Ryzen Benson (@ryzenbenson) 's Twitter Profile
Ryzen Benson

@ryzenbenson

Postdoc at UCSF in Radiation Oncology | Cancer Informatics Researcher

ID: 1235724206786011141

calendar_today06-03-2020 00:29:56

410 Tweet

166 Takipçi

190 Takip Edilen

Vicky Tiase (@vtiase) 's Twitter Profile Photo

Costco members will be eligible for online primary care visits for $29 and mental health visits for $79 bloomberg.com/news/articles/… via Bloomberg

Julian Hong (@julian_hong) 's Twitter Profile Photo

Come swing by resident Will Chen’s poster presenting findings from our prostate cancer metastasis directed SBRT experience in the PSMA PET era at #ASTRO23! Thanks ASTRO and Prostate Cancer Foundation PCF Science for your support! UCSF Helen Diller Family Comprehensive Cancer Ctr

Come swing by resident Will Chen’s poster presenting findings from our prostate cancer metastasis directed SBRT experience in the PSMA PET era at #ASTRO23! 

Thanks <a href="/ASTRO_org/">ASTRO</a> and <a href="/PCFnews/">Prostate Cancer Foundation</a> <a href="/PCF_Science/">PCF Science</a> for your support!

<a href="/UCSFCancer/">UCSF Helen Diller Family Comprehensive Cancer Ctr</a>
Bo Wang (@bowang87) 's Twitter Profile Photo

Interested in LLMs for genomic research but don't know where to start? looking for a review/survey to get started in this field? 👇👇😀 I am very excited to share that our review paper titled "To Transformers and Beyond: Large Language Models for the Genome" is now available as

Interested in LLMs for genomic research but don't know where to start? looking for a review/survey to get started in this field?  👇👇😀

I am very excited to share that our review paper titled "To Transformers and Beyond: Large Language Models for the Genome" is now available as
Sophia Yang, Ph.D. (@sophiamyang) 's Twitter Profile Photo

What is Mixture-of-Experts (MoE)? MoE is a neural network architecture design that integrates layers of experts/models within the Transformer block. As data flows through the MoE layers, each input token is dynamically routed to a subset of the experts for computation. This

What is Mixture-of-Experts (MoE)?

MoE is a neural network architecture design that integrates layers of experts/models within the Transformer block. As data flows through the MoE layers, each input token is dynamically routed to a subset of the experts for computation. This
Bindu Reddy (@bindureddy) 's Twitter Profile Photo

Transformer MoE Architectures - Why They Are More Efficient Mistral 8x7B MoE model is a solid 70B GPT 3.5 class model. Instead of having every part of the model work on every task, an MoE model splits the work among many specialized sub-models, or "experts." Each expert is good

Transformer MoE Architectures - Why They Are More Efficient

Mistral 8x7B MoE model is a solid 70B GPT 3.5 class model. Instead of having every part of the model work on every task, an MoE model splits the work among many specialized sub-models, or "experts." Each expert is good
AK (@_akhaliq) 's Twitter Profile Photo

Google presents Patchscopes A Unifying Framework for Inspecting Hidden Representations of Language Models paper page: huggingface.co/papers/2401.06… Inspecting the information encoded in hidden representations of large language models (LLMs) can explain models' behavior and verify

Google presents Patchscopes

A Unifying Framework for Inspecting Hidden Representations of Language Models

paper page: huggingface.co/papers/2401.06…

Inspecting the information encoded in hidden representations of large language models (LLMs) can explain models' behavior and verify
Wendy Chapman (@wendywchapman) 's Twitter Profile Photo

We need to consider implementation and validation at the outset of technology design rather than treating it as an afterthought. I introduce several types of infrastructure to support digital health innovators, focusing especially on AI in healthcare.  wix.to/czabI8M

JAMA Oncology (@jamaonc) 's Twitter Profile Photo

A machine learning approach with daily step counts from wearable devices from three prospective trials to predict unplanned hospitalization of patients undergoing chemoradiation was developed and validated in this study. ja.ma/4ax3WA1

A machine learning approach with daily step counts from wearable devices from three prospective trials to predict unplanned hospitalization of patients undergoing chemoradiation was developed and validated in this study. ja.ma/4ax3WA1
Wei Ping (@_weiping) 's Twitter Profile Photo

Introducing RankRAG, a novel RAG framework that instruction-tunes a single LLM for the dual purposes of top-k context ranking and answer generation in RAG. For context ranking, it performs exceptionally well by incorporating a small fraction of ranking data into the training

Introducing RankRAG, a novel RAG framework that instruction-tunes a single LLM for the dual purposes of top-k context ranking and answer generation in RAG.

For context ranking, it performs exceptionally well by incorporating a small fraction of ranking data into the training
Aakash Kumar Nain (@a_k_nain) 's Twitter Profile Photo

I went through the Llama-3 technical report (92 pages!). The report is very detailed, and it will be hard to describe everything in a single tweet, but I will try to summarize it in the best possible way. Here we go... Overview - Standard dense Transformer with minor changes -

I went through the Llama-3 technical report (92 pages!). The report is very detailed, and it will be hard to describe everything in a single tweet, but I will try to summarize it in the best possible way. Here we go...

Overview
- Standard dense Transformer with minor changes
-
Leland McInnes (@leland_mcinnes) 's Twitter Profile Photo

Datamapplot 0.4 is out now, and has far more powerful and effective interactive plots. Here is an example of a Data Map of 2.4 million papers on ArXiv, ready to be explored.

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Now that Microsoft open-sourced the code for one THE CLASSIC Paper of 2024, I am revising the MASTERPIECE. 📚 "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits" BitNet b1.58 70B was 4.1 times faster and 8.9 times higher throughput capable than the

Now that <a href="/Microsoft/">Microsoft</a>  open-sourced the code for one THE CLASSIC Paper of 2024, I am revising the MASTERPIECE.

📚 "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits"

BitNet b1.58 70B was 4.1 times faster and 8.9 times higher throughput capable than the
Julian Hong (@julian_hong) 's Twitter Profile Photo

This study at computational scale gives insights into potential priority areas (and affected populations) to focus symptom mitigation strategies, which impacts both cancer outcomes and healthcare costs Thanks to collaborators including Ryzen Benson Jie Jane Chen, MD Jean Feng!

Julian Hong (@julian_hong) 's Twitter Profile Photo

Hot off the press! Ryzen Benson led our team in this review of large language models in cancer care and research as part of the latest IMIA yearbook. Hope it's a good resource for those interested in the area! UCSF Helen Diller Family Comprehensive Cancer Ctr UCSF Bakar Computational Health Sciences Institute UC Joint Computational Precision Health Program thieme-connect.com/products/ejour…

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning "we propose Tango, a novel framework that uses RL to concurrently train both an LLM generator and a verifier in an interleaved manner. A central innovation of Tango is its generative, process-level

RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning

"we propose Tango, a novel framework that uses RL to concurrently train  both an LLM generator and a verifier in an interleaved manner. A central  innovation of Tango is its generative, process-level