Thomas Savage (@thomasrsavage) Twitter Tweets • TwiCopy

npj Digital Medicine

2 years ago

Mitigating the 'black box' of AI: LLMs can imitate diagnostic reasoning strategies when solving clinical cases & provide an interpretable means to assess if the generated answer is true/false based on the diagnostic reasoning's factual & logical accuracy. nature.com/articles/s4174…

thumb_up_off_alt23

chat_bubble_outline1

repeat9

shareShare

Omar Khattab

@lateinteraction

2 years ago

Long context will eventually work, then will eventually become less expensive and scale better. For now, though, the tradeoffs may not be great. (Must note this plot says 3000, not 10M.)

thumb_up_off_alt84

chat_bubble_outline4

repeat9

shareShare

Jim Fan

@drjimfan

a year ago

We live in such strange times. Apple, a company famous for its secrecy, published a paper with staggering amount of details on their multimodal foundation model. Those who are supposed to be open are now wayyy less than Apple. MM1 is a treasure trove of analysis. They discuss

thumb_up_off_alt4,4K

chat_bubble_outline55

repeat732

shareShare

JAMA Internal Medicine

@jamainternalmed

a year ago

In this quasi-experimental study, a deterioration model intervention was found to be associated with a decreased risk of escalations in care during hospitalization. ja.ma/4aqjG7V

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

Anil Makam

@anilmakam

a year ago

Fascinating regression discontinuity study in JAMA Internal Medicine of Epic's deterioration index (EDI) by Rob Gallo, MD EDI alerts reduced rapid response & ICU transfers Though sensitive to bandwidth choice below & above the EDI threshold when the alert fires jamanetwork.com/journals/jamai…

thumb_up_off_alt20

chat_bubble_outline4

repeat5

shareShare

Jonathan H Chen MD PhD

@jonc101x

a year ago

Increase your sample size by asking LLM chatbot the same question 100 times. Wait, maybe not? Rob Gallo, MD diving in on evaluating generative #AI. Repeated prompting like asking the same person a question or like random sampling from a population of people?jamanetwork.com/journals/jama/…

thumb_up_off_alt9

chat_bubble_outline0

repeat6

shareShare

Thomas Savage

@thomasrsavage

a year ago

LLMs seem overconfident when responding to medical questions, so how do we know when they are actually uncertain? In our preprint we review strategies to estimate LLM uncertainty for medical diagnosis and treatment selection. medrxiv.org/content/10.110…

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Thomas Savage

@thomasrsavage

a year ago

LLM fine tuning is surprisingly underused in medicine. With data siloed, we will need fine tuning to learn knowledge and preferences that are unique to our health systems . Here we show the benefits of SFT and DPO for many common medical nlp tasks (link: arxiv.org/pdf/2409.12741)

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Andrew Ng

@andrewyng

a year ago

A decision on SB-1047 is due soon. Governor Gavin Newsom has said he's concerned about its "chilling effect, particularly in the open source community". He's right, and I hope he will veto this. If you agree, please like/retweet this to show your support for VETOing SB-1047!

thumb_up_off_alt1,1K

chat_bubble_outline73

repeat481

shareShare

Jonathan H Chen MD PhD

@jonc101x

10 months ago

Large language model chatbot #AI systems are remarkably accurate on medical questions, but hard to use in high-stakes medicine when you're unsure how confident the answer is (chatbots have tendency to express high confidence, regardless of factuality). academic.oup.com/jamia/article-…

thumb_up_off_alt24

chat_bubble_outline2

repeat4

shareShare

Penn LDI

@pennldi

9 months ago

LDI Fellow Thomas Savage's study shows that large language models can estimate their uncertainty in medical diagnosis using sample consistency (SC) proxies, which proved most reliable for uncertainty detection. Learn more here. CC: Penn Medicine academic.oup.com/jamia/advance-…

thumb_up_off_alt1

chat_bubble_outline0

repeat2

shareShare

npj Digital Medicine

@npjdigitalmed

8 months ago

2🥈 Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine nature.com/articles/s4174…

thumb_up_off_alt3

chat_bubble_outline1

repeat2

shareShare

npj Digital Medicine

@npjdigitalmed

7 months ago

Listen to Thomas Savage at Penn, in our first 2-minute Author Spotlight, discuss his work which was one of the top 10 cited papers at @npjdigitalmed in 2024!

thumb_up_off_alt4

chat_bubble_outline0

repeat3

shareShare