Jack Hessel(@jmhessel) 's Twitter Profileg
Jack Hessel

@jmhessel

ML, NLP, CV. PhD from @CornellCIS; Opinions my own.

ID:121516577

linkhttps://jmhessel.com/ calendar_today09-03-2010 19:02:10

2,1K Tweets

3,3K Followers

908 Following

Mohit Bansal(@mohitban47) 's Twitter Profile Photo

🚨 We have postdoc openings at UNC 🙂

Exciting+diverse NLP/CV/ML topics**, freedom to create research agenda, competitive funding, very strong students, many collabs w/ other faculty & universities+companies, superb quality of life/weather. Please apply + help spread the word…

🚨 We have postdoc openings at UNC 🙂 Exciting+diverse NLP/CV/ML topics**, freedom to create research agenda, competitive funding, very strong students, many collabs w/ other faculty & universities+companies, superb quality of life/weather. Please apply + help spread the word…
account_circle
jack morris(@jxmnop) 's Twitter Profile Photo

one of the most important things I know about deep learning I learned from this paper: 'Pretraining Without Attention'

this what I found so surprising:
these people developed an architecture very different from Transformers called BiGS, spent months and months optimizing it and…

one of the most important things I know about deep learning I learned from this paper: 'Pretraining Without Attention' this what I found so surprising: these people developed an architecture very different from Transformers called BiGS, spent months and months optimizing it and…
account_circle
Andrej Karpathy(@karpathy) 's Twitter Profile Photo

Consider being a labeler for an LLM. The prompt is “give me a random number between 1 and 10”. What SFT & RM labels do you contribute? What does this do the network when trained on?

In subtle way this problem is present in every prompt that does not have a single unique answer.

account_circle
Luca Soldaini 🎀(@soldni) 's Twitter Profile Photo

Earnest question: why don’t top AI labs share their safety tools?

Seems like it would be pretty aligned to their mission?

account_circle
Yuntian Deng(@yuntiandeng) 's Twitter Profile Photo

Will your paper catch the eye of AK? I built a demo that predicts if AK will select a paper. It has 50% F1 using DeBERTa finetuned on data from past year.

As a test, our upcoming WildChat arXiv has a 56% chance. Hopefully not a false positive🤞

🔗huggingface.co/spaces/yuntian…

Will your paper catch the eye of @_akhaliq? I built a demo that predicts if AK will select a paper. It has 50% F1 using DeBERTa finetuned on data from past year. As a test, our upcoming WildChat arXiv has a 56% chance. Hopefully not a false positive🤞 🔗huggingface.co/spaces/yuntian…
account_circle
Jack Hessel(@jmhessel) 's Twitter Profile Photo

You have an eval metric for a task you're not 100% correlates with what you're really trying to get at. You see that the metric monotonically improves when you try 7b -> 13b -> 70b LLMs. This observation...

account_circle
Maxwell Forbes(@maxforbes) 's Twitter Profile Photo

Unlike any sane person who gets a PhD in NLP right now, afterwards I made a game. I just released it in early access talktomehuman.com Talk to NPCs who talk back at you, try to persuade your way out of sticky situations

Unlike any sane person who gets a PhD in NLP right now, afterwards I made a game. I just released it in early access talktomehuman.com Talk to NPCs who talk back at you, try to persuade your way out of sticky situations
account_circle
Jack Hessel(@jmhessel) 's Twitter Profile Photo

had to google this to keep up with llm training discourse (subsequently facepalmed because I probably should have figured the latin pattern out bi now)

had to google this to keep up with llm training discourse (subsequently facepalmed because I probably should have figured the latin pattern out bi now)
account_circle
Seungju Han(@SeungjuHan3) 's Twitter Profile Photo

🥰Excited to share that I will be joining AI2 Allen Institute for AI MOSAIC this September as a predoctoral young investigator!! So excited to continue working with amazing Yejin Choi Nouha Dziri Liwei Jiang Kavel Rao and can't wait to collaborate with others!

account_circle
Sasha Rush(@srush_nlp) 's Twitter Profile Photo

I like to think of myself as a researcher, but almost certainly the most valuable use of my time is writing US Visa letters.

account_circle
Weijia Shi(@WeijiaShi2) 's Twitter Profile Photo

When augmented with retrieval, LMs sometimes overlook retrieved docs and hallucinate 🤖💭

To make LMs trust evidence more and hallucinate less, we introduce Context-Aware Decoding: a decoding algorithm improving LM's focus on input contexts

📖 arxiv.org/pdf/2305.14739…

When augmented with retrieval, LMs sometimes overlook retrieved docs and hallucinate 🤖💭 To make LMs trust evidence more and hallucinate less, we introduce Context-Aware Decoding: a decoding algorithm improving LM's focus on input contexts 📖 arxiv.org/pdf/2305.14739… #NAACL2024
account_circle
Jack Hessel(@jmhessel) 's Twitter Profile Photo

Cool paper from Eric Zelikman et al ---

Quiet-STaR induces chain-of-thought tokens during pretraining, and uses RL to encourage the model to generate ''thoughts'' that improve language modeling performance. A clever step beyond next-word prediction :-)

arxiv.org/abs/2403.09629

account_circle