So Yeon (Tiffany) Min on Industry Job Market (@soyeontiffmin) Twitter Tweets • TwiCopy

Pratyush Maini

3 months ago

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today DatologyAI shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens🧑🏼‍🍳 - 3B LLMs beat 8B models🚀 - Pareto frontier for performance

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today <a href="/datologyai/">DatologyAI</a> shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens🧑🏼‍🍳
- 3B LLMs beat 8B models🚀
- Pareto frontier for performance

thumb_up_off_alt559

chat_bubble_outline18

repeat92

shareShare

Stephanie Milani

@steph_milani

3 months ago

🌻 Excited to announce that I’ve moved to NYC to start as an Assistant Prof/Faculty Fellow at New York University! If you’re in the area, reach out & let’s chat! Would love coffee & tea recs as well 🍵

🌻 Excited to announce that I’ve moved to NYC to start as an Assistant Prof/Faculty Fellow at <a href="/nyuniversity/">New York University</a>!

If you’re in the area, reach out & let’s chat! Would love coffee & tea recs as well 🍵

thumb_up_off_alt1,1K

chat_bubble_outline75

repeat20

shareShare

Kashu Yamazaki

@kashu_yamazaki

3 months ago

この度、Forbes JAPANが選ぶ「世界を変える30歳未満の30人」に選出いただきました！これからも精進して研究します。日本を再びロボットの中心地に！！！！！ 30 UNDER 30【ForbesJAPAN】 #u30fj

この度、Forbes JAPANが選ぶ「世界を変える30歳未満の30人」に選出いただきました！

これからも精進して研究します。日本を再びロボットの中心地に！！！！！

<a href="/forbesjapan_30/">30 UNDER 30【ForbesJAPAN】</a> #u30fj

thumb_up_off_alt400

chat_bubble_outline15

repeat22

shareShare

Paul Liang

@pliang279

3 months ago

A bit late, but finally got around to posting the recorded and edited lecture videos for the **How to AI (Almost) Anything** course I taught at MIT in spring 2025. Youtube playlist: youtube.com/watch?v=0MYt0u… Course website and materials: mit-mi.github.io/how2ai-course/… Today's AI can be

thumb_up_off_alt1,1K

chat_bubble_outline9

repeat240

shareShare

Chris Paxton

@chris_j_paxton

3 months ago

Training a Whole-Body Control Foundation Model -- new work from my team at Agility Robotics A neural network for controlling our humanoid robots which is robust to disturbances, can handle heavy objects, and is a powerful platform for learning new whole-body skills learn more

thumb_up_off_alt386

chat_bubble_outline21

repeat65

shareShare

Qian Huang

@qhwang3

3 months ago

Yesterday was my last day at xAI. It’s been an incredible ride for the past year and half, probably the most adventurous and fast growing period of my life so far. Best wishes for the team going forward. Looking forward to what’s next!

Yesterday was my last day at <a href="/xai/">xAI</a>. It’s been an incredible ride for the past year and half, probably the most adventurous and fast growing period of my life so far. Best wishes for the team going forward. Looking forward to what’s next!

thumb_up_off_alt9,9K

chat_bubble_outline599

repeat235

shareShare

Sukjun (June) Hwang

@sukjun_hwang

3 months ago

Coming from a computer vision background and now in sequence modeling, I’m often struck by how disconnected LLMs and vision feel. Our work, AUSM, treats video as language -- and it reveals a few blind spots we’ve overlooked.

thumb_up_off_alt135

chat_bubble_outline4

repeat8

shareShare

Trenton Bricken

@trentonbricken

3 months ago

I'm glad this collaboration happened and that the investigator agent could play a part in it!

thumb_up_off_alt83

chat_bubble_outline3

repeat2

shareShare

Sachin Goyal

@goyalsachin007

3 months ago

1/Excited to share the first in a series of my research updates on LLM pretraining🚀. Our new work shows *distilled pretraining*—increasingly used to train deployable models—has trade-offs: ✅ Boosts test-time scaling ⚠️ Weakens in-context learning ✨ Needs tailored data curation

thumb_up_off_alt328

chat_bubble_outline5

repeat64

shareShare

Mahi Shafiullah 🏠🤖

@notmahi

3 months ago

Deeply honored to be a part of MIT Tech Review Innovators Under 35 List this year. This recognition highlights our work on building robot intelligence that generalizes to unseen and unstructured human environments, executed with my friends & colleagues at NYU Courant & beyond.

thumb_up_off_alt124

chat_bubble_outline16

repeat8

shareShare

Sanket Vaibhav Mehta, Ph.D.

@sanketvmehta

2 months ago

Mind-blowing view of #AuroraBorealis on my flight from JFK—>SFO. The wildest part? I edited this shot on my phone at 35000 ft using freakn Nano Banana 🍌 via Google Gemini App ♊️ and posting this from the same altitude right now thanks to Delta WiFi. The future is officially here ✨

Mind-blowing view of #AuroraBorealis on my flight from JFK—>SFO. The wildest part? I edited this shot on my phone at 35000 ft using freakn <a href="/NanoBanana/">Nano Banana</a> 🍌 via <a href="/GeminiApp/">Google Gemini App</a> ♊️ and posting this from the same altitude right now thanks to <a href="/Delta/">Delta</a> WiFi. The future is officially here ✨

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Dylan Sam

@dylanjsam

2 months ago

🚨Excited to introduce a major development in building safer language models: Safety Pretraining! Instead of post-hoc alignment, we take a step back and embed safety directly into pretraining. 🧵(1/n)

thumb_up_off_alt333

chat_bubble_outline6

repeat85

shareShare

Pratyush Maini

@pratyushmaini

2 months ago

We can’t keep slapping alignment bandage 🩹 on harmful LLMs and calling it safety. Let’s fix the leak at the source by making LLMs safe by design. Introducing, Safety Pretraining🛡️

thumb_up_off_alt67

chat_bubble_outline3

repeat9

shareShare

Pratyush Maini

@pratyushmaini

2 months ago

One thing years of memorization research has made clear: unlearning is fundamentally hard. Neurons are polysemantic & concepts are massively distributed. There’s no clean 'delete'. We need architectures that are "unlearnable by design". Introducing, Memorization Sinks 🛁⬇️

thumb_up_off_alt183

chat_bubble_outline2

repeat16

shareShare

niki parmar

@nikiparmar09

2 months ago

We just dropped Sonnet 4.5, the best coding model! Agents are truly here now -- autonomous task solving, complex multi-step tasks, parallel agents, combined with new tools and features and a lot more.. Check it out here 👇

thumb_up_off_alt216

chat_bubble_outline16

repeat6

shareShare

Emily Byun

@yewonbyun_

2 months ago

💡Can we trust synthetic data for statistical inference? We show that synthetic data (e.g. LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moments of synthetic data and those of real data

thumb_up_off_alt132

chat_bubble_outline2

repeat33

shareShare

Dylan Sam

@dylanjsam

2 months ago

Very interesting insights into understanding when and why synthetic data (although imperfect and biased) can boost the performance of statistical inference!! 📈📈

thumb_up_off_alt13

chat_bubble_outline0

repeat4

shareShare

So Yeon (Tiffany) Min on Industry Job Market

@soyeontiffmin

2 months ago

Check out this great work from Emily!

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Shrimai

@shrimai_

2 months ago

Thank you Rohan Paul for highlighting our work!💫 Front-Loading Reasoning shows that inclusion of reasoning data in pretraining is beneficial, does not lead to overfitting after SFT, & has latent effect unlocked by SFT! Paper: arxiv.org/abs/2510.03264 Blog:

thumb_up_off_alt8

chat_bubble_outline0

repeat2

shareShare

Jimin Mun

@jiminmun_

2 months ago

Amazing theoretical work on how to generate text-based synthetic data that will *actually* improve performance on statistical inference!! 🤩

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare