Yao Fu (@francis_yao_) 's Twitter Profile
Yao Fu

@francis_yao_

Research Scientist at @GoogleDeepMind studying language models and complex reasoning

ID: 741313237287768064

linkhttps://yaofu.notion.site/ calendar_today10-06-2016 16:57:03

844 Tweet

15,15K Followers

1,1K Following

Alex Cheema - e/acc (@ac_crypto) 's Twitter Profile Photo

🥇 I won the the tiny corp x AGI House SF hackathon solo (+support of George Hotz 🌑) Hacked an @apple Mac Mini distributed AI training cluster with thunderbolt-4 interconnect (40Gbps) for linear scaling Detailed post coming soonTM Exo Labs Emad

🥇 I won the <a href="/__tinygrad__/">the tiny corp</a> x <a href="/AGIHouseSF/">AGI House SF</a> hackathon solo (+support of <a href="/realGeorgeHotz/">George Hotz 🌑</a>)

Hacked an @apple Mac Mini distributed AI training cluster with thunderbolt-4 interconnect (40Gbps) for linear scaling

Detailed post coming soonTM <a href="/exolabs_/">Exo Labs</a> <a href="/EMostaque/">Emad</a>
rahul (@rahulgs) 's Twitter Profile Photo

🤔 extracted the full ~5000 token claude3.5sonnet claude.ai system prompt: gist.github.com/1rgs/b31a1de86… this is a great template for function calling / tool use notes: artifacts: seem to be a fully in-context abstraction, model not finetuned for it allowed types:

🤔 extracted the full ~5000 token claude3.5sonnet claude.ai system prompt: gist.github.com/1rgs/b31a1de86…

this is a great template for function calling / tool use 

notes: 

artifacts: seem to be a fully in-context abstraction, model not finetuned for it

allowed types:
Junxian He (@junxian_he) 's Twitter Profile Photo

Rejection sampling is commonly used to synthesize SOTA SFT data for math reasoning, which filters out the synthetic responses with incorrect answers. However, have you realized that this process may implicitly introduce bias towards easy samples? — Easy queries become more and

Yao Fu (@francis_yao_) 's Twitter Profile Photo

I’ll be attending ICML Conference next week presenting my long context data engineering paper! Come and discuss long context architecture, data, inference efficiency, and I will reason with you why long context is much better than RAG arxiv.org/abs/2402.10171

Shayne Longpre (@shayneredford) 's Twitter Profile Photo

✨New Preprint ✨ How are shifting norms on the web impacting AI? We find: 📉 A rapid decline in the consenting data commons (the web) ⚖️ Differing access to data by company, due to crawling restrictions (e.g.🔻26% OpenAI, 🔻13% Anthropic) ⛔️ Robots.txt preference protocols

✨New Preprint ✨ How are shifting norms on the web impacting AI?

We find:

📉 A rapid decline in the consenting data commons (the web)

⚖️ Differing access to data by company, due to crawling restrictions (e.g.🔻26% OpenAI, 🔻13% Anthropic)

⛔️ Robots.txt preference protocols
Jonas Geiping (@jonasgeiping) 's Twitter Profile Photo

Modern LLMs have large vocab sizes and long seq lengths which leads to an annoying peak in memory due to logit activations... .... so, I wasted some time last month writing a fused triton kernel to do nn.Linear + nn.CrossEntropyLoss without a memory peak ⬇️

Modern LLMs have large vocab sizes and long seq lengths which leads to an annoying peak in memory due to logit activations...

.... so, I wasted some time last month writing a fused triton kernel to do nn.Linear + nn.CrossEntropyLoss without a memory peak
⬇️
Junxian He (@junxian_he) 's Twitter Profile Photo

Great to see DART-Math is featured in NuminaMath's report and outperforms DeepseekMath-RL by 5 points on OOD test, when our model is trained only on synthetic data from DeepseekMath-RL itself.

Yao Fu (@francis_yao_) 's Twitter Profile Photo

Albert is one of the most thoughtful researchers and one of the people in early days of AI4Math, particularly in early days of autoformalization when nobody (including myself 😅) understood the importance of it (if you still in question, check AlphaProof) So huge congrats on

Nathan Lambert (@natolambert) 's Twitter Profile Photo

Ai2 released OLMoE today. It's our best model to date. - 1.3B active, 6.9B total parameters, 64 experts per layer - Trained on 5T tokens from DCLM baseline + Dolma - New preview of Tulu 3 post training recipe - Fully open source - Actually SOTA for ~1B active params I'm most

Ai2 released OLMoE today. It's our best model to date.
- 1.3B active, 6.9B total parameters, 64 experts per layer
- Trained on 5T tokens from DCLM baseline + Dolma
- New preview of Tulu 3 post training recipe
- Fully open source
- Actually SOTA for ~1B active params

I'm most
Xiang Yue (@xiangyue96) 's Twitter Profile Photo

🚀 Introducing MMMU-Pro: A more robust version of MMMU arxiv.org/pdf/2409.02813… After launching MMMU, we received valuable feedback from the community: 1️⃣ Some questions were answerable without even seeing the images. 2️⃣ Models didn’t always "know" the answer but found shortcuts

🚀 Introducing MMMU-Pro: A more robust version of MMMU
arxiv.org/pdf/2409.02813…
After launching MMMU, we received valuable feedback from the community:

1️⃣ Some questions were answerable without even seeing the images.
2️⃣ Models didn’t always "know" the answer but found shortcuts
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

We’re presenting AlphaProteo: an AI system for designing novel proteins that bind more successfully to target molecules. 🧬 It could help scientists better understand how biological systems function, save time in research, advance drug design and more. 🧵 dpmd.ai/3XuMqbX

Yao Fu (@francis_yao_) 's Twitter Profile Photo

Looking back, the largest problem of my own phd journey is reading too many papers and writing too few codes 😮‍💨

CLS (@chengleisi) 's Twitter Profile Photo

Automating AI research is exciting! But can LLMs actually produce novel, expert-level research ideas? After a year-long study, we obtained the first statistically significant conclusion: LLM-generated ideas are more novel than ideas written by expert human researchers.

Automating AI research is exciting! But can LLMs actually produce novel, expert-level research ideas?

After a year-long study, we obtained the first statistically significant conclusion: LLM-generated ideas are more novel than ideas written by expert human researchers.
Yuntian Deng (@yuntiandeng) 's Twitter Profile Photo

What do people use ChatGPT for? We built WildVis, an interactive tool to visualize the embeddings of million-scale chat datasets like WildChat. Work done with Wenting Zhao Jack Hessel Sean (Xiang) Ren Claire Cardie Yejin Choi 📝huggingface.co/papers/2409.03… 🔗wildvisualizer.com/embeddings/eng… 1/7