Nan Jiang (@nanjiangwill) 's Twitter Profile
Nan Jiang

@nanjiangwill

research @UChicagoCS @Amazon

ID: 948753149602414592

linkhttps://www.nanjiangwill.com/ calendar_today04-01-2018 03:09:12

28 Tweet

65 Followers

248 Following

Marcus Min (@marcusjmin) 's Twitter Profile Photo

🚨 #GPT4 doesn't understand the code/specification written by itself!? 🚨 🥳 Check out our #ICLR2024 paper "Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with ldentityChain" 🥳#LLM Paper: arxiv.org/abs/2310.14053 Code: github.com/marcusm117/Ide… [1/6]

🚨 #GPT4 doesn't understand the code/specification written by itself!? 🚨

🥳 Check out our #ICLR2024 paper "Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with ldentityChain" 🥳#LLM

Paper: arxiv.org/abs/2310.14053
Code: github.com/marcusm117/Ide…

[1/6]
Wai Keen Vong (@wkvong) 's Twitter Profile Photo

1/ Today in Science, we train a neural net from scratch through the eyes and ears of one child. The model learns to map words to visual referents, showing how grounded language learning from just one child's perspective is possible with today's AI tools. science.org/doi/10.1126/sc…

1/ Today in Science, we train a neural net from scratch through the eyes and ears of one child. The model learns to map words to visual referents, showing how grounded language learning from just one child's perspective is possible with today's AI tools. science.org/doi/10.1126/sc…
Jason Hu (@onjas_6) 's Twitter Profile Photo

🚀 Introducing RouterBench, the first comprehensive benchmark for evaluating LLM routers! 🎉 A collaboration between Martian and Prof. Kurt Keutzer at UC Berkeley, we've created the first holistic framework to assess LLM routing systems. 🧵1/8 To read more:

Xiuyu Li (@xiuyu_l) 's Twitter Profile Photo

Handling long context in LLMs is expensive, but can we cut the cost by learning them offline for a specific set/genre of documents? Introducing LLoCO, our new technique that learns documents offline through context compression and in-domain finetuning using LoRA, which archives

Handling long context in LLMs is expensive, but can we cut the cost by learning them offline for a specific set/genre of documents?

Introducing LLoCO, our new technique that learns documents offline through context compression and in-domain finetuning using LoRA, which archives
Sasha Rush (@srush_nlp) 's Twitter Profile Photo

I teach a class where students code up an ML library from scratch in Python. Wenting showed me that a Claude Agent (with interactive unit test feedback and the spec) could solve it 100%. We thought it would be fun to scale this idea to every Python library in the world.

Nan Jiang (@nanjiangwill) 's Twitter Profile Photo

So... can agents now build a package from scratch? Test them on Commit0! This is an amazing and fun project this summer! Huge thanks to Wenting and to everyone in the lab for their support and guidance! 🚀👏

Wenting Zhao (@wzhao_nlp) 's Twitter Profile Photo

Coding agents can debug their own outputs, but what if none of the fixes are correct? We overcome sparse rewards by making them continuous📈 Instead of having binary execution rewards, we introduce a learned verifier to measure how close the current solution is to a correct one📏

Wenting Zhao (@wzhao_nlp) 's Twitter Profile Photo

Some personal news: I'll join UMass Amherst CS as an assistant professor in fall 2026. Until then, I'll postdoc at Meta nyc. Reasoning will continue to be my main interest, with a focus on data-centric approaches🤩 If you're also interested, apply to me (phds & a postdoc)!

Songlin Yang (@songlinyang4) 's Twitter Profile Photo

📢 (1/16) Introducing PaTH 🛣️ — a RoPE-free contextualized position encoding scheme, built for stronger state tracking, better extrapolation, and hardware-efficient training. PaTH outperforms RoPE across short and long language modeling benchmarks arxiv.org/abs/2505.16381

Christopher Manning (@chrmanning) 's Twitter Profile Photo

This paper by Ivan Lee (Ivan Lee) & Taylor Berg-Kirkpatrick was great! Best thing I’ve seen at #COLM2025 so far! Readability ≠ Learnability: Rethinking the Role of Simplicity in Training Small Language Models openreview.net/forum?id=AFMGb…