Jean Kaddour (@jeankaddour) 's Twitter Profile
Jean Kaddour

@jeankaddour

pyspur.dev
PhD Student in ML @ UCL

ID: 1863618842

linkhttps://www.jeankaddour.com/ calendar_today14-09-2013 12:05:44

650 Tweet

1,1K Followers

2,2K Following

Robert Kirk (@_robertkirk) 's Twitter Profile Photo

Very cool work I've had the pleasure of advising Yi Xu to perform, investigating non-transitive preferences in LLM-judges, showing how that can lead to inconsistent rankings of models, and demonstrating how to fix this while maintaining computational efficiency!

Max Bartolo (@max_nlp) 's Twitter Profile Photo

I really enjoyed my Machine Learning Street Talk chat with Tim at #NeurIPS2024 about some of the research we've been doing on reasoning, robustness and human feedback. If you have an hour to spare and are interested in some semi-coherent thoughts revolving around AI robustness, it may be worth

I really enjoyed my <a href="/MLStreetTalk/">Machine Learning Street Talk</a> chat with Tim at #NeurIPS2024 about some of the research we've been doing on reasoning, robustness and human feedback. If you have an hour to spare and are interested in some semi-coherent thoughts revolving around AI robustness, it may be worth
Antonin Schrab (@antoninschrab) 's Twitter Profile Photo

🎓PhD in Foundational AI done☑️ UCL Centre for Artificial Intelligence Gatsby Computational Neuroscience Unit Huge thanks to my supervisors Benjamin Guedj Arthur Gretton & to all collaborators! Check out my article & summary table unifying all my PhD works together! A Unified View of Optimal Kernel Hypothesis Testing arxiv.org/abs/2503.07084

🎓PhD in Foundational AI done☑️
<a href="/ai_ucl/">UCL Centre for Artificial Intelligence</a> <a href="/GatsbyUCL/">Gatsby Computational Neuroscience Unit</a>

Huge thanks to my supervisors <a href="/bguedj/">Benjamin Guedj</a> <a href="/ArthurGretton/">Arthur Gretton</a> &amp; to all collaborators!

Check out my article &amp; summary table unifying all my PhD works together!

A Unified View of Optimal Kernel Hypothesis Testing
arxiv.org/abs/2503.07084
hardmaru (@hardmaru) 's Twitter Profile Photo

Excited to release our technical report: “The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search”‼️ pub.sakana.ai/ai-scientist-v… The AI Scientist-v2 incorporates an “Agentic Tree Search” approach into the workflow, enabling deeper and more

Terry Yue Zhuo (@terryyuezhuo) 's Twitter Profile Photo

DeepCoder-14B on BigCodeBench-Hard Prefilling w/o Reasoning (Ranked 81th/195) 22.3% Complete 18.2% Instruct 20.3% on Average No Prefilling, w/ Reasoning (Ranked 87th/195) 22.3% Complete 16.9% Instruct 19.6% on Average o1 (reasoning=high) & o3 (reasoning=medium) -- 35.5% on

Jude Wells (@_judewells) 's Twitter Profile Photo

I really like this ProGen3 paper because, contrary to the title, I think it actually shows there is relatively little to be gained from massively scaling protein language models. 1/n

I really like this ProGen3 paper because, contrary to the title, I think it actually shows there is relatively little to be gained from massively scaling protein language models. 1/n
Aryo Pradipta Gema (@aryopg) 's Twitter Profile Photo

MMLU-Redux just touched down at #NAACL2025! 🎉 Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope 😅 If anyone's swinging by, give our research some love! Hit me up if you check it out! 👋

MMLU-Redux just touched down at #NAACL2025! 🎉 
Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope 😅
If anyone's swinging by, give our research some love! Hit me up if you check it out! 👋
Oliver Stanley (@_oliverstanley) 's Twitter Profile Photo

Introducing Reasoning Gym: Over 100 procedurally generated reasoning environments for evaluation and RLVR of language models. Generate virtually infinite training or evaluation data with fine-grained difficulty control and automatic verifiers. 🧵 1/

Introducing Reasoning Gym: Over 100 procedurally generated reasoning environments for evaluation and RLVR of language models. Generate virtually infinite training or evaluation data with fine-grained difficulty control and automatic verifiers. 🧵 1/
Shizhe Diao (@shizhediao) 's Twitter Profile Photo

Does RL truly expand a model’s reasoning🧠capabilities? Contrary to recent claims, the answer is yes—if you push RL training long enough! Introducing ProRL 😎, a novel training recipe that scales RL to >2k steps, empowering the world’s leading 1.5B reasoning model💥and offering

Does RL truly expand a model’s reasoning🧠capabilities? Contrary to recent claims, the answer is yes—if you push RL training long enough!

Introducing ProRL 😎, a novel training recipe that scales RL to &gt;2k steps, empowering the world’s leading 1.5B reasoning model💥and offering
Zafir Stojanovski (@zafstojano) 's Twitter Profile Photo

Super excited to share 💪🧠Reasoning Gym! 🧵 We provide over 100 data generators and verifiers spanning several domains (algebra, arithmetic, code, geometry, logic, games) for training the next generation of reasoning models. In essence, we can generate an infinite amount of

Super excited to share 💪🧠Reasoning Gym! 🧵

We provide over 100 data generators and verifiers spanning several domains (algebra, arithmetic, code, geometry, logic, games) for training the next generation of reasoning models. 

In essence, we can generate an infinite amount of
Niccolò Ajroldi (@n_ajroldi) 's Twitter Profile Photo

New ICML paper! 🎉⚡️ Averaging checkpoints is a well-known method to accelerate training and improve performance of ML models. Can we see these benefits on tasks from a structured and diverse benchmark for optimization algorithms such as AlgoPerf? mlcommons.org/benchmarks/alg…

New ICML paper! 🎉⚡️

Averaging checkpoints is a well-known method to accelerate training and improve performance of ML models. Can we see these benefits on tasks from a structured and diverse benchmark for optimization algorithms such as AlgoPerf? mlcommons.org/benchmarks/alg…
Robert Lange (@roberttlange) 's Twitter Profile Photo

Text-to-LoRA: What if you no longer had to fine-tune your LLM for every single downstream task? 🚀 Stoked to share our work on instant LLM adaptation using meta-learned hypernetworks 📝 → 🔥 The idea is simple yet elegant: We text-condition a hypernetwork to output LoRA

Text-to-LoRA: What if you no longer had to fine-tune your LLM for every single downstream task?

🚀 Stoked to share our work on instant LLM adaptation using meta-learned hypernetworks 📝 →  🔥

The idea is simple yet elegant: We text-condition a hypernetwork to output LoRA
Reuben Adams (@reubenjadams) 's Twitter Profile Photo

Thread on Apple paper and LLM reasoning claims in general. LLMs often fall back to baseline patterns when they can’t reason things through, making up plausible answers when they don’t know, or providing generic but incorrect coding solutions when the situation is complex.