Bhuwan Dhingra (@bhuwandhingra) 's Twitter Profile
Bhuwan Dhingra

@bhuwandhingra

Natural Language Processing / Machine Learning research. Assistant Professor @dukecompsci, @duke_nlp; Research Scientist @Apple

ID: 2490439280

linkhttps://users.cs.duke.edu/~bdhingra/ calendar_today11-05-2014 21:45:36

103 Tweet

1,1K Followers

310 Following

Bhuwan Dhingra (@bhuwandhingra) 's Twitter Profile Photo

Happy to share my first paper at Apple, led by Roy Xie. TL; DR: Interleaving <think> and <answer> blocks during reasoning reduces the time-to-first-token *and* improves accuracy.

Together AI (@togethercompute) 's Twitter Profile Photo

🚀 Introducing Mixture-of-Agents Alignment (MoAA), a new method to "distill" the collective intelligence of open-source LLMs into a single, efficient model. MoAA outperforms GPT-4o as a teacher, boosting smaller models like Llama3.1-8B to rival models 10x their size!

🚀 Introducing Mixture-of-Agents Alignment (MoAA), a new method to "distill" the collective intelligence of open-source LLMs into a single, efficient model.

MoAA outperforms GPT-4o as a teacher, boosting smaller models like Llama3.1-8B to rival models 10x their size!
Bhuwan Dhingra (@bhuwandhingra) 's Twitter Profile Photo

Backtracking allows reasoning models to go back and correct mistakes in their solution attempts. What sorts of tasks benefit from this behavior? And can we boost it using SFT? Hongyi James Cai 's new preprint answers these questions and more -- check it out!

Tian Li (@litian0331) 's Twitter Profile Photo

Want to train LLMs with less cost? We introduce BiClip, a clipping-based method that `approximates' adaptive optimizers without maintaining expensive preconditioners

Want to train LLMs with less cost? We introduce BiClip, a clipping-based method that `approximates' adaptive optimizers without maintaining expensive preconditioners
Ruoming Pang (@ruomingpang) 's Twitter Profile Photo

At WWDC we introduce a new generation of LLMs developed to enhance the Apple Intelligence features. We also introduce the new Foundation Models framework, which gives app developers direct access to the on-device foundation language model. machinelearning.apple.com/research/apple…

Bhuwan Dhingra (@bhuwandhingra) 's Twitter Profile Photo

The technical report for the second generation of Apple Foundation Models is out.. Its been a great year contributing to this effort and being part of an amazing team!