Prasann Singhal (@prasann_singhal) 's Twitter Profile
Prasann Singhal

@prasann_singhal

4th-year undergrad #NLProc Researcher at UT Austin, advised by @gregd_nlp

ID: 1349785093510934528

linkhttps://prasanns.github.io/ calendar_today14-01-2021 18:27:08

82 Tweet

274 Followers

722 Following

Ryo Kamoi (@ryokamoi) 's Twitter Profile Photo

We will present our survey on self-correction of LLMs (TACL) at #EMNLP2024 in person! Oral: Nov 12 (Tue) 11:00- (Language Modeling 1) When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs arxiv.org/abs/2406.01297 x.com/RyoKamoi/statu…

Greg Durrett (@gregd_nlp) 's Twitter Profile Photo

I won't be at #EMNLP2024, but my students & collaborators are presenting: 🔍 Detecting factual errors from LLMs Liyan Tang 🛠️ Detect, critique, & refine pipeline Manya Wadhwa Lucy Zhao 🏭 Synthetic data generation Abhishek Divekar 📄 Fact-checking @anxruddy at FEVER. Links🧵

I won't be at #EMNLP2024, but my students & collaborators are presenting:
🔍 Detecting factual errors from LLMs <a href="/LiyanTang4/">Liyan Tang</a>
🛠️ Detect, critique, &amp; refine pipeline <a href="/ManyaWadhwa1/">Manya Wadhwa</a> <a href="/lucy_xyzhao/">Lucy Zhao</a> 
🏭 Synthetic data generation <a href="/adivekar_/">Abhishek Divekar</a> 
📄 Fact-checking @anxruddy at FEVER.
Links🧵
Manya Wadhwa (@manyawadhwa1) 's Twitter Profile Photo

I'll be presenting this work at #EMNLP2024 🌴on Tuesday, 4-5:30pm, Poster Session C in Jasmine Hall ! Stop by or reach out if you are interested in tools for verification, making explanations useful or evaluation in general! Updated 📜 arxiv.org/abs/2407.02397 Lucy Zhao

Xi Ye (@xiye_nlp) 's Twitter Profile Photo

🔔 I'm recruiting multiple fully funded MSc/PhD students University of Alberta for Fall 2025! Join my lab working on NLP, especially reasoning and interpretability (see my website for more details about my research). Apply by December 15!

Zayne Sprague (@zaynesprague) 's Twitter Profile Photo

Interesting perspective, thanks for sharing! As one of the authors of the “CoT mainly helps on math/logic paper”, I agree with a lot of this, especially the connection to generator/validator gaps. One of our aims going into this project was to find datasets beyond math/logic

Jessy Li (@jessyjli) 's Twitter Profile Photo

🌟Job ad🌟 We (Greg Durrett, Matt Lease and I) are hiring a postdoc fellow within the CosmicAI Institute, to do galactic work with LLMs and generative AI! If you would like to push the frontiers of foundation models to help solve myths of the universe, please apply!

Jacob Springer (@jacspringer) 's Twitter Profile Photo

Training with more data = better LLMs, right? 🚨 False! Scaling language models by adding more pre-training data can decrease your performance after post-training! Introducing "catastrophic overtraining." 🥁🧵+arXiv 👇 1/9

Training with more data = better LLMs, right? 🚨

False! Scaling language models by adding more pre-training data can decrease your performance after post-training!

Introducing "catastrophic overtraining." 🥁🧵+arXiv 👇

1/9
Tanishq Kumar (@tanishqkumar07) 's Twitter Profile Photo

trained a nanoGPT? feeling behind before o4-mini? 🚨🚨i'm open-sourcing beyond-nanoGPT, an internal codebase to help people go from LLM basics to research-level understanding. 🚨🚨 it contains thousands of lines of from-scratch, annotated pytorch implementing advanced

trained a nanoGPT? feeling behind before o4-mini?

🚨🚨i'm open-sourcing beyond-nanoGPT, an internal codebase to help people go from LLM basics to research-level understanding. 🚨🚨

it contains thousands of lines of from-scratch, annotated pytorch implementing advanced
Sriram Padmanabhan (@srirampad05) 's Twitter Profile Photo

Are LMs sensitive to suspicious coincidences? Our paper finds that, when given access to knowledge of the hypothesis space, LMs can show sensitivity to such coincidences, displaying parallels with human inductive reasoning. w/Kanishka Misra 🌊, Kyle Mahowald, Eunsol Choi

Are LMs sensitive to suspicious coincidences? Our paper finds that, when given access to knowledge of the hypothesis space, LMs can show sensitivity to such coincidences, displaying parallels with human inductive reasoning. w/<a href="/kanishkamisra/">Kanishka Misra 🌊</a>, <a href="/kmahowald/">Kyle Mahowald</a>, <a href="/eunsolc/">Eunsol Choi</a>
Greg Durrett (@gregd_nlp) 's Twitter Profile Photo

Check out Ramya et al.'s work on understanding discourse similarities in LLM-generated text! We see this as an important step in quantifying the "sameyness" of LLM text, which we think will be a step towards fixing it!

Manya Wadhwa (@manyawadhwa1) 's Twitter Profile Photo

Evaluating language model responses on open-ended tasks is hard! 🤔 We introduce EvalAgent, a framework that identifies nuanced and diverse criteria 📋✍️. EvalAgent identifies 👩‍🏫🎓 expert advice on the web that implicitly address the user’s prompt 🧵👇

Greg Durrett (@gregd_nlp) 's Twitter Profile Photo

Check out Manya's work on evaluation for open-ended tasks! The criteria from EvalAgent can be plugged into LLM-as-a-judge or used for refinement. Great tool with a ton of potential, and there's LOTS to do here for making LLMs better at writing!

Anirudh Khatry (@anirudhkhatry) 's Twitter Profile Photo

🚀Introducing CRUST-Bench, a dataset for C-to-Rust transpilation for full codebases 🛠️ A dataset of 100 real-world C repositories across various domains, each paired with: 🦀 Handwritten safe Rust interfaces. 🧪 Rust test cases to validate correctness. 🧵[1/6]

🚀Introducing CRUST-Bench, a dataset for C-to-Rust transpilation for full codebases 🛠️
A dataset of 100 real-world C repositories across various domains, each paired with:
🦀 Handwritten safe Rust interfaces.
🧪 Rust test cases to validate correctness.
🧵[1/6]
Greg Durrett (@gregd_nlp) 's Twitter Profile Photo

New work led by Liyan Tang with a strong new model for chart understanding! Check out the blog post, model, and playground! Very fun to play around with Bespoke-MiniChart-7B and see what a 7B VLM can do!

Greg Durrett (@gregd_nlp) 's Twitter Profile Photo

Check out Anirudh's work on a new benchmark for C-to-Rust transpilation! 100 realistic-scale C projects, plus target Rust interfaces + Rust tests that let us validate the transpiled code beyond what prior benchmarks allow.

Mahesh Sathiamoorthy (@madiator) 's Twitter Profile Photo

Happy to announce Bespoke-Minichart-7B! This was a tough cookie to crack, and involved a lot of data curation and modeling work, but overall very happy with the results! Congrats to the team and especially to Liyan Tang for running so many experiments that helped us understand

thom lake (@thomlake) 's Twitter Profile Photo

Interested in how alignment changes the response distribution defined by LLMs? Come check out my poster at 2 PM at #NAACL2025 x.com/thomlake/statu…

Interested in how alignment changes the response distribution defined by LLMs? Come check out my poster at 2 PM at #NAACL2025 

x.com/thomlake/statu…
Liyan Tang (@liyantang4) 's Twitter Profile Photo

Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts! ✍🏻Entirely human-written questions by 13 CS researchers 👀Emphasis on visual reasoning – hard to be verbalized via text CoTs 📉Humans reach 93% but 63% from Gemini-2.5-Pro & 38% from Qwen2.5-72B

Introducing ChartMuseum🖼️, testing visual reasoning with diverse real-world charts!

✍🏻Entirely human-written questions by 13 CS researchers
👀Emphasis on visual reasoning – hard to be verbalized via text CoTs
📉Humans reach 93% but 63% from Gemini-2.5-Pro &amp; 38% from Qwen2.5-72B
Gaurav Ghosal (@gaurav_ghosal) 's Twitter Profile Photo

1/So much of privacy research is designing post-hoc methods to make models mem. free. It’s time we turn that around with architectural changes. Excited to add Memorization Sinks to the transformer architecture this #ICML2025 to isolate memorization during LLM training🧵

1/So much of privacy research is designing post-hoc methods to make models mem. free.
It’s time we turn that around with architectural changes. Excited to add Memorization Sinks to the transformer architecture this #ICML2025 to isolate memorization during LLM training🧵
Greg Durrett (@gregd_nlp) 's Twitter Profile Photo

📢I'm joining NYU (Courant CS + Center for Data Science) starting this fall! I’m excited to connect with new NYU colleagues and keep working on LLM reasoning, reliability, coding, creativity, and more! I’m also looking to build connections in the NYC area more broadly. Please

📢I'm joining NYU (Courant CS + Center for Data Science) starting this fall!

I’m excited to connect with new NYU colleagues and keep working on LLM reasoning, reliability, coding, creativity, and more!

I’m also looking to build connections in the NYC area more broadly. Please