Kerem Zaman (@keremzaman3) 's Twitter Profile
Kerem Zaman

@keremzaman3

PhD student @uncnlp | prev. BSc @UniBogazici | kal '18

ID: 1289141085588029445

linkhttps://keremzaman.com calendar_today31-07-2020 10:10:01

326 Tweet

369 Followers

1,1K Following

Usman Anwar (@usmananwar391) 's Twitter Profile Photo

✨New AI Safety paper on CoT Monitorability✨ We use information theory to answer when Chain-of-Thought monitoring works, and how to make it better.

✨New AI Safety paper on CoT Monitorability✨
We use information theory to answer when Chain-of-Thought monitoring works, and how to make it better.
Niloofar (on faculty job market!) (@niloofar_mire) 's Twitter Profile Photo

Privacy in LLMs is not just Memorization! We reviewed 1322 papers (2016–25) across ML, NLP & SEC: 92% fixate on memorization/chat leaks. We map 5 urgent problems + a roadmap, to prevent surveillance, inference, aggregation and other negative outcomes.

Privacy in LLMs is not just Memorization! 

We reviewed 1322 papers (2016–25) across ML, NLP & SEC: 92% fixate on memorization/chat leaks. 

We map 5 urgent problems + a roadmap, to prevent surveillance, inference, aggregation and other negative outcomes.
Kerem Zaman (@keremzaman3) 's Twitter Profile Photo

shoutout to Ai2 Asta!! it’s incredibly good at surfacing the exact papers I’m looking for. the results are super precise and it deserves more attention!

Emre Can Acikgoz (@emrecanacikgoz) 's Twitter Profile Photo

Consider an LLM Agent that could train itself while testing. What if it could also sense its own weaknesses and use them at test-time training? 🚨New paper!🚨 We investigate a new test-time self-improvement (TT-SI) algorithm that enables agents to self-improve using only one

Consider an LLM Agent that could train itself while testing. What if it could also sense its own weaknesses and use them at test-time training?

🚨New paper!🚨 We investigate a new test-time self-improvement (TT-SI) algorithm that enables agents to self-improve using only one
Nil Gurel (@nilgurelphd) 's Twitter Profile Photo

Excited for tomorrow! 🎙 Honored to join the AI x Flexible Biosensors panel at #SFTechWeek by a16z. Join us! 📅 Saturday, Oct 11 | 11:00 AM PT 🔗 RSVP: partiful.com/e/zRAxQyASrlwt… Tech Week

Excited for tomorrow! 🎙 Honored to join the AI x Flexible Biosensors panel at #SFTechWeek by <a href="/a16z/">a16z</a>. Join us! 

📅 Saturday, Oct 11 | 11:00 AM PT

🔗 RSVP: partiful.com/e/zRAxQyASrlwt…

<a href="/Techweek_/">Tech Week</a>
Kerem Zaman (@keremzaman3) 's Twitter Profile Photo

hazır kumru bu kadar gündem olmuşken şu halüsinasyon meselesine biraz açıklık getirelim. LLM halüsinasyonlarının gerçekleştiği senaryolardan ikisi şöyle: - daha önce karşılaşmadığı bir bilgiye dair hatalı cevap vermesi - cevabını bilmesine rağmen hatalı cevap vermesi

Niloofar (on faculty job market!) (@niloofar_mire) 's Twitter Profile Photo

LLMs are solving IMO problems, but can they grade them? In our new paper, we find they catch errors well but *fumble partial credit*. Our solution: agentic workflows that auto-generate rubrics and grade step-by-step, matching human consistency.

LLMs are solving IMO problems, but can they grade them? 

In our new paper, we find they catch errors well but *fumble partial credit*. Our solution: agentic workflows that auto-generate rubrics and grade step-by-step, matching human consistency.
Jaap Jumelet (@jumeletj) 's Twitter Profile Photo

🌍Introducing BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data! LLMs learn from vastly more data than humans ever experience. BabyLM challenges this paradigm by focusing on developmentally plausible data We extend this effort to 45 new languages!

🌍Introducing BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data!

LLMs learn from vastly more data than humans ever experience. BabyLM challenges this paradigm by focusing on developmentally plausible data

We extend this effort to 45 new languages!
Michael Saxon (@m2saxon) 's Twitter Profile Photo

The viral new "Definition of AGI" paper has fake citations which do not exist. And it specifically TELLS you to read them! Proof: different articles present at the specified journal/volume/page number, and their titles exist nowhere on any searchable repository.

The viral new "Definition of AGI" paper has fake citations which do not exist.

And it specifically TELLS you to read them!

Proof: different articles present at the specified journal/volume/page number, and their titles exist nowhere on any searchable repository.
Niloofar (on faculty job market!) (@niloofar_mire) 's Twitter Profile Photo

I'm recruiting students for fall 2026 thru Language Technologies Institute | @CarnegieMellon & CMU Engineering & Public Policy, in: 1. Privacy & security of LLMs, coding, long horizon & embodied agents (robotics) 2. Tiny local llms 3. AI for scientific reasoning, esp. chemistry 4. Latent reasoning 5. anything YOU are passionate about!