Mimansa Jaiswal (@mimansaj) 's Twitter Profile
Mimansa Jaiswal

@mimansaj

Robustness, Data & Annotations, Evaluation & Interpretability in LLMs

ID: 1682735822

linkhttp://mimansajaiswal.github.io calendar_today19-08-2013 08:37:34

4,4K Tweet

3,3K Followers

4,4K Following

Alicia Guo (@upcycledwords) 's Twitter Profile Photo

earlier this summer I published my first paper of my phd! ✨ a qualitative study on how creative writers are using AI in their writing and what their strategies were in order to align with their personal writing values

earlier this summer I published my first paper of my phd! ✨ a qualitative study on how creative writers are using AI in their writing and what their strategies were in order to align with their personal writing values
xuan (ɕɥɛn / sh-yen) (@xuanalogue) 's Twitter Profile Photo

tried co-writing a code tutorial notebook w a "mid-tier" LLM yesterday (sonnet 4) and couldn't shake the feeling that I was talking to an idiot god that's incredibly fast & knowledgeable but also gets incredibly basic things wrong

Vijay V. (@vijaytarian) 's Twitter Profile Photo

Closed-source models like Kimi K2 are post-trained using rubrics, but these aren't available open-source for researchers. To change this, we're releasing the WildChecklists dataset (huggingface.co/datasets/viswa…) and code (github.com/viswavi/RLCF) from our paper on checklist-based RL!

Mimansa Jaiswal (@mimansaj) 's Twitter Profile Photo

There is no way Yutori is the only "always on" scouting agent, right? What are some others I am missing? * Gemini's scheduled actions are in 99% of the cases pretty useless.

Woosuk Kwon (@woosuk_k) 's Twitter Profile Photo

At Thinking Machines, our work includes collaborating with the broader research community. Today we are excited to share that we are building a vLLM team at Thinking Machines to advance open-source vLLM and serve frontier models. If you are interested, please DM me or Barret Zoph!

Cursor (@cursor_ai) 's Twitter Profile Photo

We've trained a new Tab model that is now the default in Cursor. This model makes 21% fewer suggestions than the previous model while having a 28% higher accept rate for the suggestions it makes. Learn more about how we improved Tab with online RL.

Yossi Matias (@ymatias) 's Twitter Profile Photo

Today we are releasing a research experiment, Learn Your Way, which explores how generative AI can transform static textbook materials into an engaging experience for every student. Our efficacy study showed that students using Learn Your Way scored higher on a long-term recall

Sophie Xhonneux (@sophiexhon11060) 's Twitter Profile Photo

📣Call for Blog Posts at #ICLR2026 ICLR 2026 Following the success of the past iterations, we are opening the Call for Blog Posts 2026! iclr-blogposts.github.io/2026/about/#a-… Please retweet!

Simran Arora (@simran_s_arora) 's Twitter Profile Photo

Very excited to share that I've finished my phd @stanford and will be joining @caltech’s cms department as an assistant professor. Looking forward to working with students and colleagues on ml systems! Grateful to my amazing advisor and labmates @hazyresearch for the best time

Very excited to share that I've finished my phd @stanford and will be joining @caltech’s cms department as an assistant professor. Looking forward to working with students and colleagues on ml systems! Grateful to my amazing advisor and labmates @hazyresearch for the best time
awkwardgoat3🧣 (@divijabhasin) 's Twitter Profile Photo

Yes, I’m married. No, I don’t wear mangalsutra or sindoor. No, that’s not controversial. It’s peaceful ☺️ instagram.com/reel/DO2vpqhkm…

Mimansa Jaiswal (@mimansaj) 's Twitter Profile Photo

New short post where I talk about realizing I still sometimes need to double-check basic math from LLMs (even after using them heavily for text entangled numeric operations all year 😭). Now back to writing my deep research experience post. Link in the next tweet.

New short post where I talk about realizing I still sometimes need to double-check basic math from LLMs (even after using them heavily for text entangled numeric operations all year 😭).

Now back to writing my deep research experience post.

Link in the next tweet.
Vimal Thilak🦉🐒 (@aggieinca) 's Twitter Profile Photo

🚨 Machine Learning Research Internship opportunity in Apple MLR! We are looking for a PhD research intern with a strong interest in world modeling, planning or learning video representations for planning and/or reasoning. If interested, apply by sending an email to me at

Yoav Gur Arieh (@guryoav) 's Twitter Profile Photo

🧠 To reason over text and track entities, we find that language models use three types of 'pointers'! They were thought to rely only on a positional one—but when many entities appear, that system breaks down. Our new paper shows what these pointers are and how they interact 👇

Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

Introducing grounding with Google Maps in the Gemini API, bringing data about 250 million places and Gemini together to create all new experiences 🗺️! So powerful to connect things like maps + search together in a single experience : )