Grari Vincent (@grarivincent) 's Twitter Profile
Grari Vincent

@grarivincent

Fellow at @Stanford University & Researcher @trail_lab US | Responsible AI | Ph.D. from @Sorbonne_Univ_ @mlia_isir

ID: 1074707291058962432

calendar_today17-12-2018 16:46:00

39 Tweet

89 Followers

185 Following

Fei-Fei Li (@drfeifei) 's Twitter Profile Photo

After 3+ years, today is the day that my book “The Worlds I See” gets to see the world itself. It is a science memoir of the intertwining histories of me becoming an #AI scientist, and the making of the modern AI itself. All versions are now on Amazon a.co/d/fYKf74L 1/

After 3+ years, today is the day that my book “The Worlds I See” gets to see the world itself. It is a science memoir of the intertwining histories of me becoming an #AI scientist, and the making of the modern AI itself. All versions are now on Amazon a.co/d/fYKf74L 1/
Mickael Chen (@mickael_chen) 's Twitter Profile Photo

I completely missed ICLR blogpost track. It's a great idea that could benefit from some more visibility. iclr-blogposts.github.io/2024/call/

Percy Liang (@percyliang) 's Twitter Profile Photo

In Dec 2022, we released HELM for evaluating language models. Now, we are releasing HEIM for text-to-image models, building on the HELM infrastructure. We're excited to do more in the multimodal space!

James Zou (@james_y_zou) 's Twitter Profile Photo

📢Excited to share our new NEJM AI paper studying clinical adoption of #AI using billions of insurance claims! onepub-media.nejmgroup-production.org/ai/media/b35da…? We found that most US medical AI claims come from a few AI models (eg diabetic retinopathy), which are more likely to be used near academic

📢Excited to share our new <a href="/NEJM_AI/">NEJM AI</a> paper studying clinical adoption of #AI using billions of insurance claims! onepub-media.nejmgroup-production.org/ai/media/b35da…?

We found that most US medical AI claims come from a few AI models (eg diabetic retinopathy), which are more likely to be used near academic
Percy Liang (@percyliang) 's Twitter Profile Photo

The goal is simple: a robust, scalable, easy-to-use, and blazing fast endpoint for open models like LLama 2, Mistral, etc. The implementation is anything but. Super impressed with the team for making this happen! And we're not done yet...if you're interested, come talk to us.

Tatsunori Hashimoto (@tatsu_hashimoto) 's Twitter Profile Photo

I've been using a GPT4 paper assistant that reads the daily ArXiv feed and makes personalized recommendations in Slack. It's worked pretty well for me (today's paper demo tatsu-lab.github.io/gpt_paper_assi…). If this sounds helpful, you can set up your own bot here github.com/tatsu-lab/gpt_….

I've been using a GPT4 paper assistant that reads the daily ArXiv feed and makes personalized recommendations in Slack. It's worked pretty well for me (today's paper demo tatsu-lab.github.io/gpt_paper_assi…). If this sounds helpful, you can set up your own bot here github.com/tatsu-lab/gpt_….
Bindu Reddy (@bindureddy) 's Twitter Profile Photo

Finally, we have a hallucination leaderboard! 😍😍 Key Takeaways 📍 Not surprisingly, GPT-4 is the lowest. 📍 Open source LLama 2 70 is pretty competitive! 📍 Google's models are the lowest. Again, this is not surprising given that the #1 reason Bard is not usable is its

Finally, we have a hallucination leaderboard! 😍😍

Key Takeaways

📍 Not surprisingly, GPT-4 is the lowest.

📍 Open source LLama 2 70 is pretty competitive! 

📍 Google's models are the lowest. Again, this is not surprising given that the #1 reason Bard is not usable is its
Shao-Hua Sun (@shaohua0116) 's Twitter Profile Photo

#ICLR2024 ICLR 2026 score statistics (7304 papers): mean: 5.10; max: 8.67; min: 1.00 >8.5: top 0.59-0.67% 8.0: top 0.68-1.75% 7.5: top 1.93-3.35% 7.0: top 5.20-8.17% 6.75: top 8.42-10.34% 6.5: top 12.13-15.39% 6.25: top 17.48-20.56% 6.0: top 21.02-28.29% 5.75: top 29.07-33.27%

#ICLR2024 <a href="/iclr_conf/">ICLR 2026</a>  score statistics (7304 papers):
mean: 5.10; max: 8.67; min: 1.00
&gt;8.5: top 0.59-0.67%
8.0: top 0.68-1.75%
7.5: top 1.93-3.35%
7.0: top 5.20-8.17%
6.75: top 8.42-10.34%
6.5: top 12.13-15.39%
6.25: top 17.48-20.56%
6.0: top 21.02-28.29%
5.75: top 29.07-33.27%
Mickael Chen (@mickael_chen) 's Twitter Profile Photo

We are presenting our poster on links between GANs and Diffusion at Neurips. Don't be scared by the formalism, ⁦Jean-Yves Franceschi⁩ is awesome and will explain everything intuitively!

We are presenting our poster on links between GANs and Diffusion at Neurips. Don't be scared by the formalism, ⁦<a href="/jy_franceschi/">Jean-Yves Franceschi</a>⁩ is awesome and will explain everything intuitively!
Thibault Laugel (@thibaultlaugel) 's Twitter Profile Photo

💡Group fairness algorithms may generalize poorly: although a model may seem "fair" globally and appear to provide the same opportunities to men and women overall, this may not hold when looking at a subpopulation of the dataset, e.g. people over 60.

💡Group fairness algorithms may generalize poorly: although a model may seem "fair" globally and appear to provide the same opportunities to men and women overall, this may not hold when looking at a subpopulation of the dataset, e.g. people over 60.
James Zou (@james_y_zou) 's Twitter Profile Photo

Does #RAG/web search solve #LLM hallucinations? We find that even with RAG, 45% of responses by #GPT4 to medical queries are not fully supported by retrieved URLs. The problem is much worse for GPT-4 w/o RAG, #Gemini and #Claude arxiv.org/pdf/2402.02008… RAG ≠ faithful to source

Does #RAG/web search solve #LLM hallucinations?

We find that even with RAG, 45% of responses by #GPT4 to medical queries are not fully supported by retrieved URLs. The problem is much worse for GPT-4 w/o RAG, #Gemini and #Claude arxiv.org/pdf/2402.02008…

RAG ≠ faithful to source
Remi Cadene (@remicadene) 's Twitter Profile Photo

Important paper from Chelsea Finn and Sergey Levine lab: arxiv.org/abs/2402.19432… The authors trained a unique diffusion policy on a real-world dataset of various robots and tasks (manipulation arms, wheeled robots, self-driving cars, robot dogs, drones). Co-training gets 5%-20%

Important paper from Chelsea Finn and Sergey Levine lab: arxiv.org/abs/2402.19432…
The authors trained a unique diffusion policy on a real-world dataset of various robots and tasks (manipulation arms, wheeled robots, self-driving cars, robot dogs, drones).
Co-training gets 5%-20%
Percy Liang (@percyliang) 's Twitter Profile Photo

Levanter has the Sophia optimizer now, so you can train models ~2x faster. Together with Llama + Mistral, LoRA, TPU + GPU support, reproducibility, scalability, legibility, clean codebase, why not give Levanter a spin for your next LM training/fine-tuning run?

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

# CUDA/C++ origins of Deep Learning Fun fact many people might have heard about the ImageNet / AlexNet moment of 2012, and the deep learning revolution it started. en.wikipedia.org/wiki/AlexNet What's maybe a bit less known is that the code backing this winning submission to the

# CUDA/C++ origins of Deep Learning

Fun fact many people might have heard about the ImageNet / AlexNet moment of 2012, and the deep learning revolution it started.
en.wikipedia.org/wiki/AlexNet

What's maybe a bit less known is that the code backing this winning submission to the
Thibault Laugel (@thibaultlaugel) 's Twitter Profile Photo

Attending #ICLR2024 ? Join Grari Vincent and I this afternoon at 4:30pm for our poster session! In our work, we propose a new method, ROAD, to build models that are not only fair globally, but locally as well!

Attending #ICLR2024 ? Join <a href="/GrariVincent/">Grari Vincent</a> and I this afternoon at 4:30pm for our poster session!

In our work, we propose a new method, ROAD, to build models that are not only fair globally, but locally as well!