Pauline Luc (@paulineluc_) Twitter Tweets • TwiCopy

Roman Ring

3 years ago

A group of Flamingos is called “flamboyance” which could be an apt description for the family of vision-language models I’m thrilled to see out in the wild! I believe using large pre-trained models in creative ways will be key and hope our work is a step in the right direction.

thumb_up_off_alt34

chat_bubble_outline0

repeat5

shareShare

Katie Millican

@millikatie

3 years ago

Great to finally share our newest addition to the DeepMind large-scale model zoo! 🦩

thumb_up_off_alt83

chat_bubble_outline3

repeat14

shareShare

Yana Hasson

@yanahasson

3 years ago

A lot happenned in the last year ! I defended my PhD and joined @DeepMind where I worked with an incredible team on Flamingo🦩, a visual language model. Flamingos can fly, they can dance, and this one writes pretty well too !

thumb_up_off_alt210

chat_bubble_outline3

repeat20

shareShare

Malcolm Reynolds

@malcolm_rynlds

3 years ago

Was a total pleasure to be part of the team for Flamingo. Lots of exciting capabilities that we are just beginning to explore.

thumb_up_off_alt30

chat_bubble_outline1

repeat3

shareShare

Conor Durkan

@conormdurkan

3 years ago

Chatting with Flamingo about images is definitely the most organic experience I’ve had with an ML model. The ability to readily describe output from e.g. DALL-E 2 might be the closest we’ve come to two independently-trained large-scale models having a conversation 👀

thumb_up_off_alt55

chat_bubble_outline2

repeat7

shareShare

Arthur Mensch

@arthurmensch

3 years ago

10B extra parameters for adaptation and visual conditioning, new cross-modality data and a lot of love makes Chinchilla able to see !

thumb_up_off_alt31

chat_bubble_outline0

repeat5

shareShare

Antoine Miech

@antoine77340

3 years ago

Finally able to share what I have been working on this year! 🦩 Tldr: We took our best LM (Chinchilla), froze it and added new visual layers to it and trained 🦩 on full webpages with images instead of just image-text pairs. Check out the visual dialogue examples from the paper!

thumb_up_off_alt85

chat_bubble_outline1

repeat8

shareShare

Google DeepMind

@googledeepmind

3 years ago

In case you missed it...Flamingo 🦩 a new SOTA visual language model. Read more below ⬇️ Paper: dpmd.ai/dm-flamingo-pa… Blog: dpmd.ai/dm-flamingo

thumb_up_off_alt112

chat_bubble_outline3

repeat24

shareShare

Antoine Yang

@antoineyang2

3 years ago

Introducing Vid2Seq, a new visual language model for dense video captioning. To appear at #CVPR2023. Work done Google w/ Arsha Nagrani P.H. Seo Antoine Miech Jordi Pont-Tuset I. Laptev J. Sivic Cordelia Schmid. Page: antoyang.github.io/vid2seq.html Paper: arxiv.org/abs/2302.14115 🧵/5

thumb_up_off_alt83

chat_bubble_outline1

repeat16

shareShare

Anas Awadalla

@anas_awadalla

3 years ago

🦩 Introducing OpenFlamingo! A framework for training and evaluating Large Multimodal Models (LMMs) capable of processing images and text. More details below (including a multimodal LLaMA model!)⬇️ Blog: laion.ai/blog/open-flam… Demo: 7164d2142d11.ngrok.app

thumb_up_off_alt1,1K

chat_bubble_outline27

repeat459

shareShare

@emilymbender.bsky.social

@emilymbender

3 years ago

Okay, so that AI letter signed by lots of AI researchers calling for a "Pause [on] Giant AI Experiments"? It's just dripping with #Aihype. Here's a quick rundown. >>

thumb_up_off_alt1,1K

chat_bubble_outline31

repeat475

shareShare

Aran Komatsuzaki

@arankomatsuzaki

2 years ago

Demystifying CLIP Data Reveals CLIP’s data curation approach and makes it open to the community repo: github.com/facebookresear… abs: arxiv.org/abs/2309.16671

thumb_up_off_alt210

chat_bubble_outline2

repeat41

shareShare

Demis Hassabis

@demishassabis

2 years ago

Thrilled to share #Lyria, the world's most sophisticated AI music generation system. From just a text prompt Lyria produces compelling music & vocals. Also: building new Music AI tools for artists to amplify creativity in partnership w/YT & music industry deepmind.google/discover/blog/…

thumb_up_off_alt2,2K

chat_bubble_outline110

repeat512

shareShare

Alex Sablayrolles

@alexsablay

2 years ago

Our latest release Mistral AI Mixtral 8x7B mixture of experts - performance of a GPT3.5 - inference cost of a 12B model - context length of 32K - speaks English, French, Italian, German and Spanish Blog post mistral.ai/news/mixtral-o…

Our latest release
<a href="/MistralAI/">Mistral AI</a>

Mixtral 8x7B mixture of experts
- performance of a GPT3.5
- inference cost of a 12B model
- context length of 32K
- speaks English, French, Italian, German and Spanish

Blog post
mistral.ai/news/mixtral-o…

thumb_up_off_alt125

chat_bubble_outline2

repeat15

shareShare

Pierre Stock

@pierrestock

2 years ago

Mixtral 8x7B is here, 11 weeks only after Mistral 7B. Outperforms Llama 2 70B and GPT 3.5 on most benchmarks, at the inference cost of a 12B dense model, with 32k tokens context size.

thumb_up_off_alt162

chat_bubble_outline4

repeat27

shareShare

Thomas Mesnard

@mesnard_thomas

2 years ago

Thrilled to present to you Gemma! A family of lightweight, state-of-the art and open models by Google DeepMind. We provide both pre-trained and fine-tuned checkpoints for easy tuning, responsible development, and community-driven innovation! More info at ai.google.dev/gemma

Thrilled to present to you Gemma!
A family of lightweight, state-of-the art and open models by <a href="/GoogleDeepMind/">Google DeepMind</a>. We provide both pre-trained and fine-tuned checkpoints for easy tuning, responsible development, and community-driven innovation!
More info at ai.google.dev/gemma

thumb_up_off_alt56

chat_bubble_outline1

repeat8

shareShare

Carl Doersch

@carldoersch

a year ago

We present a new SOTA on point tracking, via self-supervised training on real, unlabeled videos! BootsTAPIR achieves 67.4% AJ on TAP-Vid DAVIS with minimal architecture changes, tracks 10K points on a 50-frame video in 6 secs. Pytorch & JAX impl on Github. bootstap.github.io

thumb_up_off_alt317

chat_bubble_outline7

repeat66

shareShare

Skanda

@skandakoppula

a year ago

We're excited to release TAPVid-3D: an evaluation benchmark of 4,000+ real world videos and 2.1 million metric 3D point trajectories, for the task of Tracking Any Point in 3D!

thumb_up_off_alt290

chat_bubble_outline6

repeat59

shareShare

Pauline Luc

@paulineluc_

3 months ago

So pleased and proud to share with you what our team has been up to, on an ambitious journey to build a video foundation model for scientific domains ! ✨ 🚀 🎞️ 🧪 #ICCV2025 #AI4Science

thumb_up_off_alt14

chat_bubble_outline0

repeat1

shareShare

joao carreira

@joaocarreira

3 months ago

Scaling 4D Representations – new preprint arxiv.org/abs/2412.15212 and models now available github.com/google-deepmin…

thumb_up_off_alt205

chat_bubble_outline3

repeat42

shareShare