João Maria Janeiro (@joaomjaneiro) Twitter Tweets • TwiCopy

Fabian Gloeckle

@fabiangloeckle

10 months ago

New generation of cross-lingual embedding models -- the same could work for code + math

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

🚨 RELEASE ALERT ‼️ github.com/facebookresear… THIS CHANGES EVERYTHING $META just dropped a game-changing codebase! Now everyone can do LLM research! 😱 🧵10 best things people are already building with lingua 🔥👇

thumb_up_off_alt69

chat_bubble_outline2

repeat9

shareShare

Adrien Bardes

@adrienbardes

10 months ago

Job alert 🚨 My team AI at Meta is looking for a PhD intern to join us in 2025 in Paris. We are working on self-supervised learning from video, world modelling and JEPA ! Apply here or reach out directly: metacareers.com/jobs/168411027…

thumb_up_off_alt234

chat_bubble_outline3

repeat47

shareShare

Mathurin Videau

@mathuvu_

10 months ago

Meta Lingua: a minimal, fast LLM codebase for training and inference. By researchers, for researchers. Easily hackable, still reproducible. Built-in efficiency, profiling (cpu, gpu and mem) and interpretability (automatic activation and gradient statistics) Joint work w/ Badr Youbi Idrissi

thumb_up_off_alt48

chat_bubble_outline1

repeat14

shareShare

Tom Sander @NeurIPS

@rednastom

10 months ago

🔒Image watermarking is promising for digital content protection. But images often undergo many modifications—spliced or altered by AI. Today at AI at Meta, we released Watermark Anything that answers not only "where does the image come from," but "what part comes from where." 🧵

thumb_up_off_alt24

chat_bubble_outline1

repeat7

shareShare

João Maria Janeiro

@joaomjaneiro

9 months ago

It was great being a part of this large project with so many amazing people! Reinventing the way text generation models work, moving away from the traditional token paradigm of LLMs. Check it out! Paper: arxiv.org/abs/2412.08821

thumb_up_off_alt10

chat_bubble_outline0

repeat2

shareShare

Belen Alastruey

@b_alastruey

8 months ago

Happy to share our team's work on Large Concept Models (LCMs), a new approach for language modeling that goes beyond standard token-based LLMs by operating in a multilingual and multimodal embedding space. Check out the full paper! 📄: ai.meta.com/research/publi…

thumb_up_off_alt448

chat_bubble_outline12

repeat85

shareShare

João Maria Janeiro

@joaomjaneiro

8 months ago

Amazing new works on audio generation from AI at Meta , specifically stem-level music edition through text and audio language modelling trained on watermarked data with easy detection!

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Piotr Bojanowski

@p_bojanowski

7 months ago

🔥 The DINO team is looking for a PostDoc! 🔥 If you are about to graduate, and want to be part of what’s next for SSL, don’t hesitate to reach out! Link to job offer : metacareers.com/jobs/502476149…

thumb_up_off_alt155

chat_bubble_outline1

repeat28

shareShare

João Maria Janeiro

@joaomjaneiro

7 months ago

Another great paper by TimDarcet et al (AI at Meta). They explore how to effectively make masked image modeling work, via thorough exploration on all components of the pipeline, from masking, to the architecture, loss, targets to predict... Check it out! Code is also accessible!

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

João Maria Janeiro

@joaomjaneiro

6 months ago

Another banger from Quentin Garrido et al from AI at Meta. They explore how JEPA models (models predicting in latent space) have a better understanding of intuitive physics! Check it out:

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Wassim (Wes) Bouaziz

@_vassim

5 months ago

Want to know if a ML model was trained on your dataset with 1 API call? See you in conferences 🙌 Excited to share that our paper Data Taggants for image data was accepted at ICLR 2025 🎉 Our follow-up on audio data, was accepted at ICASSP 2025! 🎉 Check out the details below 👇

thumb_up_off_alt33

chat_bubble_outline1

repeat13

shareShare

João Maria Janeiro

@joaomjaneiro

5 months ago

Want to know if your LLM can understand code well? Check out this new paper by Pierre Chambon from AI at Meta! It is a complex and non saturated benchmark that will surely put LLMs to the test on their understanding of code!

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

João Maria Janeiro

@joaomjaneiro

5 months ago

Are you struggling to extract relevant features from your data? Check out this new work from Krunoslav Lehman Pavasovic from AI at Meta, where they propose a new training objective to learn more relevant features!

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Kunhao Zheng @ ICLR 2025

@kunhaoz

4 months ago

🚨 Your RL only improves 𝗽𝗮𝘀𝘀@𝟭, not 𝗽𝗮𝘀𝘀@𝗸? 🚨 That’s not a bug — it’s a 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗼𝗳 𝘁𝗵𝗲 𝗼𝗯𝗷𝗲𝗰𝘁𝗶𝘃𝗲 you’re optimizing. You get what you optimize for. If you want better pass@k, you need to optimize for pass@k at training time. 🧵 How?

thumb_up_off_alt823

chat_bubble_outline12

repeat141

shareShare

João Maria Janeiro

@joaomjaneiro

3 months ago

Another great work but the DINO team! If you want a CLIP like model check it out! Stop by their poster at CVPR!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

João Maria Janeiro

@joaomjaneiro

a month ago

If you are attending ACL2025 join our oral presentation! Happening at 15:00 in room 1.86 🙂

thumb_up_off_alt6

chat_bubble_outline0

repeat3

shareShare

João Maria Janeiro

@joaomjaneiro

a month ago

Are you struggling to improve the performance of your multilingual model? The reason might be because of the languages you are mixing! But how can you know what languages to mix to maximize performance? Check our paper!

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

João Maria Janeiro

@joaomjaneiro

18 days ago

You thought DINOv2 was large scale? Checkout the new DINOv3, larger scale, better performance, amazing features! Check it out:

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare