Lucio Dery Jnr Mwinm (@derylucio) Twitter Tweets • TwiCopy

Victor Akinwande

a year ago

For large-scale causal discovery, there's no need to use NOTEARS for its speed. Consider using LiNGAM. We've parallelized it, achieving a 32x speed-up on GPUs. NOTEARS: Scalable: ✅ Identifiability guarantees: ❌ AcceleratedLiNGAM: Scalable: ✅ Identifiability guarantees: ✅

thumb_up_off_alt41

chat_bubble_outline2

repeat10

shareShare

Junhong Shen

@junhongshen1

a year ago

Introducing Unified PDE Solvers (UPS), a step towards efficiently building foundation models for PDE solvers (arxiv.org/abs/2403.07187)! Starting from a pretrained LM, UPS tackles diverse spatiotemporal PDEs with SOTA accuracy, using ~20x less data and a single A6000! 🧵[1/x]

thumb_up_off_alt102

chat_bubble_outline3

repeat22

shareShare

Arthur Douillard

@ar_douillard

a year ago

I'm super excited to release DiPaCo, a new kind of mixture of experts, that can scale engineering-wise to data centers across the entire world! A few words about it in this thread 🧵

thumb_up_off_alt295

chat_bubble_outline12

repeat49

shareShare

Simran Khanuja

@simi_97k

a year ago

Ever noticed how Pixar adapts movies for international markets? The beloved newscaster in Zootopia is a jaguar in Brazil, a panda in China, a koala in Australia … While machine translation (MT) has only dealt with language in speech/text thus far, we extend the scope of MT to

thumb_up_off_alt248

chat_bubble_outline11

repeat39

shareShare

Lucio Dery Jnr Mwinm

@derylucio

a year ago

Checkout our work on adapting multitask learning as a tool against worst case group error. Our modified MTL approach (main task + pre-training auxiliary objective + L1 embedding reg) is competitive against bespoke DRO (Distributionally Robust Optimization) methods

thumb_up_off_alt16

chat_bubble_outline0

repeat2

shareShare

Jonas Pfeiffer

@pfeiffjo

a year ago

Are you interested in working with us on modularity and continual learning? Consider applying to our open full-time RE position in NYC: boards.greenhouse.io/deepmind/jobs/…

thumb_up_off_alt75

chat_bubble_outline0

repeat7

shareShare

Amrith Setlur

@setlur_amrith

a year ago

🚨 Interested in synthetic data and LLM reasoning? Our new work studies scaling laws for synthetic data and RL for math reasoning. TLDR: Step-level RL (per-step DPO in fig) on self-generated answers improves sample efficiency of synthetic data by 8x! arxiv.org/abs/2406.14532 1/🧵

thumb_up_off_alt147

chat_bubble_outline3

repeat42

shareShare

Siyan Zhao

@siyan_zhao

a year ago

Have you wondered how the decision boundary of in-context learning in LLMs compares to traditional models like Decision Trees and KNN? 🤔 Our research uncovers unexpected irregularities and non-smoothness in LLMs' in-context decision boundaries. 🔍 📄: arxiv.org/abs/2406.11233

thumb_up_off_alt1,1K

chat_bubble_outline21

repeat201

shareShare

Sara Hooker

@sarahookr

a year ago

Applications now officially open until end of month x.com/cohereforai/st… Looking forward to hearing from all of you!

thumb_up_off_alt91

chat_bubble_outline0

repeat25

shareShare

Simran Khanuja

@simi_97k

10 months ago

Thank you so much EMNLP 2025 for this wonderful recognition! I’m so honored and humbled 💕 Thanks Graham Neubig for your support throughout! We’ve been working on this for 1.5 years and everyone who has spoken with me in the recent past knows how passionately I feel about this

thumb_up_off_alt444

chat_bubble_outline77

repeat30

shareShare

John Hewitt

@johnhewtt

9 months ago

I’m hiring PhD students in computer science at Columbia! Our lab will tackle core challenges in understanding and controlling neural models that interact with language. for example, - methods for LLM control - discoveries of LLM properties - pretraining for understanding

thumb_up_off_alt881

chat_bubble_outline18

repeat155

shareShare

Sam Altman

@sama

8 months ago

it is hard to overstate how much alec radford has contributed to the field, and how much of everyone's current progress traces back to his work. i believe he is a genius at the level of einstein, and also he is one of my favorite people ever--hard to imagine a nicer, warmer, or

thumb_up_off_alt8,8K

chat_bubble_outline314

repeat412

shareShare

Gokul Swamy

@g_k_swamy

6 months ago

1.5 yrs ago, we set out to answer a seemingly simple question: what are we *actually* getting out of RL in fine-tuning? I'm thrilled to share a pearl we found on the deepest dive of my PhD: the value of RL in RLHF seems to come from *generation-verification gaps*. Get ready to🤿!

thumb_up_off_alt1,1K

chat_bubble_outline24

repeat231

shareShare

Lucio Dery Jnr Mwinm

@derylucio

6 months ago

The future of intelligence will be distributed … and DiLoCo may just be the 🔑 ingredient. We develop scaling laws for DiLoCo with many interesting findings — see 🧵 Work led by Zachary Charles — such a fun colab —cheers to many more.

thumb_up_off_alt15

chat_bubble_outline0

repeat7

shareShare

Lucio Dery Jnr Mwinm

@derylucio

5 months ago

Hi Friends ! I'll be in Singapore for ICLR 2025. Would be great to catchup and talk about research :) I'm interested in most things ML but my current primary focus has been : - model self-improvement - continual learning and modularity - distributed / collaborative learning

thumb_up_off_alt9

chat_bubble_outline1

repeat0

shareShare

Asher Trockman

@ashertrockman

5 months ago

Are you a frontier lab investing untold sums in training? Are you trying to stay competitive? Are you finding that your competitors' models are ... thinking a bit too much like yours? Then antidistillation.com might be for you! Sam Altman Elon Musk

thumb_up_off_alt139

chat_bubble_outline5

repeat29

shareShare

Arthur Douillard

@ar_douillard

4 months ago

30+ accepted papers 6 oral papers 6 guest speakers join us at ICLR 2026 on the 27th Hall 4 #3 for a full day of workshop on Modularity for Collaborative, Decentralized, and Continual Learning sites.google.com/corp/view/mcdc… Lucio Dery Jnr Mwinm, Fengyuan Liu, and myself will be organizing

30+ accepted papers

6 oral papers

6 guest speakers

join us at <a href="/iclr_conf/">ICLR 2026</a> on the 27th Hall 4 #3 for a full day of workshop on Modularity for Collaborative, Decentralized, and Continual Learning

sites.google.com/corp/view/mcdc…

<a href="/derylucio/">Lucio Dery Jnr Mwinm</a>, Fengyuan Liu, and myself will be organizing

thumb_up_off_alt103

chat_bubble_outline3

repeat28

shareShare

Lucio Dery Jnr Mwinm

@derylucio

4 months ago

Come to our workshop !!! It’ll be fun — I promise !!

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Hamidah Oderinwale

@didaoh

22 days ago

1/ With Benjamin Laufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵

1/ With <a href="/BenDLaufer/">Benjamin Laufer</a> and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵

thumb_up_off_alt138

chat_bubble_outline8

repeat19

shareShare

Yuqing Du

@d_yuqing

8 days ago

🥹🥹 All vague-posting aside, super happy this model is finally out there & proud of everyone for making this happen 💖 let us know what you think!

thumb_up_off_alt262

chat_bubble_outline10

repeat13

shareShare