Lucio Dery Jnr Mwinm (@derylucio) 's Twitter Profile
Lucio Dery Jnr Mwinm

@derylucio

ID: 2998763504

linkhttps://ldery.github.io/ calendar_today27-01-2015 22:31:54

257 Tweet

518 Takipçi

967 Takip Edilen

Victor Akinwande (@aknvictor) 's Twitter Profile Photo

For large-scale causal discovery, there's no need to use NOTEARS for its speed. Consider using LiNGAM. We've parallelized it, achieving a 32x speed-up on GPUs. NOTEARS: Scalable: ✅ Identifiability guarantees: ❌ AcceleratedLiNGAM: Scalable: ✅ Identifiability guarantees: ✅

For large-scale causal discovery, there's no need to use NOTEARS for its speed. Consider using LiNGAM. We've parallelized it, achieving a 32x speed-up on GPUs.

NOTEARS:
Scalable: ✅
Identifiability guarantees: ❌

AcceleratedLiNGAM:
Scalable: ✅
Identifiability guarantees: ✅
Junhong Shen (@junhongshen1) 's Twitter Profile Photo

Introducing Unified PDE Solvers (UPS), a step towards efficiently building foundation models for PDE solvers (arxiv.org/abs/2403.07187)! Starting from a pretrained LM, UPS tackles diverse spatiotemporal PDEs with SOTA accuracy, using ~20x less data and a single A6000! 🧵[1/x]

Introducing Unified PDE Solvers (UPS), a step towards efficiently building foundation models for PDE solvers (arxiv.org/abs/2403.07187)! Starting from a pretrained LM, UPS tackles diverse spatiotemporal PDEs with SOTA accuracy, using ~20x less data and a single A6000! 🧵[1/x]
Arthur Douillard (@ar_douillard) 's Twitter Profile Photo

I'm super excited to release DiPaCo, a new kind of mixture of experts, that can scale engineering-wise to data centers across the entire world! A few words about it in this thread 🧵

Simran Khanuja (@simi_97k) 's Twitter Profile Photo

Ever noticed how Pixar adapts movies for international markets? The beloved newscaster in Zootopia is a jaguar in Brazil, a panda in China, a koala in Australia … While machine translation (MT) has only dealt with language in speech/text thus far, we extend the scope of MT to

Ever noticed how Pixar adapts movies for international markets? The beloved newscaster in Zootopia is a jaguar in Brazil, a panda in China, a koala in Australia …

While machine translation (MT) has only dealt with language in speech/text thus far, we extend the scope of MT to
Lucio Dery Jnr Mwinm (@derylucio) 's Twitter Profile Photo

Checkout our work on adapting multitask learning as a tool against worst case group error. Our modified MTL approach (main task + pre-training auxiliary objective + L1 embedding reg) is competitive against bespoke DRO (Distributionally Robust Optimization) methods

Jonas Pfeiffer (@pfeiffjo) 's Twitter Profile Photo

Are you interested in working with us on modularity and continual learning? Consider applying to our open full-time RE position in NYC: boards.greenhouse.io/deepmind/jobs/…

Amrith Setlur (@setlur_amrith) 's Twitter Profile Photo

🚨 Interested in synthetic data and LLM reasoning? Our new work studies scaling laws for synthetic data and RL for math reasoning. TLDR: Step-level RL (per-step DPO in fig) on self-generated answers improves sample efficiency of synthetic data by 8x! arxiv.org/abs/2406.14532 1/🧵

🚨 Interested in synthetic data and LLM reasoning? Our new work studies scaling laws for synthetic data and RL for math reasoning.
TLDR: Step-level RL (per-step DPO in fig) on self-generated answers improves sample efficiency of synthetic data by 8x! arxiv.org/abs/2406.14532

1/🧵
Siyan Zhao (@siyan_zhao) 's Twitter Profile Photo

Have you wondered how the decision boundary of in-context learning in LLMs compares to traditional models like Decision Trees and KNN? 🤔 Our research uncovers unexpected irregularities and non-smoothness in LLMs' in-context decision boundaries. 🔍 📄: arxiv.org/abs/2406.11233

Simran Khanuja (@simi_97k) 's Twitter Profile Photo

Thank you so much EMNLP 2025 for this wonderful recognition! I’m so honored and humbled 💕 Thanks Graham Neubig for your support throughout! We’ve been working on this for 1.5 years and everyone who has spoken with me in the recent past knows how passionately I feel about this

John Hewitt (@johnhewtt) 's Twitter Profile Photo

I’m hiring PhD students in computer science at Columbia! Our lab will tackle core challenges in understanding and controlling neural models that interact with language. for example, - methods for LLM control - discoveries of LLM properties - pretraining for understanding

Sam Altman (@sama) 's Twitter Profile Photo

it is hard to overstate how much alec radford has contributed to the field, and how much of everyone's current progress traces back to his work. i believe he is a genius at the level of einstein, and also he is one of my favorite people ever--hard to imagine a nicer, warmer, or

Gokul Swamy (@g_k_swamy) 's Twitter Profile Photo

1.5 yrs ago, we set out to answer a seemingly simple question: what are we *actually* getting out of RL in fine-tuning? I'm thrilled to share a pearl we found on the deepest dive of my PhD: the value of RL in RLHF seems to come from *generation-verification gaps*. Get ready to🤿!

1.5 yrs ago, we set out to answer a seemingly simple question: what are we *actually* getting out of RL in fine-tuning? I'm thrilled to share a pearl we found on the deepest dive of my PhD: the value of RL in RLHF seems to come from *generation-verification gaps*. Get ready to🤿!
Lucio Dery Jnr Mwinm (@derylucio) 's Twitter Profile Photo

The future of intelligence will be distributed … and DiLoCo may just be the 🔑 ingredient. We develop scaling laws for DiLoCo with many interesting findings — see 🧵 Work led by Zachary Charles — such a fun colab —cheers to many more.

Lucio Dery Jnr Mwinm (@derylucio) 's Twitter Profile Photo

Hi Friends ! I'll be in Singapore for ICLR 2025. Would be great to catchup and talk about research :) I'm interested in most things ML but my current primary focus has been : - model self-improvement - continual learning and modularity - distributed / collaborative learning

Asher Trockman (@ashertrockman) 's Twitter Profile Photo

Are you a frontier lab investing untold sums in training? Are you trying to stay competitive? Are you finding that your competitors' models are ... thinking a bit too much like yours? Then antidistillation.com might be for you! Sam Altman Elon Musk

Are you a frontier lab investing untold sums in training? Are you trying to stay competitive? Are you finding that your competitors' models are ... thinking a bit too much like yours?

Then antidistillation.com might be for you! <a href="/sama/">Sam Altman</a> <a href="/elonmusk/">Elon Musk</a>
Arthur Douillard (@ar_douillard) 's Twitter Profile Photo

30+ accepted papers 6 oral papers 6 guest speakers join us at ICLR 2026 on the 27th Hall 4 #3 for a full day of workshop on Modularity for Collaborative, Decentralized, and Continual Learning sites.google.com/corp/view/mcdc… Lucio Dery Jnr Mwinm, Fengyuan Liu, and myself will be organizing

30+ accepted papers

6 oral papers

6 guest speakers

join us at <a href="/iclr_conf/">ICLR 2026</a> on the 27th Hall 4 #3 for a full day of workshop on Modularity for Collaborative, Decentralized, and Continual Learning

sites.google.com/corp/view/mcdc…

<a href="/derylucio/">Lucio Dery Jnr Mwinm</a>, Fengyuan Liu, and myself will be organizing
Hamidah Oderinwale (@didaoh) 's Twitter Profile Photo

1/ With Benjamin Laufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵

1/ With <a href="/BenDLaufer/">Benjamin Laufer</a> and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵
Yuqing Du (@d_yuqing) 's Twitter Profile Photo

🥹🥹 All vague-posting aside, super happy this model is finally out there & proud of everyone for making this happen 💖 let us know what you think!