Kaichen Zhang (@kaichenzhang358) Twitter Tweets • TwiCopy

Li Bo

9 months ago

SAE Made Easy github.com/EvolvingLMMs-L… Sparse Autoencoders (SAE) have become a cornerstone in the field of explainable AI, powering safety and interpretability research at leading labs like Anthropic, OpenAI, and Google. Despite their effectiveness, training SAEs has

thumb_up_off_alt20

chat_bubble_outline1

repeat7

shareShare

Li Bo

@boli68567011

9 months ago

😻 LMMs-Eval upgrades to v0.4, better evals for better models. - multi-node evals, tp+dp parallel. - new doc_to_message support for interleaved modalities inputs, fully compatible with OpenAI official message format, suitable for evaluation in more complicated tasks. - unified

thumb_up_off_alt36

chat_bubble_outline1

repeat15

shareShare

Li Bo

@boli68567011

6 months ago

Throughout my journey in developing multimodal models, I’ve always wanted a framework that lets me plug & play modality encoders/decoders on top of an auto-regressive LLM. I want to prototype fast, try new architectures, and have my demo files scale effortlessly — with full

thumb_up_off_alt112

chat_bubble_outline9

repeat34

shareShare

Kaichen Zhang

@kaichenzhang358

6 months ago

🚀 Releasing LMMs Engine by EvolvingLMMs‑Lab — a lean, flexible framework for any-to-any modality pretraining & fine-tuning. 🔧 Built with cutting-edge optimizations: FSDP2, Ulysses Sequence Parallel, Flash Attention 2 📚 Dive in: github.com/EvolvingLMMs-L…

thumb_up_off_alt73

chat_bubble_outline1

repeat10

shareShare

Ziwei Liu

@liuziwei7

6 months ago

🔥One-Stop Training Engine for Unified Models🔥 ⚡️LMMs-Engine⚡️ is a lean and flexible unified model training engine built for hacking at scale * Support multimodal inputs and outputs, from AR, diffusion and linear models, to unified models like BAGEL 🏠github.com/EvolvingLMMs-L…

thumb_up_off_alt190

chat_bubble_outline6

repeat33

shareShare

AK

@_akhaliq

5 months ago

OpenMMReasoner Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

thumb_up_off_alt60

chat_bubble_outline2

repeat9

shareShare

Nathan Lambert

@natolambert

5 months ago

Love to see more fully open post-training recipes (this one multimodal reasoning). It's surprising how rare post-training data is because the opportunity for impact is huge. Lots of people will try it and simple data methods still can improve on SOTA.

thumb_up_off_alt199

chat_bubble_outline3

repeat21

shareShare

AK

@_akhaliq

5 months ago

LongVT Incentivizing "Thinking with Long Videos" via Native Tool Calling

thumb_up_off_alt51

chat_bubble_outline2

repeat9

shareShare

Lidong Bing

@lidongbing

5 months ago

🔥 Introducing LongVT: Teaching Multimodal LLMs to "Actively Look Back" and understand long videos just like humans! We tackle the "sparse evidence" & "hallucination" issues in long-video reasoning with an end-to-end Agentic solution. Project: evolvinglmms-lab.github.io/LongVT/ Paper:

thumb_up_off_alt38

chat_bubble_outline0

repeat9

shareShare

Ziwei Liu

@liuziwei7

4 months ago

Our open-source tools and models have become trusted infrastructure for the global AI community, with representative repos including: 🚂 LMMs-Engine: github.com/EvolvingLMMs-L… 📊 LMMs-Eval: github.com/EvolvingLMMs-L… 🤖 LLaVA-OneVision-1.5: github.com/EvolvingLMMs-L…

thumb_up_off_alt14

chat_bubble_outline1

repeat3

shareShare