Krithik Ramesh (@krithiktweets) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

We’re excited to share the completed preprint of Delphy — our new tool for scalable, near-real-time #Bayesianphylogenetics for outbreaks! 🚀 Check it out here: biorxiv.org/content/10.110… Explore Delphy: delphy.fathom.info Led by #PatrickVarilly sabeti_lab Fathom 🧵1/16

thumb_up_off_alt95

chat_bubble_outline7

repeat29

shareShare

Gage Moreno

@gagekmoreno

4 months ago

I’m thrilled to share our latest preprint! We analyzed >130,000 SARS-CoV-2 genomes from MA to investigate complex transmission dynamics—from statewide patterns, within specific facilities, and at the individual level 🦠🧬 Check out the preprint here ⬇️ medrxiv.org/content/10.110…

thumb_up_off_alt33

chat_bubble_outline1

repeat12

shareShare

Krithik Ramesh

@krithiktweets

4 months ago

The all mighty Albert pop off 😈

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

soham

@sohamgovande

4 months ago

introducing chipmunk—a training-free algorithm making ai video generation 3.7x & image gen 1.6x faster! ⚡️ our kernels for column-sparse attention are 9.3x faster than FlashAttention-3 and column-sparse GEMM is 2.5x faster vs. cuBLAS a thread on the GPU kernel optimizations 🧵

thumb_up_off_alt178

chat_bubble_outline36

repeat42

shareShare

Jerry Liu

@jerrywliu

4 months ago

I'm at #ICLR2025 this week to present our work on🔬high-precision algorithm learning🔬with Transformers! Stop by our poster session Thursday afternoon! 🔗arxiv.org/abs/2503.12295 With Jess Grogan, Owen Dugan , Ashish Rao, Simran Arora , Atri Rudra, and hazyresearch!

thumb_up_off_alt43

chat_bubble_outline2

repeat12

shareShare

Michael Zhang

@mzhangio

4 months ago

We studied (and will now talk about) how these ideas let us: - Take existing Transformer LLMs, and turn them into SoTA subquadratic LLMs - Get SoTA quality, despite only training 0.2% of past methods' model parameters, with 0.4% of their training tokens (i.e., a 2500x boost in

thumb_up_off_alt30

chat_bubble_outline0

repeat6

shareShare

Krithik Ramesh

@krithiktweets

4 months ago

Incredible work and genuinely amazing product by Raunak Chowdhuri and Adit. Excited for the future of document understanding at scale :)

thumb_up_off_alt13

chat_bubble_outline1

repeat0

shareShare

Azalia Mirhoseini

@azaliamirh

3 months ago

Excited to release SWiRL: A synthetic data generation and multi-step RL approach for reasoning and tool use! With SWiRL, the model’s capability generalizes to new tasks and tools. For example, a model trained to use a retrieval tool to solve multi-hop knowledge-intensive

thumb_up_off_alt390

chat_bubble_outline3

repeat74

shareShare

Krithik Ramesh

@krithiktweets

3 months ago

Honestly this album changed my life, FFYL has consistently been one of my favorite songs!

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Krithik Ramesh

@krithiktweets

3 months ago

Phillipe Rigollet also teaches non asymptotic statistics, this spring which has been such an enjoyable class!

thumb_up_off_alt0

chat_bubble_outline1

repeat0

shareShare

Reducto

@reductoai

3 months ago

10k likes and we'll do it

thumb_up_off_alt61

chat_bubble_outline6

repeat2

shareShare

Krithik Ramesh

@krithiktweets

3 months ago

Anthropic intelligence*

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Ted Zadouri

@tedzadouri

2 months ago

"Pre-training was hard, inference easy; now everything is hard."-Jensen Huang. Inference drives AI progress b/c of test-time compute. Introducing inference aware attn: parallel-friendly, high arithmetic intensity – Grouped-Tied Attn & Grouped Latent Attn

thumb_up_off_alt317

chat_bubble_outline6

repeat46

shareShare

Massachusetts Institute of Technology (MIT)

@mit

2 months ago

You did it, MIT students! We’re so proud of you. Image: Jenny Baek '25

thumb_up_off_alt227

chat_bubble_outline76

repeat33

shareShare

Azalia Mirhoseini

@azaliamirh

2 months ago

In the test time scaling era, we all would love a higher throughput serving engine! Introducing Tokasaurus, a LLM inference engine for high-throughput workloads with large and small models! Led by Jordan Juravsky, in collaboration with hazyresearch and an amazing team!

thumb_up_off_alt139

chat_bubble_outline2

repeat20

shareShare

Albert Gu

@_albertgu

2 months ago

exciting to see that hybrid models maintain reasoning performance with few attention layers. benefits of linear architectures are prominent for long reasoning traces, when efficiency is bottlenecked by decoding - seems like a free win if reasoning ability is preserved as well!

thumb_up_off_alt91

chat_bubble_outline1

repeat12

shareShare

Adam Zweiger

@adamzweiger

2 months ago

Excited to share our new work on Self-Adapting Language Models! This is my first first-author paper and I’m grateful to be able to work with such an amazing team of collaborators: Jyo Pari Han Guo Ekin Akyürek Yoon Kim Pulkit Agrawal

thumb_up_off_alt76

chat_bubble_outline6

repeat10

shareShare

Eva Yi Xie

@evayixie

2 months ago

1/ Excited to share our recent work in #ICML2025, “A multi-region brain model to elucidate the role of hippocampus in spatially embedded decision-making”. 🎉 🔗 minzsiure.github.io/multiregion-br… Joint w/ FieteGroup Jaedong Hwang, Brody Lab, Princeton Neuroscience Institute ⬇️ 🧵 for key takeaways

thumb_up_off_alt35

chat_bubble_outline4

repeat9

shareShare

Krithik Ramesh

Gate.io

Pardis Sabeti

Gage Moreno

Krithik Ramesh

soham

Jerry Liu

Michael Zhang

Krithik Ramesh

Azalia Mirhoseini

Krithik Ramesh

Krithik Ramesh

Reducto

Krithik Ramesh

Ted Zadouri

Massachusetts Institute of Technology (MIT)

Azalia Mirhoseini

Albert Gu

Adam Zweiger

Eva Yi Xie