Krithik Ramesh (@krithiktweets) 's Twitter Profile
Krithik Ramesh

@krithiktweets

AI + Math @MIT, compbio stuff @broadinstitute, prev: research @togethercompute

ID: 834174443211534336

linkhttp://krithikramesh.com calendar_today21-02-2017 22:54:20

2,2K Tweet

655 Takipçi

649 Takip Edilen

Pardis Sabeti (@pardissabeti) 's Twitter Profile Photo

We’re excited to share the completed preprint of Delphy — our new tool for scalable, near-real-time #Bayesianphylogenetics for outbreaks! 🚀 Check it out here: biorxiv.org/content/10.110… Explore Delphy: delphy.fathom.info Led by #PatrickVarilly sabeti_lab Fathom 🧵1/16

Gage Moreno (@gagekmoreno) 's Twitter Profile Photo

I’m thrilled to share our latest preprint! We analyzed >130,000 SARS-CoV-2 genomes from MA to investigate complex transmission dynamics—from statewide patterns, within specific facilities, and at the individual level 🦠🧬 Check out the preprint here ⬇️ medrxiv.org/content/10.110…

soham (@sohamgovande) 's Twitter Profile Photo

introducing chipmunk—a training-free algorithm making ai video generation 3.7x & image gen 1.6x faster! ⚡️ our kernels for column-sparse attention are 9.3x faster than FlashAttention-3 and column-sparse GEMM is 2.5x faster vs. cuBLAS a thread on the GPU kernel optimizations 🧵

Jerry Liu (@jerrywliu) 's Twitter Profile Photo

I'm at #ICLR2025 this week to present our work on🔬high-precision algorithm learning🔬with Transformers! Stop by our poster session Thursday afternoon! 🔗arxiv.org/abs/2503.12295 With Jess Grogan, Owen Dugan , Ashish Rao, Simran Arora , Atri Rudra, and hazyresearch!

I'm at #ICLR2025 this week to present our work on🔬high-precision algorithm learning🔬with Transformers! Stop by our poster session Thursday afternoon!
🔗arxiv.org/abs/2503.12295
With <a href="/Jessica_Grogan_/">Jess Grogan</a>, <a href="/OwenDugan/">Owen Dugan</a> , Ashish Rao, <a href="/simran_s_arora/">Simran Arora</a> , Atri Rudra, and <a href="/HazyResearch/">hazyresearch</a>!
Michael Zhang (@mzhangio) 's Twitter Profile Photo

We studied (and will now talk about) how these ideas let us: - Take existing Transformer LLMs, and turn them into SoTA subquadratic LLMs - Get SoTA quality, despite only training 0.2% of past methods' model parameters, with 0.4% of their training tokens (i.e., a 2500x boost in

We studied (and will now talk about) how these ideas let us:

- Take existing Transformer LLMs, and turn them into SoTA subquadratic LLMs

- Get SoTA quality, despite only training 0.2% of past methods' model parameters, with 0.4% of their training tokens (i.e., a 2500x boost in
Azalia Mirhoseini (@azaliamirh) 's Twitter Profile Photo

Excited to release SWiRL: A synthetic data generation and multi-step RL approach for reasoning and tool use! With SWiRL, the model’s capability generalizes to new tasks and tools. For example, a model trained to use a retrieval tool to solve multi-hop knowledge-intensive

Excited to release SWiRL: A synthetic data generation and multi-step RL approach for reasoning and tool use!

With SWiRL, the model’s capability generalizes to new tasks and tools. For example, a model trained to use a retrieval tool to solve multi-hop knowledge-intensive
Ted Zadouri (@tedzadouri) 's Twitter Profile Photo

"Pre-training was hard, inference easy; now everything is hard."-Jensen Huang. Inference drives AI progress b/c of test-time compute. Introducing inference aware attn: parallel-friendly, high arithmetic intensity – Grouped-Tied Attn & Grouped Latent Attn

"Pre-training was hard, inference easy; now everything is hard."-Jensen Huang. Inference drives AI progress b/c of test-time compute.

Introducing inference aware attn: parallel-friendly, high arithmetic intensity – Grouped-Tied Attn &amp; Grouped Latent Attn
Azalia Mirhoseini (@azaliamirh) 's Twitter Profile Photo

In the test time scaling era, we all would love a higher throughput serving engine! Introducing Tokasaurus, a LLM inference engine for high-throughput workloads with large and small models! Led by Jordan Juravsky, in collaboration with hazyresearch and an amazing team!

In the test time scaling era, we all would love a higher throughput serving engine! Introducing Tokasaurus, a LLM inference engine for high-throughput workloads with large and small models!

Led by <a href="/jordanjuravsky/">Jordan Juravsky</a>, in collaboration with <a href="/HazyResearch/">hazyresearch</a> and an amazing team!
Albert Gu (@_albertgu) 's Twitter Profile Photo

exciting to see that hybrid models maintain reasoning performance with few attention layers. benefits of linear architectures are prominent for long reasoning traces, when efficiency is bottlenecked by decoding - seems like a free win if reasoning ability is preserved as well!

Adam Zweiger (@adamzweiger) 's Twitter Profile Photo

Excited to share our new work on Self-Adapting Language Models! This is my first first-author paper and I’m grateful to be able to work with such an amazing team of collaborators: Jyo Pari Han Guo Ekin Akyürek Yoon Kim Pulkit Agrawal

Eva Yi Xie (@evayixie) 's Twitter Profile Photo

1/ Excited to share our recent work in #ICML2025, “A multi-region brain model to elucidate the role of hippocampus in spatially embedded decision-making”. 🎉 🔗 minzsiure.github.io/multiregion-br… Joint w/ FieteGroup Jaedong Hwang, Brody Lab, Princeton Neuroscience Institute ⬇️ 🧵 for key takeaways

1/ Excited to share our recent work in #ICML2025, “A multi-region brain model to elucidate the role of hippocampus in spatially embedded decision-making”. 🎉

🔗 minzsiure.github.io/multiregion-br…

Joint w/ <a href="/FieteGroup/">FieteGroup</a> <a href="/jaedong_hwang/">Jaedong Hwang</a>, <a href="/brody_lab/">Brody Lab</a>, <a href="/PrincetonNeuro/">Princeton Neuroscience Institute</a> 

⬇️ 🧵 for key takeaways