Sarah Gurev (@sarahgurev) 's Twitter Profile
Sarah Gurev

@sarahgurev

PhD student at MIT EECS. Machine learning for meaningful problems in biology in Debora Marks’ lab.

ID: 1456771656362332161

calendar_today05-11-2021 23:54:04

64 Tweet

940 Followers

2,2K Following

Benjamin Chang (@benjamin0chang) 's Twitter Profile Photo

Excited to present the first paper of my PhD: Robin, a multi-agent system for automating scientific discovery! 💡Robin was applied to identify a promising therapeutic candidate for dry AMD, ripasudil, which is a clinically-approved drug not previously used for dry AMD. 🧵👇

Nic Fishman (@njwfish) 's Twitter Profile Photo

🚨 New preprint 🚨 We introduce Generative Distribution Embeddings (GDEs) — a framework for learning representations of distributions, not just datapoints. GDEs enable multiscale modeling and come with elegant statistical theory and some miraculous geometric results! 🧵

🚨 New preprint 🚨

We introduce Generative Distribution Embeddings (GDEs) — a framework for learning representations of distributions, not just datapoints.

GDEs enable multiscale modeling and come with elegant statistical theory and some miraculous geometric results!

🧵
Jonathan Frazer (@jonnygfrazer) 's Twitter Profile Photo

Want to improve your protein or genomic language model’s performance at zero-shot variant effect prediction? We propose a simple adjustment to likelihood-based approaches. biorxiv.org/content/10.110…

Aviv Spinner (@avivspinner) 's Twitter Profile Photo

Can protein expression be ‘solved’? We review current methods for measuring and predicting protein expression (wet lab + ML techniques!) & propose how this ubiquitously interesting problem could be solved. What do you think? sciencedirect.com/science/articl…

Anshul Kundaje (anshulkundaje@bluesky) (@anshulkundaje) 's Twitter Profile Photo

Great blog post. All ML folks working in the bio space should seriously read this. This issue is seriously plaguing the field. Lots of bombastic AI4bio papers getting tons of attention (pun intended) that end up being entirely hollow under the surface. This is BAD for the field.

Ruben Weitzman (@ruben_weitzman) 's Twitter Profile Photo

🚨ICML Paper Alert🚨 What if finding the right protein homologs wasn't a slow search, but a learned part of the model itself? We introduce 𝐏𝐫𝐨𝐭𝐫𝐢𝐞𝐯𝐞𝐫, an end-to-end framework that learns to retrieve the most useful homologs for self-supervised reconstruction! (1/12)

🚨ICML Paper Alert🚨
What if finding the right protein homologs wasn't a slow search, but a learned part of the model itself?
We introduce 𝐏𝐫𝐨𝐭𝐫𝐢𝐞𝐯𝐞𝐫, an end-to-end framework that learns to retrieve the most useful homologs for self-supervised reconstruction! (1/12)
Shira Weingarten-Gabbay (@weingartenshira) 's Twitter Profile Photo

If you love #viruses, #ribosomes, and genomic #darkmatter, this thread is for you!! 💫 We're excited to share our published work developing Massively Parallel Ribosome Profiling (MPRP), which uncovered ~4,000 hidden proteins in ~700 viral genomes. science.org/doi/10.1126/sc…

If you love #viruses, #ribosomes, and genomic #darkmatter, this thread is for you!! 💫
We're excited to share our published work developing Massively Parallel Ribosome Profiling (MPRP), which uncovered ~4,000 hidden proteins in ~700 viral genomes. science.org/doi/10.1126/sc…
owl (@owl_posting) 's Twitter Profile Photo

Endometriosis is an incredibly interesting disease 5k words, 23 minutes reading time covering one of the strangest conditions ive ever heard about link: owlposting.com/p/endometriosi… very grateful to Shilpa Pothapragada for initial inspiration for this piece + reviewing it!!

Alan Amin (@alannawzadamin) 's Twitter Profile Photo

There are many domain-specific noise processes for discrete diffusion, but masking dominates! Why? We show masking exploits a key property of discrete diffusion, which we use to unlock the potential of those structured processes and beat masking! Nate Gruver Andrew Gordon Wilson 1/7

There are many domain-specific noise processes for discrete diffusion, but masking dominates! Why? We show masking exploits a key property of discrete diffusion, which we use to unlock the potential of those structured processes and beat masking! <a href="/gruver_nate/">Nate Gruver</a> <a href="/andrewgwils/">Andrew Gordon Wilson</a> 1/7
Pascal Notin (@notinpascal) 's Twitter Profile Photo

🚨 New paper 🚨 RNA modeling just got its own Gym! 🏋️ Introducing RNAGym, large-scale benchmarks for RNA fitness and structure prediction. 🧵 1/9

🚨 New paper 🚨 RNA modeling just got its own Gym! 🏋️ Introducing RNAGym, large-scale benchmarks for RNA fitness and structure prediction.
🧵 1/9
Yunha Hwang (@micro_yunha) 's Twitter Profile Photo

🚨 new paper alert! science.org/doi/10.1126/sc… During my PhD, one of the most frustrating challenges was trying to interpret genes labeled as “hypothetical proteins.” 1/n

Alan Amin (@alannawzadamin) 's Twitter Profile Photo

We can make population genetics studies more powerful by building priors of variant effect size from features like binding. But we’ve been stuck on linear models! We introduce DeepWAS to learn deep priors on millions of variants! #ICML2025 Andres Potapczynski, Andrew Gordon Wilson 1/7

We can make population genetics studies more powerful by building priors of variant effect size from features like binding. But we’ve been stuck on linear models! We introduce DeepWAS to learn deep priors on millions of variants! #ICML2025 Andres Potapczynski, <a href="/andrewgwils/">Andrew Gordon Wilson</a> 1/7
Aviv Spinner (@avivspinner) 's Twitter Profile Photo

Whose protein design model reins supreme? You design, we test, science wins 😛 Protein Engineering Tournament 2025 registration is live!!!!

Alice Yang (@alicey_ang) 's Twitter Profile Photo

Curious about why pretraining misses important features?🚩 Our new paper reveals a key limitation in how deep learning models learn – an information saturation bottleneck.🤯 Networks cannot learn new features after “saturating” information on related features. 💡(1/2)

Pascal Notin (@notinpascal) 's Twitter Profile Photo

Congratulations to the entire @ProfluentAI team on this incredible milestone! OpenCRISPR-1 represents a paradigm shift - the first AI-designed CRISPR protein to successfully edit human DNA with fewer off-target effects. We're moving from discovery-based to engineered biology. 🧬

Aviv Spinner (@avivspinner) 's Twitter Profile Photo

1/5 Biological data is noisy, redundant, and ever-growing. 🗣️ In our new paper (first paper of my post doc!! ⚡️), we track model performance across 14 years of UniRef100 snapshots to ask: how does pLM performance scale with training data?

1/5 Biological data is noisy, redundant, and ever-growing. 🗣️

In our new paper (first paper of my post doc!! ⚡️), we track model performance across 14 years of UniRef100 snapshots to ask: how does pLM performance scale with training data?
Yo Akiyama (@yoakiyama) 's Twitter Profile Photo

Excited to share work with Zhidian Zhang, Milot Mirdita, Martin Steinegger, and Sergey Ovchinnikov biorxiv.org/content/10.110… TLDR: We introduce MSA Pairformer, a 111M parameter protein language model that challenges the scaling paradigm in self-supervised protein language modeling 🧵