Christina Baek (@_christinabaek) Twitter Tweets • TwiCopy

Christina Baek

@_christinabaek

+ Follow

PhD student @mldcmu | Past: intern @GoogleAI

ID: 1409604978004594688

linkhttp://kebaek.github.io calendar_today28-06-2021 20:11:47

53 Tweet

978 Followers

309 Following

Qing Qu

@qu_1006

7 months ago

Our recent work won the best paper at NeurIPS workshop on diffusion model 2023. We recently made significant revisions to the entire article and re-uploaded a new version to Arxiv arxiv.org/abs/2310.05264, adding many new experiments and insights.

Pratyush Maini

@pratyushmaini

5 months ago

1/ 🥁Scaling Laws for Data Filtering 🥁 TLDR: Data Curation *cannot* be compute agnostic! In our #CVPR2024 paper, we develop the first scaling laws for heterogeneous & limited web data. w/Sachin Goyal Zachary Lipton Aditi Raghunathan Zico Kolter 📝:arxiv.org/abs/2404.07177

Pratyush Maini

@pratyushmaini

5 months ago

1/What does it mean for an LLM to “memorize” a doc? Exactly regurgitating a NYT article? Of course. Just training on NYT?Harder to say We take big strides in this discourse w/*Adversarial Compression* w/Avi Schwarzschild Zhili Feng Zachary Lipton Zico Kolter 🌐:locuslab.github.io/acr-memorizati…🧵

Amrith Setlur

@setlur_amrith

3 months ago

🚨 Interested in synthetic data and LLM reasoning? Our new work studies scaling laws for synthetic data and RL for math reasoning. TLDR: Step-level RL (per-step DPO in fig) on self-generated answers improves sample efficiency of synthetic data by 8x! arxiv.org/abs/2406.14532 1/🧵

Zico Kolter

@zicokolter

2 months ago

I'm excited to announce that I am joining the OpenAI Board of Directors. I'm looking forward to sharing my perspectives and expertise on AI safety and robustness to help guide the amazing work being done at OpenAI.

thumb_up_off_alt1,1K

chat_bubble_outline79

repeat75

shareShare

Kevin Li

@kevinyli_

a month ago

Attention is all you need; at least the matrices are, if you want to distill Transformers into alternative architectures, like Mamba, with our new distillation method: MOHAWK! We also release a fully subquadratic, performant 1.5B model distilled from Phi-1.5 with only 3B tokens!