Qizhen (Irene) Zhang (@irenezhang30) 's Twitter Profile
Qizhen (Irene) Zhang

@irenezhang30

PhD @UniofOxford, Research Scientist Intern @AIatMeta
prev: Member of Technical Staff @Cohere

ID: 4694088007

linkhttps://irenezhang30.github.io/ calendar_today02-01-2016 05:23:48

49 Tweet

537 Takipçi

334 Takip Edilen

Daniella Ye (@daniella_yz) 's Twitter Profile Photo

Beyond their use in assisting human evaluation (e.g. CriticGPT), can critiques directly enhance preference learning? During my @Cohere internship, we explored using synthetic critiques from large language models to improve reward models. 📑Preprint: arxiv.org/abs/2405.20850

Beyond their use in assisting human evaluation (e.g. CriticGPT), can critiques directly enhance preference learning? During my @Cohere internship, we explored using synthetic critiques from large language models to improve reward models.   

📑Preprint: arxiv.org/abs/2405.20850
Qizhen (Irene) Zhang (@irenezhang30) 's Twitter Profile Photo

I’ll be at ICML next week presenting our work on efficient initialisation for mixture of experts (arXiv link dropping soon) Come to our spotlight talk & poster sessions and say hi! 💙

Chris Lu (@_chris_lu_) 's Twitter Profile Photo

Excited to share The AI Scientist! We use LLMs to autonomously come up with research ideas, implement them, do literature search, write them up, and review them -- producing full-length papers on AI without human intervention. Co-led with Cong Lu and Robert Lange

Ahmet Üstün (@ahmetustun89) 's Twitter Profile Photo

I'm incredibly proud that Aya received #ACL2024 Best Paper Award 🥹. Huge congratulations to the Aya team and Cohere For AI community who make this possible by for extending frontiers of LLMs to multilingual, building Aya Model and Aya Dataset 🌿🌏

I'm incredibly proud that Aya received #ACL2024 Best Paper Award 🥹. 

Huge congratulations to the Aya team and <a href="/CohereForAI/">Cohere For AI</a> community who make this possible by for extending frontiers of LLMs to multilingual, building Aya Model and Aya Dataset 🌿🌏
Laura Ruis (@lauraruis) 's Twitter Profile Photo

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:

Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢

🧵⬇️
Qizhen (Irene) Zhang (@irenezhang30) 's Twitter Profile Photo

I'll be at #NeurIPS2024 this week with 2 works on pretraining MoEs: 1) BAM: Why & how to use attention experts for MoE upcycling (Wed 4:30 pm @ East Exhibit Hall A-C #3011) x.com/IreneZhang30/s… 2) Nexus: An MoE framework that easily adapts to new data distribution, led by the

Acyr Locatelli (@acyr_l) 's Twitter Profile Photo

I'm hiring performance engineers for the pre-training team at Cohere. If you enjoy writing efficient kernels, hardware-aligned architecture design and optimisations, do reach out! Check out the live job posting here: jobs.ashbyhq.com/cohere/d42f5fd…