Qizhen (Irene) Zhang (@irenezhang30) Twitter Tweets • TwiCopy

Qizhen (Irene) Zhang

@irenezhang30

+ Follow

PhD @UniofOxford, Research Scientist Intern @AIatMeta
prev: Member of Technical Staff @Cohere

ID: 4694088007

linkhttps://irenezhang30.github.io/ calendar_today02-01-2016 05:23:48

49 Tweet

537 Followers

334 Following

Daniella Ye

@daniella_yz

a year ago

Beyond their use in assisting human evaluation (e.g. CriticGPT), can critiques directly enhance preference learning? During my @Cohere internship, we explored using synthetic critiques from large language models to improve reward models. 📑Preprint: arxiv.org/abs/2405.20850

thumb_up_off_alt318

chat_bubble_outline5

repeat57

shareShare

Qizhen (Irene) Zhang

@irenezhang30

a year ago

I’ll be at ICML next week presenting our work on efficient initialisation for mixture of experts (arXiv link dropping soon) Come to our spotlight talk & poster sessions and say hi! 💙

thumb_up_off_alt89

chat_bubble_outline2

repeat7

shareShare

Chris Lu

@_chris_lu_

a year ago

Excited to share The AI Scientist! We use LLMs to autonomously come up with research ideas, implement them, do literature search, write them up, and review them -- producing full-length papers on AI without human intervention. Co-led with Cong Lu and Robert Lange

thumb_up_off_alt189

chat_bubble_outline8

repeat39

shareShare

Ahmet Üstün

@ahmetustun89

a year ago

I'm incredibly proud that Aya received #ACL2024 Best Paper Award 🥹. Huge congratulations to the Aya team and Cohere For AI community who make this possible by for extending frontiers of LLMs to multilingual, building Aya Model and Aya Dataset 🌿🌏

I'm incredibly proud that Aya received #ACL2024 Best Paper Award 🥹.

Huge congratulations to the Aya team and <a href="/CohereForAI/">Cohere For AI</a> community who make this possible by for extending frontiers of LLMs to multilingual, building Aya Model and Aya Dataset 🌿🌏

thumb_up_off_alt263

chat_bubble_outline20

repeat37

shareShare

Qizhen (Irene) Zhang

@irenezhang30

a year ago

Checkout below our new work on adaptable MoEs led by Nikolas Gritsch! adaptability, specialization, and efficiency all in one framework☝️

thumb_up_off_alt49

chat_bubble_outline0

repeat7

shareShare

Laura Ruis

@lauraruis

a year ago

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

thumb_up_off_alt966

chat_bubble_outline24

repeat208

shareShare

Qizhen (Irene) Zhang

@irenezhang30

a year ago

I'll be at #NeurIPS2024 this week with 2 works on pretraining MoEs: 1) BAM: Why & how to use attention experts for MoE upcycling (Wed 4:30 pm @ East Exhibit Hall A-C #3011) x.com/IreneZhang30/s… 2) Nexus: An MoE framework that easily adapts to new data distribution, led by the

thumb_up_off_alt60

chat_bubble_outline1

repeat6

shareShare

Acyr Locatelli

@acyr_l

10 months ago

I'm hiring performance engineers for the pre-training team at Cohere. If you enjoy writing efficient kernels, hardware-aligned architecture design and optimisations, do reach out! Check out the live job posting here: jobs.ashbyhq.com/cohere/d42f5fd…

thumb_up_off_alt154

chat_bubble_outline2

repeat34

shareShare