Jacqueline He (@jcqln_h) Twitter Tweets • TwiCopy

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Introducing HELMET, a long-context benchmark that supports >=128K length, covering 7 diverse applications. We evaluated 51 long-context models and found HELMET provide more reliable signals for model development github.com/princeton-nlp/… A 🧵 on why you should use HELMET⛑️

thumb_up_off_alt78

chat_bubble_outline2

repeat30

shareShare

Jacqueline He

@jcqln_h

a year ago

Check out our OpenScholar project!! Huge congrats to Akari Asai for leading the project — working with her has been a wonderful experience!! 🌟

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Akari Asai

@akariasai

a year ago

🚨 I’m on the job market this year! 🚨 I’m completing my Allen School Ph.D. (2025), where I identify and tackle key LLM limitations like hallucinations by developing new models—Retrieval-Augmented LMs—to build more reliable real-world AI systems. Learn more in the thread! 🧵

🚨 I’m on the job market this year! 🚨
I’m completing my <a href="/uwcse/">Allen School</a> Ph.D. (2025), where I identify and tackle key LLM limitations like hallucinations by developing new models—Retrieval-Augmented LMs—to build more reliable real-world AI systems. Learn more in the thread! 🧵

thumb_up_off_alt825

chat_bubble_outline26

repeat118

shareShare

Hila Gonen

@hila_gonen

a year ago

Extremely excited to share that I will be joining UBC Computer Science as an Assistant Professor this summer! I will be recruiting students this coming cycle!

thumb_up_off_alt147

chat_bubble_outline15

repeat18

shareShare

Ai2

@allen_ai

10 months ago

Can AI really help with literature reviews? 🧐 Meet Ai2 ScholarQA, an experimental solution that allows you to ask questions that require multiple scientific papers to answer. It gives more in-depth, detailed, and contextual answers with table comparisons, expandable sections

thumb_up_off_alt221

chat_bubble_outline14

repeat74

shareShare

Hamish Ivison

@hamishivi

9 months ago

We trained a diffusion LM! 🔁 Adapted from Mistral v0.1/v0.3. 📊 Beats AR models in GSM8k when we finetune on math data. 📈 Performance improves by using more test-time compute (reward guidance or more diffusion steps). Check out Jake Tae's thread for more details!

thumb_up_off_alt39

chat_bubble_outline1

repeat8

shareShare

Stella Li

@stellalisy

9 months ago

Asking the right questions can make or break decisions in high-stake fields like medicine, law, and beyond✴️ Our new framework ALFA—ALignment with Fine-grained Attributes—teaches LLMs to PROACTIVELY seek information through better questions🏥❓ (co-led with Jimin Mun) 👉🏻🧵

thumb_up_off_alt196

chat_bubble_outline7

repeat42

shareShare

Hamish Ivison

@hamishivi

9 months ago

How well do data-selection methods work for instruction-tuning at scale? Turns out, when you look at large, varied data pools, lots of recent methods lag behind simple baselines, and a simple embedding-based method (RDS) does best! More below ⬇️ (1/8)

thumb_up_off_alt332

chat_bubble_outline4

repeat63

shareShare

Zhiyuan Zeng

@zhiyuanzeng_

8 months ago

Is a single accuracy number all we can get from model evals?🤔 🚨Does NOT tell where the model fails 🚨Does NOT tell how to improve it Introducing EvalTree🌳 🔍identifying LM weaknesses in natural language 🚀weaknesses serve as actionable guidance (paper&demo 🔗in🧵) [1/n]

thumb_up_off_alt240

chat_bubble_outline4

repeat89

shareShare

Shangbin Feng

@shangbinfeng

6 months ago

Check out our work on LLMs and scientific knowledge updates!

thumb_up_off_alt54

chat_bubble_outline0

repeat11

shareShare

Jacqueline He

@jcqln_h

6 months ago

congrats Kunal Jha !! cool work 🎊🎉🎇

thumb_up_off_alt0

chat_bubble_outline1

repeat0

shareShare

Ilia Shumailov🦔

@iliaishacked

5 months ago

Are modern large language models (LLMs) vulnerable to privacy attacks that can determine if given data was used for training? Models and dataset are quite large, what should we even expect? Our new paper looks into this exact question. 🧵 (1/10)

thumb_up_off_alt92

chat_bubble_outline1

repeat15

shareShare

Jacqueline He

good girl

Howard Yen

Jacqueline He

Akari Asai

Hila Gonen

Ai2

Hamish Ivison

Stella Li

Hamish Ivison

Zhiyuan Zeng

Shangbin Feng

Jacqueline He

Ilia Shumailov🦔