Lucy Li (@lucy3_li) Twitter Tweets • TwiCopy

Lucy Li

@lucy3_li

+ Follow

@UCBerkeley PhD student + @allen_ai. Human-centered #NLProc, computational social science, AI fairness. she/her. https://t.co/rtSSUhWQnL

ID:861417356756533248

linkhttp://lucy3.github.io calendar_today08-05-2017 03:07:57

3,0K Tweets

4,1K Followers

1,5K Following

Michelle Lam

@michelle123lam

1 week ago

“Can we get a new text analysis tool?”
“No—we have Topic Model at home”

Topic Model at home: outputs vague keywords; needs constant parameter fiddling🫠

Is there a better way? We introduce LLooM, a concept induction tool to explore text data in terms of interpretable concepts🧵

thumb_up_off_alt170

chat_bubble_outline0

account_circle

Omar Shaikh

1 week ago

Tired of your language model 'delving' into things? Or maybe you like delving! We're working on a new interaction & method to customize language models, and we're looking for participants! If you're interested, please fill out the Google Form below.

forms.gle/6hyPgvSsqRSYHp…

thumb_up_off_alt51

chat_bubble_outline0

account_circle

Lucy Li

1 week ago

I am a fifth year phd student yet I still get nervous when i see the little icon in overleaf that shows that my advisor is actively looking at our paper

thumb_up_off_alt343

chat_bubble_outline0

account_circle

Lucy Li

1 week ago

📢 The Gates Foundation has posted a Request for Information on AI-Powered Innovations in Mathematics Teaching & Learning
usprogram.gatesfoundation.org/news-and-insig…

If you are working at the intersection of AI & math education, especially approaches that center equity, consider responding!

thumb_up_off_alt10

chat_bubble_outline0

account_circle

Sasha Rush

1 week ago

Lazy twitter: A common question in NLP class is 'if xBERT worked well, why didn't people make it bigger?' but I realize I just don't know the answer. I assume people tried but that a lot of that is unpublished. Is the theory that denoising gets too easy for big models?

thumb_up_off_alt464

chat_bubble_outline0

account_circle

Lucy Li

2 weeks ago

ugh ollama's branding visuals are SO CUTE wahhh

thumb_up_off_alt18

chat_bubble_outline0

account_circle

Shayne Longpre

2 weeks ago

🌟Several dataset releases deserve a mention for their incredible data measurement work 🌟

➡️ The Pile (arxiv.org/abs/2101.00027) Leo Gao Stella Biderman

➡️ ROOTS (arxiv.org/abs/2303.03915) Hugo Laurençon++

➡️ Dolma (arxiv.org/abs/2402.00159) Luca Soldaini 🎀 Kyle Lo

14/

thumb_up_off_alt17

chat_bubble_outline0

account_circle

Xiang Yue

2 weeks ago

🚀Introducing VisualWebBench: A Comprehensive Benchmark for Multimodal Web Page Understanding and Grounding. visualwebbench.github.io

🤔What's this all about? Why this benchmark?
> Back in Nov 2023, when we released MMMU (mmmu-benchmark.github.io), a comprehensive multimodal

thumb_up_off_alt145

chat_bubble_outline0

account_circle

Jiayi Pan

2 weeks ago

New paper from @Berkeley_AI on Autonomous Evaluation and Refinement of Digital Agents!

We show that VLM/LLM-based evaluators can significantly improve the performance of agents for web browsing and device control, advancing sotas by 29% to 75%.

arxiv.org/abs/2404.06474 [🧵]

New paper from @Berkeley_AI on Autonomous Evaluation and Refinement of Digital Agents! We show that VLM/LLM-based evaluators can significantly improve the performance of agents for web browsing and device control, advancing sotas by 29% to 75%. arxiv.org/abs/2404.06474 [🧵]

thumb_up_off_alt292

chat_bubble_outline0

account_circle

Vaibhav Adlakha

@vaibhav_adlakha

2 weeks ago

We introduce LLM2Vec, a simple approach to transform any decoder-only LLM into a text encoder. We achieve SOTA performance on MTEB in the unsupervised and supervised category (among the models trained only on publicly available data). 🧵1/N

Paper: arxiv.org/abs/2404.05961

We introduce LLM2Vec, a simple approach to transform any decoder-only LLM into a text encoder. We achieve SOTA performance on MTEB in the unsupervised and supervised category (among the models trained only on publicly available data). 🧵1/N Paper: arxiv.org/abs/2404.05961

thumb_up_off_alt834

chat_bubble_outline0

account_circle

Esin Durmus

2 weeks ago

Our latest study measures how persuasive language models like Claude are compared to humans. We find a general scaling trend: newer models tend to be more persuasive, with Claude 3 Opus generating arguments that don't differ statistically from human-written ones.

thumb_up_off_alt93

chat_bubble_outline0

account_circle

Dallas Card

3 weeks ago

I'm excited to share that the journal version of our paper, 'An archival perspective on pretraining data', is now available (open access) from Patterns!

This project was led by Meera Desai, along with Irene Pasquetto, Abigail Jacobs, and myself

1/n

I'm excited to share that the journal version of our paper, 'An archival perspective on pretraining data', is now available (open access) from Patterns! This project was led by @MeeraDesai18, along with @IrenePasquetto, @az_jacobs, and myself 1/n

thumb_up_off_alt73

chat_bubble_outline0

account_circle

Pablo Montalvo

3 weeks ago

It was hard to find quality OCR data... until today! Super excited to announce the release of the 2 largest public OCR datasets ever 📜 📜

OCR is critical for document AI: here, 26M+ pages, 18b text tokens, 6TB! Thanks to UCSF Library, Industry Documents Library and PDF Association
🧶 ↓

It was hard to find quality OCR data... until today! Super excited to announce the release of the 2 largest public OCR datasets ever 📜 📜 OCR is critical for document AI: here, 26M+ pages, 18b text tokens, 6TB! Thanks to @ucsf_library, @industrydocs and @PDFAssociation 🧶 ↓

thumb_up_off_alt625

chat_bubble_outline0

account_circle

Manuel Mager (Turatemai)

3 weeks ago

Lucy Li Most of us, in LATAM identify ourself as part of the American Continent. Sadly that does not align with the US/European conception o more than one continent (South, North...) This is why Panamerica is a really inclusive term and is our proposal.

thumb_up_off_alt8

chat_bubble_outline0

account_circle

Umang Bhatt

3 weeks ago

Congrats to those with accepted #FAccT2024 papers -- time to head to Rio 🥳

Applications for ACM FAccT financial support are due on April 12 for anyone interested! We have funds available for registration, travel, accommodation, etc.

facctconference.org/2024/scholarsh…

thumb_up_off_alt32

chat_bubble_outline0

account_circle

Yekyung Kim

3 weeks ago

Summarizing long documents (>100K tokens) is a popular use case for LLMs, but how faithful are these summaries? We present FABLES, a dataset of human annotations of faithfulness & content selection in LLM-generated summaries of books.

arxiv.org/abs/2404.01261

🧵below:

Summarizing long documents (>100K tokens) is a popular use case for LLMs, but how faithful are these summaries? We present FABLES, a dataset of human annotations of faithfulness & content selection in LLM-generated summaries of books. arxiv.org/abs/2404.01261 🧵below:

thumb_up_off_alt195

chat_bubble_outline0

account_circle

Alice Oh

1 month ago

Let's be honest. For everyone in ML/NLP, it's a really exciting time but also very stressful with so many new papers, models, benchmarks, deadlines, reviews, talks, workshops, conferences...

How do you keep up and stay sane?

For me, the only solution is collaboration. 1/n

thumb_up_off_alt647

chat_bubble_outline0

account_circle

Andrew Gray | @[email protected]

1 month ago

I have a preprint out! Evidence for extensive appearance of chatGPT/LLM derived text in scholarly papers, signalled by words that mysteriously became a lot more popular in 2023 - eg 'commendable'. I estimate upwards of 60,000 papers last year (& rising...) arxiv.org/abs/2403.16887

I have a preprint out! Evidence for extensive appearance of chatGPT/LLM derived text in scholarly papers, signalled by words that mysteriously became a lot more popular in 2023 - eg 'commendable'. I estimate upwards of 60,000 papers last year (& rising...) arxiv.org/abs/2403.16887

thumb_up_off_alt240

chat_bubble_outline0

account_circle

fpc ok :)