Shicheng Liu (@shichenggliu) Twitter Tweets • TwiCopy

Stanford OVAL

10 months ago

Democratizing AI-Assisted Access to Knowledge! The Stanford OVAL Lab is leading an initiative to create a public AI Assistant that democratizes access to the world's knowledge. Our pilot program, already embraced by over 400,000 users, generates Wikipedia-like articles through

thumb_up_off_alt31

chat_bubble_outline2

repeat14

shareShare

Aryaman Arora

@aryaman2020

10 months ago

new paper! 🫡 we introduce 🪓AxBench, a scalable benchmark that evaluates interpretability techniques on two axes: concept detection and model steering. we find that: 🥇prompting and finetuning are still best 🥈supervised interp methods are effective 😮SAEs lag behind

thumb_up_off_alt416

chat_bubble_outline7

repeat67

shareShare

Yijia Shao

@echoshao8899

9 months ago

🎉 For the first time ever: Collaborate with AI agents in real-time! Collaborative Gym UI is now IRB-approved and alive at cogym.saltlab.stanford.edu! A group of agents is eager to work with you. By providing feedback, you will see the agent's identity and its feedback to you!

thumb_up_off_alt118

chat_bubble_outline1

repeat31

shareShare

Stanford NLP Group

@stanfordnlp

9 months ago

Congratulations to Stanford NLP Group founder Christopher Manning for being elected to The National Academy of Engineering (NAE, National Academies) Class of 2025 for the development and dissemination of natural language processing methods.

Congratulations to <a href="/stanfordnlp/">Stanford NLP Group</a> founder <a href="/chrmanning/">Christopher Manning</a> for being elected to The National Academy of Engineering (NAE, <a href="/theNASEM/">National Academies</a>) Class of 2025 for the development and dissemination of natural language processing methods.

thumb_up_off_alt309

chat_bubble_outline9

repeat34

shareShare

CLS

@chengleisi

5 months ago

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

thumb_up_off_alt553

chat_bubble_outline10

repeat162

shareShare

Shicheng Liu

@shichenggliu

4 months ago

Come check out GenieWorksheets for programmable trustworthy LLM-based conversational agents and don’t miss the chance to talk to Harshit Joshi !

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Harshit Joshi

@harshitj__

4 months ago

flying to Vienna 🇦🇹 for ACL to present Genie Worksheets (Monday 11am)! come and say hi if you want to talk about how to create controllable and reliable application layers on top of LLMs, knowledge discovery and curation, or just wanna hang

thumb_up_off_alt40

chat_bubble_outline2

repeat17

shareShare

Shicheng Liu

@shichenggliu

3 months ago

This is a great work to checkout! Cursor / Claude Code often attempts to accomplish tasks with shortcuts, such as just skipping a pytest. Research like this is increasingly necessary to build trustworthy and truly instruction-following systems.

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Stanford NLP Group

@stanfordnlp

3 months ago

.Stanford NLP Group has grown from a single office to our own wing! And our new lounge is coming together. But, more to get in place before the big reveal.

.<a href="/stanfordnlp/">Stanford NLP Group</a> has grown from a single office to our own wing! And our new lounge is coming together. But, more to get in place before the big reveal.

thumb_up_off_alt112

chat_bubble_outline1

repeat2

shareShare

Ken Liu

@kenziyuliu

3 months ago

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions. Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:

thumb_up_off_alt362

chat_bubble_outline12

repeat72

shareShare

Shicheng Liu

@shichenggliu

2 months ago

Come and talk to Aryaman at COLM!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Rulin Shao

@rulinshao

2 months ago

#COLM2025 Please drop by our ReasonIR poster at Poster3 #967 (11:00am - 1:00pm Wed) by Varsha Kishore 🥰 Happy to answer questions or chat online--feel free to DM! I've been exploring deep research training lately to empower reasoning+search for complex tasks💪 Stay tuned!

#COLM2025 Please drop by our ReasonIR poster at Poster3 #967 (11:00am - 1:00pm Wed) by <a href="/varsha_kishore_/">Varsha Kishore</a> 🥰

Happy to answer questions or chat online--feel free to DM! I've been exploring deep research training lately to empower reasoning+search for complex tasks💪 Stay tuned!

thumb_up_off_alt98

chat_bubble_outline0

repeat9

shareShare

Shicheng Liu

@shichenggliu

2 months ago

Many inconsistencies in Wikipedia discovered with the help of LLMs!

thumb_up_off_alt22

chat_bubble_outline2

repeat2

shareShare

Eugenia Rho

@eugeniarho

a month ago

Had such a great time at the Stanford NLP Group reunion this past weekend, catching up with old friends and meeting new ones. Huge thanks to student organizers Dilara Soylu and Shicheng Liu for making everything seamless.

thumb_up_off_alt46

chat_bubble_outline1

repeat7

shareShare

Kristina Gligorić

@krisgligoric

21 days ago

I'm recruiting multiple PhD students for Fall 2026 in Computer Science at JHU Computer Science 🍂 Apply to work on AI for social sciences/human behavior, social NLP, and LLMs for real-world applied domains you're passionate about! Learn more kristinagligoric.com & help spread the word!

I'm recruiting multiple PhD students for Fall 2026 in Computer Science at <a href="/JHUCompSci/">JHU Computer Science</a> 🍂
Apply to work on AI for social sciences/human behavior, social NLP, and LLMs for real-world applied domains you're passionate about!
Learn more kristinagligoric.com & help spread the word!

thumb_up_off_alt655

chat_bubble_outline15

repeat155

shareShare

Shicheng Liu

@shichenggliu

15 days ago

Alex is a fantastic researcher! It would be a great working with him!

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare