Shicheng Liu (@shichenggliu) 's Twitter Profile
Shicheng Liu

@shichenggliu

CS Phd @StanfordNLP @StanfordOVAL

ID: 1779975320910934016

linkhttps://george1459.github.io/ calendar_today15-04-2024 20:49:27

31 Tweet

169 Followers

131 Following

Stanford OVAL (@stanfordoval) 's Twitter Profile Photo

Democratizing AI-Assisted Access to Knowledge! The Stanford OVAL Lab is leading an initiative to create a public AI Assistant that democratizes access to the world's knowledge. Our pilot program, already embraced by over 400,000 users, generates Wikipedia-like articles through

Democratizing AI-Assisted Access to Knowledge!

The Stanford OVAL Lab is leading an initiative to create a public AI Assistant that democratizes access to the world's knowledge. Our pilot program, already embraced by over 400,000 users, generates Wikipedia-like articles through
Aryaman Arora (@aryaman2020) 's Twitter Profile Photo

new paper! 🫡 we introduce 🪓AxBench, a scalable benchmark that evaluates interpretability techniques on two axes: concept detection and model steering. we find that: 🥇prompting and finetuning are still best 🥈supervised interp methods are effective 😮SAEs lag behind

new paper! 🫡

we introduce  🪓AxBench, a scalable benchmark that evaluates interpretability techniques on two axes: concept detection and model steering.

we find that:
🥇prompting and finetuning are still best
🥈supervised interp methods are effective
😮SAEs lag behind
Yijia Shao (@echoshao8899) 's Twitter Profile Photo

🎉 For the first time ever: Collaborate with AI agents in real-time! Collaborative Gym UI is now IRB-approved and alive at cogym.saltlab.stanford.edu! A group of agents is eager to work with you. By providing feedback, you will see the agent's identity and its feedback to you!

🎉 For the first time ever: Collaborate with AI agents in real-time! Collaborative Gym UI is now IRB-approved and alive at cogym.saltlab.stanford.edu!
A group of agents is eager to work with you. By providing feedback, you will see the agent's identity and its feedback to you!
Stanford NLP Group (@stanfordnlp) 's Twitter Profile Photo

Congratulations to Stanford NLP Group founder Christopher Manning for being elected to The National Academy of Engineering (NAE, National Academies) Class of 2025 for the development and dissemination of natural language processing methods.

Congratulations to <a href="/stanfordnlp/">Stanford NLP Group</a> founder <a href="/chrmanning/">Christopher Manning</a> for being elected to The National Academy of Engineering (NAE,  <a href="/theNASEM/">National Academies</a>) Class of 2025 for the development and dissemination of natural language processing methods.
CLS (@chengleisi) 's Twitter Profile Photo

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

Are AI scientists already better than human researchers?

We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts.

Main finding: LLM ideas result in worse projects than human ideas.
Shicheng Liu (@shichenggliu) 's Twitter Profile Photo

Come check out GenieWorksheets for programmable trustworthy LLM-based conversational agents and don’t miss the chance to talk to Harshit Joshi !

Harshit Joshi (@harshitj__) 's Twitter Profile Photo

flying to Vienna 🇦🇹 for ACL to present Genie Worksheets (Monday 11am)! come and say hi if you want to talk about how to create controllable and reliable application layers on top of LLMs, knowledge discovery and curation, or just wanna hang

flying to Vienna 🇦🇹 for ACL to present Genie Worksheets (Monday 11am)!

come and say hi if you want to talk about how to create controllable and reliable application layers on top of LLMs, knowledge discovery and curation, or just wanna hang
Shicheng Liu (@shichenggliu) 's Twitter Profile Photo

This is a great work to checkout! Cursor / Claude Code often attempts to accomplish tasks with shortcuts, such as just skipping a pytest. Research like this is increasingly necessary to build trustworthy and truly instruction-following systems.

Ken Liu (@kenziyuliu) 's Twitter Profile Photo

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions. Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions.

Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation &amp; community verification. LLMs solved ~10/500 so far:
Rulin Shao (@rulinshao) 's Twitter Profile Photo

#COLM2025 Please drop by our ReasonIR poster at Poster3 #967 (11:00am - 1:00pm Wed) by Varsha Kishore 🥰 Happy to answer questions or chat online--feel free to DM! I've been exploring deep research training lately to empower reasoning+search for complex tasks💪 Stay tuned!

#COLM2025  Please drop by our ReasonIR poster at Poster3 #967 (11:00am - 1:00pm Wed) by <a href="/varsha_kishore_/">Varsha Kishore</a> 🥰

Happy to answer questions or chat online--feel free to DM! I've been exploring deep research training lately to empower reasoning+search for complex tasks💪 Stay tuned!
Eugenia Rho (@eugeniarho) 's Twitter Profile Photo

Had such a great time at the Stanford NLP Group reunion this past weekend, catching up with old friends and meeting new ones. Huge thanks to student organizers Dilara Soylu and Shicheng Liu for making everything seamless.

Kristina Gligorić (@krisgligoric) 's Twitter Profile Photo

I'm recruiting multiple PhD students for Fall 2026 in Computer Science at JHU Computer Science 🍂 Apply to work on AI for social sciences/human behavior, social NLP, and LLMs for real-world applied domains you're passionate about! Learn more kristinagligoric.com & help spread the word!

I'm recruiting multiple PhD students for Fall 2026 in Computer Science at <a href="/JHUCompSci/">JHU Computer Science</a> 🍂
Apply to work on AI for social sciences/human behavior, social NLP, and LLMs for real-world applied domains you're passionate about!
Learn more kristinagligoric.com &amp; help spread the word!