Jianyang Gu (@vimar_gu) 's Twitter Profile
Jianyang Gu

@vimar_gu

Postdoc @ the Ohio State University

ID: 1679372176334688257

linkhttp://vimar-gu.github.io calendar_today13-07-2023 06:08:38

10 Tweet

36 Takipçi

94 Takip Edilen

Zanming Huang (@tzmhuang) 's Twitter Profile Photo

We're already using AI search systems every day for more and more complex tasks, but how good are they really? Challenge: evaluation is hard with no fixed ground truth! In Mind2Web 2, we use agents to evaluate agents. Really excited! Thanks to everyone who made this possible!

Huan Sun (OSU) (@hhsun1) 's Twitter Profile Photo

🚨 Postdoc Hiring: I am looking for a postdoc to work on rigorously evaluating and advancing the capabilities and safety of computer-use agents (CUAs), co-advised with Yu Su OSU NLP Group. We welcome strong applicants with experience in CUAs, long-horizon reasoning/planning,

Sam Stevens (@samstevens6860) 's Twitter Profile Photo

I'm excited to bring the Imageomics workshop to NeurIPS 2025! Consider submitting your work on ai4ecology, ai4conservation and general ai4science--if you're using images to learn something about the natural world, chances are it's a good fit for the imageomics workshop!

Vardaan Pahuja (@vardaanpahuja) 's Twitter Profile Photo

🚀 Excited to share our #ACL2025 Findings paper: Explorer — a scalable pipeline that generates diverse web trajectories via exploration, powering generalist GUI agents with strong performance! 📄 arxiv.org/pdf/2502.11357 🌐 osu-nlp-group.github.io/Explorer/ #WebAgents #SyntheticData #LLM

🚀 Excited to share our #ACL2025 Findings paper:
Explorer — a scalable pipeline that generates diverse web trajectories via exploration, powering generalist GUI agents with strong performance!
📄 arxiv.org/pdf/2502.11357
🌐 osu-nlp-group.github.io/Explorer/
 #WebAgents #SyntheticData #LLM
Yu Su @#ICLR2025 (@ysu_nlp) 's Twitter Profile Photo

Safety is one of the biggest blockers for computer use agents: how can I trust an agent won’t accidentally do something consequential without my permission? We collect and release the first large-scale dataset for detecting consequential actions on the web, and train the best

Safety is one of the biggest blockers for computer use agents: how can I trust an agent won’t accidentally do something consequential without my permission? 

We collect and release the first large-scale dataset for detecting consequential actions on the web, and train the best
Boyuan Zheng (@boyuan__zheng) 's Twitter Profile Photo

Remember “Son of Anton” from the Silicon Valley show(Silicon Valley)? The experimental AI that “efficiently” orders 4,000 lbs of meat while looking for a cheap burger and “fixes” a bug by deleting all the code? It’s starting to look a lot like reality. Even 18 months ago, my own

Remember “Son of Anton” from the Silicon Valley show(<a href="/SiliconHBO/">Silicon Valley</a>)? The experimental AI that “efficiently” orders 4,000 lbs of meat while looking for a cheap burger and “fixes” a bug by deleting all the code?

It’s starting to look a lot like reality. 

Even 18 months ago, my own
Lauren Gillespie (@leg2015) 's Twitter Profile Photo

Excited to be giving a keynote at the NeurIPS version of the Imageomics workshop! 🎉It's also not too late to submit your own work to the workshop! (Aug. 22nd deadline, details below 👇)

Yu Su @#ICLR2025 (@ysu_nlp) 's Twitter Profile Photo

Excited to receive the NSF CAREER Award! Grateful for all the support and encouragement I've received in the 6 years of faculty life so far, especially for my extremely supportive family and for the amazing students OSU NLP Group I have had the privilege to work with!!

Excited to receive the NSF CAREER Award!  

Grateful for all the support and encouragement I've received in the 6 years of faculty life so far, especially for my extremely supportive family and for the amazing students <a href="/osunlp/">OSU NLP Group</a> I have had the privilege to work with!!
Tianshu Zhang (@tianshu_osu) 's Twitter Profile Photo

🎉 Excited to share that our paper EvoSchema: Towards Text-to-SQL Robustness Against Schema Evolution was accepted at VLDB 2025! 🚀 📢 Reminder: join us at VLDB 2025 in London! 🗓️ Sept 2 (Tue), 10:45 AM – 12:15 PM 📍 Room Wordsworth 4F 📄 vldb.org/pvldb/vol18/p3… #VLDB2025 #LLMs

Huan Sun (OSU) (@hhsun1) 's Twitter Profile Photo

🧪 Chemists spend many hours planning and replanning synthetic routes for a target molecule to avoid dangerous reactants and intermediates.☠️🚫 🤔 What if an AI agent could plan around them automatically—better and faster than human experts? 🔬 Constrained retrosynthesis

Huan Sun (OSU) (@hhsun1) 's Twitter Profile Photo

I am humbled and grateful to receive two grants from Open Philanthropy Open Philanthropy to advance the safety of AI systems, co-led with my colleague Yu Su (hiring postdoc). I'm also honored to be the first at Ohio State to receive Open Philanthropy funding. Most credit goes to the amazing students

Yu Su @#ICLR2025 (@ysu_nlp) 's Twitter Profile Photo

Computer Use: Modern Moravec's Paradox A new blog post arguing why computer-use agents may be the biggest opportunity and challenge for AGI. tinyurl.com/computer-use-a… Table of Contents > Moravec’s Paradox > Moravec's Paradox in 2025 > Computer use may be the biggest opportunity

Computer Use: Modern Moravec's Paradox

A new blog post arguing why computer-use agents may be the biggest opportunity and challenge for AGI.

tinyurl.com/computer-use-a…

Table of Contents
&gt; Moravec’s Paradox
&gt; Moravec's Paradox in 2025
&gt; Computer use may be the biggest opportunity
Yu Su @#ICLR2025 (@ysu_nlp) 's Twitter Profile Photo

> working on semantic parsing in PhD > didn't even have its own track at ACL > it's a dead area, people say > had ~100 citations when graduating > but natural language programming is always the dream > 'Let machines understand human thinking. Don’t let humans think like machines'

&gt; working on semantic parsing in PhD
&gt; didn't even have its own track at ACL
&gt; it's a dead area, people say
&gt; had ~100 citations when graduating
&gt; but natural language programming is always the dream
&gt; 'Let machines understand human thinking. Don’t let humans think like machines'
Huan Sun (OSU) (@hhsun1) 's Twitter Profile Photo

While at #COLM2025, sharing a bit COLMy news related to the keynote yesterday by Dr. Shirley Ho Conference on Language Modeling: So, you know frontier LLMs have achieved gold medal level performance for IMO, IPhO, and IOI; what about astronomy and astrophysics? Our recent evaluation shows that

Hanane Moussa (@hananenmoussa) 's Twitter Profile Photo

📢 As AI becomes increasingly explored for research idea generation, how can we rigorously evaluate the ideas it generates before committing time and resources to them? We introduce ScholarEval, a literature grounded framework for research idea evaluation across disciplines 👇!