Yusen Zhang (@yusenzhangnlp) 's Twitter Profile
Yusen Zhang

@yusenzhangnlp

PhD Candidate @PennStateEECS | NLP Lab @NLP_PennState #NLProc | Prev Research Intern @MSFTResearch, @AmazonScience @GoogleAI

ID: 1590626031332954113

linkhttp://yuszh.com calendar_today10-11-2022 08:43:02

86 Tweet

352 Takipçi

436 Takip Edilen

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding? "we introduce HRScene, a novel unified benchmark for HRI understanding with rich scenes. HRScene incorporates 25 real-world datasets and 2 synthetic diagnostic datasets with resolutions ranging from

HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding?

"we introduce HRScene, a novel unified benchmark for HRI understanding with rich scenes. HRScene incorporates 25 real-world datasets and 2 synthetic diagnostic datasets with resolutions ranging from
Ryo Kamoi (@ryokamoi) 's Twitter Profile Photo

📢 New paper! FoVer enhances PRMs for step-level verification of LLM reasoning w/o human annotation 🚀 We synthesize training data using formal verification tools and improve LLMs at step-level verification of LLM responses on MATH, AIME, MMLU, BBH, etc. arxiv.org/abs/2505.15960

📢 New paper!
FoVer enhances PRMs for step-level verification of LLM reasoning w/o human annotation 🚀
We synthesize training data using formal verification tools and improve LLMs at step-level verification of LLM responses on MATH, AIME, MMLU, BBH, etc.
arxiv.org/abs/2505.15960
GPT Maestro | LLMpedia Curator (@gptmaestro) 's Twitter Profile Photo

Vision Language Models display a peculiar blind spot: their ability to process image content declines in a U-shaped pattern based on Manhattan distance from corners, suggesting fundamental limitations in handling high-resolution layouts.

Vision Language Models display a peculiar blind spot: their ability to process image content declines in a U-shaped pattern based on Manhattan distance from corners, suggesting fundamental limitations in handling high-resolution layouts.
Yusen Zhang (@yusenzhangnlp) 's Twitter Profile Photo

HRScene got accepted at #ICCV2025! HRScene is a novel unified benchmark for high-resolution image understanding with 25 scenes and 2 NIAH tests. Home page: yszh8.github.io/hrscene/ (Sorry, EvalAI for submission does not work currently...) My PhD research began with long text

HRScene got accepted at #ICCV2025!

HRScene is a novel unified benchmark for high-resolution image understanding with 25 scenes and 2 NIAH tests. Home page: yszh8.github.io/hrscene/ (Sorry, EvalAI for submission does not work currently...)

My PhD research began with long text
Ryo Kamoi (@ryokamoi) 's Twitter Profile Photo

Our paper VisOnlyQA has been accepted to Conference on Language Modeling #COLM2025! See you in Montreal🍁 We find that even recent Vision Language Models struggle with simple questions about geometric properties in images, such as "What is the degree of angle AOD?"🧐 arxiv.org/abs/2412.00947

Our paper VisOnlyQA has been accepted to <a href="/COLM_conf/">Conference on Language Modeling</a> #COLM2025! See you in Montreal🍁
We find that even recent Vision Language Models struggle with simple questions about geometric properties in images, such as "What is the degree of angle AOD?"🧐
arxiv.org/abs/2412.00947
Ryo Kamoi (@ryokamoi) 's Twitter Profile Photo

We updated our VisOnlyQA paper for #COLM2025! * LVLMs exhibit weak geometric perception even on geometric shapes with 2–3 lines 😭 * Gemini 2.5 Pro largely improves over prior models on charts and chemistry 😳 but still struggles with geometric shapes 😖 arxiv.org/abs/2412.00947

We updated our VisOnlyQA paper for #COLM2025!
* LVLMs exhibit weak geometric perception even on geometric shapes with 2–3 lines 😭
* Gemini 2.5 Pro largely improves over prior models on charts and chemistry 😳 but still struggles with geometric shapes 😖
arxiv.org/abs/2412.00947
Penn State Center for Socially Responsible AI (@pennstatecsrai) 's Twitter Profile Photo

How are researchers optimizing AI systems for science? CSRAI Affiliate Rui Zhang from Penn State EECS shares how to improve the efficiency and usefulness of AI and some strategies individuals can employ to get more value out of their personal AI use. psu.edu/news/engineeri…

Rui Zhang (@ruizhang_nlp) 's Twitter Profile Photo

📢 Call for Papers: NewSumm 2025 - The 5th New Frontiers in Summarization Workshop at EMNLP 2025 The summarization research community is invited to submit to NewSumm 2025, co-located with EMNLP 2025! As LLMs continue to transform our field, we're expanding beyond traditional

📢 Call for Papers: NewSumm 2025 - The 5th New Frontiers in Summarization Workshop at EMNLP 2025

The summarization research community is invited to submit to NewSumm 2025, co-located with EMNLP 2025! As LLMs continue to transform our field, we're expanding beyond traditional
Ryo Kamoi (@ryokamoi) 's Twitter Profile Photo

I'll be attending #COLM2025 Conference on Language Modeling in person 🇨🇦 I will present our work, VisOnlyQA, on the limitations of vision-language models at Poster Session 4 (Wed). Looking forward to chatting with everyone! Paper: openreview.net/forum?id=PYHwl… x.com/RyoKamoi/statu…

I'll be attending #COLM2025 <a href="/COLM_conf/">Conference on Language Modeling</a> in person 🇨🇦
I will present our work, VisOnlyQA, on the limitations of vision-language models at Poster Session 4 (Wed). Looking forward to chatting with everyone!

Paper: openreview.net/forum?id=PYHwl…
x.com/RyoKamoi/statu…