Tianzhe Chu @ ICLR 2025 (@tianzhec) 's Twitter Profile
Tianzhe Chu @ ICLR 2025

@tianzhec

Now @hkudatascience. Previous @ShanghaiTechUni, visited @UCBerkeley.

ID: 1565621205528326145

linkhttp://tianzhechu.com calendar_today02-09-2022 08:42:45

73 Tweet

230 Takipçi

171 Takip Edilen

Kyunghyun Cho (@kchonyc) 's Twitter Profile Photo

it feels like Yann LeCun is going through his decades-old ideas and re-introducing them one at a time 😂 was the optimal alpha = 2/3 and the optimal gamma = 1.7159? 🤣🤣🤣

it feels like <a href="/ylecun/">Yann LeCun</a> is going through his decades-old ideas and re-introducing them one at a time 😂

was the optimal alpha = 2/3 and the optimal gamma = 1.7159? 🤣🤣🤣
Anthropic (@anthropicai) 's Twitter Profile Photo

New Anthropic research: Tracing the thoughts of a large language model. We built a "microscope" to inspect what happens inside AI models and use it to understand Claude’s (often complex and surprising) internal mechanisms.

Peyman Milanfar (@docmilanfar) 's Twitter Profile Photo

Statistical "degrees of freedom" (df) is in general not the same as "the # of parameters." The df for any 1-1 (‘image-to-image’) model ŷ =𝐟(y) : ℝⁿ → ℝⁿ is the trace of its Jacobian: df = div[ 𝐟(y) ] = Trace[ ∇ 𝐟(y) ] 1/n

Statistical "degrees of freedom" (df) is in general not the same as "the # of parameters."

The df for any 1-1 (‘image-to-image’) model

ŷ =𝐟(y) : ℝⁿ → ℝⁿ

is the trace of its Jacobian:

df = div[ 𝐟(y) ] = Trace[ ∇ 𝐟(y) ]

1/n
David Fan (@davidjfan) 's Twitter Profile Photo

Can visual SSL match CLIP on VQA? Yes! We show with controlled experiments that visual SSL can be competitive even on OCR/Chart VQA, as demonstrated by our new Web-SSL model family (1B-7B params) which is trained purely on web images – without any language supervision.

Can visual SSL match CLIP on VQA?

Yes! We show with controlled experiments that visual SSL can be competitive even on OCR/Chart VQA, as demonstrated by our new Web-SSL model family (1B-7B params) which is trained purely on web images – without any language supervision.
Peter Tong (@tongpetersb) 's Twitter Profile Photo

Vision models have been smaller than language models; what if we scale them up? Introducing Web-SSL: A family of billion-scale SSL vision models (up to 7B parameters) trained on billions of images without language supervision, using VQA to evaluate the learned representation.

Vision models have been smaller than language models; what if we scale them up?

Introducing Web-SSL: A family of billion-scale SSL vision models (up to 7B parameters) trained on billions of images without language supervision, using VQA to evaluate the learned representation.
Peter Tong (@tongpetersb) 's Twitter Profile Photo

We're open-sourcing the training code for MetaMorph! MetaMorph offers a lightweight framework for turning LLMs into unified multimodal models: (multimodal) tokens -> transformers -> diffusion -> pixel! This is our best take on unified modeling as of November 2024, and

Tianzhe Chu @ ICLR 2025 (@tianzhec) 's Twitter Profile Photo

Will be at ICLR 2025! No paper No plan With camera V me $5 you can get an edited portrait plus an ig follower. Tariffed 245% if you pay by Zella

Will be at ICLR 2025!
No paper
No plan
With camera

V me $5 you can get an edited portrait plus an ig follower.
Tariffed 245% if you pay by Zella
Druv Pai (@druv_pai) 's Twitter Profile Photo

I'm at ICLR this week! I'll be presenting ToST, a (provably) computationally efficient high-performance deep architecture derived from information theory and convex analysis principles. 📅 Saturday April 26, 10AM-12:30PM 📌 Hall 3 + Hall 2B #145 💡Awarded a Spotlight! (1/3)

I'm at ICLR this week! I'll be presenting ToST, a (provably) computationally efficient high-performance deep architecture derived from information theory and convex analysis principles.

📅 Saturday April 26, 10AM-12:30PM
📌 Hall 3 + Hall 2B #145
💡Awarded a Spotlight!

(1/3)
Chun-Hsiao (Daniel) Yeh (@danielyehhh) 's Twitter Profile Photo

❗️❗️ Can MLLMs understand scenes from multiple camera viewpoints — like humans? 🧭 We introduce All-Angles Bench — 2,100+ QA pairs on multi-view scenes. 📊 We evaluate 27 top MLLMs, including Gemini-2.0-Flash, Claude-3.7-Sonnet, and GPT-4o. 🌐 Project: danielchyeh.github.io/All-Angles-Ben…

❗️❗️ Can MLLMs understand scenes from multiple camera viewpoints — like humans?

🧭 We introduce All-Angles Bench — 2,100+ QA pairs on multi-view scenes.

📊 We evaluate 27 top MLLMs, including Gemini-2.0-Flash, Claude-3.7-Sonnet, and GPT-4o.

🌐 Project: danielchyeh.github.io/All-Angles-Ben…
Kai He (@kai__he) 's Twitter Profile Photo

🚀 Introducing UniRelight, a general-purpose relighting framework powered by video diffusion models. 🌟UniRelight jointly models the distribution of scene intrinsics and illumination, enabling high-quality relighting and intrinsic decomposition from a single image or video.

Two Minute Papers (@twominutepapers) 's Twitter Profile Photo

NVIDIA’s AI watched 150,000 videos… and learned to relight scenes incredibly well! No game engine. No 3D software. And it has an amazing cat demo. 🐱💡 Hold on to your papers! Full video: youtube.com/watch?v=yRk6vG…

NVIDIA’s AI watched 150,000 videos… and learned to relight scenes incredibly well! No game engine. No 3D software. And it has an amazing cat demo. 🐱💡
Hold on to your papers! Full video: youtube.com/watch?v=yRk6vG…
Jyo Pari (@jyo_pari) 's Twitter Profile Photo

For agents to improve over time, they can’t afford to forget what they’ve already mastered. We found that supervised fine-tuning forgets more than RL when training on a new task! Want to find out why? 👇

For agents to improve over time, they can’t afford to forget what they’ve already mastered.

We found that supervised fine-tuning forgets more than RL when training on a new task! 

Want to find out why? 👇
Jasper (@zjasper666) 's Twitter Profile Photo

GAUSS: General Assessment of Underlying Structured Skills in Mathematics We’re excited to launch GAUSS, a next-generation math AI benchmark built to overcome the limitations of low skill resolution in today’s benchmarks. What it does GAUSS profiles LLMs across 12 cognitive

GAUSS: General Assessment of Underlying Structured Skills in Mathematics

We’re excited to launch GAUSS, a next-generation math AI benchmark built to overcome the limitations of low skill resolution in today’s benchmarks.

What it does
GAUSS profiles LLMs across 12 cognitive
Conference on Parsimony and Learning (CPAL) (@cpalconf) 's Twitter Profile Photo

Calling all parsimony and learning researchers 🚨🚨 The 3rd annual CPAL will be held in Tübingen Germany March 23–26, 2026! Check out this year's website for all the details cpal.cc

Calling all parsimony and learning researchers 🚨🚨 The 3rd annual CPAL will be held in Tübingen Germany March 23–26, 2026! Check out this year's website for all the details cpal.cc
Saining Xie (@sainingxie) 's Twitter Profile Photo

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right.

today, we introduce Representation Autoencoders (RAE).

&gt;&gt; Retire VAEs. Use RAEs. 👇(1/n)