Zhecan James Wang (@zcjw2021) 's Twitter Profile
Zhecan James Wang

@zcjw2021

CS Ph.D.@Columbia University, Student Researcher@Google Deepmind, Part-time Researcher@Microsoft Research, Department Representative@Columbia Engineering

ID: 1300123937121153024

linkhttp://zhecanwang.com calendar_today30-08-2020 17:31:25

38 Tweet

75 Takipçi

495 Takip Edilen

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Consolidated insights on LLM fine-tuning - a long read across 114 pages. "Ultimate Guide to Fine-Tuning LLMs" Worth a read during the weekend. Few ares it covers 👇 📊 Fine-tuning Pipeline → Outlines a seven-stage process for fine-tuning LLMs, from data preparation to

Consolidated insights on LLM fine-tuning - a long read across 114 pages.

"Ultimate Guide to Fine-Tuning LLMs"

Worth a read during the weekend.

Few ares it covers  👇

📊 Fine-tuning Pipeline

→ Outlines a seven-stage process for fine-tuning LLMs, from data preparation to
Xueqing Wu (@xueqing_w) 's Twitter Profile Photo

🔥Come check our 📊𝗗𝗔𝗖𝗢 paper at #NeurIPS2024 ! Our poster will be at📍𝗪𝗲𝘀𝘁 𝗕𝗮𝗹𝗹𝗿𝗼𝗼𝗺 #𝟱𝟯𝟬𝟰 ⏰ tomorrow Thursday 12/12 𝟭𝟭𝗮𝗺-𝟮𝗽𝗺 ! 😆I’ll be handing out 🍫 chocolates!

Zhecan James Wang (@zcjw2021) 's Twitter Profile Photo

Excited to be at #NeurIPS2024 in #Vancouver from today through Sunday! If you’re interested in multimodal learning, I’d love to connect. Feel free to drop by, chat, and catch our poster and oral sessions: Poster Session: ⏲️When: Thu, Dec 12, 11 a.m. – 2 p.m. PST 🗻Where: East

Jiao Sun (@sunjiao123sun_) 's Twitter Profile Photo

Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference NeurIPS Conference We have ethical reviews for authors, but missed it for invited speakers? 😡

Mitigating racial bias from LLMs is a lot easier than removing it from humans! 

Can’t believe this happened at the best AI conference <a href="/NeurIPSConf/">NeurIPS Conference</a> 

We have ethical reviews for authors, but missed it for invited speakers? 😡
Dhruv Batra (@dhruvbatradb) 's Twitter Profile Photo

Brilliant talk by Ilya Sutskever, but he's wrong on one point. We are NOT running out of data. We are running out human-written text. We have more videos than we know what to do with. We just haven't solved pre-training in vision. Just go out and sense the world. Data is easy.

Brilliant talk by <a href="/ilyasut/">Ilya Sutskever</a>, but he's wrong on one point. 

We are NOT running out of data. We are running out human-written text. 

We have more videos than we know what to do with. We just haven't solved pre-training in vision. 

Just go out and sense the world. Data is easy.
Zhecan James Wang (@zcjw2021) 's Twitter Profile Photo

As a fellow Oliner, I’m proud and excited to see Olin College alumni like Alec Radford and Luke Metz pushing the boundaries of AI research. Their work at the frontier of the field is truly inspiring!

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Nice paper from Google DeepMind When models share work, they accidentally share your secrets too. MoE models can leak user prompts through expert routing vulnerabilities in batched processing. Expert-Choice-Routing and Token dropping in MoE creates a backdoor to steal user

Nice paper from <a href="/GoogleDeepMind/">Google DeepMind</a>

When models share work, they accidentally share your secrets too.

MoE models can leak user prompts through expert routing vulnerabilities in batched processing.

Expert-Choice-Routing and Token dropping in MoE creates a backdoor to steal user
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

The most discussed AI papers of from last week (15-Dec-2024 to 21-Dec-2024) 👇 (consider subscribing, its FREE and I publish it daily ) open.substack.com/pub/rohanpaul/…

The most discussed AI papers of from last week (15-Dec-2024 to 21-Dec-2024)  👇

(consider subscribing, its FREE and I publish it daily )

open.substack.com/pub/rohanpaul/…
DeepSeek (@deepseek_ai) 's Twitter Profile Photo

🚀 Introducing DeepSeek-V3! Biggest leap forward yet: ⚡ 60 tokens/second (3x faster than V2!) 💪 Enhanced capabilities 🛠 API compatibility intact 🌍 Fully open-source models & papers 🐋 1/n

Jianshu Zhang ✈️ICLR2025🇸🇬 (@sterzhang) 's Twitter Profile Photo

🚀 Introducing VLM²-Bench! A simple yet essential ability that we use in daily life. But when tackling vision-centric tasks without relying on prior knowledge, can VLMs perform well? 🤔 🔗 Project Page: vlm2-bench.github.io More details below! 👇 (1/n)

🚀 Introducing VLM²-Bench!

A simple yet essential ability that we use in daily life.

But when tackling vision-centric tasks without relying on prior knowledge, can VLMs perform well? 🤔

🔗 Project Page: vlm2-bench.github.io

More details below! 👇 (1/n)
Hritik Bansal (@hbxnov) 's Twitter Profile Photo

📢Scaling test-time compute via generative verification (GenRM) is an emerging paradigm and shown to be more efficient than self-consistency (SC) for reasoning. But, such claims are misleading☠️ Our compute-matched analysis shows that SC outperforms GenRM across most budgets! 🧵

📢Scaling test-time compute via generative verification (GenRM) is an emerging paradigm and shown to be more efficient than self-consistency (SC) for reasoning. But, such claims are misleading☠️

Our compute-matched analysis shows that SC outperforms GenRM across most budgets! 🧵