Zhecan James Wang (@zcjw2021) Twitter Tweets • TwiCopy

Zhecan James Wang

@zcjw2021

+ Follow

CS Ph.D.@Columbia University, Student Researcher@Google Deepmind, Part-time Researcher@Microsoft Research, Department Representative@Columbia Engineering

ID: 1300123937121153024

linkhttp://zhecanwang.com calendar_today30-08-2020 17:31:25

38 Tweet

75 Takipçi

495 Takip Edilen

Rohan Paul

@rohanpaul_ai

a year ago

Consolidated insights on LLM fine-tuning - a long read across 114 pages. "Ultimate Guide to Fine-Tuning LLMs" Worth a read during the weekend. Few ares it covers 👇 📊 Fine-tuning Pipeline → Outlines a seven-stage process for fine-tuning LLMs, from data preparation to

thumb_up_off_alt1,1K

chat_bubble_outline15

repeat268

shareShare

Xueqing Wu

@xueqing_w

10 months ago

🔥Come check our 📊𝗗𝗔𝗖𝗢 paper at #NeurIPS2024 ! Our poster will be at📍𝗪𝗲𝘀𝘁 𝗕𝗮𝗹𝗹𝗿𝗼𝗼𝗺 #𝟱𝟯𝟬𝟰 ⏰ tomorrow Thursday 12/12 𝟭𝟭𝗮𝗺-𝟮𝗽𝗺 ! 😆I’ll be handing out 🍫 chocolates!

thumb_up_off_alt23

chat_bubble_outline1

repeat9

shareShare

Zhecan James Wang

@zcjw2021

10 months ago

Excited to be at #NeurIPS2024 in #Vancouver from today through Sunday! If you’re interested in multimodal learning, I’d love to connect. Feel free to drop by, chat, and catch our poster and oral sessions: Poster Session: ⏲️When: Thu, Dec 12, 11 a.m. – 2 p.m. PST 🗻Where: East

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Jiao Sun

@sunjiao123sun_

10 months ago

Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference NeurIPS Conference We have ethical reviews for authors, but missed it for invited speakers? 😡

Mitigating racial bias from LLMs is a lot easier than removing it from humans!

Can’t believe this happened at the best AI conference <a href="/NeurIPSConf/">NeurIPS Conference</a>

We have ethical reviews for authors, but missed it for invited speakers? 😡

thumb_up_off_alt3,3K

chat_bubble_outline184

repeat837

shareShare

Dhruv Batra

@dhruvbatradb

10 months ago

Brilliant talk by Ilya Sutskever, but he's wrong on one point. We are NOT running out of data. We are running out human-written text. We have more videos than we know what to do with. We just haven't solved pre-training in vision. Just go out and sense the world. Data is easy.

Brilliant talk by <a href="/ilyasut/">Ilya Sutskever</a>, but he's wrong on one point.

We are NOT running out of data. We are running out human-written text.

We have more videos than we know what to do with. We just haven't solved pre-training in vision.

Just go out and sense the world. Data is easy.

thumb_up_off_alt850

chat_bubble_outline66

repeat76

shareShare

Zhecan James Wang

@zcjw2021

10 months ago

RIP. Mental health is crucial for working in a fast-paced AI field nowadays.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Zhecan James Wang

@zcjw2021

10 months ago

techcrunch.com/2024/12/16/goo…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

OpenAI

@openai

10 months ago

Day 11: A new way to work with ChatGPT openai.com/12-days/?day=11

thumb_up_off_alt1,1K

chat_bubble_outline267

repeat212

shareShare

Zhecan James Wang

@zcjw2021

10 months ago

As a fellow Oliner, I’m proud and excited to see Olin College alumni like Alec Radford and Luke Metz pushing the boundaries of AI research. Their work at the frontier of the field is truly inspiring!

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Rohan Paul

@rohanpaul_ai

10 months ago

Nice paper from Google DeepMind When models share work, they accidentally share your secrets too. MoE models can leak user prompts through expert routing vulnerabilities in batched processing. Expert-Choice-Routing and Token dropping in MoE creates a backdoor to steal user

Nice paper from <a href="/GoogleDeepMind/">Google DeepMind</a>

When models share work, they accidentally share your secrets too.

MoE models can leak user prompts through expert routing vulnerabilities in batched processing.

Expert-Choice-Routing and Token dropping in MoE creates a backdoor to steal user

thumb_up_off_alt390

chat_bubble_outline8

repeat65

shareShare

Rohan Paul

@rohanpaul_ai

10 months ago

The most discussed AI papers of from last week (15-Dec-2024 to 21-Dec-2024) 👇 (consider subscribing, its FREE and I publish it daily ) open.substack.com/pub/rohanpaul/…

thumb_up_off_alt6

chat_bubble_outline1

repeat1

shareShare

DeepSeek

@deepseek_ai

10 months ago

🚀 Introducing DeepSeek-V3! Biggest leap forward yet: ⚡ 60 tokens/second (3x faster than V2!) 💪 Enhanced capabilities 🛠 API compatibility intact 🌍 Fully open-source models & papers 🐋 1/n

thumb_up_off_alt13,13K

chat_bubble_outline676

repeat2,2K

shareShare

Jianshu Zhang ✈️ICLR2025🇸🇬

@sterzhang

8 months ago

🚀 Introducing VLM²-Bench! A simple yet essential ability that we use in daily life. But when tackling vision-centric tasks without relying on prior knowledge, can VLMs perform well? 🤔 🔗 Project Page: vlm2-bench.github.io More details below! 👇 (1/n)

thumb_up_off_alt124

chat_bubble_outline1

repeat46

shareShare

Hritik Bansal

@hbxnov

6 months ago

📢Scaling test-time compute via generative verification (GenRM) is an emerging paradigm and shown to be more efficient than self-consistency (SC) for reasoning. But, such claims are misleading☠️ Our compute-matched analysis shows that SC outperforms GenRM across most budgets! 🧵

thumb_up_off_alt260

chat_bubble_outline2

repeat49

shareShare