Haochen Zhang (@jhaochenz) 's Twitter Profile
Haochen Zhang

@jhaochenz

Member of Technical Staff @AnthropicAI | prev. CS PhD @StanfordAILab

ID: 1704090745

linkhttps://cs.stanford.edu/~jhaochen/ calendar_today27-08-2013 08:06:18

26 Tweet

743 Followers

259 Following

Haochen Zhang (@jhaochenz) 's Twitter Profile Photo

Glad to share this new work! We theoretically show that parameter-dependent noises provide a bias toward parameters where noise itself is small. This is a potentially stronger effect than simply escaping sharp local minima. Code for our project is publicly available!

Roger Grosse (@rogergrosse) 's Twitter Profile Photo

With the right regularizer, linear autoencoders recover the ordered principal components, but very slowly. Interesting model system for representation learning since we know the optimal representation. New work w/ Jenny (Xuchan) Bao, James Lucas, and Sushant Sachdeva. arxiv.org/abs/2007.06731

Tengyu Ma (@tengyuma) 's Twitter Profile Photo

Why does contrastive learning magically produce linearly separable features? We leverage spectral graph theory to analyze it under realistic settings. (In contrast, many prior works require that positive pairs are independent conditioned on the label.) arxiv.org/abs/2106.04156

Why does contrastive learning magically produce linearly separable features? We leverage spectral graph theory to analyze it under realistic settings. (In contrast, many prior works require that positive pairs are independent conditioned on the label.)  arxiv.org/abs/2106.04156
Tengyu Ma (@tengyuma) 's Twitter Profile Photo

Thinking of applying self-supervised learning (SSL) on your uncurated, imbalanced datasets? Good news: we found SSL is more robust to long tails than supervised representations. We also present theoretical and empirical analyses and an improved algorithm. arxiv.org/abs/2110.05025

Thinking of applying self-supervised learning (SSL) on your uncurated, imbalanced datasets? Good news: we found SSL is more robust to long tails than supervised representations. We also present theoretical and empirical analyses and an improved algorithm. arxiv.org/abs/2110.05025
Tengyu Ma (@tengyuma) 's Twitter Profile Photo

Pretraining is ≈SoTA for domain adaptation: just do contrastive learning on *all* unlabeled data + finetune on source labels. Features are NOT domain-invariant, but disentangle class & domain info to enable transfer. Theory & exps: arxiv.org/abs/2204.00570 arxiv.org/abs/2204.02683

Pretraining is ≈SoTA for domain adaptation: just do contrastive learning on *all* unlabeled data + finetune on source labels. Features are NOT domain-invariant, but disentangle class & domain info to enable transfer. Theory & exps: arxiv.org/abs/2204.00570 arxiv.org/abs/2204.02683
Stanford AI Lab (@stanfordailab) 's Twitter Profile Photo

Curious about why contrastive learning produces representations useful for downstream tasks? Check out Haochen Zhang, Colin Wei, and Tengyu Ma's theoretical explanation in our latest blog post: ai.stanford.edu/blog/understan… Based on the NeurIPS 2021 oral paper: arxiv.org/abs/2106.04156

Haochen Zhang (@jhaochenz) 's Twitter Profile Photo

How does model architecture influence the contrastive representations? Check out our paper "A theoretical study of inductive biases in contrastive learning" at #ICLR2023 . Virtual poster: iclr.cc/virtual/2023/p…. Joint work with Tengyu Ma.

How does model architecture influence the contrastive representations? Check out our paper "A theoretical study of inductive biases in contrastive learning" at #ICLR2023 . Virtual poster: iclr.cc/virtual/2023/p…. Joint work with <a href="/tengyuma/">Tengyu Ma</a>.
Tengyu Ma (@tengyuma) 's Twitter Profile Photo

📢 Introducing Voyage AI Voyage_AI_! Founded by a talented team of leading AI researchers and me 🚀🚀. We build state-of-the-art embedding models (e.g., better than OpenAI 😜). We also offer custom models that deliver 🎯+10-20% accuracy gain in your LLM products. 🧵

📢 Introducing Voyage AI <a href="/Voyage_AI_/">Voyage_AI_</a>!

Founded by a talented team of leading AI researchers and me 🚀🚀.

We build state-of-the-art embedding models (e.g., better than OpenAI 😜).

We also offer custom models that deliver 🎯+10-20% accuracy gain in your LLM products. 🧵
Alex Albert (@alexalbert__) 's Twitter Profile Photo

Artifacts pro tip: If you are running into unsupported library errors with NPM modules, just ask Claude to use the cdnjs link instead and it should work just fine.

Artifacts pro tip:

If you are running into unsupported library errors with NPM modules, just ask Claude to use the cdnjs link instead and it should work just fine.
Mike Krieger (@mikeyk) 's Twitter Profile Photo

New today: organize your claude[dot]ai chats in Projects, and add files & context and 💫 custom instructions 💫 that are shared across all project chats. And on the Claude Team plan, you can discover great uses of Claude within your team using a new activity feed & shared chats.

Sara Price (@sprice354_) 's Twitter Profile Photo

We've made Claude Opus 4 and Claude Sonnet 4 significantly better at avoiding reward hacking behaviors (like hard-coding and special-casing in code settings) that we frequently saw in Claude Sonnet 3.7.

Haochen Zhang (@jhaochenz) 's Twitter Profile Photo

Glad to see what we've been working on finally get out! Lots of cool ideas and hard work have gone into this model, and there are still so many open research questions ahead around solving alignment and scaling up RL. Looking forward to what we'll learn next.