ML@CMU (@mlcmublog) Twitter Tweets • TwiCopy

ML@CMU

@mlcmublog

+ Follow

Official twitter account for the ML@CMU blog @mldcmu @SCSatCMU

ID: 1233552889055834112

linkhttps://blog.ml.cmu.edu/ calendar_today29-02-2020 00:42:45

110 Tweet

2,2K Takipçi

20 Takip Edilen

good girl

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

ML@CMU

a year ago

blog.ml.cmu.edu/2024/10/07/vqa… With the rapid advancement of text-to-visual models like Sora, Midjourney, and Stable Diffusion, evaluating how well the generated imagery follows input text prompts has become a major challenge. However, work by Zhiqiu Lin, Deepak Pathak, Baiqi Li, Emily

thumb_up_off_alt11

chat_bubble_outline0

repeat3

shareShare

ML@CMU

a year ago

blog.ml.cmu.edu/2024/10/29/jai… AI-powered robots are alarmingly easy to jailbreak to perform dangerous tasks, including delivering bombs, surveilling humans, and ignoring traffic laws. What does the future hold for AI-powered robots? Learn more in our latest blog post, based on work

thumb_up_off_alt14

chat_bubble_outline0

repeat6

shareShare

ML@CMU

a year ago

blog.ml.cmu.edu/2024/11/07/ide… Demining 70+ war-affected countries could take 1,100 years at the current pace. This AI-powered tool, developed in close collaboration with the UN in work led by Mateo Dulce, halves false alarms and speeds up clearance. Now tested in Afghanistan &

blog.ml.cmu.edu/2024/11/07/ide…

Demining 70+ war-affected countries could take 1,100 years at the current pace. This AI-powered tool, developed in close collaboration with the UN in work led by Mateo Dulce, halves false alarms and speeds up clearance. Now tested in Afghanistan &

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

ML@CMU

a year ago

blog.ml.cmu.edu/2024/12/02/car… Check out our latest blog post on CMU @ NeurIPS 2024!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

ML@CMU

a year ago

blog.ml.cmu.edu/2024/12/06/scr… A critical question arises when using large language models: should we fine-tune them or rely on prompting with in-context examples? Recent work led by Junhong Shen and collaborators demonstrates that we can develop state-of-the-art web agents by

blog.ml.cmu.edu/2024/12/06/scr…

A critical question arises when using large language models: should we fine-tune them or rely on prompting with in-context examples? Recent work led by <a href="/JunhongShen1/">Junhong Shen</a> and collaborators demonstrates that we can develop state-of-the-art web agents by

thumb_up_off_alt14

chat_bubble_outline0

repeat3

shareShare

ML@CMU

a year ago

blog.ml.cmu.edu/2024/12/12/hum… Have you had difficulty using a new machine for DIY or latte-making? Have you forgotten to add spice during cooking? Riku Arakawa Hiromu Yakura Vimal Mollyn, Jill Fain Lehman, and Mayank Goel are leveraging multimodal sensing to improve the

thumb_up_off_alt14

chat_bubble_outline0

repeat5

shareShare

ML@CMU

a year ago

blog.ml.cmu.edu/2025/01/02/ind… Why is our brain 🧠 modular with specialized areas? Recent research by Ruiyi Zhang @Xaqlab shows that artificial agents 🤖 with modular architectures—mirroring brain-like specialization—achieve better learning and generalization in naturalistic navigation

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

ML@CMU

10 months ago

blog.ml.cmu.edu/2025/01/08/opt… How can we train LLMs to solve complex challenges beyond just data scaling? In a new blogpost, Amrith Setlur, Yuxiao Qu Matthew Yang, Lunjun Zhang , Virginia Smith and Aviral Kumar demonstrate that Meta RL can help LLMs better optimize test time compute

thumb_up_off_alt91

chat_bubble_outline3

repeat22

shareShare

ML@CMU

7 months ago

blog.ml.cmu.edu/2025/04/09/cop… How do real-world developer preferences compare to existing evaluations? A CMU and UC Berkeley team led by Wayne Chi and Valerie Chen created Copilot Arena to collect user preferences on in-the-wild workflows. This blogpost overviews the design and

blog.ml.cmu.edu/2025/04/09/cop…

How do real-world developer preferences compare to existing evaluations? A CMU and UC Berkeley team led by <a href="/iamwaynechi/">Wayne Chi</a> and <a href="/valeriechen_/">Valerie Chen</a> created <a href="/CopilotArena/">Copilot Arena</a> to collect user preferences on in-the-wild workflows. This blogpost overviews the design and

thumb_up_off_alt18

chat_bubble_outline0

repeat7

shareShare

ML@CMU

7 months ago

blog.ml.cmu.edu/2025/04/18/llm… 📈⚠️ Is your LLM unlearning benchmark measuring what you think it is? In a new blog post authored by Pratiksha Thaker, Shengyuan Hu, Neil Kale, Yash Maurya, Steven Wu, and Virginia Smith, we discuss why empirical benchmarks are necessary but not

thumb_up_off_alt12

chat_bubble_outline0

repeat11

shareShare

ML@CMU

7 months ago

blog.ml.cmu.edu/2025/04/21/all… Check out our new blog post on ALLIE, a new chess AI that actually plays like a human! Unlike Stockfish or AlphaZero that focus on winning at all costs, ALLIE uses a transformer model trained on human chess games to make moves, ponder and resign like

thumb_up_off_alt1

chat_bubble_outline0

repeat2

shareShare

ML@CMU

7 months ago

blog.ml.cmu.edu/2025/04/23/car… Check out our latest blog post on CMU @ ICLR 2025!

thumb_up_off_alt4

chat_bubble_outline0

repeat2

shareShare

ML@CMU

6 months ago

blog.ml.cmu.edu/2025/05/22/unl… Are your LLMs truly forgetting unwanted data? In this new blog post authored by Shengyuan Hu, Yiwei Fu, Steven Wu, and Virginia Smith, we discuss how benign relearning can jog unlearned LLM's memory to recover knowledge that is supposed to be forgotten.

thumb_up_off_alt5

chat_bubble_outline0

repeat3

shareShare

ML@CMU

6 months ago

blog.ml.cmu.edu/2025/06/01/rlh… In this in-depth coding tutorial, Zhaolin Gao and Gokul Swamy walk through the steps to train an LLM via RL from Human Feedback!

thumb_up_off_alt25

chat_bubble_outline0

repeat8

shareShare