Mert Yuksekgonul (@mertyuksekgonul) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Many providers offer inference APIs for the same models: for example, there were over nine Llama-3 8B APIs in Summer 2024. Do all of these APIs serve the same completion distribution as the original model? In our new paper, ✨Model Equality Testing: Which Model is This API

thumb_up_off_alt167

chat_bubble_outline5

repeat33

shareShare

Weixin Liang

@liang_weixin

7 months ago

How can we reduce pretraining costs for multi-modal models without sacrificing quality? We study this Q in our new work: arxiv.org/abs/2411.04996 At AI at Meta, We introduce Mixture-of-Transformers (MoT), a sparse architecture with modality-aware sparsity for every non-embedding

thumb_up_off_alt220

chat_bubble_outline5

repeat37

shareShare

Fatih Dinc

@fatihdin4en

7 months ago

I am defending my thesis next week, if you are around Stanford, please feel free to join! Here is the abstract and details! I will talk about several unpublished works on latent circuits subserving neural manifolds, how to train RNNs with millions of parameters on laptops with

thumb_up_off_alt237

chat_bubble_outline6

repeat12

shareShare

Mackenzie Mathis, PhD

@trackingactions

7 months ago

I’m happy to support an AI Fellow at EPFL in my group! ⬇️ Aside from our applied #AI4Science work, we are excited to push on fundamental problems in #ML Here is some recent work: arxiv.org/abs/2410.10744 & sslneurips23.github.io/paper_pdfs/pap…

thumb_up_off_alt41

chat_bubble_outline2

repeat11

shareShare

Teddi Worledge

@teddiworledge

6 months ago

🧵LLMs are great at synthesizing info, but unreliable at citing sources. Search engines are the opposite. What lies between them? Our new paper runs human evals on 7 systems across the✨extractive-abstractive spectrum✨for utility, citation quality, time-to-verify, & fluency!

thumb_up_off_alt65

chat_bubble_outline1

repeat21

shareShare

Luke Bailey

@lukebailey181

6 months ago

Can interpretability help defend LLMs? We find we can reshape activations while preserving a model’s behavior. This lets us attack latent-space defenses, from SAEs and probes to Circuit Breakers. We can attack so precisely that we make a harmfulness probe output this QR code. 🧵

thumb_up_off_alt366

chat_bubble_outline11

repeat83

shareShare

Wanjia Zhao

@wanjiazhao1203

4 months ago

Introducing #SIRIUS🌟: A self-improving multi-agent LLM framework that learns from successful interactions and refines failed trajectories, enhancing college-level reasoning and competitive negotiations. 📜Preprint: arxiv.org/pdf/2502.04780 💻code: github.com/zou-group/siri… 1/N

thumb_up_off_alt326

chat_bubble_outline8

repeat57

shareShare

Barışcan KURTKAYA

@bariskurtkaya

3 months ago

Our preprint is out!🚀 We explore attractor mechanisms that subserve short-term memory by training 35,000+ recurrent neural networks! Most importantly, we present a phase diagram that reveals how learning rate & delay length shape attractor dynamics. 👉arxiv.org/abs/2502.17433

thumb_up_off_alt53

chat_bubble_outline4

repeat9

shareShare

James Zou

@james_y_zou

3 months ago

⚡️Really thrilled that #textgrad is published in @nature today!⚡️ We present a general method for genAI to self-improve via our new *calculus of text*. We show how this optimizes agents🤖, molecules🧬, code🖥️, treatments💊, non-differentiable systems🤯 + more!

thumb_up_off_alt663

chat_bubble_outline19

repeat129

shareShare

nature

@nature

3 months ago

Nature research paper: Optimizing generative AI by backpropagating language model feedback go.nature.com/4ikGj1Y

thumb_up_off_alt104

chat_bubble_outline1

repeat30

shareShare

Karan Dalal

@karansdalal

2 months ago

Today, we're releasing a new paper – One-Minute Video Generation with Test-Time Training. We add TTT layers to a pre-trained Transformer and fine-tune it to generate one-minute Tom and Jerry cartoons with strong temporal consistency. Every video below is produced directly by

thumb_up_off_alt5,5K

chat_bubble_outline187

repeat940

shareShare

James Zou

@james_y_zou

2 months ago

Can LLMs learn to reason better by "cheating"?🤯 Excited to introduce #cheatsheet: a dynamic memory module enabling LLMs to learn + reuse insights from tackling previous problems 🎯Claude3.5 23% ➡️ 50% AIME 2024 🎯GPT4o 10% ➡️ 99% on Game of 24 Great job Mirac Suzgun w/ awesome

thumb_up_off_alt253

chat_bubble_outline9

repeat37

shareShare

Fatih Dinc

@fatihdin4en

2 months ago

As we say in Turkish "Yasasin 23 Nisan!" BTF is now officially accepting funding applications for summer research internships: bridgetoturkiye.org/our-work/schol… The program requires you to apply with a mentor, who can be a PhD or a postdoc in a US institution. Good luck!

thumb_up_off_alt51

chat_bubble_outline0

repeat13

shareShare

Mehmet Hamza Erol

@mhamzaerol

2 months ago

How much does a correct answer from an LM cost? How much has AI lowered the cost of solving problems? Meet Cost‑of‑Pass: An Economic Framework for Evaluating LMs! Cost‑of‑Pass = expected $ for one correct answer. Frontier Cost‑of‑Pass = cheapest route: an LM or a human expert.

thumb_up_off_alt66

chat_bubble_outline4

repeat20

shareShare

Mert Yuksekgonul

@mertyuksekgonul

2 months ago

Hamza has been working to make progress in AI and reasoning feel more measurable and grounded. He’s also genuinely enjoyable to work with. Check out his work and give him a follow (faculty friends, he’ll be applying to PhD programs next year 👀)

thumb_up_off_alt43

chat_bubble_outline0

repeat5

shareShare

Sabri Eyuboglu

@eyuboglusabri

a month ago

🇸🇬 If you're at ICLR and interested in model compression and conditional computation, go chat with Roberto and Jerry!! In our paper led by Roberto, we show how to convert any dense, pretrained linear layer into an MoE-like layer with dynamic sparsity!!

thumb_up_off_alt12

chat_bubble_outline0

repeat4

shareShare

Yiğit Korkmaz

@yigitkkorkmaz

a month ago

We see increasingly capable robot policies everyday. Yet during execution, they often act reasonably but fail to complete tasks, e.g. due to novel scenes or objects. Wouldn't it be nice if we provide a handful of interventions to the robot policies and they could learn from them?

thumb_up_off_alt32

chat_bubble_outline1

repeat6

shareShare

James Zou

@james_y_zou

a month ago

💸We expand the economic framework of cost-of-production to quantify the benefits of different LLMs arxiv.org/pdf/2504.13359 Llama3 8B + o1-mini really stand out as milestone jumps in efficiency and capability resp! Great job Mehmet Hamza Erol Mert Yuksekgonul Batu El Mirac Suzgun👏

thumb_up_off_alt19

chat_bubble_outline3

repeat4

shareShare

De Marke Sports

@demarkesports

a day ago

Aynen kardeşim soccer...

thumb_up_off_alt12,12K

chat_bubble_outline94

repeat77

shareShare

Mert Yuksekgonul

Gate.io

Irena Gao

Weixin Liang

Fatih Dinc

Mackenzie Mathis, PhD

Teddi Worledge

Luke Bailey

Wanjia Zhao

Barışcan KURTKAYA

James Zou

nature

Karan Dalal

James Zou

Fatih Dinc

Mehmet Hamza Erol

Mert Yuksekgonul

Sabri Eyuboglu

Yiğit Korkmaz

James Zou

De Marke Sports