Shahan (@shahanmemon) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Abhilasha Ravichander

@lasha_nlp

5 months ago

We are launching HALoGEN💡, a way to systematically study *when* and *why* LLMs still hallucinate. New work w/ Shrusti Ghela* David Wadden Yejin Choi 💫 🧵 [1/n]

We are launching HALoGEN💡, a way to systematically study *when* and *why* LLMs still hallucinate.

New work w/ <a href="/shrusti_ghela/">Shrusti Ghela</a>* <a href="/davidjwadden/">David Wadden</a> <a href="/YejinChoinka/">Yejin Choi</a> 💫

🧵 [1/n]

thumb_up_off_alt165

chat_bubble_outline1

repeat40

shareShare

Sander Dieleman

@sedielem

2 months ago

New blog post: let's talk about latents! sander.ai/2025/04/15/lat…

thumb_up_off_alt946

chat_bubble_outline24

repeat188

shareShare

Academia or industry, expert or novice, infrastructure shouldn't be your bottleneck. Garden democratizes access to OpenCatalyst models for ALL researchers. This is how tomorrow's breakthroughs will be found. Garden is a new superpower for scientists.

thumb_up_off_alt38

chat_bubble_outline1

repeat4

shareShare

Parshin Shojaee

@parshinshojaee

2 months ago

Scientific discovery with LLMs has so much potential yet is underexplored. Our new benchmark **LLM-SRBench** enable rigorous evaluations of equation discovery with LLMs! 🧠Key takeaway: Even SOTA discovery models with strong LLM backbones still fail to discover mathematical

thumb_up_off_alt170

chat_bubble_outline3

repeat30

shareShare

Shahan

@shahanmemon

2 months ago

Wish I was a pre-doc! This is a great opportunity!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Emma Hoes

@emmahoes93

2 months ago

🚨New paper out in PNASNews ! Existential AI risks do **not** distract from immediate harms. In our study (n = 10,800), people consistently prioritize current threats - bias, misinformation, job loss - over sci-fi doom! 💥👉 pnas.org/doi/10.1073/pn…

🚨New paper out in <a href="/PNASNews/">PNASNews</a> !

Existential AI risks do **not** distract from immediate harms. In our study (n = 10,800), people consistently prioritize current threats - bias, misinformation, job loss - over sci-fi doom!

💥👉 pnas.org/doi/10.1073/pn…

thumb_up_off_alt114

chat_bubble_outline3

repeat20

shareShare

liminal

@liminal1988

2 months ago

Appreciate it.

thumb_up_off_alt7,7K

chat_bubble_outline18

repeat1,1K

shareShare

Melissa Pan

@melissapan

2 months ago

🚨 Why Do Multi-Agent LLM Systems Fail? ⁉️ 🔥 Introducing MAST: The first multi-agent failure taxonomy - consists of 14 failure modes and 3 categories, generalizes for diverse multi-agent systems and tasks! Paper: arxiv.org/pdf/2503.13657 Code: github.com/multi-agent-sy… 🧵1/n

thumb_up_off_alt186

chat_bubble_outline4

repeat54

shareShare

Dashun Wang

@dashunwang

2 months ago

🚨 Our latest paper is out today in Science! We uncover stark and systematic partisan differences in the amount, content, and character of science used in policy, which mirror differences in political elites’ trust in science. Four years in the making. Led by Zander Furnas 1/n

thumb_up_off_alt385

chat_bubble_outline3

repeat129

shareShare

John B. Holbein

@johnholbein1

2 months ago

Here's some good news! The file drawer problem may have diminished in recent years, at least in social science survey experiments "This suggests increased recognition of the importance of null results."

thumb_up_off_alt216

chat_bubble_outline7

repeat55

shareShare

Ronen Tamari

@rtk254

2 months ago

"A society that can no longer read complex texts may soon find itself unable to think complex thoughts [...] reading becomes not just a cognitive act, but a civic one: a rehearsal for the intellectual stamina that democracy requires." 🎯

thumb_up_off_alt13

chat_bubble_outline1

repeat4

shareShare

Haofei Yu 🦋 @haofeiyu.bsky.social

@haofeiyu44

2 months ago

🧪 Want an AI-generated paper draft in just 1 minute? Or dreaming of building auto-research apps but frustrated with setups? Meet tiny-scientist, a minimal package to start AI-powered research: 👉 pip install tiny-scientist 🔗 github.com/ulab-uiuc/tiny… #AIAgent #pythonpackages

thumb_up_off_alt30

chat_bubble_outline3

repeat10

shareShare

Arthur Spirling

@arthur_spirling

2 months ago

Again, I think academics have perhaps not quite groked what this sort of wholesale removal of a model means for replication in science. This isn’t versioned or downloadable, and you won’t be able to recreate it

thumb_up_off_alt261

chat_bubble_outline9

repeat37

shareShare

Atoosa Kasirzadeh

@dr_atoosa

2 months ago

📢 New paper with Iason Gabriel is out! 2025 is being called the year of AI agents, with overwhelming headlines about them every day. But we lack a shared vocabulary to distinguish their fundamental properties. Our paper aims to bridge this gap. A 🧵

📢 New paper with <a href="/IasonGabriel/">Iason Gabriel</a> is out! 2025 is being called the year of AI agents, with overwhelming headlines about them every day. But we lack a shared vocabulary to distinguish their fundamental properties. Our paper aims to bridge this gap. A 🧵

thumb_up_off_alt123

chat_bubble_outline5

repeat33

shareShare

Sam Rodriques

@sgrodriques

2 months ago

Watch our team explain how you can use the FutureHouse Platform to come up with new hypotheses and make new discoveries.

thumb_up_off_alt360

chat_bubble_outline10

repeat54

shareShare

TechCrunch

@techcrunch

2 months ago

FutureHouse releases AI tools it claims can accelerate science | TechCrunch techcrunch.com/2025/05/01/fut…

thumb_up_off_alt101

chat_bubble_outline6

repeat33

shareShare

Dr Claire Malone FRSA

@geeknproud42

2 months ago

A new webinar series is coming soon 🎤 Designed for anyone who writes—emails, reports, blogs, posts—and wants to use AI as a tool for clarity, not chaos. 💬 Prompt better ⚡ Write smarter 🧠 Keep your voice Watch this space 👀 #AI #writing #ChatGPT #ProductivityHacks

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Deb Raji

@rajiinio

2 months ago

Lately, I've been seriously exploring what it could mean to move beyond the benchmarking paradigm in ML evaluation and it's led to some stats-y papers: (1) a critique of current experimental evals of prediction based interventions & (2) a framework for adverse events reporting.

thumb_up_off_alt102

chat_bubble_outline5

repeat16

shareShare

Shahan

@shahanmemon

2 months ago

Apply to Lucy as a #PhD student next cycle. What an opportunity. #AcademicTwitter

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Sayash Kapoor

@sayashk

a month ago

Seth Lazar It is necessary to invest in alternatives right now. Without it, we might see the worst aspects of our current platform economy amplified. We don't have all the answers in the paper, but we have a blueprint for where to start. Paper: arxiv.org/pdf/2505.04345

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Shahan

Gate.io

Abhilasha Ravichander

Sander Dieleman

Ben Blaiszik

Parshin Shojaee

Shahan

Emma Hoes

liminal

Melissa Pan

Dashun Wang

John B. Holbein

Ronen Tamari

Haofei Yu 🦋 @haofeiyu.bsky.social

Arthur Spirling

Atoosa Kasirzadeh

Sam Rodriques

TechCrunch

Dr Claire Malone FRSA

Deb Raji

Shahan

Sayash Kapoor