Doug Downey (@_dougdowney) Twitter Tweets • TwiCopy

Doug Downey

@_dougdowney

+ Follow

Research Manager at @allen_ai, Prof at @northwesterncs

ID: 1258151736105041920

calendar_today06-05-2020 21:48:55

90 Tweet

342 Takipçi

195 Takip Edilen

good girl

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Doug Downey

a year ago

New scientific QA system led by Sergey Feldman, Amanpreet Singh, Joseph Chee Chang, Aakanksha Naik and team from AI2 & UW! Following AI2/UW's prior open QA system (x.com/AkariAsai/stat…) by Akari Asai, this adds thematic clustering, tables, and the latest proprietary models.

thumb_up_off_alt11

chat_bubble_outline0

repeat5

shareShare

Ai2

10 months ago

Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on

Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on

thumb_up_off_alt1,1K

chat_bubble_outline154

repeat379

shareShare

Ai2

10 months ago

We took our most efficient model and made an open-source iOS app📱but why? As phones get faster, more AI will happen on device. With OLMoE, researchers, developers, and users can get a feel for this future: fully private LLMs, available anytime. Learn more from Luca Soldaini 🎀👇

thumb_up_off_alt655

chat_bubble_outline47

repeat107

shareShare

Ai2

10 months ago

Introducing olmOCR, our open-source tool to extract clean plain text from PDFs! Built for scale, olmOCR handles many document types with high throughput. Run it on your own GPU for free—at over 3000 token/s, equivalent to $190 per million pages, or 1/32 the cost of GPT-4o!

thumb_up_off_alt1,1K

chat_bubble_outline85

repeat266

shareShare

Ai2

9 months ago

We’re excited to share some updates to Ai2 ScholarQA: 🗂️ You can now sign in via Google to save your query history across devices and browsers. 📚 We added 108M+ paper abstracts to our corpus - expect to get even better responses! ✨ The backbone model has been updated to the

We’re excited to share some updates to Ai2 ScholarQA:
🗂️ You can now sign in via Google to save your query history across devices and browsers.
📚 We added 108M+ paper abstracts to our corpus - expect to get even better responses!
✨ The backbone model has been updated to the

thumb_up_off_alt166

chat_bubble_outline3

repeat37

shareShare

Ai2

9 months ago

Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!

$Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!$

thumb_up_off_alt668

chat_bubble_outline29

repeat161

shareShare

Ai2

9 months ago

Meet Ai2 Paper Finder, an LLM-powered literature search system. Searching for relevant work is a multi-step process that requires iteration. Paper Finder mimics this workflow — and helps researchers find more papers than ever 🔍

Meet Ai2 Paper Finder, an LLM-powered literature search system.

Searching for relevant work is a multi-step process that requires iteration. Paper Finder mimics this workflow — and helps researchers find more papers than ever 🔍

thumb_up_off_alt1,1K

chat_bubble_outline19

repeat220

shareShare

Ai2

8 months ago

Imagine AI doing science: reading papers, generating ideas, designing and running experiments, analyzing results… How many more discoveries can we reveal? 🧐 Meet CodeScientist, a promising next step toward autonomous scientific discovery. 🧵

Imagine AI doing science: reading papers, generating ideas, designing and running experiments, analyzing results… How many more discoveries can we reveal? 🧐

Meet CodeScientist, a promising next step toward autonomous scientific discovery. 🧵

thumb_up_off_alt371

chat_bubble_outline6

repeat106

shareShare

Ai2

8 months ago

For years it’s been an open question — how much is a language model learning and synthesizing information, and how much is it just memorizing and reciting? Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light. 🔦

thumb_up_off_alt638

chat_bubble_outline17

repeat167

shareShare

Ai2

8 months ago

Ever wonder how LLM developers choose their pretraining data? It’s not guesswork— all AI labs create small-scale models as experiments, but the models and their data are rarely shared. DataDecide opens up the process: 1,050 models, 30k checkpoints, 25 datasets & 10 benchmarks 🧵

Ever wonder how LLM developers choose their pretraining data? It’s not guesswork— all AI labs create small-scale models as experiments, but the models and their data are rarely shared.
DataDecide opens up the process: 1,050 models, 30k checkpoints, 25 datasets & 10 benchmarks 🧵

thumb_up_off_alt659

chat_bubble_outline11

repeat121

shareShare

Semantic Scholar Research @ AI2

@ai2_s2research

7 months ago

Ai2 Semantic Scholar is hiring an #ml #nlp #ai reasoning researcher for a Research Scientist, Agents for Science position with target start dates in 2025. Excited about developing AI systems with deep reasoning capabilities for science? Send an application our way!

<a href="/allen_ai/">Ai2</a> <a href="/SemanticScholar/">Semantic Scholar</a>
is hiring an #ml #nlp #ai reasoning researcher for a Research Scientist, Agents for Science position with target start dates in 2025. Excited about developing AI systems with deep reasoning capabilities for science? Send an application our way!

thumb_up_off_alt18

chat_bubble_outline0

repeat9

shareShare

Ai2

5 months ago

Introducing SciArena, a platform for benchmarking models across scientific literature tasks. Inspired by Chatbot Arena, SciArena applies a crowdsourced LLM evaluation approach to the scientific domain. 🧵

Introducing SciArena, a platform for benchmarking models across scientific literature tasks. Inspired by Chatbot Arena, SciArena applies a crowdsourced LLM evaluation approach to the scientific domain. 🧵

thumb_up_off_alt381

chat_bubble_outline12

repeat63

shareShare

Doug Downey

5 months ago

This was a fun collaboration led by Yilun Zhao and Kaiyan Zhang from Arman Cohan's lab at Yale University. Annotators preferred o3 in our study, which was found to give more detailed and technical answers. Curious to see if community voting changes the picture!

thumb_up_off_alt7

chat_bubble_outline1

repeat0

shareShare

Ai2

5 months ago

We’ve upgraded ScholarQA, our agent that helps researchers conduct literature reviews efficiently by providing detailed answers. Now, when ScholarQA cites a source, it won’t just tell you which paper it came from–you’ll see the exact quote, highlighted in the original PDF. 🧵

We’ve upgraded ScholarQA, our agent that helps researchers conduct literature reviews efficiently by providing detailed answers. Now, when ScholarQA cites a source, it won’t just tell you which paper it came from–you’ll see the exact quote, highlighted in the original PDF. 🧵

thumb_up_off_alt154

chat_bubble_outline5

repeat32

shareShare

Ai2

5 months ago

Great science starts with great questions. 🤔✨ Meet AutoDS—an AI that doesn’t just hunt for answers, it decides which questions are worth asking. 🧵

Great science starts with great questions. 🤔✨ Meet AutoDS—an AI that doesn’t just hunt for answers, it decides which questions are worth asking. 🧵

thumb_up_off_alt332

chat_bubble_outline2

repeat38

shareShare

Hita K

4 months ago

Are you a researcher in CS or a CS-adjacent field who could use help in refining your research ideas? Want to try our new AI-powered tool that helps with just that in a paid user study? Details and sign up here! forms.gle/UPFjyJ59uuZ5Zb…

thumb_up_off_alt19

chat_bubble_outline2

repeat6

shareShare

Ai2

4 months ago

With fresh support of $75M from U.S. National Science Foundation and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

With fresh support of $75M from <a href="/NSF/">U.S. National Science Foundation</a> and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

thumb_up_off_alt659

chat_bubble_outline31

repeat64

shareShare

Ai2

4 months ago

🚀 In March, we launched Paper Finder, an LLM-powered literature search agent that surfaces papers other tools miss. Now, we’re releasing an open-source snapshot to enable others to inspect & build on it—and reproduce the results. 🧵

🚀 In March, we launched Paper Finder, an LLM-powered literature search agent that surfaces papers other tools miss. Now, we’re releasing an open-source snapshot to enable others to inspect & build on it—and reproduce the results. 🧵

thumb_up_off_alt444

chat_bubble_outline5

repeat60

shareShare

Ai2

4 months ago

🚨 SciArena update + evaluation of new models including GPT-5! 🚨 With thousands of new votes, new LLMs are reshaping our leaderboard for scientific literature tasks. o3 still leads—but GPT-5, Claude Opus 4.1, & more are closing the gap.

thumb_up_off_alt114

chat_bubble_outline8

repeat4

shareShare