Michael Skarlinski (@m_skarlinski) 's Twitter Profile
Michael Skarlinski

@m_skarlinski

ML/Engineering enthusiast and Member of the Technical Staff @ FutureHouse

ID: 1787551300844232704

calendar_today06-05-2024 18:33:44

4 Tweet

43 Followers

4 Following

Sam Rodriques (@sgrodriques) 's Twitter Profile Photo

Introducing PaperQA2, the first AI agent that conducts entire scientific literature reviews on its own. PaperQA2 is also the first agent to beat PhD and Postdoc-level biology researchers on multiple literature research tasks, as measured both by accuracy on objective benchmarks

Andrew White 🐦‍⬛ (@andrewwhite01) 's Twitter Profile Photo

We used PaperQA2 to extract claims from papers and then see if they're contradicted anywhere in literature. This task is time consuming for humans, but we were able to use this for hundreds of papers to look for trends in disagreement in fields, decades, and journals.

We used PaperQA2 to extract claims from papers and then see if they're contradicted anywhere in literature. This task is time consuming for humans, but we were able to use this for hundreds of papers to look for trends in disagreement in fields, decades, and journals.
Ethan Mollick (@emollick) 's Twitter Profile Photo

Wow: Just tried PaperQA, an open source AI-powered literature review whose research paper claims it achieves "superhuman synthesis of scientific knowledge" I tested it against papers I wrote and it seems like the real deal, putting together a good summary with accurate details.

Wow: Just tried PaperQA, an open source AI-powered literature review whose research paper claims it achieves "superhuman synthesis of scientific knowledge"

I tested it against papers I wrote and it seems like the real deal, putting together a good summary with accurate details.
Andrew White 🐦‍⬛ (@andrewwhite01) 's Twitter Profile Photo

It took us about 9 months of exploration to build agents that can do superhuman scientific literature summary and Q&A. Michael Skarlinski wrote up what failed and what was essential in an engineering blog post: futurehouse.org/research-annou…

It took us about 9 months of exploration to build agents that can do superhuman scientific literature summary and Q&amp;A. <a href="/m_skarlinski/">Michael Skarlinski</a> wrote up what failed and what was essential in an engineering blog post:

futurehouse.org/research-annou…
Michael Skarlinski (@m_skarlinski) 's Twitter Profile Photo

Can you exchange frontier model compute for more accuracy in RAG? Yes. Check out our other experiments leading to the PaperQA2 design in our new engineering blog post: futurehouse.org/research-annou…

Can you exchange frontier model compute for more accuracy in RAG? Yes. 

Check out our other experiments leading to the PaperQA2 design in our new engineering blog post: futurehouse.org/research-annou…
Andrew White 🐦‍⬛ (@andrewwhite01) 's Twitter Profile Photo

Another finding was how important it is to do multiple retrieval strategies. Using LLMs, you can get query expansion easily by having LLMs rewrite the question multiple ways. We also found exploiting sci lit metadata - i.e., citation graph - helped a lot 2/3

Another finding was how important it is to do multiple retrieval strategies. Using LLMs, you can get query expansion easily by having LLMs rewrite the question multiple ways. We also found exploiting sci lit metadata - i.e., citation graph - helped a lot 2/3