Ryan-Rhys Griffiths (@ryan__rhys) Twitter Tweets • TwiCopy

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Delighted to join Sam Rodriques and Andrew White 🐦‍⬛ FutureHouse last week. It was a very difficult decision to turn down a faculty position for this opportunity but these are unprecedented times in #AI4Science and I strongly believe in FutureHouse's mission.

Delighted to join <a href="/SGRodriques/">Sam Rodriques</a> and <a href="/andrewwhite01/">Andrew White 🐦‍⬛</a> <a href="/FutureHouseSF/">FutureHouse</a> last week. It was a very difficult decision to turn down a faculty position for this opportunity but these are unprecedented times in #AI4Science and I strongly believe in FutureHouse's mission.

thumb_up_off_alt116

chat_bubble_outline9

repeat2

shareShare

FutureHouse

@futurehousesf

2 years ago

ChemCrow was one of the first serious demonstrations of using AI to automate science. There will be many more to come. Major congratulations to the team: Sam Cox Carlo Baldassari Oliver Schilter Carlo Baldassari Andrew White 🐦‍⬛ Philippe Schwaller (he/him)

thumb_up_off_alt26

chat_bubble_outline1

repeat6

shareShare

Ryan-Rhys Griffiths

@ryan__rhys

a year ago

An extensive new biology benchmark for LLMs!

thumb_up_off_alt14

chat_bubble_outline0

repeat1

shareShare

Ryan-Rhys Griffiths

@ryan__rhys

a year ago

A much delayed #BOHackathon presentation on input warping for Bayesian optimization in GAUCHE. Many thanks to Mathieu Sang Truong and Anthony Onwuli for collaborating and Sterling G. Baird and Acceleration Consortium (AC) for organizing! Code: github.com/leojklarner/ga…

thumb_up_off_alt45

chat_bubble_outline0

repeat8

shareShare

Sam Rodriques

@sgrodriques

a year ago

Introducing PaperQA2, the first AI agent that conducts entire scientific literature reviews on its own. PaperQA2 is also the first agent to beat PhD and Postdoc-level biology researchers on multiple literature research tasks, as measured both by accuracy on objective benchmarks

thumb_up_off_alt3,3K

chat_bubble_outline78

repeat772

shareShare

Ryan-Rhys Griffiths

@ryan__rhys

a year ago

The awkward moment when you realize you were a Top Reviewer for #NeurIPS2023 only when checking the list for #NeurIPS2024 (for which you were not a Top Reviewer) 🤔

thumb_up_off_alt24

chat_bubble_outline1

repeat0

shareShare

Mathieu

@miniapeur

a year ago

(3/3) I had the great pleasure of seeing old friends, making new ones and being stopped by some of my followers. In particular: Clément Bonet Daniel Augusto Austin Tripp Bruno Aristimunha Thomas Kipf Claas Voelcker Lorenzo Giusti Jessica Dafflon🍍 Ryan-Rhys Griffiths 𝗝𝗼𝘀𝗵𝘂𝗮 Irene Cannistraci

(3/3) I had the great pleasure of seeing old friends, making new ones and being stopped by some of my followers. In particular: <a href="/Clement_Bonet_/">Clément Bonet</a> <a href="/spectraldani/">Daniel Augusto</a> <a href="/austinjtripp/">Austin Tripp</a> <a href="/BAristimunha/">Bruno Aristimunha</a> <a href="/tkipf/">Thomas Kipf</a> <a href="/c_voelcker/">Claas Voelcker</a> <a href="/lorenzgiusti/">Lorenzo Giusti</a> <a href="/jessdafflon/">Jessica Dafflon🍍</a> <a href="/Ryan__Rhys/">Ryan-Rhys Griffiths</a> <a href="/Joshua_Bambrick/">𝗝𝗼𝘀𝗵𝘂𝗮</a> <a href="/ire_cannistraci/">Irene Cannistraci</a>

thumb_up_off_alt15

chat_bubble_outline2

repeat1

shareShare

Sam Rodriques

@sgrodriques

9 months ago

The next frontier for AI Agents in Science will be data analysis. Today, we're releasing BixBench, the most sophisticated benchmark yet for data analysis in biology. Agents that can do these tasks will be powerful tools for discovery. So far, they're not even close.

thumb_up_off_alt270

chat_bubble_outline10

repeat43

shareShare

Andrew White 🐦‍⬛

@andrewwhite01

9 months ago

Half of an AI scientist is rejecting or accepting hypotheses. FutureHouse and ScienceMachine just put out ~300 novel hypotheses from ~50 published papers along with ground-truth data. Humans take 4.2 hours to solve these and frontier models get 10-20% correct. SWE-bench for bio

Half of an AI scientist is rejecting or accepting hypotheses. <a href="/FutureHouseSF/">FutureHouse</a> and <a href="/SciMac/">ScienceMachine</a> just put out ~300 novel hypotheses from ~50 published papers along with ground-truth data. Humans take 4.2 hours to solve these and frontier models get 10-20% correct.

SWE-bench for bio

thumb_up_off_alt203

chat_bubble_outline7

repeat33

shareShare

Ryan-Rhys Griffiths

@ryan__rhys

7 months ago

The FutureHouse platform is now live and ready to use, featuring a variety of ways to leverage language agents in scientific endeavors: platform.futurehouse.org

thumb_up_off_alt22

chat_bubble_outline0

repeat1

shareShare

Sam Rodriques

@sgrodriques

7 months ago

Today, we’re announcing the first major discovery made by our AI Scientist with the lab in the loop: a promising new treatment for dry AMD, a major cause of blindness. Our agents generated the hypotheses, designed the experiments, analyzed the data, iterated, even made figures

thumb_up_off_alt3,3K

chat_bubble_outline113

repeat708

shareShare

Ryan-Rhys Griffiths

@ryan__rhys

6 months ago

Today we're releasing ether0, a large reasoning model for chemistry trained with reinforcement learning via GRPO. Read more in our exclusive with Nature below: Nature: nature.com/articles/d4158… Preprint: storage.googleapis.com/aviary-public/… Model: huggingface.co/futurehouse/et… Benchmark:

thumb_up_off_alt26

chat_bubble_outline1

repeat2

shareShare

Simon Frieder

@friederrrr

5 months ago

IMO2025 has begun. Last year, AlphaProof won a silver medal (though no paper nor software was released so we have to trust that claim and the mathematicians that had access). This year, a whole bunch of organizations requested access to the IMO problems, so it will be

thumb_up_off_alt13

chat_bubble_outline0

repeat1

shareShare

Simon Frieder

@friederrrr

5 months ago

The IMO is changing - and is walking in the footsteps of chess competitions. Last year it was just the #aimoprize that it hosted. This year there was an associated event where a handful of AI companies and organizations tested some of their models on the IMO problems. The

thumb_up_off_alt12

chat_bubble_outline2

repeat2

shareShare

Simon Frieder

@friederrrr

5 months ago

Below is a chronological thread that summarizes the main developments around the IMO25 and the controversial AI evaluation. (Note that due to managing the AIMO, I can't really comment on ongoing developments, and the links below should not be construed as me endorsing them.)

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare

Simon Frieder

@friederrrr

4 months ago

OpenAI have just released an open-weight LLM. openai.com/index/introduc… This is great news for the AIMO3 competition we'll launch. Get your GPUs ready to fine-tune the lemmas out of GPT-OSS! :)

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Ryan-Rhys Griffiths

@ryan__rhys

a month ago

From J-1 to Green Card in 15 months. I'm open-sourcing my complete 1600-page EB-1A petition (pictured). Includes: → Complete LaTeX petition template → Example reference letters → Timeline In light of Trump's comments on streamlining the EB-1A process, there's never been a