Ryan-Rhys Griffiths (@ryan__rhys) 's Twitter Profile
Ryan-Rhys Griffiths

@ryan__rhys

Research Scientist @FutureHouseSF | ex-@Meta | ex-@Mila_Quebec | ex-@Huawei | PhD @Cambridge_Uni | LLMs | AI4Science | Bayesian Optimization | Chess FIDE Master

ID: 1280142008070307843

linkhttps://github.com/Ryan-Rhys calendar_today06-07-2020 14:10:27

188 Tweet

4,4K Followers

4,4K Following

Ryan-Rhys Griffiths (@ryan__rhys) 's Twitter Profile Photo

Delighted to join Sam Rodriques and Andrew White šŸ¦ā€ā¬› FutureHouse last week. It was a very difficult decision to turn down a faculty position for this opportunity but these are unprecedented times in #AI4Science and I strongly believe in FutureHouse's mission.

Delighted to join <a href="/SGRodriques/">Sam Rodriques</a> and <a href="/andrewwhite01/">Andrew White šŸ¦ā€ā¬›</a> <a href="/FutureHouseSF/">FutureHouse</a> last week. It was a very difficult decision to turn down a faculty position for this opportunity but these are unprecedented times in #AI4Science and I strongly believe in FutureHouse's mission.
Ryan-Rhys Griffiths (@ryan__rhys) 's Twitter Profile Photo

A much delayed #BOHackathon presentation on input warping for Bayesian optimization in GAUCHE. Many thanks to Mathieu Sang Truong and Anthony Onwuli for collaborating and Sterling G. Baird and Acceleration Consortium (AC) for organizing! Code: github.com/leojklarner/ga…

Sam Rodriques (@sgrodriques) 's Twitter Profile Photo

Introducing PaperQA2, the first AI agent that conducts entire scientific literature reviews on its own. PaperQA2 is also the first agent to beat PhD and Postdoc-level biology researchers on multiple literature research tasks, as measured both by accuracy on objective benchmarks

Ryan-Rhys Griffiths (@ryan__rhys) 's Twitter Profile Photo

The awkward moment when you realize you were a Top Reviewer for #NeurIPS2023 only when checking the list for #NeurIPS2024 (for which you were not a Top Reviewer) šŸ¤”

Sam Rodriques (@sgrodriques) 's Twitter Profile Photo

The next frontier for AI Agents in Science will be data analysis. Today, we're releasing BixBench, the most sophisticated benchmark yet for data analysis in biology. Agents that can do these tasks will be powerful tools for discovery. So far, they're not even close.

The next frontier for AI Agents in Science will be data analysis. Today, we're releasing BixBench, the most sophisticated benchmark yet for data analysis in biology. Agents that can do these tasks will be powerful tools for discovery. So far, they're not even close.
Andrew White šŸ¦ā€ā¬› (@andrewwhite01) 's Twitter Profile Photo

Half of an AI scientist is rejecting or accepting hypotheses. FutureHouse and ScienceMachine just put out ~300 novel hypotheses from ~50 published papers along with ground-truth data. Humans take 4.2 hours to solve these and frontier models get 10-20% correct. SWE-bench for bio

Half of an AI scientist is rejecting or accepting hypotheses. <a href="/FutureHouseSF/">FutureHouse</a> and <a href="/SciMac/">ScienceMachine</a> just put out ~300 novel hypotheses from ~50 published papers along with ground-truth data. Humans take 4.2 hours to solve these and frontier models get 10-20% correct. 

SWE-bench for bio
Ryan-Rhys Griffiths (@ryan__rhys) 's Twitter Profile Photo

The FutureHouse platform is now live and ready to use, featuring a variety of ways to leverage language agents in scientific endeavors: platform.futurehouse.org

Sam Rodriques (@sgrodriques) 's Twitter Profile Photo

Today, we’re announcing the first major discovery made by our AI Scientist with the lab in the loop: a promising new treatment for dry AMD, a major cause of blindness. Our agents generated the hypotheses, designed the experiments, analyzed the data, iterated, even made figures

Ryan-Rhys Griffiths (@ryan__rhys) 's Twitter Profile Photo

Today we're releasing ether0, a large reasoning model for chemistry trained with reinforcement learning via GRPO. Read more in our exclusive with Nature below: Nature: nature.com/articles/d4158… Preprint: storage.googleapis.com/aviary-public/… Model: huggingface.co/futurehouse/et… Benchmark:

Simon Frieder (@friederrrr) 's Twitter Profile Photo

IMO2025 has begun. Last year, AlphaProof won a silver medal (though no paper nor software was released so we have to trust that claim and the mathematicians that had access). This year, a whole bunch of organizations requested access to the IMO problems, so it will be

Simon Frieder (@friederrrr) 's Twitter Profile Photo

The IMO is changing - and is walking in the footsteps of chess competitions. Last year it was just the #aimoprize that it hosted. This year there was an associated event where a handful of AI companies and organizations tested some of their models on the IMO problems. The

Simon Frieder (@friederrrr) 's Twitter Profile Photo

Below is a chronological thread that summarizes the main developments around the IMO25 and the controversial AI evaluation. (Note that due to managing the AIMO, I can't really comment on ongoing developments, and the links below should not be construed as me endorsing them.)

Simon Frieder (@friederrrr) 's Twitter Profile Photo

OpenAI have just released an open-weight LLM. openai.com/index/introduc… This is great news for the AIMO3 competition we'll launch. Get your GPUs ready to fine-tune the lemmas out of GPT-OSS! :)

Ryan-Rhys Griffiths (@ryan__rhys) 's Twitter Profile Photo

From J-1 to Green Card in 15 months. I'm open-sourcing my complete 1600-page EB-1A petition (pictured). Includes: → Complete LaTeX petition template → Example reference letters → Timeline In light of Trump's comments on streamlining the EB-1A process, there's never been a

From J-1 to Green Card in 15 months. 

I'm open-sourcing my complete 1600-page EB-1A petition (pictured).

Includes:

→ Complete LaTeX petition template
→ Example reference letters
→ Timeline

In light of Trump's comments on streamlining the EB-1A process, there's never been a
Simon Frieder (@friederrrr) 's Twitter Profile Photo

Can you solve this Olympiad-level problem? This is the mind of problems LLMs have to solve to compete at the third AI Math Olympiad.

Can you solve this Olympiad-level problem?

This is the mind of problems LLMs have to solve to compete at the third AI Math Olympiad.