Charles Sutton @ โœˆ๏ธ ICML 2024 ๐Ÿฅ (@randomlywalking) 's Twitter Profile
Charles Sutton @ โœˆ๏ธ ICML 2024 ๐Ÿฅ

@randomlywalking

Research scientist @GoogleAI / Previously academic @InfAtEd / Deep learning to help people write code. / @[email protected] / โค๏ธs:๐Ÿฑ๐Ÿถโ˜•๏ธ๐Ÿ•

ID: 21815759

linkhttp://homepages.inf.ed.ac.uk/csutton/ calendar_today25-02-2009 00:12:29

4,4K Tweet

17,17K Followers

1,1K Following

Naman Jain (@stringchaos) 's Twitter Profile Photo

Check out our new ICML paper on R2E which converts code repositories to environments for evaluating coding LLMs! Key takeaway -- execution is the cornerstone and we synthesize test cases for making arbitrary functions executable!

Koushik Sen (@koushik77) 's Twitter Profile Photo

Would you ride an airplane which has not made a test flight? Would you trust code generated by an LLM? Obviously no, until it has been verified/tested. Most popular LLM based code synthesis benchmarks are evaluated based on insufficient tests. R2E enables you to convert any

Charles Sutton @ โœˆ๏ธ ICML 2024 ๐Ÿฅ (@randomlywalking) 's Twitter Profile Photo

Interesting post. I don't have a considered view about HCI's relevance to AI alignment, but there are great insights here about interdisciplinary collaboration more generally.

Vaibhav Tulsyan (@xennygrimmato_) 's Twitter Profile Photo

Great results on CyberSecEval 2! "Project Naptime", an agent from the Project Zero team at Google, achieves new top scores of 100% on the โ€œBuffer Overflow" tests (from 5%) and 76% on the "Advanced Memory Corruption" tests (from 24%).

Great results on CyberSecEval 2!
"Project Naptime", an agent from the Project Zero team at Google, achieves new top scores of 100% on the โ€œBuffer Overflow" tests (from 5%) and 76% on the "Advanced Memory Corruption" tests (from 24%).
๐Ÿ‘ฉโ€๐Ÿ’ป Paige Bailey (@dynamicwebpaige) 's Twitter Profile Photo

โœจ Gemini, as applied to code migrations at Google: "For this workstream we found that 80% of the code modifications in the landed CLs were AI-authored, the rest were human-authored. The total time spent on the migration was reduced by an estimated 50% as reported by the

Charles Sutton @ โœˆ๏ธ ICML 2024 ๐Ÿฅ (@randomlywalking) 's Twitter Profile Photo

Despite the travel disruptions today, I seem to be travelling successfully to #ICML2024. Looking forward to seeing Vienna for the first time and catching up with folks!

Stephan Hoyer (@shoyer) 's Twitter Profile Photo

I'm incredibly proud to share NeuralGCM, our new AI and physics based approach to weather and climate modeling with state-of-the-art accuracy, published today in nature: nature.com/articles/s4158โ€ฆ

Ansong Ni (@ansongni) 's Twitter Profile Photo

I won't be at #ICML2024 this year due to visa issues but don't worry as you'll get a major upgrade for the presenters: ๐ŸŒŸCharles Sutton @ โœˆ๏ธ ICML 2024 ๐Ÿฅ and Arman Cohan ๐ŸŒŸ will be presenting the poster tomorrow afternoon, come and learn more about NExT! โฐ: 1PM CEST ๐Ÿ“: Hall C 4-9 #607

Yisong Yue (@yisongyue) 's Twitter Profile Photo

Planning an AI for Code/Math meetup Friday night at #ICML2024 with Swarat Chaudhuri. Drinks sponsored by Asari AI. DM me with your name & email if you want info.

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Weโ€™re presenting the first AI to solve International Mathematical Olympiad problems at a silver medalist level.๐Ÿฅˆ It combines AlphaProof, a new breakthrough model for formal reasoning, and AlphaGeometry 2, an improved version of our previous system. ๐Ÿงต dpmd.ai/imo-silver

Pedro A. Ortega (@adaptiveagents) 's Twitter Profile Photo

The statement "AI is compression" is remarkably vague. Give me any distribution over a vast, finite space X. I can easily create a near-optimal Huffman code for it. Yet this code lacks any generalization capabilities.