Dylan (@dylan_works_) 's Twitter Profile
Dylan

@dylan_works_

Working on LLM Post-Training

ID: 1448697561410244608

calendar_today14-10-2021 17:09:58

156 Tweet

389 Takipçi

2,2K Takip Edilen

Ming Zhong (@mingzhong_) 's Twitter Profile Photo

LLMs are winning gold medals in math & coding🥇, but what if they have to learn a new language from scratch, with only a grammar book and a dictionary? 🤔 On our new language Camlang, GPT-5's reasoning plummets: 98% in English → 47%. Check out our paper for more details!

Yiran Wu (@yiranwu18) 's Twitter Profile Photo

Introducing 🛡️ExCyTIn‑Bench: Evaluating LLM agents on Cyber Threat Investigations. It’s built on Azure tenant, a real Security Operations Center environment, covering 57 tables. Explore how LLMs fare in realistic, multi-hop incident detection! #Cybersecurity #AI #LLM #Benchmark

Introducing 🛡️ExCyTIn‑Bench: Evaluating LLM agents on Cyber Threat Investigations. It’s built on Azure tenant, a real Security Operations Center environment, covering 57 tables. Explore how LLMs fare in realistic, multi-hop incident detection! #Cybersecurity #AI #LLM #Benchmark