Lucas Vogel (@lucasvogel_dev) 's Twitter Profile
Lucas Vogel

@lucasvogel_dev

Dev Intern @ groq | CS & Econ @ WashU

ID: 1873020235399540737

calendar_today28-12-2024 14:57:02

68 Tweet

54 Takipçi

86 Takip Edilen

Lucas Vogel (@lucasvogel_dev) 's Twitter Profile Photo

Interesting idea of dropping overlap in eval datasets to decrease costs. As evaluations become harder and longer, costs will become incredibly high. Ensuring tasks don’t become repetitive, yet still rigorously test capabilities, is one way to mitigate this.

Lucas Vogel (@lucasvogel_dev) 's Twitter Profile Photo

Diminishing marginal returns on short tasks or simple QA evals can be misleading. In an economic sense, the value of LMs is dependent on the length of their task horizon in “human hours”. Can it do 5 hours of my work accurately or 10? Labs are beginning to prioritize horizon

Lucas Vogel (@lucasvogel_dev) 's Twitter Profile Photo

Got the chance to work out of the Gateway X office today. Two things I learned ⬇️ 1. Code editors aren’t just for code I sat in on a meeting where the CTO went through an entire business first workflow in VSCode using Claude code. The best thing: No code whatsoever. He

Mason Wang (@masonwang025) 's Twitter Profile Photo

(1/2) i felt like no one actually teaches you a good framework for how to read (ML) papers well + fast, so i wrote this 5-minute read tldr: because so many papers suck, here's how to go through them quickly and revisit the good ones

(1/2) i felt like no one actually teaches you a good framework for how to read (ML) papers well + fast, so i wrote this 5-minute read

tldr: because so many papers suck, here's how to go through them quickly and revisit the good ones