Wasu Top Piriyakulkij (@topwasu) 's Twitter Profile
Wasu Top Piriyakulkij

@topwasu

Cornell PhD student cs.cornell.edu/~wp237/

ID: 1278176158312534016

calendar_today01-07-2020 03:58:43

38 Tweet

28 Takipçi

156 Takip Edilen

Yuandong Tian (@tydsh) 's Twitter Profile Photo

Our team (LLM+reasoning/planning) is hiring multiple research interns for 2025. If you are interested, please apply via the following link. Thanks! metacareers.com/jobs/532549086…

Roberta Raileanu (@robertarail) 's Twitter Profile Photo

I’m looking for a PhD intern for next year to work at the intersection of LLM-based agents and open-ended learning, part of the Llama Research Team in London. If interested please send me an email with a short paragraph with some research ideas and apply at the link below.

Oliver Habryka (@ohabryka) 's Twitter Profile Photo

I compiled all the emails released as part of the Musk v. Altman lawsuit in chronological order (link in reply). IMO a really valuable read. Extremely consequential decisions made in these emails.

I compiled all the emails released as part of the Musk v. Altman lawsuit in chronological order (link in reply). 

IMO a really valuable read. Extremely consequential decisions made in these emails.
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

JB Rubinovitz joshwa It’s hard to understand now, the Atari RL paper of 2013 and its extensions was the by far dominant meme. One single general learning algorithm discovered an optimal strategy to Breakout and so many other games. You just had to improve and scale it enough. My recollection of the

finbarr (@finbarrtimbers) 's Twitter Profile Photo

It turns out that Atari has a bunch of specific properties (determinism, fully observable states, small discrete action space, etc) that are critical, and relaxing these assumptions is very very very hard

Jeff Clune (@jeffclune) 's Twitter Profile Photo

The secret to doing good research is always to be a little underemployed. You waste years by not being able to waste hours. - Amos Tversky

Selçuk Korkmaz (@selcukorkmaz) 's Twitter Profile Photo

In statistical modeling, particularly within the context of regression analysis and analysis of variance (ANOVA), fixed effects and random effects are two fundamental concepts that describe different types of variables or factors in a model. Here’s a straightforward explanation:

In statistical modeling, particularly within the context of regression analysis and analysis of variance (ANOVA), fixed effects and random effects are two fundamental concepts that describe different types of variables or factors in a model. Here’s a straightforward explanation:
Albert Tseng (@tsengalb99) 's Twitter Profile Photo

I will be at #NeurIPS2024 in person next week presenting QTIP (arxiv.org/abs/2406.11235), our latest LLM quantization algorithm that achieves SOTA results with trellis quantization. Reach out if you'd like to chat!

Zenna Tavares (@zennatavares) 's Twitter Profile Photo

Thrilled that joint work by Kevin Ellis's lab and Basis won 1st prize in ARC Prize Paper Awards and 2nd prize in ARC-AGI-PUB (w/ MIT) This is our first result from Project MARA: an effort to build Modeling, Abstraction, and Reasoning Agents capable of "everyday science"

Edward Grefenstette (@egrefen) 's Twitter Profile Photo

"We're seeing today results I anticipated in X" "Today's result is a instance of [overly general framework I wrote in 20XX" "As I've been talking about years, we finally see..." STFU bro — if you didn't build it, you didn't build it.

Gaoyue Zhou (@gaoyuezhou) 's Twitter Profile Photo

Can we extend the power of world models beyond just online model-based learning? Absolutely! We believe the true potential of world models lies in enabling agents to reason at test time. Introducing DINO-WM: World Models on Pre-trained Visual Features for Zero-shot Planning.

Albert Tseng (@tsengalb99) 's Twitter Profile Photo

Excited to announce our #AISTATS📜on training LLMs with MXFP4! We use stoch. rounding and random Hadamard transforms (all fast on HW) to get low-variance, unbiased gradient estimates with MXFP4 GEMMs. We get a ~30% speedup over FP8 with almost no PPL gap! arxiv.org/abs/2502.20586

Excited to announce our #AISTATS📜on training LLMs with MXFP4! We use stoch. rounding and random Hadamard transforms (all fast on HW) to get low-variance, unbiased gradient estimates with MXFP4 GEMMs. We get a ~30% speedup over FP8 with almost no PPL gap!

arxiv.org/abs/2502.20586
Yingheng Wang (@yingheng_wang) 's Twitter Profile Photo

❓ Are LLMs actually problem solvers or just good at regurgitating facts? 🚨New Benchmark Alert! We built HeuriGym to benchmark if LLMs can craft real heuristics for real-world hard combinatorial optimization problems. 🛞 We’re open-sourcing it all: ✅ 9 problems ✅ Iterative

❓ Are LLMs actually problem solvers or just good at regurgitating facts?

🚨New Benchmark Alert! We built HeuriGym to benchmark if LLMs can craft real heuristics for real-world hard combinatorial optimization problems.

🛞 We’re open-sourcing it all:
✅ 9 problems
✅ Iterative
Moksh Jain (@jainmoksh) 's Twitter Profile Photo

As the field moves towards agents doing science, the ability to understand novel environments through interaction becomes critical. AutumnBench is an attempt at measuring this abstract capability in both humans and current LLMs. Check out the blog post for more insights!