Gilad (@gil2rok) 's Twitter Profile
Gilad

@gil2rok

stats + probabilistic ml researcher (sampling, generative models). I sort in exponential time. novelty seeker. @FlatironInst @Columbia. DMs open.

ID: 1727791812884963328

linkhttps://gil2rok.github.io/ calendar_today23-11-2023 20:51:12

939 Tweet

746 Followers

2,2K Following

Gilad (@gil2rok) 's Twitter Profile Photo

- Probabilistic ML: An Introduction by Kevin Murphy - Probabilistic ML: Advanced topics by also Kevin Murphy - Reinforcement Learning: An Overview by also also by Kevin Murphy - A First Course in Monte Carlo Methods by D. Sanz-Alonso and O. Al-Ghattas

- Probabilistic ML: An Introduction by Kevin Murphy
- Probabilistic ML: Advanced topics by also Kevin Murphy
- Reinforcement Learning: An Overview by also also by Kevin Murphy 
- A First Course in Monte Carlo Methods by D. Sanz-Alonso and O. Al-Ghattas
Nathan Lambert (@natolambert) 's Twitter Profile Photo

America needs to take open models more seriously. This summer the early lead in open model adoption of the US via Llama has been overtaken by Chinese models. With The American Truly Open Models (ATOM) Project we're looking to build support and express the urgency of this issue.

America needs to take open models more seriously. This summer the early lead in open model adoption of the US via Llama has been overtaken by Chinese models.

With The American Truly Open Models (ATOM) Project we're looking to build support and express the urgency of this issue.
Gilad (@gil2rok) 's Twitter Profile Photo

TIL: Big tech companies often use *monorepos* (single repo for all code) to manage dependencies across *microservices *(independently deployable services). Counter-intuitive but powerful: centralized code, distributed deployment.

Gilad (@gil2rok) 's Twitter Profile Photo

One of the most significant parts of GPT-5 (and other top LLms) is that it takes us closer to “on-demand software” This will radically transform the economy

Gilad (@gil2rok) 's Twitter Profile Photo

Any advice for when to use Claude Code vs Cursor Simon Willison? Claude code: requires high trust, large code changes, non-critical code Cursor: more granular changes, encourages more hands on approvals, when you need to understand the code changes

Arc Jax (@arcjax7) 's Twitter Profile Photo

Nobody does JAX like JAX, folks! Super fast with compilation—turns Python into lightning. Vectorization? Handles massive batches, all at once! And the gradients—nobody gets gradients like JAX, believe me! People everywhere are saying, “Sir, it’s the best!” Tremendous technology!

Gilad (@gil2rok) 's Twitter Profile Photo

What I like about JAX is not that it’s faster (mostly on TPUs), but b/c the abstractions are just so much cleaner than PyTorch. Clear thinking saves you more time than faster training!

Gilad (@gil2rok) 's Twitter Profile Photo

Diffusion LLMs outperform traditional autoregressive LLMs by learning multiple orderings of the same data (instead of learning ONLY left-to-right) This is only helpful when we are data constrained (train for >1 epoch). Does this occur in most frontier labs? Genuine question.

Gilad (@gil2rok) 's Twitter Profile Photo

Anyone else find most LaTeX CV templates to be ugly and/or hard to use?? Like why can’t they just work and look nice? I’ve truly never *once* been able to understand the style file for these templates !!