Pierre Richemond ๐Ÿ‡ช๐Ÿ‡บ (@theonekloud) 's Twitter Profile
Pierre Richemond ๐Ÿ‡ช๐Ÿ‡บ

@theonekloud

Pretraining @cohere. @ImperialCollege PhD, Paris VI - @Polytechnique, ENST, @HECParis alum. Prev @GoogleDeepMind scientist, @GoldmanSachs trader. Views mine.

ID: 173117297

calendar_today31-07-2010 13:44:22

13,13K Tweet

1,1K Takipรงi

873 Takip Edilen

Matthias Niessner (@mattniessner) 's Twitter Profile Photo

(1/2) How to accelerate the reconstruction of 3D Gaussian Splatting? 3DGS-LM replaces the commonly used ADAM optimizer with a tailored Levenberg-Marquardt (LM). => We are ๐Ÿ‘๐ŸŽ% ๐Ÿ๐š๐ฌ๐ญ๐ž๐ซ ๐ญ๐ก๐š๐ง ๐Ÿ‘๐ƒ๐†๐’ for the same quality. lukashoel.github.io/3DGS-LM/ youtu.be/tDiGuGMssg8

(1/2)
How to accelerate the reconstruction of 3D Gaussian Splatting?

3DGS-LM replaces the commonly used ADAM optimizer with a tailored Levenberg-Marquardt (LM). 
=> We are ๐Ÿ‘๐ŸŽ% ๐Ÿ๐š๐ฌ๐ญ๐ž๐ซ ๐ญ๐ก๐š๐ง ๐Ÿ‘๐ƒ๐†๐’ for the same quality.

lukashoel.github.io/3DGS-LM/
youtu.be/tDiGuGMssg8
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Thyme: Think Beyond Images "Thyme transcends traditional "thinking with images" paradigms by autonomously generating and executing diverse image processing and computational operations through executable code, significantly enhancing performance on high-resolution perception

Thyme: Think Beyond Images

"Thyme transcends traditional "thinking with images" paradigms by  autonomously generating and executing diverse image processing and  computational operations through executable code, significantly  enhancing performance on high-resolution perception
Pratyush Maini (@pratyushmaini) 's Twitter Profile Photo

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today DatologyAI shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens๐Ÿง‘๐Ÿผโ€๐Ÿณ - 3B LLMs beat 8B models๐Ÿš€ - Pareto frontier for performance

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today <a href="/datologyai/">DatologyAI</a> shares BeyondWeb, our synthetic data approach &amp; all the learnings from scaling it to trillions of tokens๐Ÿง‘๐Ÿผโ€๐Ÿณ
- 3B LLMs beat 8B models๐Ÿš€
- Pareto frontier for performance
Oleksii Kuchaiev (@kuchaev) 's Twitter Profile Photo

We are excited to release Nvidia-Nemotron-Nano-V2 model! This is a 9B hybrid SSM model with open base model and training data. This model also supports runtime "thinking" budget control. HF collection with base and post trained models: huggingface.co/collections/nvโ€ฆ

We are excited to release Nvidia-Nemotron-Nano-V2 model! This is a 9B hybrid SSM model with open base model and training data. This model also supports runtime "thinking" budget control. HF collection with base and post trained models: huggingface.co/collections/nvโ€ฆ
Jason Weston (@jaseweston) 's Twitter Profile Photo

๐Ÿค–Introducing OptimalThinkingBench ๐Ÿค– ๐Ÿ“: arxiv.org/abs/2508.13141 - Thinking LLMs use a lot of tokens & overthink; non-thinking LLMs underthink & underperform. - We introduce a benchmark which scores models in the quest to find the best mix. - OptimalThinkingBench reports the F1

๐Ÿค–Introducing OptimalThinkingBench ๐Ÿค–
๐Ÿ“: arxiv.org/abs/2508.13141
- Thinking LLMs use a lot of tokens &amp; overthink; non-thinking LLMs underthink &amp; underperform.
- We introduce a benchmark which scores models in the quest to find the best mix.
- OptimalThinkingBench reports the F1
Evan Solomon (@evanlsolomon) 's Twitter Profile Photo

Canada has signed an MOU with ๐Ÿ‡จ๐Ÿ‡ฆ AI leader cohere to explore how AI can make government services faster, smarter and more secure, while backing Canadian tech on the global stage. Weโ€™re championing Canadian companies building world-class toolsโ€”and making them work for Canadians.

Jay Alammar (@jayalammar) 's Twitter Profile Photo

The Illustrated GPT-OSS New post! A visual tour of the architecture, message formatting, and reasoning of the latest GPT. Link in ๐Ÿงต

The Illustrated GPT-OSS

New post! A visual tour of the architecture, message formatting, and reasoning of the latest GPT.

Link in ๐Ÿงต
Saining Xie (@sainingxie) 's Twitter Profile Photo

I know op is click-baiting, but let me bite... fwiw every researcherโ€™s DREAM is to find out their architecture is wrong. If itโ€™s never wrong, thatโ€™s a bigger problem. we try to break DiT every day w/ SiT, REPA, REPA-E etc. but you gotta form hypotheses, run experiments, test, not

Sander Dieleman (@sedielem) 's Twitter Profile Photo

New survey on diffusion language models: arxiv.org/abs/2508.10875 (via Nicolas Perez-Nieves). Covers pre/post-training, inference and multimodality, with very nice illustrations. I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023๐Ÿฅฒ

New survey on diffusion language models: arxiv.org/abs/2508.10875 (via <a href="/NicolasPerezNi1/">Nicolas Perez-Nieves</a>). Covers pre/post-training, inference and multimodality, with very nice illustrations.

I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023๐Ÿฅฒ
Jonathan Gorard (@getjonwithit) 's Twitter Profile Photo

Today, I decided to try again at using generative AI (GPT-5 and Grok 4) to assist with a math/physics research problem. For context: I've been developing and testing a new first-order, flux-conservative formulation of the Einstein equations that can be evolved... (1/6)

Ernest Ryu (@ernestryu) 's Twitter Profile Photo

This is really exciting and impressive, and this stuff is in my area of mathematics research (convex optimization). I have a nuanced take. ๐Ÿงต (1/9)

TechCrunch (@techcrunch) 's Twitter Profile Photo

The freeze went into effect last week, and itโ€™s not clear how long it will last, The Journalโ€™s sources say. Meta is still likely working through its reorg, which split its AI unit, Meta Superintelligence Labs, into four new groups: TBD Labs, run by form... techcrunch.com/2025/08/21/repโ€ฆ

Chris Offner (@chrisoffner3d) 's Twitter Profile Photo

"It is beautiful. It is elegant. Does it work well in practice? Not really. This is often the caveat we face in research: the things that are beautiful don't work and the things that work are not beautiful." โ€“ Daniel Cremers

Jiawei Zhao (@jiawzhao) 's Twitter Profile Photo

Introducing DeepConf: Deep Think with Confidence ๐Ÿš€ First method to achieve 99.9% on AIME 2025 with open-source models! Using GPT-OSS-120B even without tools, we reached this almost-perfect accuracy while saving up to 85% generated tokens. It also delivers many strong

Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

This is an unwise statement that can only make people confused about what LLMs can or cannot do. Let me tell you something: Programming is NOT about solving this kind of ad hoc automation problems. Yeah, by scraping available data and then clustering it, LLMs can sometimes solve

SemiAnalysis (@semianalysis_) 's Twitter Profile Photo

TogetherAI's Chief Scientist Tri Dao announced Flash Attention v4 at HotChips Conference which is up to 22% faster than the attention kernel implementation from NVIDIA's cuDNN library. Tri Dao was able to achieve this 2 key algorithmic changes. Firstly, it uses a new online

TogetherAI's Chief Scientist <a href="/tri_dao/">Tri Dao</a> announced Flash Attention v4 at HotChips Conference which is up to 22% faster than the attention kernel implementation from NVIDIA's cuDNN library. Tri Dao was able to achieve this 2 key algorithmic changes. Firstly, it uses a new online
Matt Beard (@matt_beard_) 's Twitter Profile Photo

How is it possible for the New York Times to get this so completely wrong? Apple ahead of Anthropic in AI? Did nobody read this before it went out?

How is it possible for the New York Times to get this so completely wrong? Apple ahead of Anthropic in AI? Did nobody read this before it went out?