
Pierre Richemond ๐ช๐บ
@theonekloud
Pretraining @cohere. @ImperialCollege PhD, Paris VI - @Polytechnique, ENST, @HECParis alum. Prev @GoogleDeepMind scientist, @GoldmanSachs trader. Views mine.
ID: 173117297
31-07-2010 13:44:22
13,13K Tweet
1,1K Takipรงi
873 Takip Edilen

(1/2) How to accelerate the reconstruction of 3D Gaussian Splatting? 3DGS-LM replaces the commonly used ADAM optimizer with a tailored Levenberg-Marquardt (LM). => We are ๐๐% ๐๐๐ฌ๐ญ๐๐ซ ๐ญ๐ก๐๐ง ๐๐๐๐ for the same quality. lukashoel.github.io/3DGS-LM/ youtu.be/tDiGuGMssg8



1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today DatologyAI shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens๐ง๐ผโ๐ณ - 3B LLMs beat 8B models๐ - Pareto frontier for performance



Canada has signed an MOU with ๐จ๐ฆ AI leader cohere to explore how AI can make government services faster, smarter and more secure, while backing Canadian tech on the global stage. Weโre championing Canadian companies building world-class toolsโand making them work for Canadians.


I know op is click-baiting, but let me bite... fwiw every researcherโs DREAM is to find out their architecture is wrong. If itโs never wrong, thatโs a bigger problem. we try to break DiT every day w/ SiT, REPA, REPA-E etc. but you gotta form hypotheses, run experiments, test, not

New survey on diffusion language models: arxiv.org/abs/2508.10875 (via Nicolas Perez-Nieves). Covers pre/post-training, inference and multimodality, with very nice illustrations. I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023๐ฅฒ




The freeze went into effect last week, and itโs not clear how long it will last, The Journalโs sources say. Meta is still likely working through its reorg, which split its AI unit, Meta Superintelligence Labs, into four new groups: TBD Labs, run by form... techcrunch.com/2025/08/21/repโฆ






