Pierre Richemond 🇪🇺 (@theonekloud) Twitter Tweets • TwiCopy

Matthias Niessner

a year ago

(1/2) How to accelerate the reconstruction of 3D Gaussian Splatting? 3DGS-LM replaces the commonly used ADAM optimizer with a tailored Levenberg-Marquardt (LM). => We are 𝟑𝟎% 𝐟𝐚𝐬𝐭𝐞𝐫 𝐭𝐡𝐚𝐧 𝟑𝐃𝐆𝐒 for the same quality. lukashoel.github.io/3DGS-LM/ youtu.be/tDiGuGMssg8

thumb_up_off_alt182

chat_bubble_outline1

repeat49

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

4 months ago

Thyme: Think Beyond Images "Thyme transcends traditional "thinking with images" paradigms by autonomously generating and executing diverse image processing and computational operations through executable code, significantly enhancing performance on high-resolution perception

thumb_up_off_alt271

chat_bubble_outline12

repeat36

shareShare

Pratyush Maini

@pratyushmaini

4 months ago

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today DatologyAI shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens🧑🏼‍🍳 - 3B LLMs beat 8B models🚀 - Pareto frontier for performance

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today <a href="/datologyai/">DatologyAI</a> shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens🧑🏼‍🍳
- 3B LLMs beat 8B models🚀
- Pareto frontier for performance

thumb_up_off_alt559

chat_bubble_outline18

repeat92

shareShare

Oleksii Kuchaiev

@kuchaev

4 months ago

We are excited to release Nvidia-Nemotron-Nano-V2 model! This is a 9B hybrid SSM model with open base model and training data. This model also supports runtime "thinking" budget control. HF collection with base and post trained models: huggingface.co/collections/nv…

thumb_up_off_alt286

chat_bubble_outline9

repeat54

shareShare

Jason Weston

@jaseweston

4 months ago

🤖Introducing OptimalThinkingBench 🤖 📝: arxiv.org/abs/2508.13141 - Thinking LLMs use a lot of tokens & overthink; non-thinking LLMs underthink & underperform. - We introduce a benchmark which scores models in the quest to find the best mix. - OptimalThinkingBench reports the F1

thumb_up_off_alt417

chat_bubble_outline1

repeat68

shareShare

Evan Solomon

@evanlsolomon

3 months ago

Canada has signed an MOU with 🇨🇦 AI leader cohere to explore how AI can make government services faster, smarter and more secure, while backing Canadian tech on the global stage. We’re championing Canadian companies building world-class tools—and making them work for Canadians.

thumb_up_off_alt702

chat_bubble_outline401

repeat81

shareShare

Jay Alammar

@jayalammar

3 months ago

The Illustrated GPT-OSS New post! A visual tour of the architecture, message formatting, and reasoning of the latest GPT. Link in 🧵

thumb_up_off_alt557

chat_bubble_outline2

repeat77

shareShare

Saining Xie

@sainingxie

3 months ago

I know op is click-baiting, but let me bite... fwiw every researcher’s DREAM is to find out their architecture is wrong. If it’s never wrong, that’s a bigger problem. we try to break DiT every day w/ SiT, REPA, REPA-E etc. but you gotta form hypotheses, run experiments, test, not

thumb_up_off_alt531

chat_bubble_outline12

repeat51

shareShare

Sander Dieleman

@sedielem

3 months ago

New survey on diffusion language models: arxiv.org/abs/2508.10875 (via Nicolas Perez-Nieves). Covers pre/post-training, inference and multimodality, with very nice illustrations. I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023🥲

New survey on diffusion language models: arxiv.org/abs/2508.10875 (via <a href="/NicolasPerezNi1/">Nicolas Perez-Nieves</a>). Covers pre/post-training, inference and multimodality, with very nice illustrations.

I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023🥲

thumb_up_off_alt538

chat_bubble_outline7

repeat83

shareShare

Jonathan Gorard

@getjonwithit

3 months ago

Today, I decided to try again at using generative AI (GPT-5 and Grok 4) to assist with a math/physics research problem. For context: I've been developing and testing a new first-order, flux-conservative formulation of the Einstein equations that can be evolved... (1/6)

thumb_up_off_alt1,1K

chat_bubble_outline73

repeat64

shareShare

Ernest Ryu

@ernestryu

3 months ago

This is really exciting and impressive, and this stuff is in my area of mathematics research (convex optimization). I have a nuanced take. 🧵 (1/9)

thumb_up_off_alt3,3K

chat_bubble_outline36

repeat304

shareShare

TechCrunch

@techcrunch

3 months ago

The freeze went into effect last week, and it’s not clear how long it will last, The Journal’s sources say. Meta is still likely working through its reorg, which split its AI unit, Meta Superintelligence Labs, into four new groups: TBD Labs, run by form... techcrunch.com/2025/08/21/rep…

thumb_up_off_alt25

chat_bubble_outline6

repeat7

shareShare

Pierre Richemond 🇪🇺

@theonekloud

3 months ago

I’m hiring !

thumb_up_off_alt85

chat_bubble_outline4

repeat5

shareShare

Pierre Richemond 🇪🇺

@theonekloud

3 months ago

Command A Reasoning is here and on HF for you to try. Beats R1, Magistral and gpt-oss. We're shipping.

thumb_up_off_alt67

chat_bubble_outline1

repeat5

shareShare

Chris Offner

@chrisoffner3d

3 months ago

"It is beautiful. It is elegant. Does it work well in practice? Not really. This is often the caveat we face in research: the things that are beautiful don't work and the things that work are not beautiful." – Daniel Cremers

thumb_up_off_alt166

chat_bubble_outline8

repeat8

shareShare

Jiawei Zhao

@jiawzhao

3 months ago

Introducing DeepConf: Deep Think with Confidence 🚀 First method to achieve 99.9% on AIME 2025 with open-source models! Using GPT-OSS-120B even without tools, we reached this almost-perfect accuracy while saving up to 85% generated tokens. It also delivers many strong

thumb_up_off_alt978

chat_bubble_outline21

repeat99

shareShare

Lucas Beyer (bl16)

@giffmana

3 months ago

This is an unwise statement that can only make people confused about what LLMs can or cannot do. Let me tell you something: Programming is NOT about solving this kind of ad hoc automation problems. Yeah, by scraping available data and then clustering it, LLMs can sometimes solve

thumb_up_off_alt573

chat_bubble_outline42

repeat33

shareShare

Pierre Richemond 🇪🇺

@theonekloud

3 months ago

Real.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

SemiAnalysis

@semianalysis_

3 months ago

TogetherAI's Chief Scientist Tri Dao announced Flash Attention v4 at HotChips Conference which is up to 22% faster than the attention kernel implementation from NVIDIA's cuDNN library. Tri Dao was able to achieve this 2 key algorithmic changes. Firstly, it uses a new online

TogetherAI's Chief Scientist <a href="/tri_dao/">Tri Dao</a> announced Flash Attention v4 at HotChips Conference which is up to 22% faster than the attention kernel implementation from NVIDIA's cuDNN library. Tri Dao was able to achieve this 2 key algorithmic changes. Firstly, it uses a new online

thumb_up_off_alt604

chat_bubble_outline24

repeat70

shareShare

Matt Beard

@matt_beard_

3 months ago

How is it possible for the New York Times to get this so completely wrong? Apple ahead of Anthropic in AI? Did nobody read this before it went out?

thumb_up_off_alt681

chat_bubble_outline22

repeat12

shareShare