NYRE (@sleenyre) Twitter Tweets • TwiCopy

NYRE

@sleenyre

+ Follow

ML @krea_ai, prev fire fighter oleve.co

ID: 1656820518119407619

linkhttps://re-n-y.github.io/devlog/rambling/ calendar_today12-05-2023 00:36:20

169 Tweet

266 Followers

178 Following

NYRE

@sleenyre

2 months ago

I've been also wanting to write a blog on pretraining + post training. Open challenges and solutions to new round of generative models. Hopefully I will get to it this month.

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

today we're open-sourcing Krea Realtime. this 14B autoregressive model is 10x larger than any open-source equivalent, and it can generate long-form videos at 11 fps on a single B200. weights and technical report below 👇

thumb_up_off_alt1,1K

chat_bubble_outline59

repeat203

shareShare

Vik Paruchuri

@vikparuchuri

a month ago

I'm excited to announce that Chandra OCR is open source! - Full layout information - Extracts and captions images and diagrams - Strong handwriting, form, table support - Works with transformers and vLLM

thumb_up_off_alt1,1K

chat_bubble_outline56

repeat188

shareShare

NYRE

@sleenyre

a month ago

What is torchcomms.....? Whatever it is looking forward to it.

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

NYRE

@sleenyre

a month ago

Torchcomm + Monarch Christmas came early. I know what I'm doing this weekend :) pytorch.org/blog/torchcomm… pytorch.org/blog/introduci…

thumb_up_off_alt10

chat_bubble_outline0

repeat1

shareShare

Sayak Paul

@risingsayak

a month ago

With simple changes, I was able to cut down KREA AI's new real-time video gen's timing from 25.54s to 18.14s 🔥🚀 1. FA3 through `kernels` 2. Regional compilation 3. Selective (FP8) quantization Notes are in 🧵 below

thumb_up_off_alt108

chat_bubble_outline5

repeat13

shareShare

NYRE

@sleenyre

a month ago

Here's a fun ML engineering question. In TorchTitan / Lingua, qkv projections are unfused (i.e. separate three linear layers) which is known to be inefficient. Is this on purpose? If so, why?

thumb_up_off_alt149

chat_bubble_outline8

repeat2

shareShare

NYRE

@sleenyre

19 days ago

Did you know that you can cache torch.compile to Redis?

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Physical Intelligence

@physical_int

15 days ago

Our model can now learn from its own experience with RL! Our new π*0.6 model can more than double throughput over a base model trained without RL, and can perform real-world tasks: making espresso drinks, folding diverse laundry, and assembling boxes. More in the thread below.

thumb_up_off_alt1,1K

chat_bubble_outline72

repeat299

shareShare