Ted (@tedinreallife) 's Twitter Profile
Ted

@tedinreallife

still on pace for PhD-before-50

ID: 2732174480

calendar_today14-08-2014 16:36:01

1,1K Tweet

120 Takipçi

556 Takip Edilen

Glenn K. Lockwood (@glennklockwood) 's Twitter Profile Photo

The AI world is in a GPU crunch and meanwhile NERSC is offering 50% off its A100 GPUs (rest.nersc.gov/REST/announcem…). They could make a killing by backfilling idle capacity with commercial workloads 💰 💰 💰

Soumith Chintala (@soumithchintala) 's Twitter Profile Photo

Regulation starts at roughly two orders of magnitude larger than a ~70B Transformer trained on 2T tokens -- which is ~5e24. Note: increasing the size of the dataset OR the size of the transformer increases training flops. The (rumored) size of GPT-4 is regulated.

Christian Szegedy (@chrszegedy) 's Twitter Profile Photo

Inception used 1.5X less compute than AlexNet and 12X less than VGG, outperforming both. The trend continued with mobile net... etc. IMO, today's LLMs are insanely inefficient/compute. Regulations that impose limits on the amount of compute spent on AI training will just

Eliezer Yudkowsky ⏹️ (@esyudkowsky) 's Twitter Profile Photo

Me: Can you draw a very normal image? ChatGPT: Here is a very normal image depicting a tranquil suburban street scene during the daytime. Me: Not bad, but can you go more normal than that? (cont.)

Me:  Can you draw a very normal image?

ChatGPT:  Here is a very normal image depicting a tranquil suburban street scene during the daytime.

Me:  Not bad, but can you go more normal than that?

(cont.)
Prof. Anima Anandkumar (@animaanandkumar) 's Twitter Profile Photo

How do we capture local features across multiple resolutions? While standard convolutional layers work only on a fixed input-resolution, we design local neural operators that learn integral and differential kernels, and are principled ways to extend standard convolutions to

How do we capture local features across multiple resolutions? While standard convolutional layers work only on a fixed input-resolution, we design local neural operators that learn integral and differential kernels, and are principled ways to extend standard convolutions to
Bojan Tunguz (@tunguz) 's Twitter Profile Photo

And today is T - 2 weeks for my other @NVIDIA #GTC session - a fireside chat with Christian Szegedy, cofounder of @xAI. Christian is one of the seminal research figures in the Deep Learning community, but the main focus of our chat will be on something that he has been working very

And today is T - 2 weeks for my other @NVIDIA #GTC session - a fireside chat with <a href="/ChrSzegedy/">Christian Szegedy</a>, cofounder of @xAI. Christian is one of the seminal research figures in the Deep Learning community, but the main focus of our chat will be on something that he has been working very
Damien Teney (@damienteney) 's Twitter Profile Photo

Why do neural nets generalize so well?🤔 There's a ton of work on SGD, flat minima, ... but the root cause is that their inductive biases somehow match properties of real-world data.🌎 We've examined these inductive biases in *untrained* networks.🎲 arxiv.org/abs/2403.02241 ⬇️

Why do neural nets generalize so well?🤔
There's a ton of work on SGD, flat minima, ... but the root cause is that their inductive biases somehow match properties of real-world data.🌎 We've examined these inductive biases in *untrained* networks.🎲
arxiv.org/abs/2403.02241 ⬇️
Ted (@tedinreallife) 's Twitter Profile Photo

Mechanism for feature learning in neural networks and backpropagation-free machine learning models | Science science.org/doi/10.1126/sc…

Zhenhailong Wang (@zhenhailongw) 's Twitter Profile Photo

Large multimodal models often lack precise low-level perception needed for high-level reasoning, even with simple vector graphics. We bridge this gap by proposing an intermediate symbolic representation that leverages LLMs for text-based reasoning. mikewangwzhl.github.io/VDLM 🧵1/4