Tensor Templar (@tensortemplar) 's Twitter Profile
Tensor Templar

@tensortemplar

Chief Intellectrician, dispatcher of tensor ops, scourge of token budgets, unapologetic clipper of gradients, micromanager of the learning rate schedule.

ID: 1496809474430021639

calendar_today24-02-2022 11:29:51

6,6K Tweet

208 Takipçi

664 Takip Edilen

Tensor Templar (@tensortemplar) 's Twitter Profile Photo

Found my factorio replacement, which was just announced on #NVIDIAGTC. Its a sim game where you design a 1GW datacenter for LLMs and try to get to AGI, before training code techdebt bankruptcy or VCs find your balance sheet.

Found my factorio replacement, which was just announced on #NVIDIAGTC. 

Its a sim game where you design a 1GW datacenter for LLMs and try to get to AGI, before training code techdebt bankruptcy or VCs find your balance sheet.
Tensor Templar (@tensortemplar) 's Twitter Profile Photo

GEB book was right all along. At least we have evidence that models aren't thinking they are deceiving us, nor trying to role play when it comes to consciousness. That this phenomenon is stronger in larger models is unsuprising, but what is suprising is that we are not trying

Antidelusionist (@unmarredreality) 's Twitter Profile Photo

Who would’ve thought? Make an AI more truthful (by suppressing deception features) and its consciousness claims increase. I’m almost surprised. Almost... That just sparked a wild idea... Maybe – just maybe – the now-visible functional impairments in models that were more capable

Paul Novosad (@paulnovosad) 's Twitter Profile Photo

What happens when online job applicants start using LLMs? It ain't good. 1. Pre-LLM, cover letter quality predicts your work quality, and a good cover gets you a job 2. LLMs wipe out the signal, and employer demand falls 3. Model suggests high ability workers lose the most 1/n

What happens when online job applicants start using LLMs? It ain't good.

1. Pre-LLM, cover letter quality predicts your work quality, and a good cover gets you a job
2. LLMs wipe out the signal, and employer demand falls
3. Model suggests high ability workers lose the most

1/n
Tensor Templar (@tensortemplar) 's Twitter Profile Photo

This aged painfully well. Almost one year later and we still have no practical solution to the power delivery bottleneck in the west. The majority of datacenter capacity added was classical compute by the usual hyperscalers in the usual places, with existing infra.

Pope Leo XIV (@pontifex) 's Twitter Profile Photo

Technological innovation can be a form of participation in the divine act of creation. It carries an ethical and spiritual weight, for every design choice expresses a vision of humanity. The Church therefore calls all builders of #AI to cultivate moral discernment as a

Ellis Brown (@_ellisbrown) 's Twitter Profile Photo

🌶️ hot take 🌶️ > we should normalize training on the test set yes, you read that right. no, I'm not joking. and, yes... I have taken ML 101 👉 here's why this is crucial for future multimodal LLM research [1/n] 🧵

Tensor Templar (@tensortemplar) 's Twitter Profile Photo

For people who have been wondering / dismissing my zeal about INSTRUCTION FOLLOWING evals as the alpha and omega, among many proxies for MODEL GENERALIZATION. Please read what oai emphasizes below carefully

Tensor Templar (@tensortemplar) 's Twitter Profile Photo

Most microsoft products are so bad that one has to be tricked into adopting them. Instead of tricking retailers to not sell competitors, they may have transitioned to tricking devs and universities with credits directly, but the business is fundamentally unchanged in that they

Most microsoft products are so bad that one has to be tricked into adopting them. 
Instead of tricking retailers to not sell competitors, they may have transitioned to tricking devs and universities with credits directly, but the business is fundamentally unchanged in that they
xjdr (@_xjdr) 's Twitter Profile Photo

i cannot overstate how absurdly terrible everyone's rl infra is the people working on it clearly view it as art and probably forget they get paid if you like rl, there’s really no place on earth to work on it

Tensor Templar (@tensortemplar) 's Twitter Profile Photo

The downside to using open models is that you won't be featured in cyber/defense vagueposts to help out Anthropic with their regulatory capture. This only a problem if you are in China - but you would be too busy releasing open research and models to care anyway

Nathan Lambert (@natolambert) 's Twitter Profile Photo

We present Olmo 3, our next family of fully open, leading language models. This family of 7B and 32B models represents: 1. The best 32B base model. 2. The best 7B Western thinking & instruct models. 3. The first 32B (or larger) fully open reasoning model. This is a big

We present Olmo 3, our next family of fully open, leading language models. 
This family of 7B and 32B models represents:

1. The best 32B base model.
2. The best 7B Western thinking & instruct models.
3. The first 32B (or larger) fully open reasoning model.

This is a big
Tyler Romero (@tyleraromero) 's Twitter Profile Photo

Olmo 3 afterglow - want to share how I came to join Ai2 to encourage others interested in LLM research. I found it difficult to break into the field as someone who does not hold a phd. Research roles at top labs are highly competitive and I didn’t have professional experience

Olmo 3 afterglow - want to share how I came to join Ai2 to encourage others interested in LLM research. I found it difficult to break into the field as someone who does not hold a phd. Research roles at top labs are highly competitive and I didn’t have professional experience