Shekswess (@shekswess) 's Twitter Profile
Shekswess

@shekswess

Machine Learning Lead @lokahq | College Professor @brainster

ID: 1848019348780204032

linkhttp://shekswess.github.io/ calendar_today20-10-2024 15:12:10

94 Tweet

8 Followers

217 Following

Sudo su (@sudoingx) 's Twitter Profile Photo

the people telling you a single 3090 can't ship production quality are not wrong about the ceiling. they're wrong about the conclusion. most of them prompted a model twice, watched it hallucinate, and made a youtube video titled "local AI is NOT ready." they never iterated.

Maziyar PANAHI (@maziyarpanahi) 's Twitter Profile Photo

One protein. 10^150 possible DNA sequences. We trained a transformer to pick the right one. 25 species. $165. mRNA language models that learn context-dependent codon preferences from natural coding sequences. Not frequency tables from the 1980s. Actual sequence understanding.

One protein. 10^150 possible DNA sequences.
We trained a transformer to pick the right one. 25 species. $165.

mRNA language models that learn context-dependent codon preferences from natural coding sequences.

Not frequency tables from the 1980s. Actual sequence understanding.
Jürgen Schmidhuber (@schmidhuberai) 's Twitter Profile Photo

Dr. LeCun's heavily promoted Joint Embedding Predictive Architecture (JEPA, 2022) [5] is the heart of his new company. However, the core ideas are not original to LeCun. Instead, JEPA is essentially identical to our 1992 Predictability Maximization system (PMAX) [1][14].

Dr. LeCun's heavily promoted Joint Embedding Predictive Architecture (JEPA, 2022) [5] is the heart of his new company. However, the core ideas are not original to LeCun. Instead, JEPA is essentially identical to our 1992 Predictability Maximization system (PMAX) [1][14].
Joseph Suarez (e/🐡) (@jsuarez5341) 's Twitter Profile Photo

You really can't hate Schmidhuber. Both my first paper and our upcoming Puffer 4 release heavily rely on a trick from one of his now mostly forgotten papers. The dude be much happier if someone had just handed him a couple of 5090s back then to make all his ideas work better

Liquid AI (@liquidai_) 's Twitter Profile Photo

Today, we release LFM2.5-350M. Agentic loops at 350M parameters. A 350M model trained for reliable data extraction and tool use, where models at this scale typically struggle. <500MB when quantized, built for environments where compute, memory, and latency are constrained. 🧵

Today, we release LFM2.5-350M. Agentic loops at 350M parameters.

A 350M model trained for reliable data extraction and tool use, where models at this scale typically struggle.

&lt;500MB when quantized, built for environments where compute, memory, and latency are constrained.

🧵
will brown (@willccbb) 's Twitter Profile Photo

hiring 1-2 more interns this summer for Applied Research @primeintellect focus areas = agentic RL, data + evals, or forward-deployed in-person in SF, relo support provided, US work auth required (sorry), intended for current students DM me something sick you've been working on

PrismML (@prismml) 's Twitter Profile Photo

Today, we are emerging from stealth and launching PrismML, an AI lab with Caltech origins that is centered on building the most concentrated form of intelligence. At PrismML, we believe that the next major leaps in AI will be driven by order-of-magnitude improvements in

Today, we are emerging from stealth and launching PrismML, an AI lab with Caltech origins that is centered on building the most concentrated form of intelligence. 

At PrismML, we believe that the next major leaps in AI will be driven by order-of-magnitude improvements in
Hynek Kydlíček (@hkydlicek) 's Twitter Profile Photo

Oh shit, it seems like all the HF Research team pretraining data has been accidentally leaked to the public. The web, PDFs, and synthetic datasets are expode on hf FineData org... Apparently, an intern used CC to push the data with private=False.

Oh shit, it seems like all the HF Research team pretraining data has been accidentally leaked to the public. The web, PDFs, and synthetic datasets are expode on hf FineData org...

Apparently, an intern used CC to push the data with private=False.
GitLawb (@gitlawb) 's Twitter Profile Photo

We forked the leaked Claude Code source and made it work with ANY LLM: GPT, DeepSeek, Gemini, Llama, MiniMax. Open source. The name is OpenCode

🍓🍓🍓 (@iruletheworldmo) 's Twitter Profile Photo

🚨BREAKING FRONTIER MODEL NEWS claude mythos set for release april 16th dario has more leaks than the titanic, here’s some info from anthropic staff. >95 or higher on every single benchmark. except arc agi 3, yet to be tested on. >dramatically outperforms opus 4.6 on coding,

Shekswess (@shekswess) 's Twitter Profile Photo

I haven't felt this entusiastic about some interesting stuff on which I'm working for a really long time. I hope everything will turn out really cool !