Utopic e/λ (@utopicdev) 's Twitter Profile
Utopic e/λ

@utopicdev

AI Designer and Builder.
Technology to save the world.
There Is No Planet B...

ID: 1671549614950957056

calendar_today21-06-2023 16:05:07

4,4K Tweet

190 Followers

3,3K Following

Yanheng He (@yanhenghe) 's Twitter Profile Photo

🔥 Excited to share our work "Efficient Agent Training for Computer Use" Q: Do computer use agents need massive data or complex RL to excel? A: No, with just 312 high-quality trajectories, Qwen2.5-VL can outperform Claude 3.7, setting a new SOTA for Windows computer use. 1/6

🔥 Excited to share our work "Efficient Agent Training for Computer Use"

Q: Do computer use agents need massive data or complex RL to excel?

A: No, with just 312 high-quality trajectories, Qwen2.5-VL can outperform Claude 3.7, setting a new SOTA for Windows computer use.

1/6
Jason Rosenfeld (@jrosenfeld13) 's Twitter Profile Photo

AGI achieved internally. Using DSPy and Keras 3, I built a system where an LLM can self-reflect and modify its own keras source code based on iterative performance. See its reasoning as it changes its neural network architecture after an iteration. Open source soon.

AGI achieved internally. Using <a href="/DSPyOSS/">DSPy</a> and Keras 3, I built a system where an LLM can self-reflect and modify its own keras source code based on iterative performance. See its reasoning as it changes its neural network architecture after an iteration.

Open source soon.
LightOn (@lightonio) 's Twitter Profile Photo

🌐 From Matching to Reasoning — Retrieval just grew a brain. LightOn introduces Reason-ModernColBERT, a State-of-the-Art multi-vector model purpose-built for the era of Deep Research — where matching isn’t enough, and true insight demands reasoning. Built on #ModernBERT and

🌐 From Matching to Reasoning — Retrieval just grew a brain.

LightOn introduces Reason-ModernColBERT, a State-of-the-Art multi-vector model purpose-built for the era of Deep Research — where matching isn’t enough, and true insight demands reasoning.

Built on #ModernBERT and
NovaSky (@novaskyai) 's Twitter Profile Photo

1/N Introducing SkyRL-SQL, a simple, data-efficient RL pipeline for Text-to-SQL that trains LLMs to interactively probe, refine, and verify SQL queries with a real database. 🚀 Early Result: trained on just ~600 samples, SkyRL-SQL-7B outperforms GPT-4o, o4-mini, and SFT model

1/N Introducing SkyRL-SQL, a simple, data-efficient RL pipeline for Text-to-SQL that trains LLMs to interactively probe, refine, and verify SQL queries with a real database.

🚀 Early Result: trained on just ~600 samples, SkyRL-SQL-7B outperforms GPT-4o, o4-mini, and SFT model
kyutai (@kyutai_labs) 's Twitter Profile Photo

Talk to unmute.sh 🔊, the most modular voice AI around. Empower any text LLM with voice, instantly, by wrapping it with our new speech-to-text and text-to-speech. Any personality, any voice. Interruptible, smart turn-taking. We’ll open-source everything within the

Maxime Labonne (@maximelabonne) 's Twitter Profile Photo

The French Ministry of Culture released 175k high-quality arena-style preferences It's exactly the type of data LMSYS stopped releasing. They created their own chatbot arena with 55 models and open-sourced everything. Incredible work! 🤗 Dataset: huggingface.co/datasets/minis…

The French Ministry of Culture released 175k high-quality arena-style preferences

It's exactly the type of data LMSYS stopped releasing.

They created their own chatbot arena with 55 models and open-sourced everything. Incredible work!

🤗 Dataset: huggingface.co/datasets/minis…
Philipp Schmid (@_philschmid) 's Twitter Profile Photo

I just generated a 5:30 min Multi-Speaker Podcast on Agentic Patterns using Gemini 2.5 Flash and our new Text-to-speech (TTS) Model! At I/O we launched native controllable Audio Generation for Gemini 2.5 Pro & Flash. > Controllable style, accent, pace, tone. > single and

clem 🤗 (@clementdelangue) 's Twitter Profile Photo

We want to give more visibility to the whole AI community! So everyone can now share community blogposts on Hugging Face. Wether you want to share about your latest science breakthrough, the model, dataset or space that you build or just your opinion on the latest AI dramas, you

We want to give more visibility to the whole AI community!

So everyone can now share community blogposts on Hugging Face. Wether you want to share about your latest science breakthrough, the model, dataset or space that you build or just your opinion on the latest AI dramas, you
Sasha Rush (@srush_nlp) 's Twitter Profile Photo

Strong recommend for this book and the JAX/TPU docs, even if you are using Torch / GPUs. Clean notation and mental model for some challenging ideas. github.com/jax-ml/scaling… github.com/jax-ml/scaling… docs.jax.dev/en/latest/note…

Strong recommend for this book and the JAX/TPU docs, even if you are using Torch / GPUs. Clean notation and mental model for some challenging ideas. 

github.com/jax-ml/scaling…
github.com/jax-ml/scaling…
docs.jax.dev/en/latest/note…
Eugene Yang (@eyangtw) 's Twitter Profile Photo

🚨Wouldn’t it be nice if your agentic search system could reason over all your docs? ✨Introducing Rank-K, a listwise reranker that benefits from test-time compute and long-context! Rank-K sets a new SoTA for reasoning-based reranking, without reasoning chains from other models.

🚨Wouldn’t it be nice if your agentic search system could reason over all your docs?

✨Introducing Rank-K, a listwise reranker that benefits from test-time compute and long-context! Rank-K sets a new SoTA for reasoning-based reranking, without reasoning chains from other models.
Eugene Yurtsev (@veryboldbagel) 's Twitter Profile Photo

Samuel Colvin Pydantic LangChain Done! If you’d like to collaborate on a framework-agnostic solution let us know. a code interpreter isn't going to be the deciding feature when choosing between agent frameworks. Many use cases will require a full container based sandbox anyway...

Omar Khattab (@lateinteraction) 's Twitter Profile Photo

Love this! What GRPO does with this is amplify behavior that produces code aligned with this. Another, far more sample efficient way to amplify that behavior is to grab the whole trajectory that worked best and stick it into the prompt(s). That gives you dspy.BootstrapFewShot,

integral. (@integral_io) 's Twitter Profile Photo

DSPy simplifies prompt tuning for optimal LLM responses. We fine-tune prompts based on input/output analysis, addressing incorrect or inappropriate LLM behavior.

TuringPost (@theturingpost) 's Twitter Profile Photo

12 types of JEPA (Joint-Embedding Predictive Architecture) ▪️ I-JEPA ▪️ MC-JEPA ▪️ V-JEPA ▪️ UI-JEPA ▪️ A-JEPA (Audio-based JEPA) ▪️ S-JEPA ▪️ TI-JEPA ▪️ T-JEPA ▪️ ACT-JEPA ▪️ Brain-JEPA ▪️ 3D-JEPA ▪️ Point-JEPA Save the list and check this out for the links and more info:

12 types of JEPA (Joint-Embedding Predictive Architecture)

▪️ I-JEPA
▪️ MC-JEPA
▪️ V-JEPA
▪️ UI-JEPA
▪️ A-JEPA (Audio-based JEPA)
▪️ S-JEPA
▪️ TI-JEPA
▪️ T-JEPA
▪️ ACT-JEPA
▪️ Brain-JEPA
▪️ 3D-JEPA
▪️ Point-JEPA

Save the list and check this out for the links and more info:
Utopic e/λ (@utopicdev) 's Twitter Profile Photo

I think people don't really use LLM for complex code/projects, here claude4 couldn't solve anything, everything remains the same...