Dominik Soós (@domsoos) 's Twitter Profile
Dominik Soós

@domsoos

made in hungary🇭🇺 phd student @oducs @webscidl💻 @odufootball @ccsffootball alum

ID: 772109471388610560

linkhttp://domsoos.github.io calendar_today03-09-2016 16:30:17

415 Tweet

352 Followers

345 Following

Deedy (@deedydas) 's Twitter Profile Photo

Research published by Google Deepmind reveals OpenAI Strawberry's approach. Searches at inference through potential responses to reason better. “test-time compute can be used to outperform a 14× larger model.”

Research published by Google Deepmind reveals OpenAI Strawberry's approach.

Searches at inference through potential responses to reason better.
                
“test-time compute can be used to outperform a 14× larger model.”
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Now that Microsoft open-sourced the code for one THE CLASSIC Paper of 2024, I am revising the MASTERPIECE. 📚 "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits" BitNet b1.58 70B was 4.1 times faster and 8.9 times higher throughput capable than the

Now that <a href="/Microsoft/">Microsoft</a>  open-sourced the code for one THE CLASSIC Paper of 2024, I am revising the MASTERPIECE.

📚 "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits"

BitNet b1.58 70B was 4.1 times faster and 8.9 times higher throughput capable than the
Andriy Burkov (@burkov) 's Twitter Profile Photo

This is the system prompt for Apple Intelligence. Turns out Apple's prompt engineers are as clueless about how LLM work as all the others.

This is the system prompt for Apple Intelligence. Turns out Apple's prompt engineers are as clueless about how LLM work as all the others.
Dominik Soós (@domsoos) 's Twitter Profile Photo

From Budapest🇭🇺 to Norfolk🇺🇸: my journey from the field to the frontlines of AI research. Join me in a story of teamwork, resilience, and passion with City College of San Francisco Football, ODU Football, Jian Wu, ODU Computer Science, and WS-DL Group, ODU CS. Discover more here: ws-dl.blogspot.com/2024/11/2024-1…

Jason Wei (@_jasonwei) 's Twitter Profile Photo

Prediction: within the next year there will be a pretty sharp transition of focus in AI from general user adoption to the ability to accelerate science and engineering. For the past two years it has been about user base and general adoption across the public. This is very

Deedy (@deedydas) 's Twitter Profile Photo

Using light as a neural network, as this viral video depicts, is actually closer than you think. In 5-10yrs, we could have matrix multiplications in constant time O(1) with 95% less energy. This is the next era of Moore's Law. Let's talk about Silicon Photonics... 1/9

Toby Ord (@tobyordoxford) 's Twitter Profile Photo

The Scaling Paradox: AI capabilities have improved remarkably quickly, fuelled by the explosive scale-up of resources to train the leading models. But the scaling laws that inspired this rush actually show very poor returns to scale. What’s going on? 1/ tobyord.com/writing/the-sc…

ollama (@ollama) 's Twitter Profile Photo

DeepSeek's first-generation reasoning models are achieving performance comparable to OpenAI's o1 across math, code, and reasoning tasks! Give it a try! 👇 7B distilled: ollama run deepseek-r1:7b More distilled sizes are available. 🧵

DeepSeek's first-generation reasoning models are achieving performance comparable to OpenAI's o1 across math, code, and reasoning tasks! 

Give it a try! 👇

7B distilled: 
ollama run deepseek-r1:7b

More distilled sizes are available. 🧵
DAIR.AI (@dair_ai) 's Twitter Profile Photo

The Danger of Overthinking There have been a few papers that look at overthinking in large reasoning models (LRMs). This one provides the most extensive analysis of the issue. It looks at 4K software engineering task trajectories to understand how reasoning models handle

The Danger of Overthinking

There have been a few papers that look at overthinking in large reasoning models (LRMs).

This one provides the most extensive analysis of the issue.

It looks at 4K software engineering task trajectories to understand how reasoning models handle
Dominik Soós (@domsoos) 's Twitter Profile Photo

Excited to share that I’ve accepted a Computational Science Internship Fermilab! Can’t wait to join a world-class lab and contribute to cutting-edge research this summer! ODU Computer Science WS-DL Group, ODU CS

PicoCreator - AI Model Builder 🌉 (@picocreator) 's Twitter Profile Photo

❗️Attention is NOT all you need ❗️ Using only 8 GPU's (not a cluster), we trained a Qwerky-72B (and 32B), without any transformer attention With evals far surpassing GPT 3.5 turbo, and closing in on 4o-mini. All with 100x++ lower inference cost, via RWKV linear scaling

❗️Attention is NOT all you need ❗️

Using only 8 GPU's (not a cluster), we trained a Qwerky-72B (and 32B), without any transformer attention

With evals far surpassing GPT 3.5 turbo, and closing in on 4o-mini. All with 100x++ lower inference cost, via RWKV linear scaling
MatthewBerman (@matthewberman) 's Twitter Profile Photo

We knew very little about how LLMs actually work...until now. Anthropic just dropped the most insane research paper, detailing some of the ways AI "thinks." And it's completely different than we thought. Here are their wild findings: 🧵

We knew very little about how LLMs actually work...until now.

<a href="/AnthropicAI/">Anthropic</a> just dropped the most insane research paper, detailing some of the ways AI "thinks."

And it's completely different than we thought.

Here are their wild findings: 🧵
Kent Lee Platte (@mathbomb) 's Twitter Profile Photo

Pat Conroy is a FB prospect in the 2025 draft class. He scored a 9.98 RAS out of a possible 10.00. This ranked 2 out of 540 FB from 1987 to 2025. ras.football/ras-informatio…

Pat Conroy is a FB prospect in the 2025 draft class. He scored a 9.98 RAS out of a possible 10.00. This ranked 2 out of 540 FB from 1987 to 2025.

ras.football/ras-informatio…
Sergio Pereira (@sergiorocks) 's Twitter Profile Photo

I almost got scammed yesterday! I'll post it all here to showcase the anatomy of an online scam in the time of LLMs. 1. I received this invitation to speak at a tech summit at a well known University in China.

I almost got scammed yesterday!

I'll post it all here to showcase the anatomy of an online scam in the time of LLMs.

1. I received this invitation to speak at a tech summit at a well known University in China.
elvis (@omarsar0) 's Twitter Profile Photo

LLMs Get Lost in Multi-turn Conversation The cat is out of the bag. Pay attention, devs. This is one of the most common issues when building with LLMs today. Glad there is now paper to share insights. Here are my notes:

LLMs Get Lost in Multi-turn Conversation

The cat is out of the bag.

Pay attention, devs.

This is one of the most common issues when building with LLMs today.

Glad there is now paper to share insights.

Here are my notes:
Dominik Soós (@domsoos) 's Twitter Profile Photo

Paper accepted to ACM Hypertext '25! 🎉 We show we can improve open-source LLMs to human level accuracy at distinguishing between human-written and LLM-generated science news. Thankful for my advisors for their support. Jian Wu Meng Jiang WS-DL Group, ODU CS ODU Computer Science

Paper accepted to ACM Hypertext '25! 🎉
We show we can improve open-source LLMs to human level accuracy at distinguishing between human-written and LLM-generated science news. Thankful for my advisors for their support. <a href="/fanchyna/">Jian Wu</a> <a href="/Meng_CS/">Meng Jiang</a> <a href="/WebSciDL/">WS-DL Group, ODU CS</a> <a href="/oducs/">ODU Computer Science</a>