Vigneshwarar (@vignesh_warar) 's Twitter Profile
Vigneshwarar

@vignesh_warar

tweet mostly about ai/building

building:
graphthem.com (new!)
keepitshot.com (@KeepitShot)
@GraspSearch (paused)

ID: 769481360

linkhttps://vigneshwarar.substack.com calendar_today20-08-2012 12:59:37

1,1K Tweet

695 Takipçi

653 Takip Edilen

ARC Prize (@arcprize) 's Twitter Profile Photo

Today we are announcing ARC-AGI-2, an unsaturated frontier AGI benchmark that challenges AI reasoning systems (same relative ease for humans). Grand Prize: 85%, ~$0.42/task efficiency Current Performance: * Base LLMs: 0% * Reasoning Systems: <4%

Today we are announcing ARC-AGI-2, an unsaturated frontier AGI benchmark that challenges AI reasoning systems (same relative ease for humans).

Grand Prize: 85%, ~$0.42/task efficiency

Current Performance:
* Base LLMs: 0%
* Reasoning Systems: &lt;4%
Vigneshwarar (@vignesh_warar) 's Twitter Profile Photo

A TikTok-style clip featuring a repeatedly watched moment from a movie or series on Netflix, with a hook, would be a banger for discoverability.

ARC Prize (@arcprize) 's Twitter Profile Photo

o3 and o4-mini on ARC-AGI's Semi Private Evaluation * o3-medium scores 53% on ARC-AGI-1 * o4-mini shows state-of-the-art efficiency * ARC-AGI-2 remains virtually unsolved (<3%) Through analysis we highlight differences from o3-preview and other model behavior

o3 and o4-mini on ARC-AGI's Semi Private Evaluation

* o3-medium scores 53% on ARC-AGI-1
* o4-mini shows state-of-the-art efficiency
* ARC-AGI-2 remains virtually unsolved (&lt;3%)

Through analysis we highlight differences from o3-preview and other model behavior
Adam Zweiger (@adamzweiger) 's Twitter Profile Photo

Here are all the architecture tricks used by gpt-oss: - Attention sinks - for each attention head, have a learned scalar such that softmax(qk) becomes softmax over [a_1, a_2, ..., a_T, sink]. Tokens don't have to attend to anything if all the attention scores are low! -

Guangxuan Xiao (@guangxuan_xiao) 's Twitter Profile Photo

I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI's new OSS models. For those interested in the details: hanlab.mit.edu/blog/streaming…

I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI's new OSS models.

For those interested in the details:
hanlab.mit.edu/blog/streaming…
Vigneshwarar (@vignesh_warar) 's Twitter Profile Photo

AI explanations + visualization tightly integrated. Understand complex concepts intuitively, in minutes. Can't wait to ship