Vigneshwarar (@vignesh_warar) Twitter Tweets • TwiCopy

Vigneshwarar

@vignesh_warar

+ Follow

tweet mostly about ai/building

building:
graphthem.com (new!)
keepitshot.com (@KeepitShot)
@GraspSearch (paused)

ID: 769481360

linkhttps://vigneshwarar.substack.com calendar_today20-08-2012 12:59:37

1,1K Tweet

695 Takipçi

653 Takip Edilen

Machine Learning Street Talk

@mlstreettalk

5 months ago

Today is a big day, ARC Prize version 2 has just been released! Watch the launch video on MLST.

thumb_up_off_alt147

chat_bubble_outline6

repeat21

shareShare

Today we are announcing ARC-AGI-2, an unsaturated frontier AGI benchmark that challenges AI reasoning systems (same relative ease for humans). Grand Prize: 85%, ~$0.42/task efficiency Current Performance: * Base LLMs: 0% * Reasoning Systems: <4%

thumb_up_off_alt2,2K

chat_bubble_outline66

repeat342

shareShare

Vigneshwarar

@vignesh_warar

5 months ago

👀

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Vigneshwarar

@vignesh_warar

5 months ago

A TikTok-style clip featuring a repeatedly watched moment from a movie or series on Netflix, with a hook, would be a banger for discoverability.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

ARC Prize

@arcprize

4 months ago

o3 and o4-mini on ARC-AGI's Semi Private Evaluation * o3-medium scores 53% on ARC-AGI-1 * o4-mini shows state-of-the-art efficiency * ARC-AGI-2 remains virtually unsolved (<3%) Through analysis we highlight differences from o3-preview and other model behavior

thumb_up_off_alt1,1K

chat_bubble_outline38

repeat125

shareShare

Vigneshwarar

@vignesh_warar

3 months ago

...evolutionary agents

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Jack Cole

@mindsai_jack

2 months ago

We put years of work in on this hard problem and made progress. We're seeking new breakthroughs for ARC-AGI-2.

thumb_up_off_alt43

chat_bubble_outline1

repeat5

shareShare

Vigneshwarar

@vignesh_warar

20 days ago

Alright, time to read what all the hype is about.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Adam Zweiger

@adamzweiger

19 days ago

Here are all the architecture tricks used by gpt-oss: - Attention sinks - for each attention head, have a learned scalar such that softmax(qk) becomes softmax over [a_1, a_2, ..., a_T, sink]. Tokens don't have to attend to anything if all the attention scores are low! -

thumb_up_off_alt754

chat_bubble_outline10

repeat64

shareShare

Vigneshwarar

@vignesh_warar

17 days ago

GPT-5 will power a new web-based PDF reader for research papers I'm building.

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Vigneshwarar

@vignesh_warar

17 days ago

OK, it is a bit underwhelming after playing around for a while.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Guangxuan Xiao

@guangxuan_xiao

17 days ago

I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI's new OSS models. For those interested in the details: hanlab.mit.edu/blog/streaming…