Anish (@anishhacko) 's Twitter Profile
Anish

@anishhacko

AI and Neuro

ID: 359346904

linkhttps://anish.bearblog.dev/ calendar_today21-08-2011 12:35:05

287 Tweet

53 Takipçi

453 Takip Edilen

Anish (@anishhacko) 's Twitter Profile Photo

What does it mean to create dataset for ChatGPT according to you? My view is “you understand every personas and how tricky they can prompt a question” and how flawless model can respond to it… 😇

Sachin (@sachdh) 's Twitter Profile Photo

Excited to share Aryabhatta 1.0, our leading model that scores 90.2% on JEE Mains, outperforming frontier models like o4 mini and Gemini Flash 2.5 Trained by us at AthenaAgent , in collaboration with Physics Wallah (PW), using custom RLVR training on 130K+ curated JEE problems

Excited to share Aryabhatta 1.0, our leading model that scores 90.2% on JEE Mains, outperforming frontier models like o4 mini and Gemini Flash 2.5

Trained by us at <a href="/AthenaAgentRL/">AthenaAgent</a> , in collaboration with <a href="/physics__wallah/">Physics Wallah (PW)</a>, using custom RLVR training on 130K+ curated JEE problems
Bytez (@bytez) 's Twitter Profile Photo

GPU costs holding back your killer idea? Apply for free inference for your startup. (credits worth $100K) - Credits apply to open/closed source models (Anthropic, etc) - Use thousands of models with a single API Don’t just dream it. Build it. 👇

Omar Sanseviero (@osanseviero) 's Twitter Profile Photo

Some fun things people may have missed from Gemma 3 270M: 1. Out of 270M params, 170M are embedding params and 100M are transformers blocks. Bert from 2018 was larger 🤯 2. The vocabulary is quite large (262144 tokens). This makes Gemma 3 270M very good model to be hyper

Avi Chawla (@_avichawla) 's Twitter Profile Photo

The growth of LLM context length with time: - GPT-3.5-turbo → 4k tokens - OpenAI GPT4 → 8k tokens - Claude 2 → 100k tokens - Llama 3 → 128k tokens - Gemini → 1M tokens Let's understand how they extend the context length of LLMs:

Anish (@anishhacko) 's Twitter Profile Photo

diabrowser.com/skills/graham Evaluate a person online like investor Graham Duncan. Just use the prompt and see the :)

Anish (@anishhacko) 's Twitter Profile Photo

What we are certainly missing in the AI first space is “precision engineering” Solving for it puts your upfront :)

Anish (@anishhacko) 's Twitter Profile Photo

The laziness to aggregate high quality data pushes the demand of LLMs(API calls) Why? - We can train SLM on high quality data and serve at scale in long run. -Data under our control, model weights under our control. But: We are lazy 🥸