Yifan (@yifanbth) 's Twitter Profile
Yifan

@yifanbth

tech founder cooking in AI -
dev deep dives: youtube.com/@YifanBTH -
ex: founding team @improbableio -
@cambridgeuni compsci

ID: 383951897

linkhttp://bento.me/yifan calendar_today02-10-2011 20:11:10

372 Tweet

1,1K Followers

340 Following

Yifan (@yifanbth) 's Twitter Profile Photo

hot take: $1000+ pro max plus LLM subscriptions are coming this year reasoning models like o3 already cost $17k per task on some benchmarks if you're running complex agents daily, even $1000/month starts looking reasonable vs per-token pricing providers will test this for sure

Yifan (@yifanbth) 's Twitter Profile Photo

our standards for sharing LLM progress is dropping rapidly 2022: peer-reviewed papers with months of review 2024: detailed company blog posts 2025: afternoon tweets from the CEO we went from rigorous science to "trust me bro" in just 3 years

Yifan (@yifanbth) 's Twitter Profile Photo

productivity hack: set youtube to 2x speed counterintuitive but you'll be MORE focused, not less. the faster pace forces your brain to stay engaged instead of drifting off works best for educational content, tutorials, even podcasts. saves time + improves attention

Yifan (@yifanbth) 's Twitter Profile Photo

found a weird LLM quirk: if you have only one function defined with tool_choice="auto", both sonnet-4 and gemini-2.5-pro will spam call it regardless of need tried everything - explicit "don't use unless needed" instructions, better function descriptions, prompt optimisation.

Yifan (@yifanbth) 's Twitter Profile Photo

been doing some deep dives into current browser offerings, just realised that Zen browser was mostly done by a single person, extremely impressive given the polish it's basically what Arc should have been - vertical tabs, open source

been doing some deep dives into current browser offerings, just realised that Zen browser was mostly done by a single person, extremely impressive given the polish

it's basically what Arc should have been - vertical tabs, open source
Yifan (@yifanbth) 's Twitter Profile Photo

been using o3 for most of my daily search tasks and it's become that perfect sweet spot between 4o's quick facts and deep research's verbosity interestingly, OpenAI o3 with search is quite a lot better than Perplexity with o3. it frequently goes 5-10 rounds of

Yifan (@yifanbth) 's Twitter Profile Photo

spent a day optimising my production support bot to add function call capabilities. tried o4-mini-high, sonnet-4, and gemini-2.5-pro with identical system prompts and tool definitions. completely unexpected results: • o4-mini-high: accurate tool calls, ok-ish responses •

Yifan (@yifanbth) 's Twitter Profile Photo

grok's twitter search is that missing piece i didn't know i needed instead of manually scrolling through #AI twitter to gauge sentiment on new releases, just ask grok "how are people reacting to claude code's weekly limits" saves me 20+ mins of timeline archaeology daily

Yifan (@yifanbth) 's Twitter Profile Photo

spent the whole weekend reverse engineering claude code, there's so much good stuff hidden in plain sight. Anthropic's effort in prompt engineering is insane big video coming up!

Yifan (@yifanbth) 's Twitter Profile Photo

the price for gpt-oss-120b is extremely competitive if the real-world performance lives up to the benchmarks input: ~$0.20/mtok output: ~$0.65/mtok

the price for gpt-oss-120b is extremely competitive if the real-world performance lives up to the benchmarks

input: ~$0.20/mtok 
output: ~$0.65/mtok
Yifan (@yifanbth) 's Twitter Profile Photo

kubernetes-related debugging via CLI coding agents is bliss caveat: you gotta review every command carefully to avoid accidental deletions

Yifan (@yifanbth) 's Twitter Profile Photo

honestly, the years with ChatGPT turned me into that annoying person who asks "but how does that actually work?" about everything custom instructions with constructive feedback on every chat made me more reflective than years of attempted journaling and now there's zero excuse

Yifan (@yifanbth) 's Twitter Profile Photo

being able to reference past chats is one of the best features that ChatGPT and Claude have ever added. despite coming out later, Claude's UX is superior, with more precise indications of which chats were referenced I suspect this will cause a slight spike in user retention,