Jr Kibs (@jrkibs) 's Twitter Profile
Jr Kibs

@jrkibs

“History began when humans invented gods, and will end when humans become gods” Yuval Noah Harari

ID: 1282014745

calendar_today20-03-2013 01:37:58

11,11K Tweet

438 Followers

1,1K Following

elvis (@omarsar0) 's Twitter Profile Photo

Evaluating LLM-based Agents This report has a comprehensive list of methods for evaluating AI Agents. Don't ignore evals. If done right, they are a game-changer. Highly recommend it to AI devs. (bookmark it)

Evaluating LLM-based Agents

This report has a comprehensive list of methods for evaluating AI Agents. 

Don't ignore evals. If done right, they are a game-changer.

Highly recommend it to AI devs. (bookmark it)
Chubby♨️ (@kimmonismus) 's Twitter Profile Photo

Today I read that AI agents are already being introduced into law firms around the world, initially to perform repetitive tasks and later to take on real legal work. In the meantime, there are already numerous use cases on Reddit, such as those in which AI helps to conduct

Today I read that AI agents are already being introduced into law firms around the world, initially to perform repetitive tasks and later to take on real legal work.

In the meantime, there are already numerous use cases on Reddit, such as those in which AI helps to conduct
Machine Learning Street Talk (@mlstreettalk) 's Twitter Profile Photo

Very interesting work by Sakana AI - they have designed a MoE / novel test time inference framework inspired by MCTS which finds the best "switching path" of frontier models which at depth 0 generates a code solution and from depth >0 iteratively edits the existing solution

Ai2 (@allen_ai) 's Twitter Profile Photo

Today we released SciArena, an open evaluation platform where researchers can compare and vote on foundation models for scientific literature tasks. 👇

Jr Kibs (@jrkibs) 's Twitter Profile Photo

Here’s a show that gives us clues on how we might deal with Alien Intelligence. AI Safety folks, this one’s for you. I invite you to watch this show carefully. I invite you to be like Mitsuki : think like them in order to hack them. Anthropic Amanda Askell Jan Leike

Maitreyee Wairagkar (@maitreyee_w) 's Twitter Profile Photo

Check out the new nature Research Briefings article “Brain implant decodes neural activity to produce expressive speech” which summarizes our recent voice-synthesis neuroprosthesis paper. It also gives a sneak peek into the story behind this paper. doi.org/10.1038/d41586…

Check out the new <a href="/Nature/">nature</a> Research Briefings article “Brain implant decodes neural activity to produce expressive speech” which summarizes our recent voice-synthesis neuroprosthesis paper. It also gives a sneak peek into the story behind this paper. doi.org/10.1038/d41586…
Tom Yeh (@proftomyeh) 's Twitter Profile Photo

Context Engineering by hand ✍️ This exercise shows you how it goes far beyond prompt engineering. Do you think this new AI buzzword will stick around?

Jr Kibs (@jrkibs) 's Twitter Profile Photo

"You might have heard rumors of companies looking to acquire us. We are flattered by their attention but are focused on seeing our work through." Zuck, this one's for you.

Jr Kibs (@jrkibs) 's Twitter Profile Photo

Freelancers using Codex can easily handle 3 to 4 clients. It’s just insane. Especially when you pair it with Claude (for design), oh my Gosh. You can build apps that look like they came straight out of a sci-fi movie. I literally feel like we’re in the middle of a takeoff.

Jr Kibs (@jrkibs) 's Twitter Profile Photo

It’s Karpathy who sets the direction for the whole community. In early 2024, he popularized “prompt engineering” A year later, he made “vibe coding” mainstream. And now ? Everyone’s talking about 'context engineering' ever since he tweeted about it a few days ago.

Jr Kibs (@jrkibs) 's Twitter Profile Photo

Here, it’s not the answers that matter; it’s the questions being asked. Each one invites us to dig deep, to respond based on what we believe to be true. They push us to question the very nature of our reality.

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

this story is going wildy viral on reddit. ChatGPT flagged a hidden gene defect that doctors missed for a decade. ChatGPT ingested the patient’s MRI, CT, broad lab panels and years of unexplained symptoms. It noticed that normal serum B12 clashed with nerve pain and fatigue,

this story is going wildy viral on reddit.

ChatGPT flagged a hidden gene defect that doctors missed for a decade.

ChatGPT ingested the patient’s MRI, CT, broad lab panels and years of unexplained symptoms. It noticed that normal serum B12 clashed with nerve pain and fatigue,