Martin Bel (@__mbel__) 's Twitter Profile
Martin Bel

@__mbel__

...

ID: 1589662075873271809

linkhttps://www.youtube.com/@martinbel calendar_today07-11-2022 16:52:44

955 Tweet

592 Takipçi

125 Takip Edilen

Martin Bel (@__mbel__) 's Twitter Profile Photo

The alpaca prompt for data generation. Probably their most original idea. If you run it on a GPT, you will understand how they generated the data for finetuning. github.com/tatsu-lab/stan…

The alpaca prompt for data generation. 
Probably their most original idea. 

If you run it on a GPT, you will understand how they generated the data for finetuning.

github.com/tatsu-lab/stan…
prof-g (@robertghrist) 's Twitter Profile Photo

spent a few hours with Claude 3.5 sonnet doing some mathematics research. you are underestimating the impact AI will have on research. yes, you. yes, I'm serious. no, it does not replace mathematicians. but the augmentation is about to take off.

Ivan Werning (@ivanwerning) 's Twitter Profile Photo

I tried Google's NotebookLM "radio conversation generator" on my paper about taxing robots (the irony!). I was blown away by the results. Are robots and AI coming for your jobs journalists? 😱

François Chollet (@fchollet) 's Twitter Profile Photo

A common misconception about Transformers is to believe that they're a sequence-processing architecture. They're not. They're a *set-processing* architecture. Transformers are 100% order-agnostic (which was the big innovation compared to RNNs, back in late 2016 -- you compute

naklecha (@naklecha) 's Twitter Profile Photo

today, i'm excited to release a reinforcement learning guide that carefully explains the intuition and implementation details behind every single fundamental algorithm in the field. enjoy :) naklecha.com/reinforcement-…

today, i'm excited to release a reinforcement learning guide that carefully explains the intuition and implementation details behind every single fundamental algorithm in the field. enjoy :)

naklecha.com/reinforcement-…
Germán Milano (@german_milano) 's Twitter Profile Photo

1/8 Everyone's doing great work building agents for bug fixing. But I'm curious—does anyone have insights on which types of issues your agents are good at, and which ones pose the biggest challenges? 🧵 Cc: Martin Bel José Lamas Rodolfo Anibal Gaston Milano Millan Marcelo Pérez

Paul Gauthier (@paulgauthier) 's Twitter Profile Photo

DeepSeek R1 gets 57% on the aider polyglot benchmark, ranks 2nd behind o1: 62% o1 (high) 57% DeepSeek R1 52% Sonnet 48% DeepSeek Chat V3 Full leaderboard: aider.chat/docs/leaderboa…

DeepSeek R1 gets 57% on the aider polyglot benchmark, ranks 2nd behind o1:

62% o1 (high)
57% DeepSeek R1
52% Sonnet
48% DeepSeek Chat V3

Full leaderboard:
aider.chat/docs/leaderboa…
Sebastian Raschka (@rasbt) 's Twitter Profile Photo

But before I get to the reasoning model space... if you are looking to do some focused offline reading this weekend, I just re-compiled my take on the "noteworthy AI research papers of 2024" into one PDF-export-friendly 47-page mega-post with TOC and all: sebastianraschka.com/blog/2025/llm-…

José Lamas (@jlamasrios) 's Twitter Profile Photo

Congrats to the Data Science team behind Code Fixer at Globant!! After ranking #1 on SWE-Bench Lite in Nov ’24, their major upgrades in multimodal preprocessing, prompt design, and large codebase navigation (especially for complex frontend/backend stacks) made it #1 on SWE-Bench

Congrats to the Data Science team behind Code Fixer at Globant!! After ranking #1 on SWE-Bench Lite in Nov ’24, their major upgrades in multimodal preprocessing, prompt design, and large codebase navigation (especially for complex frontend/backend stacks) made it #1 on SWE-Bench
Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

sam mcallister Personally I encourage my team to use other folks' tools too so we have a realistic view of where we stand in the market. I've noticed that most folks at most big labs seem quite unfamiliar with the competition, OTOH. You gotta be an expert user to understand capabilities.