Joe Hoover (@joeehoover) 's Twitter Profile
Joe Hoover

@joeehoover

Automating AI eval @Apple

ID: 462542884

calendar_today13-01-2012 02:14:13

221 Tweet

563 Takipçi

1,1K Takip Edilen

Horace He (@chhillee) 's Twitter Profile Photo

Happy to OSS gpt-fast, a fast and hackable implementation of transformer inference in <1000 lines of native PyTorch with support for quantization, speculative decoding, TP, Nvidia/AMD support, and more! Code: github.com/pytorch-labs/g… Blog: pytorch.org/blog/accelerat… (1/12)

Replicate (@replicate) 's Twitter Profile Photo

Businesses are building on open-source AI. But we’ve only reached a tiny fraction. That's why we raised a $40M Series B. Open-source is open for business 😎 replicate.com/blog/series-b

Hongyang Zhang (@hongyangzh) 's Twitter Profile Photo

Introduce EAGLE, a new method for fast LLM decoding based on compression: - 3x🚀than vanilla - 2x🚀 than Lookahead (on its benchmark) - 1.6x🚀 than Medusa (on its benchmark) - provably maintains text distribution - trainable (in 1~2 days) and testable on RTX 3090s Playground:

Nate Raw (@_nateraw) 's Twitter Profile Photo

Feel the AGI!! 💪 Try out the new Mixtral model from Mistral AI, a 8x7B Mixture of Experts, now on Replicate! Big shout out to Dmytro Dzhulgakov for their minimal implementation that I used to get this shipped 🚀 Impl very slow for now - but it works 😅 replicate.com/nateraw/mixtra…

Hamel Husain (@hamelhusain) 's Twitter Profile Photo

If you are using axolotl, you may have gotten confused by subtle differences in tokenization and seen artifacts like spaces. This can affect you at inference time. In this post, I explain exactly why it is happening, and what you should do about (link in second message 👇).

If you are using axolotl, you may have gotten confused by subtle differences in tokenization and  seen artifacts like spaces.  This can affect you at inference time.

In this post, I explain exactly why it is happening, and what you should do about (link in second message 👇).
Charlie Holtz (@charliebholtz) 's Twitter Profile Photo

I made a site to chat with Mistral AI's new 8x7B instruct model! - free + open source - streams at ~40 tokens/s - beats GPT-3.5 on most benchmarks Try it at mixtral.replicate.dev

AI at Meta (@aiatmeta) 's Twitter Profile Photo

Today we’re releasing Code Llama 70B: a new, more performant version of our LLM for code generation — available under the same license as previous Code Llama models. Download the models ➡️ bit.ly/3Oil6bQ • CodeLlama-70B • CodeLlama-70B-Python • CodeLlama-70B-Instruct

Today we’re releasing Code Llama 70B: a new, more performant version of our LLM for code generation — available under the same license as previous Code Llama models.

Download the models ➡️ bit.ly/3Oil6bQ
• CodeLlama-70B
• CodeLlama-70B-Python
• CodeLlama-70B-Instruct
Replicate (@replicate) 's Twitter Profile Photo

Code Llama 70B is live on Replicate! It's the most powerful code generation model from AI at Meta with instruct, Python, and base variants. Code Llama 70B instruct is fine tuned for understanding natural language instructions: replicate.com/meta/codellama…

Joe Hoover (@joeehoover) 's Twitter Profile Photo

We're building next-gen AI evaluation systems at Apple. Want to help? Looking for a contract ML Scientist/Engineer in NYC (or SEA/Bay Area). Need someone who wants to ships fast. Deep LLM experience required. Must be available now. DM resume + fit explanation. #MLjobs #Apple

Joe Hoover (@joeehoover) 's Twitter Profile Photo

We're building responsible AI evaluation systems at Apple. Want to help? Seeking a Senior Research Data Scientist (contract) in SEA/NYC/SD/Bay Area with data science, RAI, and human research expertise. Python + LLM experience required. Must be available now. DM resume + fit.

Joe Hoover (@joeehoover) 's Twitter Profile Photo

SOTA LLM agents trained on just 72 tasks—but not with GRPO 👀 - No value network - No reward norms - Beats o1 - Great discussion on why this worked in a small-data regime

Zichen Liu @ ICLR2025 (@zzlccc) 's Twitter Profile Photo

🚨There May Not be Aha Moment in R1-Zero-like Training: oatllm.notion.site/oat-zero A common belief about the recent R1-Zero-like training is that self-reflections *emerge* as a result of RL training. We carefully investigated and showed the opposite. 🧵

Joe Hoover (@joeehoover) 's Twitter Profile Photo

"on the majority of the *original* benchmarks, over 50% of 'model errors' are actually caused by label noise!" 🥺 Doing the lord's work 🙏🙏