Rohit Patel (@_rohit_patel_) Twitter Tweets • TwiCopy

Rohit Patel

@_rohit_patel_

2 years ago

A look at early impact of llama3: ai.meta.com/blog/meta-llam…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context

thumb_up_off_alt5,5K

chat_bubble_outline271

repeat1,1K

shareShare

Rohit Patel

@_rohit_patel_

a year ago

We're open source releasing the the latest Llama models today. The largest of our models pushes new boundaries in many areas. Can't wait to see how the community will use these: github.com/meta-llama/lla…

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Rohit Patel

@_rohit_patel_

a year ago

With the release of Meta Llama 3.1 we are putting out the evaluation data for anyone looking to replicate our evals: huggingface.co/meta-llama

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare

Ahmad Al-Dahle

@ahmad_al_dahle

a year ago

The team at SambaNova just announced their new API offering: SambaNova Cloud. They’re achieving the fastest inference we’ve seen yet for Llama 70B (570 tokens/s) and 405B (132 tokens/s). Available for free via API with no waitlist today. 👏

thumb_up_off_alt37

chat_bubble_outline2

repeat6

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

a year ago

Due to strong community interest, we've collaborated with AI at Meta to compare the bf16 and fp8 versions of Llama-3.1-405b in Chatbot Arena! With over 5K community votes, both versions show similar performance across the board: - Overall: 1266 vs 1266 - Hard prompts: 1267 vs

Due to strong community interest, we've collaborated with <a href="/AIatMeta/">AI at Meta</a> to compare the bf16 and fp8 versions of Llama-3.1-405b in Chatbot Arena!

With over 5K community votes, both versions show similar performance across the board:
- Overall: 1266 vs 1266
- Hard prompts: 1267 vs

thumb_up_off_alt574

chat_bubble_outline26

repeat72

shareShare

Rohit Patel

@_rohit_patel_

a year ago

Today we're open source releasing the latest versions of our Llama models, Llama 3.2. We have 1B/3B models for text and 11B/90B multimodal models: ai.meta.com/blog/llama-3-2…

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Rohit Patel

@_rohit_patel_

a year ago

A fully self-contained derivation of LLMs from middle school math: medium.com/@rohit-patel/u…

thumb_up_off_alt14

chat_bubble_outline0

repeat1

shareShare

Towards Data Science

@tdatascience

a year ago

Curious about Large Language Models but don't know where to start? Rohit Patel's latest article breaks it all down from the basics, requiring only your ability to add and multiply. #LLM #ML towardsdatascience.com/understanding-…

thumb_up_off_alt28

chat_bubble_outline2

repeat6

shareShare

Rohit Patel

@_rohit_patel_

a year ago

We’re releasing 1B & 3B quantized Llama models with same quality as the original, while achieving 2-4x speedup. We used two techniques: Quantization-Aware Training with LoRA adaptors, and SpinQuant ai.meta.com/blog/meta-llam…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Rohit Patel

@_rohit_patel_

a year ago

We are releasing Llama 3.3 today. An updated Lllama 70B open source instruct model which is comparable in performance to the 405B model. Happy holidays!!! 🥳 #llm #ai #llama One Meta: llama.com/llama-download… Oh Huggingface: huggingface.co/meta-llama/Lla…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Towards Data Science

@tdatascience

a year ago

From our #TDSBestOf2024 collection: Rohit Patel with a beginner-friendly primer on LLMs and how they work under the hood. towardsdatascience.com/understanding-…

thumb_up_off_alt10

chat_bubble_outline0

repeat2

shareShare

Rohit Patel

@_rohit_patel_

8 months ago

Our CRAG-MM Challenge (KDD Cup 2025) invites you to develop innovative multi-modal, multi-turn question-answering systems with a focus on RAG, using agentic tools to retrieve information. The goal is to improve visual reasoning: aicrowd.com/challenges/met…

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

Rohit Patel

@_rohit_patel_

5 months ago

Negative log-likelihood, cross entropy, and KL divergence. Related, simple and extremely useful concepts worth fully internalizing.medium.com/data-science-c… hashtag#ml hashtag#ai hashtag#statistics

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Rohit Patel

@_rohit_patel_

4 months ago

Understanding reinforcement learning for model training from scratch. This took me a lot longer to write than anticipated, partly because the RLMT literature is not an easy read: rohit-patel.medium.com/understanding-…

thumb_up_off_alt3

chat_bubble_outline1

repeat0

shareShare