Rohan Paul(@rohanpaul_ai) 's Twitter Profileg
Rohan Paul

@rohanpaul_ai

ML Engineer (e/acc)

πŸ“Œ https://t.co/x0IIWfnOt8

πŸš€ https://t.co/QEO4CKRl1b

Open LLMs is Happiness πŸ’‘

Ex Deutsche & HSBC.

DM for collaboration.

ID:2588345408

linkhttps://rohanpaul.gumroad.com/l/python-core-with-under-the-hood-explanations calendar_today25-06-2014 22:38:54

10,4K Tweets

13,0K Followers

1,0K Following

Rohan Paul(@rohanpaul_ai) 's Twitter Profile Photo

'NExT: Teaching Large Language Models to Reason about Code Execution'

The key problem this paper aims to solve is that, (LLMs) of code are typically trained on the surface textual form of programs, thus may lack a semantic understanding of how programs execute at run-time. So

'NExT: Teaching Large Language Models to Reason about Code Execution' The key problem this paper aims to solve is that, (LLMs) of code are typically trained on the surface textual form of programs, thus may lack a semantic understanding of how programs execute at run-time. So
account_circle
Rohan Paul(@rohanpaul_ai) 's Twitter Profile Photo

The Perplexity of Quantized Llama 3 degrades quite a bit vs quantized Llama 2

But looking at the below Perplexity numbers - Llama 3 has a higher initial perplexity when not quantized vs Llama-2.

Possible explanation πŸ‘‡

Degree of specialization of each model on the Wikitext

The Perplexity of Quantized Llama 3 degrades quite a bit vs quantized Llama 2 But looking at the below Perplexity numbers - Llama 3 has a higher initial perplexity when not quantized vs Llama-2. Possible explanation πŸ‘‡ Degree of specialization of each model on the Wikitext
account_circle
Rohan Paul(@rohanpaul_ai) 's Twitter Profile Photo

Google's new Med-Gemini surpasses the GPT-4 model family on every benchmark where a direct comparison could be made.

Achieves SoTA performance of 91.1% accuracy on MedQA (USMLE) benchmark, using a novel uncertainty-guided search strategy.

πŸ“Œ A very significant advancements in

Google's new Med-Gemini surpasses the GPT-4 model family on every benchmark where a direct comparison could be made. Achieves SoTA performance of 91.1% accuracy on MedQA (USMLE) benchmark, using a novel uncertainty-guided search strategy. πŸ“Œ A very significant advancements in
account_circle
Rohan Paul(@rohanpaul_ai) 's Twitter Profile Photo

Meta is really taking the leadership position for OSS contributions.

Beyond Llama-3 - we have all the below from them

(and this not an exhaustive list)

- React
- PyTorch
- React Native
- GraphQL
- Jest
- Flow
- Yarn
- Hermes
- FBT
- Prophet
- Cassandra
- Mercurial (which is

account_circle
Rohan Paul(@rohanpaul_ai) 's Twitter Profile Photo

Comparison of TensorRT-LLM on consumer hardware vs llama .cpp - by Jan

(blog link in 1st comment)
---

Findings - 'TensorRT-LLM was:

- 30-70% faster than llama.cpp on the same hardware

- Consumes less memory on consecutive runs and marginally more GPU VRAM

Comparison of TensorRT-LLM on consumer hardware vs llama .cpp - by @janframework (blog link in 1st comment) --- Findings - 'TensorRT-LLM was: - 30-70% faster than llama.cpp on the same hardware - Consumes less memory on consecutive runs and marginally more GPU VRAM
account_circle
Rohan Paul(@rohanpaul_ai) 's Twitter Profile Photo

Really fantastic paper for a new understanding of In-context Learning in Transformers

'Transformers learn in-context'

In-context learning refers to the ability of Transformers to adapt their predictions based on the context provided in the input sequence, without the need for

Really fantastic paper for a new understanding of In-context Learning in Transformers 'Transformers learn in-context' In-context learning refers to the ability of Transformers to adapt their predictions based on the context provided in the input sequence, without the need for
account_circle
Rohan Paul(@rohanpaul_ai) 's Twitter Profile Photo

Recently Microsoft announced 'ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks' πŸ”₯

πŸ“Œ It investigates the limitations of existing 4-bit quantization methods like GPTQ for large language models (LLMs), which tend to overfit

Recently Microsoft announced 'ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks' πŸ”₯ πŸ“Œ It investigates the limitations of existing 4-bit quantization methods like GPTQ for large language models (LLMs), which tend to overfit
account_circle
Rohan Paul(@rohanpaul_ai) 's Twitter Profile Photo

Meta is really taking the leadership position for OSS contributions.

Beyond Llama-3 - we have all the below from them

(and this not an exhaustive list)

- React
- PyTorch
- React Native
- GraphQL
- Jest
- Flow
- Yarn
- Hermes
- FBT
- Prophet
- Cassandra
- Mercurial (which is

account_circle
Yam Peleg(@Yampeleg) 's Twitter Profile Photo

anton For searching a database? of course it does, context kills RAGs 300%.

Cache the dataset in the KV cache, it is better in any way shape or form.

RAG will only be used for explainability because it is very easy to explain (and 'blame').

(rag is also very hard to get to work..)

account_circle
Rohan Paul(@rohanpaul_ai) 's Twitter Profile Photo

The Perplexity of Quantized Llama 3 degrades quite a bit vs quantized Llama 2

But looking at the below Perplexity numbers - Llama 3 has a higher initial perplexity when not quantized vs Llama-2.

Possible explanation πŸ‘‡

Degree of specialization of each model on the Wikitext

The Perplexity of Quantized Llama 3 degrades quite a bit vs quantized Llama 2 But looking at the below Perplexity numbers - Llama 3 has a higher initial perplexity when not quantized vs Llama-2. Possible explanation πŸ‘‡ Degree of specialization of each model on the Wikitext
account_circle
Marques Brownlee(@MKBHD) 's Twitter Profile Photo

NEW VIDEO - Rabbit R1: Barely Reviewable

youtu.be/ddTV12hErTc

This is the pinnacle of a trend that's been annoying for years: Delivering barely finished products to win a 'race' and then continuing to build them after charging full price. Games, phones, cars, now AI in a box

account_circle