Aston Zhang (@astonzhangaz) Twitter Tweets • TwiCopy

Zhuosheng Zhang

2 years ago

🚀 The research in the year 2023 has advanced so rapidly! 🌌Join us on an exciting journey from Chain-of-Thought to Language Agent! Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

thumb_up_off_alt121

chat_bubble_outline2

repeat24

shareShare

Mike Lewis

@ml_perception

2 years ago

Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come... ai.meta.com/blog/meta-llam…

thumb_up_off_alt503

chat_bubble_outline17

repeat97

shareShare

Aston Zhang

@astonzhangaz

2 years ago

Llama 3 has been my focus since joining the Llama team last summer. Together, we've been tackling challenges across pre-training and human data, pre-training scaling, long context, post-training, and evaluations. It's been a rigorous yet thrilling journey: 🔹Our largest models

thumb_up_off_alt1,1K

chat_bubble_outline131

repeat232

shareShare

Laurens van der Maaten

@lvdmaaten

2 years ago

Excited to share what I’ve been working on for the past 9 months. So incredibly proud of the entire team that worked tirelessly to make Llama 3 happen! And this is only the beginning… ai.meta.com/blog/meta-llam…

thumb_up_off_alt212

chat_bubble_outline5

repeat22

shareShare

Aston Zhang

@astonzhangaz

2 years ago

Try out our open sourced Llama 3 at github.com/meta-llama/lla…!

thumb_up_off_alt11

chat_bubble_outline5

repeat0

shareShare

Aston Zhang

@astonzhangaz

2 years ago

Proud of what the team have been achieving together! Besides Llama 3, do check out our product launches meta.ai

thumb_up_off_alt37

chat_bubble_outline5

repeat0

shareShare

Aston Zhang

@astonzhangaz

2 years ago

Thanks Lex! Llama is thrilled to support developers as an open-source model. With the exciting upgrades in this Llama 3 release, we're excited to see how video podcasts can empower developers to quickly build amazing things together.

thumb_up_off_alt27

chat_bubble_outline0

repeat0

shareShare

Aston Zhang

@astonzhangaz

2 years ago

Riveting read! High-stakes decision-making demands reliable evaluations. Understanding evaluation failure modes is crucial. Post-training evaluations, which are closer to human interactions, pose more challenges than pre-training ones.

thumb_up_off_alt19

chat_bubble_outline1

repeat3

shareShare

Aston Zhang

@astonzhangaz

a year ago

Thanks AI at Meta for having me on the Llama for Developers podcast! Tokenizers play a crucial role in LLMs, impacting data handling, pre-training, post-training, and inference: 🔹With a larger vocabulary, domain-specific words are more likely to be single tokens, preserving

thumb_up_off_alt47

chat_bubble_outline1

repeat10

shareShare

Ahmad Al-Dahle

@ahmad_al_dahle

a year ago

We get a lot of questions about the best ways to start experimenting with Llama — we teamed up with Andrew Ng’s team to build a great resource for this. We’ve made the full Prompt Engineering with Llama course available for free on @Coursera: coursera.org/projects/promp…

thumb_up_off_alt149

chat_bubble_outline27

repeat30

shareShare

AI at Meta

@aiatmeta

a year ago

Some context on why our smallest Llama 3 model went from 7B → 8B. More details on the changes to the tokenizer in the full conversation with @astongzhangAZ ➡️ youtu.be/Tmdk_H2WDj4

thumb_up_off_alt284

chat_bubble_outline8

repeat51

shareShare

Aston Zhang

@astonzhangaz

a year ago

Our Llama 3.1 405B is now openly available! After a year of dedicated effort, from project planning to launch reviews, we are thrilled to open-source the Llama 3 herd of models and share our findings through the paper: 🔹Llama 3.1 405B, continuously trained with a 128K context

thumb_up_off_alt3,3K

chat_bubble_outline131

repeat588

shareShare

Aston Zhang

@astonzhangaz

a year ago

🚀 New paper from our Llama team AI at Meta! We discuss "cross capabilities" and "Law of the Weakest Link" of large language models (LLMs): 🔹 Cross capabilities: the intersection of multiple distinct capabilities across different types of expertise necessary to address complex,

🚀 New paper from our Llama team <a href="/AIatMeta/">AI at Meta</a>! We discuss "cross capabilities" and "Law of the Weakest Link" of large language models (LLMs):

🔹 Cross capabilities: the intersection of multiple distinct capabilities across different types of expertise necessary to address complex,

thumb_up_off_alt148

chat_bubble_outline7

repeat22

shareShare

Aston Zhang

@astonzhangaz

a year ago

🚀 Exciting internship opportunity! Join the Llama team AI at Meta and help redefine what's possible with large language models—from pre-training to post-training. Be part of our 2025 research internship and help shape the future of LLMs. Feel free to email or DM me 📩 Learn

thumb_up_off_alt223

chat_bubble_outline6

repeat21

shareShare

Aston Zhang

@astonzhangaz

8 months ago

Our Llama 4’s industry leading 10M+ multimodal context length (20+ hours of video) has been a wild ride. The iRoPE architecture I’d been working on helped a bit with the long-term infinite context goal toward AGI. Huge thanks to my incredible teammates! 🚀Llama 4 Scout 🔹17B

thumb_up_off_alt1,1K

chat_bubble_outline78

repeat137

shareShare

Sanyam Bhutani

@bhutanisanyam1

8 months ago

1.5M tokens to website in 5 minutes 🙏 - Upload an entire repo of apps - Upload multiple sketches website - Use the repo content to populate the template Llama 4 supports 10M context + upto 10 images in a session:

thumb_up_off_alt45

chat_bubble_outline5

repeat7

shareShare