Aston Zhang (@astonzhangaz) 's Twitter Profile
Aston Zhang

@astonzhangaz

Long context lead of Llama. Lead author of d2l.ai.

ID: 1074411845199421440

linkhttp://astonzhang.com calendar_today16-12-2018 21:12:00

205 Tweet

9,9K Takipçi

92 Takip Edilen

Zhuosheng Zhang (@zhangzhuosheng) 's Twitter Profile Photo

🚀 The research in the year 2023 has advanced so rapidly! 🌌Join us on an exciting journey from Chain-of-Thought to Language Agent! Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents

🚀 The research in the year 2023 has advanced so rapidly! 🌌Join us on an exciting journey from Chain-of-Thought to Language Agent!

Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents
Mike Lewis (@ml_perception) 's Twitter Profile Photo

Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come... ai.meta.com/blog/meta-llam…

Aston Zhang (@astonzhangaz) 's Twitter Profile Photo

Llama 3 has been my focus since joining the Llama team last summer. Together, we've been tackling challenges across pre-training and human data, pre-training scaling, long context, post-training, and evaluations. It's been a rigorous yet thrilling journey: 🔹Our largest models

Llama 3 has been my focus since joining the Llama team last summer. Together, we've been tackling challenges across pre-training and human data, pre-training scaling, long context, post-training, and evaluations. It's been a rigorous yet thrilling journey:

🔹Our largest models
Laurens van der Maaten (@lvdmaaten) 's Twitter Profile Photo

Excited to share what I’ve been working on for the past 9 months. So incredibly proud of the entire team that worked tirelessly to make Llama 3 happen! And this is only the beginning… ai.meta.com/blog/meta-llam…

Aston Zhang (@astonzhangaz) 's Twitter Profile Photo

Thanks Lex! Llama is thrilled to support developers as an open-source model. With the exciting upgrades in this Llama 3 release, we're excited to see how video podcasts can empower developers to quickly build amazing things together.

Aston Zhang (@astonzhangaz) 's Twitter Profile Photo

Riveting read! High-stakes decision-making demands reliable evaluations. Understanding evaluation failure modes is crucial. Post-training evaluations, which are closer to human interactions, pose more challenges than pre-training ones.

Aston Zhang (@astonzhangaz) 's Twitter Profile Photo

Thanks AI at Meta for having me on the Llama for Developers podcast! Tokenizers play a crucial role in LLMs, impacting data handling, pre-training, post-training, and inference: 🔹With a larger vocabulary, domain-specific words are more likely to be single tokens, preserving

Ahmad Al-Dahle (@ahmad_al_dahle) 's Twitter Profile Photo

We get a lot of questions about the best ways to start experimenting with Llama — we teamed up with Andrew Ng’s team to build a great resource for this. We’ve made the full Prompt Engineering with Llama course available for free on @Coursera: coursera.org/projects/promp…

AI at Meta (@aiatmeta) 's Twitter Profile Photo

Some context on why our smallest Llama 3 model went from 7B → 8B. More details on the changes to the tokenizer in the full conversation with @astongzhangAZ ➡️ youtu.be/Tmdk_H2WDj4

Aston Zhang (@astonzhangaz) 's Twitter Profile Photo

Our Llama 3.1 405B is now openly available! After a year of dedicated effort, from project planning to launch reviews, we are thrilled to open-source the Llama 3 herd of models and share our findings through the paper: 🔹Llama 3.1 405B, continuously trained with a 128K context

Our Llama 3.1 405B is now openly available! After a year of dedicated effort, from project planning to launch reviews, we are thrilled to open-source the Llama 3 herd of models and share our findings through the paper:

🔹Llama 3.1 405B, continuously trained with a 128K context
Aston Zhang (@astonzhangaz) 's Twitter Profile Photo

🚀 New paper from our Llama team AI at Meta! We discuss "cross capabilities" and "Law of the Weakest Link" of large language models (LLMs): 🔹 Cross capabilities: the intersection of multiple distinct capabilities across different types of expertise necessary to address complex,

🚀 New paper from our Llama team <a href="/AIatMeta/">AI at Meta</a>! We discuss "cross capabilities" and "Law of the Weakest Link" of large language models (LLMs):

🔹 Cross capabilities: the intersection of multiple distinct capabilities across different types of expertise necessary to address complex,
Aston Zhang (@astonzhangaz) 's Twitter Profile Photo

🚀 Exciting internship opportunity! Join the Llama team AI at Meta and help redefine what's possible with large language models—from pre-training to post-training. Be part of our 2025 research internship and help shape the future of LLMs. Feel free to email or DM me 📩 Learn

Aston Zhang (@astonzhangaz) 's Twitter Profile Photo

Our Llama 4’s industry leading 10M+ multimodal context length (20+ hours of video) has been a wild ride. The iRoPE architecture I’d been working on helped a bit with the long-term infinite context goal toward AGI. Huge thanks to my incredible teammates! 🚀Llama 4 Scout 🔹17B

Our Llama 4’s industry leading 10M+ multimodal context length (20+ hours of video) has been a wild ride. The iRoPE architecture I’d been working on helped a bit with the long-term infinite context goal toward AGI. Huge thanks to my incredible teammates!

🚀Llama 4 Scout
🔹17B
Sanyam Bhutani (@bhutanisanyam1) 's Twitter Profile Photo

1.5M tokens to website in 5 minutes 🙏 - Upload an entire repo of apps - Upload multiple sketches website - Use the repo content to populate the template Llama 4 supports 10M context + upto 10 images in a session: