Hector Liu (@waterluffy) 's Twitter Profile
Hector Liu

@waterluffy

@LLM360, Natural Language Processing, Computational Linguistic

ID: 25823622

linkhttps://hunterhector.github.io/ calendar_today22-03-2009 14:34:22

76 Tweet

221 Followers

280 Following

Morgan McGuire (Hack @ W&B Sep 21/22) (@morgymcg) 's Twitter Profile Photo

Love how the LLM360 team share their Weights & Biases workspaces publicly in the Metrics section for both Amber and Crystal Coder 😍 44 loss and eval charts logged during training, all publicly browsable wandb.ai/llm360/Amber

Love how the <a href="/llm360/">LLM360</a> team share their <a href="/weights_biases/">Weights & Biases</a> workspaces publicly in the Metrics section for both Amber and Crystal Coder 😍

44 loss and eval charts logged during training, all publicly browsable 

wandb.ai/llm360/Amber
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

There's too much happening right now, so here's just a bunch of links GPT-4 + Medprompt -> SOTA MMLU microsoft.com/en-us/research… Mixtral 8x7B @ MLX nice and clean github.com/ml-explore/mlx… Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Cerebras (@cerebrassystems) 's Twitter Profile Photo

In coding tasks, CrystalCoder approaches StarCoder-base in accuracy. In language, CrystalCoder is comparable to Llama and MPT-7B. While previously #LLM builders had to choose between coding or language, CrystalCoder is optimal for both of these tasks. Check it out here:

In coding tasks, CrystalCoder approaches StarCoder-base in accuracy.  

In language, CrystalCoder is comparable to Llama and MPT-7B.  

While previously #LLM builders had to choose between coding or language, CrystalCoder is optimal for both of these tasks.  

Check it out here:
Hector Liu (@waterluffy) 's Twitter Profile Photo

CrystalChat is currently the best model in LLM360, on both English and Code. With our special multiphase training, CrystalChat demonstrates surprising ability early during the training: see the training trajectory here: wandb.ai/llm360/Crystal…. A paper is coming soon!

Jiacheng Liu (@liujc1998) 's Twitter Profile Photo

It’s year 2024, and n-gram LMs are making a comeback!! We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T

It’s year 2024, and n-gram LMs are making a comeback!!

We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T
LLM360 (@llm360) 's Twitter Profile Photo

Huge congratulations to the Ai2 team! The OLMo series is a substantial contribution to the OSS LLM community with: - open training datasets - 500 checkpoints - analysis and evaluations Couldn’t be happier to see more projects with similar goals (cc EleutherAI)

Qian Liu (@sivil_taram) 's Twitter Profile Photo

Open access goes beyond the open of model weights; it also encompasses the release of comprehensive training recipes, from ELMo to OLMo🚀 Big shoutout to Ai2 , along with my respect to BigCode EleutherAI LLM360 for their commitment to FULLY transparent AI models.

Aurick Qiao (@aurickq) 's Twitter Profile Photo

Super proud to have played a small part in this large model! Arctic is a 480B parameter MoE with 128 experts, with weights released under Apache-2.0. We are also publishing a series of blog posts that that details and demystifies large MoE training, check it out!

Maitrix.org (@maitrixorg) 's Twitter Profile Photo

🔥Introducing Pandora 🌏 🪐 a World Model that generates videos of world states with real-time language control 🎥🕹️ Simulate the world across domains in an _interactive_ way! check out more world-model.ai

Eric Xing (@ericxing) 's Twitter Profile Photo

A World Model that is real-time steerable with language command, and able to real-time reason at concept level in visual space. Time to move past LLM that lives in lingual world and enter the physical and sensory world! Zhiting Hu, Yann LeCun, MBZUAI

Hector Liu (@waterluffy) 's Twitter Profile Photo

Proud to see K2 finally out! Months of hard work of an amazing team and GPUs🖥️. Being transparent and matching Llama2-70B's performance, we believe K2 can push the frontier of AI in the OSS way. Stay tuned for more updates, or explore on our website llm360.ai now.

Hector Liu (@waterluffy) 's Twitter Profile Photo

Open source is well defined in software development, which features collaboration, peer production. The open weights ~= .exe analogy is neat: weights are like code, but compiled ones that require significant effort to reverse-engineer, not the right level for peer production.

Ludwig Schmidt (@lschmidt3) 's Twitter Profile Photo

Very excited about this! DCLM already led to a great training set for language models, and there is (much) more to understand + more room for improvement here.