Linqing Liu (@likicode) Twitter Tweets • TwiCopy

Greg Yang

2 years ago

Finally launched x.ai! The mathematics of deep learning is profound, beautiful, and unreasonably effective. Developing the "theory of everything" for large neural networks will be central to taking AI to the next level. Conversely, this AI will enable everyone

thumb_up_off_alt7,7K

chat_bubble_outline515

repeat920

shareShare

Gin Jiang

@zhiyingj

2 years ago

For anyone who’s interested, here is the code github.com/bazingagin/npc…. btw, I’m the author of the paper and thanks Luke Gessler for digging my paper out of that many ACL papers lol 😂

thumb_up_off_alt603

chat_bubble_outline21

repeat81

shareShare

Nandan Thakur

@beirmug

2 years ago

That's a wrap! The Waterloo (Waterloo's Cheriton School of Computer Science) team had fun attending the ACL 2023 Conference in Toronto, Canada! #ACL2023NLP 🇨🇦 We would like to congratulate ralphtang.eth Linqing Liu Gin Jiang Jimmy Lin et al. for winning the Best Paper Award at ACL 2023!!🏆 Next stop is SIGIR 2023.

That's a wrap! The Waterloo (<a href="/UWCheritonCS/">Waterloo's Cheriton School of Computer Science</a>) team had fun attending the ACL 2023 Conference in Toronto, Canada! #ACL2023NLP 🇨🇦

We would like to congratulate <a href="/ralph_tang/">ralphtang.eth</a> <a href="/likicode/">Linqing Liu</a> <a href="/ZhiyingJ/">Gin Jiang</a> <a href="/lintool/">Jimmy Lin</a> et al. for winning the Best Paper Award at ACL 2023!!🏆

Next stop is SIGIR 2023.

thumb_up_off_alt81

chat_bubble_outline3

repeat18

shareShare

Igor Babuschkin

@ibab

2 years ago

If you want to move past the AI hype and learn some real fundamental basics behind today's learning algorithms there's no better choice than MacKay's "Information Theory, Inference and Learning Algorithms". You can read the book for free on the official website:

thumb_up_off_alt1,1K

chat_bubble_outline49

repeat107

shareShare

Jean Kaddour

@jeankaddour

2 years ago

📢The costs for training (L)LMs skyrocketed 🚀 in recent years, motivating efficient training algorithms. However, when pre-training BERT and T5 models with a fixed compute budget, we find their gains vanish compared to a baseline with a fully-decayed learning rate! 1/5

thumb_up_off_alt126

chat_bubble_outline2

repeat28

shareShare

Meriem

@mellem_boo

2 years ago

Very excited to share our latest work: 🤔 Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation w/ Edward Kim, Beyza Ermiş, Marzieh Fadaee, Sara Hooker 🔗: arxiv.org/abs/2310.14424

Very excited to share our latest work:
🤔 Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation

w/ <a href="/eddotman/">Edward Kim</a>, <a href="/beyzaermis/">Beyza Ermiş</a>, <a href="/mziizm/">Marzieh Fadaee</a>, <a href="/sarahookr/">Sara Hooker</a>
🔗: arxiv.org/abs/2310.14424

thumb_up_off_alt166

chat_bubble_outline4

repeat33

shareShare

Arthur Mensch

@arthurmensch

2 years ago

Announcing Mixtral 8x7B mistral.ai/news/mixtral-o… and our early developer platform mistral.ai/news/la-platef…. Very proud of the team!

thumb_up_off_alt1,1K

chat_bubble_outline63

repeat191

shareShare

Igor Babuschkin

@ibab

2 years ago

x.ai/blog/grok-os

thumb_up_off_alt2,2K

chat_bubble_outline158

repeat365

shareShare

Jonathan Frankle

@jefrankle

2 years ago

Meet DBRX, a new sota open llm from Databricks. It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.

Meet DBRX, a new sota open llm from <a href="/databricks/">Databricks</a>. It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.

thumb_up_off_alt1,1K

chat_bubble_outline31

repeat255

shareShare

Matei Zaharia

@matei_zaharia

2 years ago

At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B. databricks.com/blog/introduci…

thumb_up_off_alt646

chat_bubble_outline13

repeat130

shareShare

Ali Ghodsi

@alighodsi

2 years ago

Today we released an open source model, DBRX, that beats all previous open source models on the standard benchmarks. The model itself is a Mixture of Experts (MoE), that's roughly twice the brains (132B) but half the cost (36B) of Llama2-70B. Making it both smart and cheap. Since

thumb_up_off_alt1,1K

chat_bubble_outline45

repeat210

shareShare

Mistral AI

@mistralai

2 years ago

magnet:?xt=urn:btih:9238b09245d0d8cd915be09927769d5f7584c1c9&dn=mixtral-8x22b&tr=udp%3A%2F%2Fopen.demonii.com%3A1337%2Fannounce&tr=http%3A%2F%https://t.co/OdtBUsbeV5%3A1337%2Fannounce

thumb_up_off_alt5,5K

chat_bubble_outline266

repeat794

shareShare

Arthur Mensch

@arthurmensch

2 years ago

Official now, very proud of the team! Apache 2.0 and instructed versions for your pleasure, available today on la Plateforme mistral.ai/news/mixtral-8…

thumb_up_off_alt671

chat_bubble_outline53

repeat77

shareShare

Linqing Liu

@likicode

a year ago

Excited to work on this code autocompletion model that supercharge your coding experience!

thumb_up_off_alt11

chat_bubble_outline1

repeat0

shareShare

Noam Shazeer

@noamshazeer

a year ago

Character AI is serving 20,000 QPS. Here are the technologies we use to serve hyper-efficiently. [research.character.ai/optimizing-inf… ]

thumb_up_off_alt1,1K

chat_bubble_outline33

repeat191

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

a year ago

We are thrilled to announce the milestone release of SGLang Runtime v0.2, featuring significant inference optimizations after months of hard work. It achieves up to 2.1x higher throughput compared to TRT-LLM and up to 3.8x higher throughput compared to vLLM. It consistently

thumb_up_off_alt526

chat_bubble_outline13

repeat125

shareShare

Linqing Liu

@likicode

a year ago

Evaluating LLMs in enterprise domains can be challenging. In this post, we share how our applied AI team synthesized high-quality code tests for specific libraries to enhance system performance. Joint work with MatthewHayes Matei Zaharia Ritendra Datta!

thumb_up_off_alt22

chat_bubble_outline0

repeat6

shareShare

Demis Hassabis

@demishassabis

a year ago

The world model is taking shape... 🌐

thumb_up_off_alt6,6K

chat_bubble_outline140

repeat442

shareShare

Demis Hassabis

@demishassabis

a year ago

Thrilled to kick off the Gemini 2.0 era with Gemini 2.0 Flash, an update to our workhorse model that outperforms even 1.5 Pro at twice the speed. It has really great multilingual skills, and can natively call tools, like Google Search. It’s the first release in the Gemini 2.0

thumb_up_off_alt2,2K

chat_bubble_outline109

repeat424

shareShare

Andrej Karpathy

@karpathy

5 months ago

Good post from Balaji on the "verification gap". You could see it as there being two modes in creation. Borrowing GAN terminology: 1) generation and 2) discrimination. e.g. painting - you make a brush stroke (1) and then you look for a while to see if you improved the

thumb_up_off_alt4,4K

chat_bubble_outline115

repeat469

shareShare