Loreto Parisi (@loretoparisi) Twitter Tweets • TwiCopy

Loreto Parisi

@loretoparisi

+ Follow

MSc EECS @UninaIT 2006. Working on #MachineLearning. #AI @musixmatchai Engineering Director @musixmatch Tweets are my own

ID:7981712

linkhttps://github.com/loretoparisi calendar_today06-08-2007 00:59:59

15,9K Tweets

1,5K Followers

1,2K Following

Salman Khan

1 week ago

LLaMA3 and Phi3 have made the splash this week in LLM Arena. But how strong is their visual understanding ability?

⚡We release LLaMA3-Vision and Phi3-Vision models that beat their larger size LLM competitors.

Github: github.com/mbzuai-oryx/LL…
HF: huggingface.co/collections/MB…

LLaMA3 and Phi3 have made the splash this week in LLM Arena. But how strong is their visual understanding ability? ⚡We release LLaMA3-Vision and Phi3-Vision models that beat their larger size LLM competitors. Github: github.com/mbzuai-oryx/LL… HF: huggingface.co/collections/MB…

thumb_up_off_alt353

chat_bubble_outline0

account_circle

Zengyi Qin

1 week ago

Introducing OpenVoice V2, our latest voice clone model

· Clone Any Voice, Speak in Many Languages
· Totally Free, Open-Sourced

Now your voice goes global in multiple languages🤯

Joint work by MyShell and MIT CSAIL

thumb_up_off_alt380

chat_bubble_outline0

account_circle

Loreto Parisi

2 weeks ago

🦙 x 3 = 🦙🦙🦙

Llama-3 is here!

thumb_up_off_alt2

chat_bubble_outline0

account_circle

Loreto Parisi

2 weeks ago

❄️Artic embeddings (384, 768, 1024 size) from Snowflake

thumb_up_off_alt1

chat_bubble_outline0

account_circle

Philipp Schmid

2 weeks ago

We can do it! 🙌 First open LLM outperforms OpenAI GPT-4 (March) on MT-Bench. WizardLM 2 is a fine-tuned and preferences-trained Mixtral 8x22B! 🤯

TL;DR;
🧮 Mixtral 8x22B based (141B-A40 MoE)
🔓 Apache 2.0 license
🤖 First > 9.00 on MT-Bench with an open LLM
🧬 Used multi-step…

We can do it! 🙌 First open LLM outperforms @OpenAI GPT-4 (March) on MT-Bench. WizardLM 2 is a fine-tuned and preferences-trained Mixtral 8x22B! 🤯 TL;DR; 🧮 Mixtral 8x22B based (141B-A40 MoE) 🔓 Apache 2.0 license 🤖 First > 9.00 on MT-Bench with an open LLM 🧬 Used multi-step…

thumb_up_off_alt393

chat_bubble_outline0

account_circle

🇺🇦Ukrainian Front

@front_ukrainian

1 month ago

⚡️🇺🇦Ukrainian pilots flew by helicopter to a boy from a front-line village to thank him.

The boy always met the airmen with the flag, so they decided to meet him and presented a package of sweets, toys and food for his family.

thumb_up_off_alt7,1K

chat_bubble_outline0

account_circle

AMD Radeon

1 month ago

We are working to release Micro-Engine Scheduler(MES) documentation towards end of May and will follow up with published source code for external review and feedback. We have also opened a GitHub tracker, which will have the latest status on fixes and release dates.

thumb_up_off_alt416

chat_bubble_outline0

account_circle

Loreto Parisi

1 month ago

AMD to open source Micro Engine Scheduler firmware for Radeon GPUs 💥

thumb_up_off_alt4

chat_bubble_outline0

account_circle

Loreto Parisi

1 month ago

 ⌘R+

thumb_up_off_alt2

chat_bubble_outline0

account_circle

Loreto Parisi

1 month ago

Oh-oh apparently Llama.CPP surpassed Apple MLX at the Token/Sec context

thumb_up_off_alt3

chat_bubble_outline0

account_circle

Loreto Parisi

1 month ago

JetMoE-8B has 24 blocks. Each block has two MoE layers: Mixture of Attention heads (MoA) and Mixture of MLP Experts (MoE). Each MoA and MoE layer has 8 expert, and 2 experts are activated for each input token. 💡 ModuleFormer: Modularity Emerges from MoeE
arxiv.org/abs/2306.04640

thumb_up_off_alt13

chat_bubble_outline0

account_circle

Loreto Parisi

1 month ago

1B Token Context Window is the largest C.W. to date. This implies as large as 750M english words (1x10^9 * 75*10^-2 = 75*10^7). ICL capabilities at this scale are still to be proven, but a Needle In A Haystack (NIAH) test of 100% accuracy for the new FastLLM is impressive.

1B Token Context Window is the largest C.W. to date. This implies as large as 750M english words (1x10^9 * 75*10^-2 = 75*10^7). ICL capabilities at this scale are still to be proven, but a Needle In A Haystack (NIAH) test of 100% accuracy for the new FastLLM is impressive.

thumb_up_off_alt0

chat_bubble_outline0

account_circle

Silke Hahn ✨

1 month ago

Aprilscherz oder echt? 🐣🤔🎧

Erinnert ihr euch an Voice Engine, OpenAIs Programm zum Stimmklonen? Eine Woche vorher hatte die Universität von Texas ein vergleichbares Werkzeug veröffentlicht — Voice Craft
the-decoder.de/open-source-st…

... das angeblich nur 3 Sekunden Stimme braucht…

thumb_up_off_alt7

chat_bubble_outline0

account_circle

Loreto Parisi

1 month ago

How do I return the response from an asynchronous call? (Good) old answer, still valid! #JavaScript #NodeJS
stackoverflow.com/a/36585554/758…

thumb_up_off_alt0

chat_bubble_outline0

account_circle

Loreto Parisi

1 month ago

Mixture of Exports (MoE) #LLMs

‣ Mixtral 8x7B , 45B/12B active = Mixtral8x7B-45x12B
‣ Qwen1.5-MoE-A2.7B, 14.3B /72.7B active = Qwen1.5-14.3x2.7B
‣ Grok-1 314B/86B active = Grok-1-314x86B
‣ DBRX 132B/36B active = DBRX-132x36B

thumb_up_off_alt1

chat_bubble_outline0

account_circle

Loreto Parisi

1 month ago

So basically this (DBRX, 132B params with 36B active) and Grok-1 (314B params MoE, 86B active) confirm that MoE-based LLMs are the #LLMs architectures for the 2024 horizon (I cannot bet on it for 2025).

thumb_up_off_alt0

chat_bubble_outline0

account_circle

Loreto Parisi

1 month ago

Totally agreed.

thumb_up_off_alt0

chat_bubble_outline0

account_circle

Loreto Parisi

1 month ago

The old glory is back in business! Make the volume higher! 🎵🎸

The old glory is back in business! Make the volume higher! 🎵🎸

thumb_up_off_alt2

chat_bubble_outline0

account_circle

fpc ok :)