Fereshte Khani (@fereshte_khani) Twitter Tweets • TwiCopy

Fereshte Khani

@fereshte_khani

+ Follow

@OpenAI, CS Ph.D. @stanfordAILab

ID: 994846342118871040

linkhttps://fereshte-khani.github.io/ calendar_today11-05-2018 07:47:06

333 Tweet

4,4K Takipçi

883 Takip Edilen

AI Coffee Break with Letitia

7 months ago

We simply explain and illustrate Mamba and (Selective) State Space Models – SSMs. 📺 youtu.be/vrF3MtGwD0Y SSMs match performance of transformers, but are faster and more memory-efficient than them. This is crucial for long sequences! Incredible work by Albert Gu Tri Dao! 👏

We simply explain and illustrate Mamba and (Selective) State Space Models – SSMs.
📺 youtu.be/vrF3MtGwD0Y
SSMs match performance of transformers, but are faster and more memory-efficient than them. This is crucial for long sequences!
Incredible work by <a href="/_albertgu/">Albert Gu</a> <a href="/tri_dao/">Tri Dao</a>! 👏

thumb_up_off_alt253

chat_bubble_outline2

Fereshte Khani

@fereshte_khani

7 months ago

Lives Ended in Gaza… nytimes.com/interactive/20…

thumb_up_off_alt0

chat_bubble_outline0

Zeyuan Allen-Zhu

@zeyuanallenzhu

6 months ago

Our 12 scaling laws (for LLM knowledge capacity) are out: arxiv.org/abs/2404.05405. Took me 4mos to submit 50,000 jobs; took Meta 1mo for legal review; FAIR sponsored 4,200,000 GPU hrs. Hope this is a new direction to study scaling laws + help practitioners make informed decisions

Our 12 scaling laws (for LLM knowledge capacity) are out: arxiv.org/abs/2404.05405. Took me 4mos to submit 50,000 jobs; took Meta 1mo for legal review; FAIR sponsored 4,200,000 GPU hrs. Hope this is a new direction to study scaling laws + help practitioners make informed decisions

thumb_up_off_alt1,1K

chat_bubble_outline28

Lilian Weng

5 months ago

🎨Spent some time refactoring the 2021 post on diffusion model with new content: lilianweng.github.io/posts/2021-07-… ⬇️ ⬇️ ⬇️ 🎬Then another short piece on diffusion video models: lilianweng.github.io/posts/2024-04-… (Yes, I had an intensive weekend🥹)

thumb_up_off_alt1,1K

chat_bubble_outline30

Fereshte Khani

@fereshte_khani

5 months ago

In the long run we are all dead -Keynes

thumb_up_off_alt5

chat_bubble_outline3

lmsys.org

5 months ago

Exciting update -- Llama-3 full result is out, now reaching top-5 on the Arena leaderboard🔥 We've got stable enough CIs with over 12K votes. No question now Llama-3 70B is the new king of open model. Its powerful 8B variant has also surpassed many larger-size models. What an

Exciting update -- Llama-3 full result is out, now reaching top-5 on the Arena leaderboard🔥

We've got stable enough CIs with over 12K votes. No question now Llama-3 70B is the new king of open model. Its powerful 8B variant has also surpassed many larger-size models. What an

thumb_up_off_alt1,1K

chat_bubble_outline31

Michael Black

@michael_j_black

5 months ago

Young scientists regularly ask me for career advice. Academia or industry? Big company or startup? US or Europe? Good scientists in AI disciplines are fortunate to have many choices. But choosing can be stressful. I always give the same advice. 1/10

thumb_up_off_alt1,1K

chat_bubble_outline21

OpenAI

4 months ago

We’ll be streaming live on openai.com at 10AM PT Monday, May 13 to demo some ChatGPT and GPT-4 updates.

thumb_up_off_alt10,10K

chat_bubble_outline572

Fereshte Khani

@fereshte_khani

4 months ago

😢

thumb_up_off_alt6

chat_bubble_outline0

Behnam Neyshabur

4 months ago

I'm excited about this! Our team has been working really hard to improve Gemini 1.5 capabilities significantly on multiple fronts and in particular MATH/STEM! Please see the report here: goo.gle/GeminiV1-5

thumb_up_off_alt169

chat_bubble_outline9

Tri Dao

4 months ago

With Albert Gu, we’ve built a rich theoretical framework of state-space duality, showing that many linear attn variants and SSMs are equivalent! The resulting model, Mamba-2 is better & faster than Mamba-1, and still matching strong Transformer arch on language modeling. 1/

With <a href="/_albertgu/">Albert Gu</a>, we’ve built a rich theoretical framework of state-space duality, showing that many linear attn variants and SSMs are equivalent! The resulting model, Mamba-2 is better & faster than Mamba-1, and still matching strong Transformer arch on language modeling.
1/

thumb_up_off_alt739

chat_bubble_outline5

Rohan Paul

3 months ago

Brilliant work by the Android agents team at Google DeepMind 📌 The authors introduce ANDROIDCONTROL, a new dataset of 15,283 human demonstrations of everyday tasks across 833 Android apps. Each task includes both high-level and low-level instructions. This allows studying agent

Brilliant work by the Android agents team at <a href="/GoogleDeepMind/">Google DeepMind</a>

📌 The authors introduce ANDROIDCONTROL, a new dataset of 15,283 human demonstrations of everyday tasks across 833 Android apps. Each task includes both high-level and low-level instructions. This allows studying agent

thumb_up_off_alt172

chat_bubble_outline8

Robin Jia

2 months ago

For many years as a Stanford NLP Group PhD student, I loved attending these seminars. It’s good to be back, this time as a guest speaker! I’ll discuss my group’s recent progress on understanding and auditing large language models

thumb_up_off_alt71

chat_bubble_outline0

Sam Altman

2 months ago

way back in 2022, the best model in the world was text-davinci-003. it was much, much worse than this new model. it cost 100x more.

thumb_up_off_alt3,3K

chat_bubble_outline171

Rosanne Liu

2 months ago

New fundraiser to support 25 Nigerian students to attend Deep Learning Indaba in September! 🌍 In 2022 we supported 8 Nigerian students to attend Indaba. This year we are raising $20k to support 25(!) of them to travel to Senegal for likely the most important career event in their lives!

New fundraiser to support 25 Nigerian students to attend <a href="/DeepIndaba/">Deep Learning Indaba</a> in September! 🌍

In 2022 we supported 8 Nigerian students to attend Indaba. This year we are raising $20k to support 25(!) of them to travel to Senegal for likely the most important career event in their lives!

thumb_up_off_alt163

chat_bubble_outline6

Zeyuan Allen-Zhu

@zeyuanallenzhu

2 months ago

Incredibly honored and humbled by the overwhelming response to my tutorial, and thank you everyone who attended in person. Truly heartwarming to hear how much you enjoyed it. Many have been asking for a recording, and I prepared one with my own subtitles youtu.be/yBL7J0kgldU

Incredibly honored and humbled by the overwhelming response to my tutorial, and thank you everyone who attended in person. Truly heartwarming to hear how much you enjoyed it. Many have been asking for a recording, and I prepared one with my own subtitles youtu.be/yBL7J0kgldU

thumb_up_off_alt895

chat_bubble_outline23

Zeyuan Allen-Zhu

@zeyuanallenzhu

2 months ago

Bad news (1/2): video taken down by ICML ([email protected]) for copyright. While I can't agree (the consent I signed allows me to publish elsewhere) - I will respect it to save time for more important things. To bad I delayed many things and spent 20+ hrs preparing the video.

Bad news (1/2): video taken down by ICML (brockmeyer@icml.cc) for copyright. While I can't agree (the consent I signed allows me to publish elsewhere) - I will respect it to save time for more important things. To bad I delayed many things and spent 20+ hrs preparing the video.

thumb_up_off_alt494

chat_bubble_outline38

Shefali

2 months ago

🥲🥲

thumb_up_off_alt11,11K

chat_bubble_outline105

Zico Kolter

2 months ago

I'm excited to announce that I am joining the OpenAI Board of Directors. I'm looking forward to sharing my perspectives and expertise on AI safety and robustness to help guide the amazing work being done at OpenAI.

thumb_up_off_alt1,1K

chat_bubble_outline79

Qinyuan Ye

a month ago

I'll be presenting our work on investigating the role of meta-prompt components in automatic prompt engineering, i.e., "𝗽𝗿𝗼𝗺𝗽𝘁 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗮 𝗽𝗿𝗼𝗺𝗽𝘁 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿 ", at Findings Session 1 (Mon 12:45) and NLRSE Workshop (Thu 4pm)! Please come say hi! 👋

I'll be presenting our work on investigating the role of meta-prompt components in automatic prompt engineering, i.e., "𝗽𝗿𝗼𝗺𝗽𝘁 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗮 𝗽𝗿𝗼𝗺𝗽𝘁 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿 ", at Findings Session 1 (Mon 12:45) and NLRSE Workshop (Thu 4pm)! Please come say hi! 👋

thumb_up_off_alt181

chat_bubble_outline1