William Chen (@chenwanch1) 's Twitter Profile
William Chen

@chenwanch1

PhD Student @LTIatCMU @SCSatCMU | Masters @LTIatCMU | Formerly @TXInstruments | @UCF ‘21

ID: 1403044813562433539

linkhttp://wanchichen.github.io calendar_today10-06-2021 17:42:13

247 Tweet

697 Followers

384 Following

Salah Zaiem (@salah_zaiem) 's Twitter Profile Photo

We are looking for audio and speech generation people, in Zurich, Paris or London to join our team at Google Deepmind. We build cutting-edge speech, music and audio (also audio-visual) generation capabilities. Reach out to Jason or me if interested. Retweets very appreciated !

Masao (@mmiagshatoy) 's Twitter Profile Photo

Happy to share our #ICLR2025 paper: "Context-Aware Dynamic Pruning for Speech Foundation Models" 🎉 💡 We introduce context-aware inference-time pruning. 🎯 On Speech Translation (ST), it cuts inference time by 34% (relative) with no drop in BLEU. 📄 openreview.net/forum?id=u2QdC…

Happy to share our #ICLR2025 paper:
"Context-Aware Dynamic Pruning for Speech Foundation Models" 🎉

💡 We introduce context-aware inference-time pruning.
🎯 On Speech Translation (ST), it cuts inference time by 34% (relative) with no drop in BLEU.

📄 openreview.net/forum?id=u2QdC…
Shinji Watanabe (@shinjiw_at_cmu) 's Twitter Profile Photo

📢 Introducing VERSA: our new open-source toolkit for speech & audio evaluation! - 80+ metrics in one unified interface - Flexible input support - Distributed evaluation with Slurm - ESPnet compatible Check out the details wavlab.org/activities/202… github.com/wavlab-speech/…

Huck Yang 🇸🇬 ICLR 2025 (@huckiyang) 's Twitter Profile Photo

We are happy that🦉 OWLS, 18B to 0.25B open ASR/AST limited data scaling laws, has been accepted to ICML Conference 2025 led by William Chen Jinchuan Tian (田晋川) from Shinji Watanabe WAVLab | @CarnegieMellon and NVIDIA AI Models: huggingface.co/collections/es… Paper: arxiv.org/pdf/2502.10373 Deepspeed ESPNet:

We are happy that🦉 OWLS, 18B to 0.25B open ASR/AST limited data scaling laws, has been accepted to <a href="/icmlconf/">ICML Conference</a> 2025 led by  <a href="/chenwanch1/">William Chen</a> <a href="/MXzBFhjFpS1jyMI/">Jinchuan Tian (田晋川)</a> from <a href="/shinjiw_at_cmu/">Shinji Watanabe</a> <a href="/WavLab/">WAVLab | @CarnegieMellon</a> and <a href="/NVIDIAAI/">NVIDIA AI</a> 
Models: huggingface.co/collections/es…
Paper: arxiv.org/pdf/2502.10373
Deepspeed ESPNet:
Andrew Rouditchenko 🇺🇦 (@arouditchenko) 's Twitter Profile Photo

Do you really need audio to fine-tune your Audio LLM? 🤔 Answer below: Introducing Omni-R1, a simple GRPO fine‑tuning method for Qwen2.5‑Omni on audio question answering. It sets new state‑of‑the‑art accuracies on the MMAU benchmark for Audio LLMs. arxiv.org/abs/2505.09439

Shuichiro Shimizu / 清水 周一郎 (@cromz22) 's Twitter Profile Photo

Excited to share our survey paper accepted to #ACL2025NLP Findings: When Large Language Models Meet Speech: A Survey on Integration Approaches by Zhengdong Yang, Shuichiro Shimizu, Yahan Yu, Chenhui Chu (Chenhui Chu) 1/5

William Chen (@chenwanch1) 's Twitter Profile Photo

7/7 papers accepted to #Interspeech2025 🎉 Lots of interesting work from my fantastic co-authors on long-form processing, multilingualism, and multi-modal foundation models. See y’all in Rotterdam 🇳🇱

William Chen (@chenwanch1) 's Twitter Profile Photo

I’ll be interning at Adobe Research in San Francisco this summer, working on audio generation. HMU if you’re in the area and want to chat about speech / audio AI!

I’ll be interning at Adobe Research in San Francisco this summer, working on audio generation. HMU if you’re in the area and want to chat about speech / audio AI!
jiatongshi (@jiatongshi) 's Twitter Profile Photo

🚀 Introducing Uni-VERSA: a unified model for multi-dimensional speech evaluation-naturalness, intelligibility, noise, prosody & more. ⚡ 109× faster than native VERSA metric computation 🤗 Pretrained models + Colab demo 🧰 VERSA integration coming! 🔗 huggingface.co/collections/es…

Masao (@mmiagshatoy) 's Twitter Profile Photo

🚀 Happy to share our #INTERSPEECH2025 paper: Using speaker & acoustic context, we dynamically adjust model paths, resulting in a 25.7% relative BLEU improvement in speech translation. We also analyze how context influences model behavior. 📜 Paper: arxiv.org/abs/2505.18860

jiatongshi (@jiatongshi) 's Twitter Profile Photo

🔊 New release: #ARECHO -> Autoregressive Evaluation via Chain-based Hypothesis Optimization. • 87-metric coverage in one model 🧮 • Dynamic classifier chain 🤝 • Unified tokenization 🧩 • Confidence-aware decoding 🛡️ Built on #UniVERSA, heading to #VERSA. More ↓