jiatongshi (@jiatongshi) Twitter Tweets • TwiCopy

Wen-Chin Huang

3 months ago

Enjoyed a great INTERSPEECH 2025 experience! (my first since 2019 at Austria😮‍💨) Kudos to the organizers! Please find our tutorial slides here: voicemos-challenge-2023.github.io/speech-synthes… Also if you work on MOS prediction make sure you check out SHEET! github.com/unilight/sheet

thumb_up_off_alt26

chat_bubble_outline0

repeat10

shareShare

Pooneh Mousavi

@mousavipooneh

3 months ago

I’m happy to share that our paper, "Discrete Audio Tokens: More Than a Survey!", has been accepted at TMLR. 🎉 📄 Read: arxiv.org/pdf/2506.10274 🔎 Explore our tokenizer database & submit yours: poonehmousavi.github.io/dates-website/…

thumb_up_off_alt20

chat_bubble_outline0

repeat4

shareShare

Wangyou Zhang

@emrys365

3 months ago

The 3rd URGENT Challenge just started! 🚀 🎯This time we further expand data diversity and pursue high data quality in the universal SE track. 🎯 I'm also very excited about the new track on SE-oriented SQA. We believe (found) this benefits several areas (SE, TTS, and so on).

thumb_up_off_alt4

chat_bubble_outline0

repeat2

shareShare

🐿️🐒🗻📚🐹

@sythonuk

3 months ago

SingMOS を使おう！ github.com/South-Twilight…

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

Shinji Watanabe

@shinjiw_at_cmu

3 months ago

espnet v.202509 released 🚀 github.com/espnet/espnet/… Includes many updates + fixes for NumPy 2.0 & Python 3.12 (thanks Nelson!). This is the last major update before we shift to the next-gen framework, ESPnet3 Interested in collaborating? Let us know!

thumb_up_off_alt35

chat_bubble_outline0

repeat12

shareShare

jiatongshi

@jiatongshi

3 months ago

Happy to see many familiar faces and be proud of Stella and the world Gaia (hope to have more time playing the game after ICASSP and ICLR submissions 😂

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

jiatongshi

@jiatongshi

3 months ago

ARECHO has been accepted by #neurips25 as spotlight! Many thanks to all the co-authors for their great effort and support!

thumb_up_off_alt27

chat_bubble_outline1

repeat6

shareShare

Chris Donahue

@chrisdonahuey

3 months ago

Eval for music generation is notoriously ill-defined, but no fear! Presenting MAD, a new metric for music quality with stronger alignment to human preferences. Appearing at ISMIR this week! ⭐: github.com/i-need-sleep/m… 📖: arxiv.org/abs/2503.16669 🔊: mad-metric-83cde1d399d1.herokuapp.com 🧵

thumb_up_off_alt43

chat_bubble_outline1

repeat6

shareShare

jiatongshi

@jiatongshi

2 months ago

Happy to help organize the challenge again!

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Wen-Chin Huang

@unilightwf

2 months ago

The SHEET toolkit and the MOS-Bench collection are having MAJOR updates! (1/3) 🗃️ First, MOS-Bench is expanded! Now it has 8 training sets and 17 test sets. A "MOS-Bench at a glance" page is now available -- check out this Google Spreadsheet! docs.google.com/spreadsheets/d…

thumb_up_off_alt4

chat_bubble_outline1

repeat4

shareShare

Chris Donahue

@chrisdonahuey

2 months ago

🎵Music Arena ⚔️ was accepted to the NeurIPS 2025 Creativity Track, and we've released a big update to celebrate! Includes new models from Sonauto AI and ElevenLabs. Also, Music Arena is now available as a 🤗 Hugging Face space and dataset!

thumb_up_off_alt41

chat_bubble_outline1

repeat8

shareShare

jiatongshi

@jiatongshi

a month ago

Speech isn’t just sound -> it’s how we turn thought into expression. Our new work, Speech-DRAME, measures how well speech AI can act, aligning evaluation with human perception. Paper: arxiv.org/abs/2511.01261 Code: github.com/Anuttacon/spee…

thumb_up_off_alt22

chat_bubble_outline0

repeat5

shareShare

jiatongshi

@jiatongshi

25 days ago

It’s a pleasure to work with Haoran!

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

jiatongshi

@jiatongshi

18 days ago

This is exactly the reason we worked for ESPnet-Codec, but being really hard to keep tracking as people are fast nowadays. The similar issue happens at most speech tasks from ASR, TTS, to general speech LLM. It's a bit sad time for driving scientific findings 🥲

thumb_up_off_alt26

chat_bubble_outline3

repeat4

shareShare