Sung-Feng Huang (@sungfenghuang) Twitter Tweets • TwiCopy

Sung-Feng Huang

@sungfenghuang

+ Follow

National Taiwan University | Speech Processing & Machine Learning Lab @ntu_spml

ID: 378591891

calendar_today23-09-2011 13:32:43

22 Tweet

63 Takipçi

256 Takip Edilen

RJ Skerry-Ryan

@rustyryan

5 years ago

New work with Ron Weiss, Eric Battenberg, Soroosh Mariooryad, Durk Kingma -- finally achieving what Yuxuan Wang and I set out to do in 2016 before switching to spectrograms: direct waveform generation from characters. (1/7) abs: arxiv.org/abs/2011.03568 samples: google.github.io/tacotron/publi…

thumb_up_off_alt88

chat_bubble_outline1

repeat33

shareShare

Sasha Rush

@srush_nlp

5 years ago

Understanding the Difficulty of Training Transformers (arxiv.org/pdf/2004.08249…, Liyuan Liu (Lucas)) Studies instability of transformer. Dives into the dark arts of NN stability, impact of layer norm / residuals. Isolates residual paths as a main cause. Trains 60 layers transformers.

Understanding the Difficulty of Training Transformers (arxiv.org/pdf/2004.08249…, <a href="/LiyuanLucas/">Liyuan Liu (Lucas)</a>)

Studies instability of transformer. Dives into the dark arts of NN stability, impact of layer norm / residuals. Isolates residual paths as a main cause. Trains 60 layers transformers.

thumb_up_off_alt246

chat_bubble_outline4

repeat38

shareShare

VoiceConversionLab

@voiceconversion

5 years ago

[new VC paper candidate] "Utilizing Self-supervised Representations for MOS Prediction" arXiv: arxiv.org/abs/2104.03017

thumb_up_off_alt9

chat_bubble_outline0

repeat3

shareShare

VoiceConversionLab

@voiceconversion

5 years ago

[paper version up] "Utilizing Self-supervised Representations for MOS Prediction" arXiv: arxiv.org/abs/2104.03017

thumb_up_off_alt4

chat_bubble_outline0

repeat3

shareShare

Hung-yi Lee (李宏毅)

@hungyilee2

5 years ago

Three years ago, when we first tried to use GAN to realize unsupervised ASR (arxiv.org/abs/1804.00316), I thought the idea was sci-fi. But a few days ago, Facebook AI pushed the idea of using GAN for unsupervised ASR to 5.9% WER on Librispeech (ai.facebook.com/blog/wav2vec-u…).

thumb_up_off_alt28

chat_bubble_outline0

repeat10

shareShare

Sung-Feng Huang

@sungfenghuang

5 years ago

Accepted to @INTERSPEECH2021 as 1st author!

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

NTU SPML Lab

@ntu_spml

5 years ago

Honor to cooperate with the researchers of Facebook, CMU, MIT, and JHU to develop SUPERB. When you pre-train an LM like BERT on text, you use GLUE to evaluate its performance. How about speech? You can use SUPERB, which will be the speech version of GLUE. superbbenchmark.org

thumb_up_off_alt12

chat_bubble_outline0

repeat4

shareShare

Hung-yi Lee (李宏毅)

@hungyilee2

4 years ago

Two tutorials at INTERSPEECH'22. Self-Supervised Representation Learning for Speech Processing slides: docs.google.com/presentation/d… Neural Speech Synthesis slides: github.com/tts-tutorial/i…

thumb_up_off_alt170

chat_bubble_outline0

repeat32

shareShare

Xuanjun (Victor) Chen 🤖

@xjchen_ntu

a year ago

🚨 Call for Papers – ASRU 2025 Special Session 🎤 Responsible Speech & Audio Generative AI 📍 Honolulu, Hawaii · Dec 2025 Join us to tackle accountability, fairness, and trust in generative speech/music/audio systems! 👉 Deadline: May 28, 2025 🔗 Detail: codecfake.github.io/RespSA-GenAI/

thumb_up_off_alt8

chat_bubble_outline0

repeat4

shareShare

Sreyan Ghosh

@sreyang

9 months ago

We at NVIDIA and GAMMA UMD are excited to release Audio Flamingo 3, the most powerful, open, and capable large audio-language model to date! Paper: arxiv.org/abs/2507.08128 Open-source model, code, and data: research.nvidia.com/labs/adlr/AF3/ Try it out here: huggingface.co/spaces/nvidia/…

thumb_up_off_alt19

chat_bubble_outline1

repeat8

shareShare

Cheng Han Chiang (姜成翰)

@dcml0714

9 months ago

1/7 🔗 Introducing STITCH: our new method to make Spoken Language Models (SLMs) think and talk at the same time. Paper link 👉 arxiv.org/abs/2507.15375

thumb_up_off_alt62

chat_bubble_outline1

repeat21

shareShare