Xinhao Mei (@xinhao_mei) 's Twitter Profile
Xinhao Mei

@xinhao_mei

Research Scientist @ AI at Meta | PhD student @ University of Surrey.

ID: 1178271870329790464

linkhttp://xinhaomei.github.io calendar_today29-09-2019 11:34:55

33 Tweet

124 Followers

253 Following

Haohe Liu (@liuhaohe) 's Twitter Profile Photo

Can't wait to share our new Text-to-Audio model, AudioLDM. 😆 This video shows the generation result with a simple text prompt: "A music made by xxx". More demos coming soon!😉 The paper will be available next Monday on arXiv! 😊 Our model will be open-sourced soon!😎

AK (@_akhaliq) 's Twitter Profile Photo

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research large-scale weakly-labelled audio captioning dataset, comprising approximately 400k audio clips with paired captions abs: arxiv.org/abs/2303.17395

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research 

large-scale weakly-labelled audio captioning dataset, comprising approximately 400k audio clips
with paired captions

abs: arxiv.org/abs/2303.17395
Xinhao Mei (@xinhao_mei) 's Twitter Profile Photo

🔊 So exited to share our new work, WavCaps: a large-scale weakly-labelled audio captioning dataset. We utilize #ChatGPT to filter & transform noisy data into captions. See remarkable improvements over previous SOTA on multiple tasks! ☺️ Code: github.com/XinhaoMei/WavC…

Haohe Liu (@liuhaohe) 's Twitter Profile Photo

Excited to announce that our paper, "AudioLDM: Text-to-Audio Generation with Latent Diffusion Models," has been accepted at #ICML2023. Many thanks to the reviewers for their invaluable feedback. It's nice to collaborate with Zehua Chen and other co-authors. Also, special

Excited to announce that our paper, "AudioLDM: Text-to-Audio Generation with Latent Diffusion Models," has been accepted at #ICML2023. Many thanks to the reviewers for their invaluable feedback. It's nice to collaborate with <a href="/ZehuaChenICL/">Zehua Chen</a> and other co-authors. 
Also, special
AK (@_akhaliq) 's Twitter Profile Photo

Universal Source Separation with Weakly Labelled Data abs: arxiv.org/abs/2305.07447 paper page: huggingface.co/papers/2305.07… github: github.com/bytedance/uss

Universal Source Separation with Weakly Labelled Data

abs: arxiv.org/abs/2305.07447
paper page: huggingface.co/papers/2305.07…
github: github.com/bytedance/uss
Haohe Liu (@liuhaohe) 's Twitter Profile Photo

AudioLDM 2 paper is now available on Arxiv: arxiv.org/pdf/2308.05734… AudioLDM 2 project page (Demo, Code, Discord): audioldm.github.io/audioldm2/

Haohe Liu (@liuhaohe) 's Twitter Profile Photo

48kHz AudioLDM now open-sourced on GitHub 🔊Text-to-HiFiAudio Generation Much better than the previous 16kHz. The speed-optimized version will be available on HF and Diffusers soon. github.com/haoheliu/Audio…

Haohe Liu (@liuhaohe) 's Twitter Profile Photo

🔊Introducing AudioSR: a plug-and-play & one-for-all solution to upsample your audio to stunning 48kHz quality! 👉Significant improvement verified on MusicGen (32kHz), AudioLDM (16kHz), and FastSpeech2 (22kHz)! Demo, code, and paper: audioldm.github.io/audiosr #AudioSR

🔊Introducing AudioSR: a plug-and-play &amp; one-for-all solution to upsample your audio to stunning 48kHz quality!
👉Significant improvement verified on MusicGen (32kHz), AudioLDM (16kHz), and FastSpeech2 (22kHz)!
Demo, code, and paper: audioldm.github.io/audiosr 

#AudioSR
AK (@_akhaliq) 's Twitter Profile Photo

FoleyGen: Visually-Guided Audio Generation paper page: huggingface.co/papers/2309.10… Recent advancements in audio generation have been spurred by the evolution of large-scale deep learning models and expansive datasets. However, the task of video-to-audio (V2A) generation continues

FoleyGen: Visually-Guided Audio Generation

paper page: huggingface.co/papers/2309.10…

Recent advancements in audio generation have been spurred by the evolution of large-scale deep learning models and expansive datasets. However, the task of video-to-audio (V2A) generation continues
Hung-yi Lee (李宏毅) (@hungyilee2) 's Twitter Profile Photo

Recent years have witnessed significant developments in audio codec models (an overview figure from arxiv.org/abs/2402.13236). We introduce Codec-SUPERB (arxiv.org/abs/2402.13071) to boost fair and comprehensive comparison. Leaderboard: codecsuperb.com

Recent years have witnessed significant developments in audio codec models (an overview figure from arxiv.org/abs/2402.13236). We introduce Codec-SUPERB (arxiv.org/abs/2402.13071) to boost fair and comprehensive comparison. Leaderboard: codecsuperb.com
Haohe Liu (@liuhaohe) 's Twitter Profile Photo

New Challenge on IEEE ICME 2024: Semi-supervised Acoustic Scene Classification under Domain Shift! The final submission deadline is Mar 22. Innovate and compete to win up to $600! 🏆 Baseline & more info are available: ascchallenge.xshengyun.com #IEEEICME2024 #MachineLearning

Thomas Pellegrini (@topel290118) 's Twitter Profile Photo

Machine listening people, please consider participating to the audio captioning task of DCASE. A new baseline system is provided: CNext-trans, 28M params, 29.6% SPIDEr-FL score on Clotho-eval dcase.community/challenge2024/… github.com/Labbeti/dcase2… #DCASE #audiocaptioning

Machine listening people, please consider participating to the audio captioning task of DCASE. A new baseline system is provided: CNext-trans, 28M params, 29.6% SPIDEr-FL score on Clotho-eval

dcase.community/challenge2024/…

github.com/Labbeti/dcase2…

#DCASE #audiocaptioning