WOOSUNG CHOI (@woosungchoi3) 's Twitter Profile
WOOSUNG CHOI

@woosungchoi3

ID: 1207259612014997504

linkhttps://ws-choi.github.io/cv calendar_today18-12-2019 11:21:59

650 Tweet

289 Takipçi

232 Takip Edilen

arXiv Sound (@arxivsound) 's Twitter Profile Photo

Junyoung Koh, Soo Yong Kim, Gyu Hyeong Choi, Yongwon Choi, "AIBA: Attention-based Instrument Band Alignment for Text-to-Audio Diffusion," arxiv.org/abs/2509.20891

Koichi Saito (@koichi__saito) 's Twitter Profile Photo

🔊Our new work: SoundReactor, pushing V2A generation toward “Neural Sound Engine”! We tackle “frame-level online” V2A, i.e., without accessing any future video frames ✅Simple design ✅Full-band stereo with AV sync ✅Low frame-level latency on 30FPS 🎧koichi-saito-sony.github.io/soundreactor/

arXiv Sound (@arxivsound) 's Twitter Profile Photo

Azalea Gui, Woosung Choi, Junghyun Koo, Kazuki Shimada, Takashi Shibuya, Joan Serr\`a, Wei-Hsiang Liao, Yuki Mitsufuji, "Towards Blind Data Cleaning: A Case Study in Music Source Separation," arxiv.org/abs/2510.15409

arXiv Sound (@arxivsound) 's Twitter Profile Photo

Chihiro Nagashima, Akira Takahashi, Zhi Zhong, Shusuke Takahashi, Yuki Mitsufuji, "Studies for : A Human-AI Co-Creative Sound Artwork Using a Real-time Multi-channel Sound Generation Model," arxiv.org/abs/2510.25228

WOOSUNG CHOI (@woosungchoi3) 's Twitter Profile Photo

Excited to head to San Diego for #NeurIPS2025 I’ll be presenting at the Creative AI Session this Wednesday: Large-Scale Training Data Attribution for Music Generative Models via Unlearning. - neurips.cc/virtual/2025/l… - ai.sony/publications/L… See you in San Diego!

arXiv Sound (@arxivsound) 's Twitter Profile Photo

Longshen Ou, Xichu Ma, Ye Wang, "Joint Learning of Wording and Formatting for Singable Melody-to-Lyric Generation," arxiv.org/abs/2307.02146

camenduru (@camenduru) 's Twitter Profile Photo

🎵 LeVo SongGeneration on 🍞 TostUI 🎙 Thanks to Tencent LeVo Team ❤ 🎁 🥳 Happy New Year 🎇🥂 🐋 docker run --gpus all -p 3000:3000 --name tostui-songgeneration camenduru/tostui-songgeneration 🌐 levo-demo.github.io 🍞 github.com/camenduru/Tost…

Kyunghyun Cho (@kchonyc) 's Twitter Profile Photo

this seems like the perfect time to re-advertise this new textbook <Foundations of Linear Algebra> authored by Prof. Wanmo Kang and me, if you're interested in vectors and vector spaces (also a bit of cosine similarity.) link below.

this seems like the perfect time to re-advertise this new textbook &lt;Foundations of Linear Algebra&gt; authored by Prof. Wanmo Kang and me, if you're interested in vectors and vector spaces (also a bit of cosine similarity.)

link below.
Chieh-Hsin (Jesse) Lai (@jcjesselai) 's Twitter Profile Photo

🎓 Happy to share: CMU is incorporating our book 《The Principles of Diffusion Models》 as a core resource for their diffusion & flow-matching course materials. If you’re teaching or learning diffusion models — or want a systematic, principled handbook — feel free to use it too. pic.x.com/S034V7OX1a

Joan Serrà (@serrjoa) 's Twitter Profile Photo

INTERNSHIP ALERT! My team at Sony AI is seeking interns for various positions, starting this year, in #Barcelona and #Zurich. We have a total of four internship opportunities, each lasting between 3 to 6 months, focusing on different topics based on the location. 1/3

INTERNSHIP ALERT!

My team at <a href="/SonyAI_global/">Sony AI</a> is seeking interns for various positions, starting this year, in #Barcelona and #Zurich.  We have a total of four internship opportunities, each lasting between 3  to 6 months, focusing on different topics based on the location.

1/3
Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

8 papers accepted at #ICLR2026 from our lab Sony AI, thanks to our strong interns and collaborators🎉 1. VIRTUE: Visual-Interactive Text-Image Universal Embedder arxiv.org/abs/2510.00523 2. CMT: Mid-Training for Efficient Learning of Consistency, Mean Flow, and Flow Map

SeungHeon Doh (@seungheon_doh) 's Twitter Profile Photo

🎧 Excited to share that our paper "LLM2Fx-Tools: Tool Calling For Music Post-Production" has been accepted to #ICLR2026 📃 Paper link: arxiv.org/abs/2512.01559 LLM2Fx-Tools generates executable sequences of audio effects (Fx-chain) using Chain-of-Thought.

🎧 Excited to share that our paper "LLM2Fx-Tools: Tool Calling For Music Post-Production" has been accepted to #ICLR2026 
📃 Paper link:  arxiv.org/abs/2512.01559 

LLM2Fx-Tools generates executable sequences of audio effects (Fx-chain) using Chain-of-Thought.
Junghyun (Tony) Koo (@junghyun_koo) 's Twitter Profile Photo

🔊 How “good enough” are today’s MLLMs—especially for niche, domain-specific tasks? Multimodal LLMs have become incredibly powerful. But when it comes to highly specialized problems, bigger isn’t always better. 🧵1/4

이준원 Junwon Lee (@jnwnlee) 's Twitter Profile Photo

Our paper on Selective Video-to-Audio generation for compositional workflows has been accepted to #CVPR2026! Check out the demo video below 🎥 🔊 Hear What Matters! Text-conditioned Selective Video-to-Audio Generation Junwon Lee, Juhan Nam, Jiyoung Lee youtube.com/watch?v=eUocr6…

Our paper on Selective Video-to-Audio generation for compositional workflows has been accepted to #CVPR2026! Check out the demo video below 🎥 🔊

Hear What Matters! Text-conditioned Selective Video-to-Audio Generation
Junwon Lee, <a href="/juhan_nam/">Juhan Nam</a>, Jiyoung Lee

youtube.com/watch?v=eUocr6…
Yuhta Takida (@takiko_san) 's Twitter Profile Photo

🎉PAVAS, a framework for generating physically plausible audio from video, by integrating physics estimation at #CVPR2026! Led by our intern Hyun-Bin Oh (x.gd/pE0IB), in collaboration with 過密都市, Tae-Hyun Oh, and Yuki Mitsufuji. 🎧&📝: x.gd/ObKwe

🎉PAVAS, a framework for generating physically plausible audio from video, by integrating physics estimation at #CVPR2026! 

Led by our intern Hyun-Bin Oh (x.gd/pE0IB), in collaboration with <a href="/kamitsutoshi/">過密都市</a>, <a href="/Tae_Hyun_Oh/">Tae-Hyun Oh</a>, and <a href="/mittu1204/">Yuki Mitsufuji</a>.

🎧&amp;📝: x.gd/ObKwe
Nicholas J. Bryan (@nicholasjbryan) 's Twitter Profile Photo

Audio VAEs + VQ-VAEs designed for #GenAI! * Ultra-fast encoding for on-the-fly training pipelines, * ~2x more compression (13Hz) w/frontier quality, * Any format (mono, stereo LR, MS, mel, raw), * Cont. or discrete latents. 👏 Jonah Casebeer! w/Ge Zhu Zhepei Wang, me