SeungHeon Doh (@seungheon_doh) Twitter Tweets • TwiCopy

Kento Watanabe

a year ago

🚀 Exciting Tutorial Alert! 🚀 🎶 Join us at #ISMIR2024 on Nov 10 for the tutorial: "T6: Lyrics and Singing Voice Processing in MIR" 🎶 Discover transcription, alignment, lyrics analysis & voice conversion—advancing MIR applications! 👉 More: ismir2024.ismir.net/tutorials

thumb_up_off_alt5

chat_bubble_outline0

repeat4

shareShare

arXiv Sound

@arxivsound

a year ago

``PIAST: A Multimodal Piano Dataset with Audio, Symbolic and Text,'' Hayeon Bang, Eunjin Choi, Megan Finch, Seungheon Doh, Seolhee Lee, Gyeong-Hoon Lee, Juan Nam, ift.tt/v6oGM9e

thumb_up_off_alt20

chat_bubble_outline0

repeat5

shareShare

Zachary Novack @ICLR2025 🇸🇬

@zacknovack

a year ago

Excited for my 1st #ISMIR2024 this week! Happy to chat about controllable + fast music generation 🙂 I'll be presenting our part 2 of DITTO, where we accelerate control to near real-time! DITTO-2: Distilled Diffusion Inference Time T-Optimization 🎹:ditto-music.github.io/ditto2/ 🧵

thumb_up_off_alt54

chat_bubble_outline1

repeat8

shareShare

SeungHeon Doh

@seungheon_doh

a year ago

Don't miss the "Connecting Music Audio and Natural Language" tutorial ISMIR Conference. We have prepared presentations including Overview of Language Models (Jong Wook Kim 💟), Music Description (Ilaria Manco), Music Retrieval (me), and Music Generation (Zachary Novack, Ke Chen ).

Don't miss the "Connecting Music Audio and Natural Language" tutorial <a href="/ISMIRConf/">ISMIR Conference</a>. We have prepared presentations including Overview of Language Models (<a href="/_jongwook_kim/">Jong Wook Kim 💟</a>), Music Description (<a href="/Ilaria__Manco/">Ilaria Manco</a>), Music Retrieval (me), and Music Generation (<a href="/zacknovack/">Zachary Novack</a>, <a href="/Kotentorothy/">Ke Chen</a> ).

thumb_up_off_alt52

chat_bubble_outline0

repeat7

shareShare

SeungHeon Doh

@seungheon_doh

a year ago

😂😂😂😂😂

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Patrick O'Reilly

@reillyopatrick

a year ago

This Thursday, I'll be at ISMIR Conference presenting a tool for turning *any* percussive sound into drums: oreillyp.github.io/tria/ (1/5)

thumb_up_off_alt66

chat_bubble_outline2

repeat13

shareShare

SeungHeon Doh

@seungheon_doh

a year ago

Start bsky! 🦋 bsky.app/profile/seungh…

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

NVIDIA AI Developer

@nvidiaaidev

a year ago

🎵 ✨The world’s most flexible sound machine? With text and audio inputs, this new #generativeAI model, named Fugatto, can create any combination of music, voices, and sounds.🎹 Read more in our blog by @RichardKerris ➡️ blogs.nvidia.com/blog/fugatto-g… #NVIDIAResearch Note: Some

thumb_up_off_alt479

chat_bubble_outline48

repeat173

shareShare

SeungHeon Doh

@seungheon_doh

10 months ago

A new audio-symbolic-text joint embedding has been released (Like ImageBind!). Use it for music retrieval with multilingual queries, Conditioning, RAG, FD Score, and more!

thumb_up_off_alt17

chat_bubble_outline0

repeat2

shareShare

SeungHeon Doh

@seungheon_doh

10 months ago

I have completed Ph.D. journey! The title of my doctoral dissertation is "Connecting Audio and Natural Language for Music Annotation and Retrieval." I would like to express my deepest gratitude to my advisor, Professor Juhan Nam , and (unofficial co-advisor), Dr. Keunwoo Choi

thumb_up_off_alt80

chat_bubble_outline5

repeat3

shareShare

arXiv Sound

@arxivsound

10 months ago

``TALKPLAY: Multimodal Music Recommendation with Large Language Models,'' Seungheon Doh, Keunwoo Choi, Juhan Nam, ift.tt/HYvzbMZ

thumb_up_off_alt9

chat_bubble_outline0

repeat2

shareShare

Joan Serrà

@serrjoa

10 months ago

Got a "too familiar" tune from your generative model? Try checking for musical version matching (MVM)! But MVM works with full tracks, and your tune is just a segment... Well, in our latest work we tackle precisely this issue, and achieve SOTA results even on full tracks! 1/4

thumb_up_off_alt58

chat_bubble_outline1

repeat13

shareShare

Keunwoo Choi

@keunwoochoi

9 months ago

hi all, hi audio gen folks. please check out KAD! we truly believe KAD is a great alternative to FAD. why? thread:

thumb_up_off_alt81

chat_bubble_outline1

repeat8

shareShare

SanghyukChun

@sanghyukchun

9 months ago

Great work as always! Looks super interesting 😉 Keunwoo Choi

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

SeungHeon Doh

@seungheon_doh

9 months ago

Thanks for sharing :) AK Check out more examples of multi-turn music recommendation examples! - Demo: talkpl-ai.github.io/talkplay-demo/# - Paper: arxiv.org/abs/2502.13713 - Dataset: huggingface.co/datasets/talkp… ( w/ Keunwoo Choi Juhan Nam )

thumb_up_off_alt23

chat_bubble_outline0

repeat0

shareShare

Kirak Kim

@_kirak_kim

9 months ago

🎶 I’ll be presenting at IEEE VR 2025 in Saint-Malo, France! My work, “Designing a VR Music Game for Stress Reduction,” explores VR active music therapy & gamified approaches. First time presenting at an international conference-excited to connect!

thumb_up_off_alt9

chat_bubble_outline2

repeat3

shareShare

Nicholas J. Bryan

@nicholasjbryan

8 months ago

Introducing "DRAGON: Distributional Rewards Optimize Diffusion Generative Models"! 📖: arxiv.org/abs/2504.15217 🎹: ml-dragon.github.io/web/ A new framework for fine-tuning gen models towards a target distribution. By Yatong Bai w/Jonah Casebeer Somayeh Sojoudi Nicholas J. Bryan

thumb_up_off_alt30

chat_bubble_outline2

repeat10

shareShare

Keunwoo Choi

@keunwoochoi

6 months ago

🧵we updated the TalkPlay paper significantly. 1. check out the performance comparison. LLM-based recsys does great job over multi-turn chat and recommendation. SeungHeon Doh

thumb_up_off_alt10

chat_bubble_outline1

repeat2

shareShare

Grace Luo

@graceluo_

6 months ago

✨New preprint: Dual-Process Image Generation! We distill *feedback from a VLM* into *feed-forward image generation*, at inference time. The result is flexible control: parameterize tasks as multimodal inputs, visually inspect the images with the VLM, and update the generator.🧵

thumb_up_off_alt1,1K

chat_bubble_outline18

repeat165

shareShare