Koichi Saito (@koichi__saito) 's Twitter Profile
Koichi Saito

@koichi__saito

Research engineer @SonyAI_global Sound generation/restoration, Deep generative modeling. Tweets are my own. Love fender 🎸

ID: 1590271163124416512

linkhttps://scholar.google.com/citations?user=UT-g5BAAAAAJ&hl=en calendar_today09-11-2022 09:13:00

113 Tweet

251 Followers

360 Following

Dongjun Kim (@gimdong58085414) 's Twitter Profile Photo

🚀Happy to announce our new model, PaGoDA (arxiv.org/abs/2405.14822). Following Progressively Growing GAN, PaGoDA extends the 1-step generator progressively to distill 64x64 pixel diffusion up to 512x512! All you need is 64x64 pixel diffusion! Chieh-Hsin (Jesse) Lai Yuki Mitsufuji Stefano Ermon

Yixiao Zhang (@yixiao_zhang_) 's Twitter Profile Photo

We made instruct-MusicGen open-sourced! 😀 It includes both training & inference code. Since Slakh2100 dataset contains copyright audio data, we are not able to publicly share the ckpt😭 Paper: arxiv.org/abs/2405.18386 Code: github.com/ldzhangyx/inst… Demo: bit.ly/instruct-music…

We made instruct-MusicGen open-sourced! 😀 It includes both training & inference code.

Since Slakh2100 dataset contains copyright audio data, we are not able to publicly share the ckpt😭

Paper: arxiv.org/abs/2405.18386
Code: github.com/ldzhangyx/inst…
Demo: bit.ly/instruct-music…
Koichi Saito (@koichi__saito) 's Twitter Profile Photo

Thank you very much for sharing our work AK, my friend Dongjun Kim, and colleague Chieh-Hsin (Jesse) Lai :) Now audio demo samples are ready here: koichi-saito-sony.github.io/soundctm/ Training/inference codes and checkpoints are almost there!

dadabots (@dadabots) 's Twitter Profile Photo

🥳 Announcing Stable Audio Open 1.0 WE GOT MODEL WEIGHTS OUT 🏁😅 - text2audio diffusion, T5 DiT - 47s - pre-trained on sfx/samples from FreeSound - good for making samples for your music - fine-tune it - build upon it (controlnet, etc) huggingface.co/stabilityai/st…

Marco Martínez (@marcoamaram) 's Twitter Profile Photo

🎧🔬Given a music mixture and its multitrack recordings, can we retrieve the audio effects graph from audio alone ?🎶🔍 Check out 'Searching For Music Mixing Graphs: A Pruning Approach'—great research by @SunghoL10754073 during his internship at Sony AI arxiv.org/abs/2406.01049

Shinnosuke Takamichi / 高道 慎之介 (@forthshinji) 's Twitter Profile Photo

we are thrilled to announce YODAS v2! - 400k hours, 149 languages of speech data (same to v1) - supporting long-form speech - higher sampling rate (24 kHz sampling) huggingface.co/datasets/espne…

Yixiao Zhang (@yixiao_zhang_) 's Twitter Profile Photo

🪄MusicMagus is accepted to IJCAI 2024! It is a light-weight, training-free music editing and style transfer algorithm, which can be used to a pretrained diffusion model. We also made 🪄MusicMagus open-sourced at GitHub: github.com/ldzhangyx/Musi…. Huggingface Space coming soon!

roser batlle roca (@roserbatlleroca) 's Twitter Profile Photo

extremely excited to announce that our article “Towards Assessing Data Replication in Music Generation with Music Similarity Metrics on Raw Audio” has been accepted at #ISMIR2024! ✨ w/ #WeiHsiangLiao Xavier Serra Yuki Mitsufuji Emilia Gomez 💐 stayed tuned for MiRA tool release👀

Chieh-Hsin (Jesse) Lai (@jcjesselai) 's Twitter Profile Photo

🎶🎶 Exciting News! We are organizing and presenting a tutorial at ISMIR Conference in San Francisco, Nov. 10-14, on Diffusion Models for Music and Sound! 📅 Mark our section -- T3: From White Noise to Symphony: Diffusion Models for Music and Sound in ismir2024.ismir.net/tutorials 🌐 Check

角野隼斗 - かてぃん (@880hz) 's Twitter Profile Photo

NYに移り住んで1年、ようやく北米のマネジメントとも契約することができました。がんばるぞ〜 sheldonartists.com/hayato-sumino

NYに移り住んで1年、ようやく北米のマネジメントとも契約することができました。がんばるぞ〜

sheldonartists.com/hayato-sumino
Marco Martínez (@marcoamaram) 's Twitter Profile Photo

GRAFX is an open-source library for audio graphs in PyTorch. Audio processing can be efficiently done on GPU with batched processing and various differentiable audio effects, including a multitap delay and zero-phase EQ Great work by @SunghoL10754073 ! github.com/sh-lee97/grafx

GRAFX is an open-source library for audio graphs in PyTorch. Audio processing can be efficiently done on GPU with batched processing and various differentiable audio effects, including a multitap delay and zero-phase EQ

Great work by @SunghoL10754073 !

github.com/sh-lee97/grafx
Takuya Narihira (@tnarihi) 's Twitter Profile Photo

The model from our paper, GenWarp, is now available! You can also try out an interactive demo where you can generate a novel view of a given image with camera control. Please give it a try! genwarp-nvs.github.io Demo: huggingface.co/spaces/Sony/ge… Code&Model: github.com/sony/genwarp

The model from our paper, GenWarp, is now available! You can also try out an interactive demo where you can generate a novel view of a given image with camera control. Please give it a try!

genwarp-nvs.github.io
Demo: huggingface.co/spaces/Sony/ge…
Code&Model: github.com/sony/genwarp