Dongmin Park @ iclr25 (@dongmin_park11) 's Twitter Profile
Dongmin Park @ iclr25

@dongmin_park11

AI Researcher @Krafton_AI (@PUBG) | Prev-Intern @Meta, NAVER | PhD @ KAIST | Data-centric AI, Diffusion, LLM Agents

ID: 1503362786638065666

linkhttps://scholar.google.com/citations?view_op=list_works&hl=en&hl=en&user=4xXYQl0AAAAJ calendar_today14-03-2022 13:30:46

40 Tweet

76 Takipçi

114 Takip Edilen

Kevin Wang (@kevinwang_111) 's Twitter Profile Photo

🎉 Thrilled to announce our MindGames challenge is accepted at #NeurIPS2025! 🧠🤖 Ready to deploy your AI agents to compete and collaborate in Hanabi, Werewolf, Stag Hunt, and Colonel Blotto? 🎮 Stay tuned for details!

🎉 Thrilled to announce our MindGames challenge is accepted at #NeurIPS2025! 🧠🤖 Ready to deploy your AI agents to compete and collaborate in Hanabi, Werewolf, Stag Hunt, and Colonel Blotto? 🎮 Stay tuned for details!
Grace Luo (@graceluo_) 's Twitter Profile Photo

✨New preprint: Dual-Process Image Generation! We distill *feedback from a VLM* into *feed-forward image generation*, at inference time. The result is flexible control: parameterize tasks as multimodal inputs, visually inspect the images with the VLM, and update the generator.🧵

Dongmin Park @ iclr25 (@dongmin_park11) 's Twitter Profile Photo

Thanks Yoshi Suhara, it was a real pleasure working with you and the NVIDIA AI team! Hope we get to collaborate again in the future, especially on building gaming SLMs jointly!

George (@georgejrjrjr) 's Twitter Profile Photo

Have you read the Deep Research Bench paper yet? Very cool project. And you can tell building out the eval and its infrastructure was a BUNCH of work. I would love to see this expanded to include open deep research *scaffolds*, so this hill can get climbed pronto in the open.

Have you read the Deep Research Bench paper yet?

Very cool project. And you can tell building out the eval and its infrastructure was a BUNCH of work.

I would love to see this expanded to include open deep research *scaffolds*, so this hill can get climbed pronto in the open.
Alfonso Amayuelas (@alfonamayuelas) 's Twitter Profile Photo

New paper 🚨📜🚀 Introducing “Agents of Change: Self-Evolving LLM Agents for Strategic Planning”! In this work, we show how LLM-powered agents can rewrite their own prompts & code to climb the learning curve in the board game Settlers of Catan 🎲 🧵👇

New paper 🚨📜🚀
Introducing “Agents of Change: Self-Evolving LLM Agents for Strategic Planning”!
In this work, we show how LLM-powered agents  can rewrite their own prompts & code to climb the learning curve in the board game Settlers of Catan 🎲
🧵👇
inZOI (@playinzoi) 's Twitter Profile Photo

Your next story begins on Mac this August. Pre-order today on the Mac App Store and let your imagination lead the way. ➡️ inzoi.me/macpreorder #Apple #Mac #WWDC #inZOI #KRAFTON #LifeSimulation

Your next story begins on Mac this August.

Pre-order today on the Mac App Store and let your imagination lead the way. ➡️ inzoi.me/macpreorder

#Apple #Mac #WWDC #inZOI
#KRAFTON #LifeSimulation
Dongmin Park @ iclr25 (@dongmin_park11) 's Twitter Profile Photo

Orak🎮 benchmark leaderboard is just launched! Submit your LLMs and agentic strategies to compete in diverse real-world video games! krafton-ai.github.io/orak-leaderboa… *Orak comes from 오락, a native Korean word meaning “game”

inZOI (@playinzoi) 's Twitter Profile Photo

🔎Get a glimpse of what’s new in June Update (v.0.2.0)! Accessories like glasses, headpieces, and earrings can now be freely resized and repositioned. This allows for more precise and flexible styling than ever before. Updates open up more ways to create stories — with new

Kevin Ellis (@ellisk_kellis) 's Twitter Profile Photo

New paper: World models + Program synthesis by Wasu Top Piriyakulkij 1. World modeling on-the-fly by synthesizing programs w/ 4000+ lines of code 2. Learns new environments from minutes of experience 3. Positive score on Montezuma's Revenge 4. Compositional generalization to new environments

Yunzhi Zhang (@zhang_yunzhi) 's Twitter Profile Photo

(1/n) Time to unify your favorite visual generative models, VLMs, and simulators for controllable visual generation—Introducing a Product of Experts (PoE) framework for inference-time knowledge composition from heterogeneous models.

Essential AI (@essential_ai) 's Twitter Profile Photo

[1/5] 🚀 Meet Essential-Web v1.0, a 24-trillion-token pre-training dataset with rich metadata built to effortlessly curate high-performing datasets across domains and use cases!

[1/5]

🚀 Meet Essential-Web v1.0, a 24-trillion-token pre-training dataset with rich metadata built to effortlessly curate high-performing datasets across domains and use cases!
Dongmin Park @ iclr25 (@dongmin_park11) 's Twitter Profile Photo

🚀 The research teaser video for Orak (오락) is out! 🔗 YouTube: youtube.com/watch?v=2_tUJR… Explore the code, benchmark, and leaderboard, and join us in pushing the boundaries of game agents!

Adina Yakup (@adinayakup) 's Twitter Profile Photo

Stream-Omni 🔥 new Any-to-Any model by Chinese Academy of Science. Model: huggingface.co/ICTNLP/stream-… Paper: huggingface.co/papers/2506.13… ✨ Unified multimodal input: text, vision, and speech ✨ Real-time "see-while-hear" experience ✨ Efficient training with minimal omni-modal data

Dawn Song (@dawnsongtweets) 's Twitter Profile Photo

My group & collaborators have developed many popular benchmarks over the years, e.g., MMLU, MATH, APPS---really excited about our latest benchmark OMEGA Ω: 🔍Can LLMs really think outside the box in math? a new benchmark probing 3 axes of generalization: 1️⃣ Exploratory 2️⃣

Kevin Wang (@kevinwang_111) 's Twitter Profile Photo

Excited to announce the Mindgame @NeurIPS Competition is officially LIVE! 🤖 Pit your agents against others in Mafia, Codename, Prisoner’s Dilemma, Stg Hunt, and Colonel Blotto. Sign up now for $500 in compute credits on your initial run! 🔗 Register : mindgamesarena.com

Excited to announce the Mindgame @NeurIPS Competition is officially LIVE!
🤖 Pit your agents against others in Mafia, Codename, Prisoner’s Dilemma, Stg Hunt, and Colonel Blotto.
Sign up now for $500 in compute credits on your initial run!
🔗 Register : mindgamesarena.com
elvis (@omarsar0) 's Twitter Profile Photo

Agent Leaderboard v2 is here! > GPT-4.1 leads > Gemini-2.5-flash excels at tool selection > Kimi K2 is the top open-source model > Grok 4 falls short > Reasoning models lag behind > No single model dominates all domains More below:

Agent Leaderboard v2 is here!

> GPT-4.1 leads
> Gemini-2.5-flash excels at tool selection
> Kimi K2 is the top open-source model
> Grok 4 falls short
> Reasoning models lag behind
> No single model dominates all domains 

More below:
Scott Condron (@_scottcondron) 's Twitter Profile Photo

I don't have any special inside knowledge about how Kimi.ai trained Kimi K2. I just read the paper and this part is what I've been telling anyone who will listen about. Their data generation steps to get lots of high quality, multi-turn agent traces to train on is so much

I don't have any special inside knowledge about how <a href="/Kimi_Moonshot/">Kimi.ai</a> trained Kimi K2. I just read the paper and this part is what I've been telling anyone who will listen about.

Their data generation steps to get lots of high quality, multi-turn agent traces to train on is so much
Kangwook Lee (@kangwook_lee) 's Twitter Profile Photo

🧵When training reasoning models, what's the best approach? SFT, Online RL, or perhaps Offline RL? At KRAFTON AI and SK telecom, we've explored this critical question, uncovering interesting insights! Let’s dive deeper, starting with the basics first. 1) SFT SFT (aka hard

Jason Weston (@jaseweston) 's Twitter Profile Photo

🌿Introducing MetaCLIP 2 🌿 📝: arxiv.org/abs/2507.22062 code, model: github.com/facebookresear… After four years of advancements in English-centric CLIP development, MetaCLIP 2 is now taking the next step: scaling CLIP to worldwide data. The effort addresses long-standing

🌿Introducing MetaCLIP 2 🌿
📝: arxiv.org/abs/2507.22062
code, model: github.com/facebookresear…

After four years of advancements in English-centric CLIP development, MetaCLIP 2 is now taking the next step: scaling CLIP to worldwide data. The effort addresses long-standing