Lj Miranda (@ljvmiranda) 's Twitter Profile
Lj Miranda

@ljvmiranda

processing language naturally • predoc at @allen_ai

ID: 982443027493896192

linkhttp://ljvmiranda921.github.io calendar_today07-04-2018 02:20:45

2,2K Tweet

976 Followers

478 Following

Blaise Cruz (@jcblaisecruz) 's Twitter Profile Photo

Proud and excited to finally release FilBench, the first Open LLM Eval suite and leaderboard for Filipino! Blog post: huggingface.co/blog/filbench GitHub: github.com/filbench/filbe… Leaderboard: huggingface.co/spaces/UD-Fili… Paper: arxiv.org/abs/2508.03523

Proud and excited to finally release FilBench, the first Open LLM Eval suite and leaderboard for Filipino!

Blog post: huggingface.co/blog/filbench
GitHub: github.com/filbench/filbe…
Leaderboard: huggingface.co/spaces/UD-Fili…
Paper: arxiv.org/abs/2508.03523
Joseph Imperial (@josephimperial_) 's Twitter Profile Photo

FilBench is out! These are exciting times for doing NLP in Filipino (and other Philippine languages)! 🇵🇭 Led by the amazing Lj Miranda 🥳 Paper: arxiv.org/abs/2508.03523

Team Cherry (@teamcherrygames) 's Twitter Profile Photo

The countdown is on! Join us in 48 hours for a special announcement about Hollow Knight: Silksong! Premiering here: youtu.be/6XGeJwsUP9c

The countdown is on!

Join us in 48 hours for a special announcement about Hollow Knight: Silksong!

Premiering here: youtu.be/6XGeJwsUP9c
Shangbin Feng (@shangbinfeng) 's Twitter Profile Photo

👀 How to find more difficult/novel/salient evaluation data? ✨ Let the data generators find it for you! Introducing Data Swarms, multiple data generator LMs collaboratively search in the weight space to optimize quantitative desiderata of evaluation.

👀 How to find more difficult/novel/salient evaluation data?
✨ Let the data generators find it for you!

Introducing Data Swarms, multiple data generator LMs collaboratively search in the weight space to optimize quantitative desiderata of evaluation.
Yong Zheng-Xin (Yong) (@yong_zhengxin) 's Twitter Profile Photo

🔥 Our one-year work (collaboration with Cohere Labs) on multilingual safety survey is accepted to EMNLP 2025 Main!! We got one crazy reviewer but we also received one of the most encouraging feedback: "I greatly appreciate the suggested research directions. These are clear,

🔥 Our one-year work (collaboration with <a href="/Cohere_Labs/">Cohere Labs</a>) on multilingual safety survey is accepted to EMNLP 2025 Main!! 

We got one crazy reviewer but we also received one of the most encouraging feedback: 

"I greatly appreciate the suggested research directions. These are clear,
Joseph Imperial (@josephimperial_) 's Twitter Profile Photo

Exciting news. Both UniversalCEFR 🇪🇺 and FilBench 🇵🇭 were accepted at #EMNLP2025 Main 🥳 Grateful to my collaborators around the globe for making these projects possible. See you all in Suzhou and Shanghai!

Exciting news. Both UniversalCEFR 🇪🇺 and FilBench 🇵🇭 were accepted at #EMNLP2025 Main 🥳

Grateful to my collaborators around the globe for making these projects possible. See you all in Suzhou and Shanghai!
Team Cherry (@teamcherrygames) 's Twitter Profile Photo

Hollow Knight: Silksong will be available September 4 on all platforms and day one on Xbox Game Pass! Watch the release trailer: youtu.be/6XGeJwsUP9c

Soheil Feizi (@feizisoheil) 's Twitter Profile Photo

Thrilled to share that our paper, “Gaming Tool Preferences in Agentic LLMs” was accepted to EMNLP 2025: arxiv.org/pdf/2505.18135 Tools make agentic AI powerful, but today many models choose them based on descriptions: Add a single assertive cue to a tool description, e.g., “This

Thrilled to share that our paper, “Gaming Tool Preferences in Agentic LLMs” was accepted to EMNLP 2025: arxiv.org/pdf/2505.18135

Tools make agentic AI powerful, but today many models choose them based on descriptions: Add a single assertive cue to a tool description, e.g., “This
Catherine Arnett (@linguist_cat) 's Twitter Profile Photo

Did you know? ❌77% of language models on Hugging Face are not tagged for any language 📈For 95% of languages, most models are multilingual 🚨88% of models with tags are trained on English In a new blog post, Tyler Chang and I dig into these trends and why they matter! 👇

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Tinker is cool. If you're a researcher/developer, tinker dramatically simplifies LLM post-training. You retain 90% of algorithmic creative control (usually related to data, loss function, the algorithm) while tinker handles the hard parts that you usually want to touch much less

Nathan (@nathanhabib1011) 's Twitter Profile Photo

🚀 new 🌤️ lighteval release and our biggest yet! • new benchmark finder to explore all available tasks • inspect-ai integration from AI Security Institute → more stable and easier to add benchmarks • share your evals and insights with the community on the Hugging Face hub • new

🚀 new 🌤️ lighteval release and our biggest yet!

• new benchmark finder to explore all available tasks
• inspect-ai integration from <a href="/AISecurityInst/">AI Security Institute</a> → more stable and easier to add benchmarks
• share your evals and insights with the community on the <a href="/huggingface/">Hugging Face</a> hub
• new
Ai2 (@allen_ai) 's Twitter Profile Photo

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, &amp; tool use, and an open model flow—not just the final weights, but the entire training journey.
Best fully open 32B reasoning model &amp; best 32B base model. 🧵
Pradeep Dasigi (@pdasigi) 's Twitter Profile Photo

We released Olmo 3. Fully open 7B and 32B models. This release is HUGE, with lots of new features including reasoning and function-calling. It comes with the entire model flow--data, checkpoints, code, and recipes so you can branch and build from any point in the development