Lj Miranda (@ljvmiranda) Twitter Tweets • TwiCopy

Blaise Cruz

8 months ago

Proud and excited to finally release FilBench, the first Open LLM Eval suite and leaderboard for Filipino! Blog post: huggingface.co/blog/filbench GitHub: github.com/filbench/filbe… Leaderboard: huggingface.co/spaces/UD-Fili… Paper: arxiv.org/abs/2508.03523

thumb_up_off_alt25

chat_bubble_outline1

repeat6

shareShare

Joseph Imperial

@josephimperial_

8 months ago

FilBench is out! These are exciting times for doing NLP in Filipino (and other Philippine languages)! 🇵🇭 Led by the amazing Lj Miranda 🥳 Paper: arxiv.org/abs/2508.03523

thumb_up_off_alt13

chat_bubble_outline0

repeat1

shareShare

Team Cherry

@teamcherrygames

8 months ago

The countdown is on! Join us in 48 hours for a special announcement about Hollow Knight: Silksong! Premiering here: youtu.be/6XGeJwsUP9c

thumb_up_off_alt113,113K

chat_bubble_outline3,3K

repeat29,29K

shareShare

Shangbin Feng

@shangbinfeng

8 months ago

👀 How to find more difficult/novel/salient evaluation data? ✨ Let the data generators find it for you! Introducing Data Swarms, multiple data generator LMs collaboratively search in the weight space to optimize quantitative desiderata of evaluation.

thumb_up_off_alt114

chat_bubble_outline2

repeat16

shareShare

Yong Zheng-Xin (Yong)

@yong_zhengxin

8 months ago

🔥 Our one-year work (collaboration with Cohere Labs) on multilingual safety survey is accepted to EMNLP 2025 Main!! We got one crazy reviewer but we also received one of the most encouraging feedback: "I greatly appreciate the suggested research directions. These are clear,

🔥 Our one-year work (collaboration with <a href="/Cohere_Labs/">Cohere Labs</a>) on multilingual safety survey is accepted to EMNLP 2025 Main!!

We got one crazy reviewer but we also received one of the most encouraging feedback:

"I greatly appreciate the suggested research directions. These are clear,

thumb_up_off_alt132

chat_bubble_outline11

repeat13

shareShare

Joseph Imperial

@josephimperial_

8 months ago

Exciting news. Both UniversalCEFR 🇪🇺 and FilBench 🇵🇭 were accepted at #EMNLP2025 Main 🥳 Grateful to my collaborators around the globe for making these projects possible. See you all in Suzhou and Shanghai!

thumb_up_off_alt45

chat_bubble_outline3

repeat4

shareShare

Cohere Labs

@cohere_labs

7 months ago

Congrats Lj and team on the release of FilBench, we're excited to see what our grants program makes possible ✨

thumb_up_off_alt19

chat_bubble_outline0

repeat3

shareShare

Team Cherry

@teamcherrygames

7 months ago

Hollow Knight: Silksong will be available September 4 on all platforms and day one on Xbox Game Pass! Watch the release trailer: youtu.be/6XGeJwsUP9c

thumb_up_off_alt53,53K

chat_bubble_outline1,1K

repeat17,17K

shareShare

Soheil Feizi

@feizisoheil

7 months ago

Thrilled to share that our paper, “Gaming Tool Preferences in Agentic LLMs” was accepted to EMNLP 2025: arxiv.org/pdf/2505.18135 Tools make agentic AI powerful, but today many models choose them based on descriptions: Add a single assertive cue to a tool description, e.g., “This

thumb_up_off_alt24

chat_bubble_outline0

repeat7

shareShare

Lj Miranda

@ljvmiranda

7 months ago

Pati yung OEC sa POEA / OWWA dapat imbestigahan 🤣

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Catherine Arnett

@linguist_cat

7 months ago

Did you know? ❌77% of language models on Hugging Face are not tagged for any language 📈For 95% of languages, most models are multilingual 🚨88% of models with tags are trained on English In a new blog post, Tyler Chang and I dig into these trends and why they matter! 👇

thumb_up_off_alt25

chat_bubble_outline2

repeat4

shareShare

Andrej Karpathy

@karpathy

6 months ago

Tinker is cool. If you're a researcher/developer, tinker dramatically simplifies LLM post-training. You retain 90% of algorithmic creative control (usually related to data, loss function, the algorithm) while tinker handles the hard parts that you usually want to touch much less

thumb_up_off_alt1,1K

chat_bubble_outline41

repeat181

shareShare

Nathan

@nathanhabib1011

5 months ago

🚀 new 🌤️ lighteval release and our biggest yet! • new benchmark finder to explore all available tasks • inspect-ai integration from AI Security Institute → more stable and easier to add benchmarks • share your evals and insights with the community on the Hugging Face hub • new

🚀 new 🌤️ lighteval release and our biggest yet!

• new benchmark finder to explore all available tasks
• inspect-ai integration from <a href="/AISecurityInst/">AI Security Institute</a> → more stable and easier to add benchmarks
• share your evals and insights with the community on the <a href="/huggingface/">Hugging Face</a> hub
• new

thumb_up_off_alt14

chat_bubble_outline1

repeat11

shareShare

Ai2

@allen_ai

5 months ago

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵

thumb_up_off_alt1,1K

chat_bubble_outline47

repeat296

shareShare

Pradeep Dasigi

@pdasigi

5 months ago

We released Olmo 3. Fully open 7B and 32B models. This release is HUGE, with lots of new features including reasoning and function-calling. It comes with the entire model flow--data, checkpoints, code, and recipes so you can branch and build from any point in the development

thumb_up_off_alt20

chat_bubble_outline3

repeat2

shareShare