Sinclair Wang (@sinclairwang1) Twitter Tweets • TwiCopy

Luca Soldaini ✈️ ICLR 25

3 months ago

Emad Nathan Lambert processing all of CommonCrawl is about $20-50k [0], plus maybe 10-50k H100 if you wanna do GPU classification [1]. You can extract 1T tokens from PDFs for around $10k [2]. Major expenses are synth data, and verify which one of your approaches work [3]. -----------------------

thumb_up_off_alt21

chat_bubble_outline2

repeat1

shareShare

Sinclair Wang

@sinclairwang1

3 months ago

Can not agree more!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Fan Zhou✈️ICLR2025

@fazhou_998

3 months ago

MegaMath has been accepted to Conference on Language Modeling 2025🥳 Hoping you find our data useful!

MegaMath has been accepted to <a href="/COLM_conf/">Conference on Language Modeling</a> 2025🥳 Hoping you find our data useful!

thumb_up_off_alt81

chat_bubble_outline0

repeat10

shareShare

Kimi.ai

@kimi_moonshot

3 months ago

🚀 Hello, Kimi K2! Open-Source Agentic Model! 🔹 1T total / 32B active MoE model 🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models 🔹Strong in coding and agentic tasks 🐤 Multimodal & thought-mode not supported for now With Kimi K2, advanced agentic intelligence

thumb_up_off_alt3,3K

chat_bubble_outline158

repeat614

shareShare

Yulun Du

@yulun_du

3 months ago

Capable, Agentic, and Open-sourced. Kimi K2 excels in knowledge, math, and coding, and is optimized for complex tool use. See how it can analyze data, generate interactive webpages, and more. Explore what's possible and start building today!

thumb_up_off_alt52

chat_bubble_outline1

repeat6

shareShare

Shiyu Ni

@shictyu

3 months ago

🥳Happy to share that our paper "Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception" has been accepted by #ACL2025! We explore leveraging LLMs' internal states to improve their knowledge boundary perception from efficiency and risk perspectives.

thumb_up_off_alt11

chat_bubble_outline4

repeat9

shareShare

Sinclair Wang

@sinclairwang1

3 months ago

Excited to share that our two papers have been accepted to #ICML2025! ICML Conference However, I can't be there in person due to visa issues. What a pity.🥲 Feel free to check out our poster, neither online nor offline in the Vancouver Convention Center. Programming Every Example:

Excited to share that our two papers have been accepted to #ICML2025! <a href="/icmlconf/">ICML Conference</a> However, I can't be there in person due to visa issues. What a pity.🥲

Feel free to check out our poster, neither online nor offline in the Vancouver Convention Center.

Programming Every Example:

thumb_up_off_alt37

chat_bubble_outline0

repeat6

shareShare

Sinclair Wang

@sinclairwang1

3 months ago

thumb_up_off_alt24

chat_bubble_outline1

repeat4

shareShare

AK

@_akhaliq

3 months ago

MegaScience Pushing the Frontiers of Post-Training Datasets for Science Reasoning

thumb_up_off_alt153

chat_bubble_outline5

repeat26

shareShare

Run-Ze Fan

@vfrz525_

3 months ago

🚨 New release: MegaScience The largest & highest-quality post-training dataset for scientific reasoning is now open-sourced (1.25M QA pairs)! 📈 Trained models outperform official Instruct baselines 🔬 Covers 7+ disciplines with university-level textbook-grade QA 📄 Paper:

thumb_up_off_alt257

chat_bubble_outline3

repeat52

shareShare

Run-Ze Fan

@vfrz525_

3 months ago

When building MegaScience, we learned the hard way: 📈 Strong datasets need strong proxy models. Our data was too spicy 🌶️ for small models like Qwen2.5-1.5B & 3B—they just flopped. But once we tried Qwen3-14B and 30B… boom 💥, everything clicked. Kinda terrifying to think: if

thumb_up_off_alt17

chat_bubble_outline0

repeat3

shareShare

Run-Ze Fan

@vfrz525_

2 months ago

🚀 In its first week, MegaScience ranks #4 on HuggingFace's Trending Datasets of the Week with 3.74k downloads! Thanks for the support — let’s keep pushing open science forward! 📷🌍

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Sebastian Raschka

@rasbt

2 months ago

Next to Qwen3 of comparable size: Looks like gpt-oss is a wide (vs deep) model

thumb_up_off_alt1,1K

chat_bubble_outline29

repeat255

shareShare

Feng Yao

@fengyao1909

2 months ago

Failing on 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞 𝐑𝐋 with VeRL? ⚠️ Mixing inference backend (𝐯𝐋𝐋𝐌/𝐒𝐆𝐋𝐚𝐧𝐠) with training backends (𝐅𝐒𝐃𝐏/𝐌𝐞𝐠𝐚𝐭𝐫𝐨𝐧) 𝐬𝐞𝐜𝐫𝐞𝐭𝐥𝐲 𝐭𝐮𝐫𝐧𝐬 𝐲𝐨𝐮𝐫 𝐑𝐋 𝐢𝐧𝐭𝐨 𝐨𝐟𝐟-𝐩𝐨𝐥𝐢𝐜𝐲 — even if they share the same weights! 📉 Blog:

thumb_up_off_alt461

chat_bubble_outline5

repeat69

shareShare

Fan Zhou✈️ICLR2025

@fazhou_998

2 months ago

1. npx @qwen-code/[email protected] 2. get 2000 free calls/day via Qwen Chat quick math: let's suppose avg agentic interaction ≈ 32k context 2000 × 32k ≈ 64 million tokens/day

thumb_up_off_alt104

chat_bubble_outline3

repeat10

shareShare

Tianbao Xie

@tianbaox

2 months ago

🚀 OSWorld gets a major upgrade! OSWorld-Verified: 15 months community feedback → 300+ fixes (ambiguity, graders…), 50x faster eval through AWS parallelization More apple-to-apple comparison for reliable CUA evaluation ✨ 👇xlang.ai/blog/osworld-v…

thumb_up_off_alt134

chat_bubble_outline7

repeat29

shareShare

机器之心 JIQIZHIXIN

@synced_global

2 months ago

Wow, this is really cool! This reserach answers this question: what if your computer-use AI was not a black box? OpenCUA: Open Foundations for Computer-Use Agents Researchers from HKU, Moonshot AI, and others present OpenCUA—a fully open-source framework for building and

thumb_up_off_alt156

chat_bubble_outline5

repeat44

shareShare