Shom (@shomlined) 's Twitter Profile
Shom

@shomlined

language model | sequence modeling | education | HCI

ID: 1443551154009358338

calendar_today30-09-2021 12:20:19

543 Tweet

275 Followers

1,1K Following

Shom (@shomlined) 's Twitter Profile Photo

Thanks LLM Security for covering our paper! This is a fast growing field and we want to make sense of the LLM security landscape. Big update in github repo incoming with new taxonomy and more papers covered😄 #AISafety #jailbreak

Shom (@shomlined) 's Twitter Profile Photo

Tired: transformer captures long term dependency Wired: fractal exhibits long term dependency Inspired: Memory processes and 2D Ising models characterize long term dependency

Taelin (@victortaelin) 's Twitter Profile Photo

HOC's Fast Discrete Program Search (DPS) HOC will soon (EOY?) launch an API for our DPS solution. The interface will be simple: - You give us a set of examples (input/output pairs) - We'll give you a (Python?) function that models it And that's it. It will be an universal

LibrAI (@libr_ai) 's Twitter Profile Photo

📝Please fill in your information to get a free pass before they’re gone-only 3 days left to register!  ⬇️Check the comments for the link to our questionnaire. Let’s meet and talk about innovation, AI, and opportunities! #LibrAI #AI #GITEX #FreePass #GITEX2024 #ExpandNorthStar

Shom (@shomlined) 's Twitter Profile Photo

Deepseek in Jan 2025 is going through the chatgpt moment in Dec 2022. Servers going down, user base surging, rl techniques making model rise in performance.

J. AI Research-JAIR (@jair_editor) 's Twitter Profile Photo

New Article: "Against The Achilles' Heel: A Survey on Red Teaming for Generative Models" by Lin, Mu, Zhai, Wang, Wang, Wang, Gao, Zhang, Che, Baldwin, Han, and Li jair.org/index.php/jair…

Xinyu Yang (@xinyu2ml) 's Twitter Profile Photo

We will be presenting "APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding", a novel encoding method that enables: 🚀Pre-caching Contexts for Fast Inference 🐍Re-using Positions for Long Context Our poster session is located in Hall 3 and Hall 2B,

We will be presenting "APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding", a novel encoding method that enables:
🚀Pre-caching Contexts for Fast Inference
🐍Re-using Positions for Long Context

Our poster session is located in Hall 3 and Hall 2B,
Shom (@shomlined) 's Twitter Profile Photo

i didn't play with o3 as much but judging from my experience with claude, its love of printing probably stems from having to print out results to be collected and judged in RL loop. Its abuse of .get("key") and try catch may be caused by error penalty.

Shom (@shomlined) 's Twitter Profile Photo

Since its fifth generation, RWKV's main progress -- outer product states, data dependent decay and delta rules -- has come only after works like RetNet, Mamba and DeltaNet with a few adjustments. I respect his efforts of training models, but he could use some more credit.

Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

I'd like to see Meta building a lean LLM team around Narang, Allen-Zhu, Mike Lewis, Zettlemoyer and Sukhbaatar and giving them all the budget and power.