Mathis Pink (@mathispink) 's Twitter Profile
Mathis Pink

@mathispink

👀🧠(x) | x ∈ {👀🧠,🤖}
PhD student @mpi_sws_
trying to trick rocks into thinking and remembering.

ID: 1299449849637728257

calendar_today28-08-2020 20:53:17

92 Tweet

350 Takipçi

2,2K Takip Edilen

Sebastian Michelmann (@s_michelmann) 's Twitter Profile Photo

Excited to share our new preprint bit.ly/402EYEb with Mariya Toneva, Norman Lab, and Manoj Kumar @[email protected]), in which we ask if GPT-3 (a large language model) can segment narratives into meaningful events similarly to humans. We use an unconventional approach: ⬇️

Mathis Pink (@mathispink) 's Twitter Profile Photo

4/n💡We find that fine-tuning or RAG do not support episodic memory capabilities well (yet). In-context presentation supports some episodic memory capabilities but at high costs and insufficient length-generalization, making it a bad candidate for episodic memory!

Omer Moussa (@ohmoussa2) 's Twitter Profile Photo

We are so excited to share the first work that demonstrates consistent downstream improvements for language tasks after fine-tuning with brain data!! Improving semantic understanding in speech language models via brain-tuning arxiv.org/abs/2410.09230 W/ Dietrich Klakow, Mariya Toneva

We are so excited to share the first work that demonstrates consistent downstream improvements for language tasks after fine-tuning with brain data!!
Improving semantic understanding in speech language models via brain-tuning
arxiv.org/abs/2410.09230

W/ <a href="/dklakow/">Dietrich Klakow</a>, <a href="/mtoneva1/">Mariya Toneva</a>
François Fleuret (@francoisfleuret) 's Twitter Profile Photo

Consider the prompt X="Describe a beautiful house." We can consider two processes to generate the answer Y: (A) sample P(Y | X) or, (B) sample an image Z with a conditional image density model P(Z | X) and then sample P(Y | Z). 1/3

Mathis Pink (@mathispink) 's Twitter Profile Photo

x.com/MathisPink/sta… We think this is because LLMs do not have parametric episodic memory (as opposed to semantic memory)! We recently created SORT, a new benchmark task that tests temporal order memory in LLMs

François Chollet (@fchollet) 's Twitter Profile Photo

Anyway, glad to see that the whole "let's just pretrain a bigger LLM" paradigm is dead. Model size is stagnating or even decreasing, while researchers are now looking at the right problems -- either test-time training or neurosymbolic approaches like test-time search, program

Karan Goel (@krandiash) 's Twitter Profile Photo

A few interesting challenges in extending context windows. A model with a big prompt =/= "infinite context" in my mind. 10M tokens of context is not exactly on the path to infinite context. Instead, it requires a streaming model that has - an efficient state with fast

Martina Vilas (@martinagvilas) 's Twitter Profile Photo

We will be presenting this 💫 spotlight 💫 paper at #ICLR2025. Come say hi or DM me if you're interested in discussing AI #interpretability in Singapore! 📆 Poster Session 4 (#530) 🕰️ Fri 25 Apr. 3:00-5:30 PM 📝 openreview.net/forum?id=QogcG… 📊 iclr.cc/virtual/2025/p…

We will be presenting this 💫 spotlight 💫 paper at #ICLR2025. Come say hi or DM me if you're interested in discussing AI #interpretability in Singapore!

📆 Poster Session 4 (#530)
🕰️ Fri 25 Apr. 3:00-5:30 PM
📝 openreview.net/forum?id=QogcG…
📊 iclr.cc/virtual/2025/p…
Omer Moussa (@ohmoussa2) 's Twitter Profile Photo

🚨Excited to share our latest work published at Interspeech 2025: “Brain-tuned Speech Models Better Reflect Speech Processing Stages in the Brain”! 🧠🎧 arxiv.org/abs/2506.03832 W/ Mariya Toneva We fine-tuned speech models directly with brain fMRI data, making them more brain-like.🧵

Tim Kietzmann (@timkietzmann) 's Twitter Profile Photo

Exciting new preprint from the lab: “Adopting a human developmental visual diet yields robust, shape-based AI vision”. A most wonderful case where brain inspiration massively improved AI solutions. Work with Zejin Lu | 陆泽金 Sushrut Thorat and Radoslaw Cichy arxiv.org/abs/2507.03168