Mathis Pink (@mathispink) Twitter Tweets • TwiCopy

Mathis Pink

@mathispink

+ Follow

👀🧠(x) | x ∈ {👀🧠,🤖}
PhD student @mpi_sws_
trying to trick rocks into thinking and remembering.

ID: 1299449849637728257

calendar_today28-08-2020 20:53:17

92 Tweet

350 Takipçi

2,2K Takip Edilen

Sebastian Michelmann

@s_michelmann

3 years ago

Excited to share our new preprint bit.ly/402EYEb with Mariya Toneva, Norman Lab, and Manoj Kumar @[email protected]), in which we ask if GPT-3 (a large language model) can segment narratives into meaningful events similarly to humans. We use an unconventional approach: ⬇️

thumb_up_off_alt94

chat_bubble_outline2

repeat30

shareShare

Mathis Pink

@mathispink

a year ago

4/n💡We find that fine-tuning or RAG do not support episodic memory capabilities well (yet). In-context presentation supports some episodic memory capabilities but at high costs and insufficient length-generalization, making it a bad candidate for episodic memory!

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare

Omer Moussa

@ohmoussa2

a year ago

We are so excited to share the first work that demonstrates consistent downstream improvements for language tasks after fine-tuning with brain data!! Improving semantic understanding in speech language models via brain-tuning arxiv.org/abs/2410.09230 W/ Dietrich Klakow, Mariya Toneva

thumb_up_off_alt62

chat_bubble_outline3

repeat8

shareShare

François Fleuret

@francoisfleuret

a year ago

Consider the prompt X="Describe a beautiful house." We can consider two processes to generate the answer Y: (A) sample P(Y | X) or, (B) sample an image Z with a conditional image density model P(Z | X) and then sample P(Y | Z). 1/3

thumb_up_off_alt64

chat_bubble_outline5

repeat3

shareShare

Mathis Pink

@mathispink

a year ago

x.com/MathisPink/sta… We think this is because LLMs do not have parametric episodic memory (as opposed to semantic memory)! We recently created SORT, a new benchmark task that tests temporal order memory in LLMs

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

François Chollet

@fchollet

10 months ago

Anyway, glad to see that the whole "let's just pretrain a bigger LLM" paradigm is dead. Model size is stagnating or even decreasing, while researchers are now looking at the right problems -- either test-time training or neurosymbolic approaches like test-time search, program

thumb_up_off_alt512

chat_bubble_outline10

repeat34

shareShare

Karan Goel

@krandiash

10 months ago

A few interesting challenges in extending context windows. A model with a big prompt =/= "infinite context" in my mind. 10M tokens of context is not exactly on the path to infinite context. Instead, it requires a streaming model that has - an efficient state with fast

thumb_up_off_alt85

chat_bubble_outline2

repeat10

shareShare

Martina Vilas

@martinagvilas

6 months ago

We will be presenting this 💫 spotlight 💫 paper at #ICLR2025. Come say hi or DM me if you're interested in discussing AI #interpretability in Singapore! 📆 Poster Session 4 (#530) 🕰️ Fri 25 Apr. 3:00-5:30 PM 📝 openreview.net/forum?id=QogcG… 📊 iclr.cc/virtual/2025/p…

thumb_up_off_alt214

chat_bubble_outline6

repeat29

shareShare

Omer Moussa

@ohmoussa2

5 months ago

🚨Excited to share our latest work published at Interspeech 2025: “Brain-tuned Speech Models Better Reflect Speech Processing Stages in the Brain”! 🧠🎧 arxiv.org/abs/2506.03832 W/ Mariya Toneva We fine-tuned speech models directly with brain fMRI data, making them more brain-like.🧵

thumb_up_off_alt29

chat_bubble_outline1

repeat5

shareShare

Tim Kietzmann

@timkietzmann

4 months ago

Exciting new preprint from the lab: “Adopting a human developmental visual diet yields robust, shape-based AI vision”. A most wonderful case where brain inspiration massively improved AI solutions. Work with Zejin Lu | 陆泽金 Sushrut Thorat and Radoslaw Cichy arxiv.org/abs/2507.03168

thumb_up_off_alt140

chat_bubble_outline5

repeat47

shareShare