Alex Dimakis (@alexgdimakis) Twitter Tweets • TwiCopy

Alex Dimakis

@alexgdimakis

+ Follow

Professor, UC berkeley | Founder @bespokelabsai |

ID: 29178343

linkhttps://people.eecs.berkeley.edu/~alexdimakis/ calendar_today06-04-2009 10:45:43

4,4K Tweet

19,19K Takipçi

2,2K Takip Edilen

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

It was with great confusion that I just realized at my old age, that the Greek word sycophant means ‘insincere flatterer’. In modern Greek (as in ancient) it means ‘slanderer’ and I was confused about GPT-4 spreading slander.

thumb_up_off_alt24

chat_bubble_outline2

repeat4

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

Great work on Phi4. This seems to be the best open weights model for reasoning, beating the previous best QWQ 32B even though it’s only 14B

thumb_up_off_alt24

chat_bubble_outline0

repeat3

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

Data-centric workshop at ICML 2025. Awesome speakers and panelists. (And they also kindly invited me)

thumb_up_off_alt24

chat_bubble_outline0

repeat1

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

Really excited to be participating in the UC Berkeley entrepreneurs mixer event- with amazing researchers, funders and founders.

thumb_up_off_alt9

chat_bubble_outline0

repeat2

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

The Berkeley entrepreneurs student club has some cool alums that started a few small startups.

thumb_up_off_alt21

chat_bubble_outline1

repeat4

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

Very cool result: KV cache compression can be done with compressed sensing: store keys and values as sparse combinations of some dictionary vectors. Interestingly, the dictionary is universal across inputs (but learned for each model).

thumb_up_off_alt155

chat_bubble_outline0

repeat16

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

I thought of DSPy as a prompt optimization tool. But it can optimize the weights of multi-component AI systems too, including GRPO for multi-turn and tool calling, see this very interesting new addition: dspy.GRPO

thumb_up_off_alt74

chat_bubble_outline2

repeat10

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

GroupNorm (normalizing groups of channels) considered harmful. It kills the relative means of different channels as nicely explained here.

thumb_up_off_alt11

chat_bubble_outline1

repeat1

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

A good example how predicting one token requires reasoning.

thumb_up_off_alt192

chat_bubble_outline9

repeat7

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

A New benchmark for visual reasoning just came out.

thumb_up_off_alt10

chat_bubble_outline1

repeat1

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

Congratulations LMArena !

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

Very interesting paper showing that releasing embeddings of text is almost the same as releasing the text itself. The universality of embedding geometry for different models and datasets is still puzzling me.

thumb_up_off_alt60

chat_bubble_outline4

repeat5

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

Incredible progress on Multi-turn RL by Berkeley NovaSky team! They get very good results in Text-to-SQL on the Spider benchmark. The agent learns to explore the database to answer questions very efficiently. Quick highlights: Multi-turn RL learns faster and generalizes better

thumb_up_off_alt34

chat_bubble_outline1

repeat5

shareShare

Alex Dimakis

@alexgdimakis

2 months ago

In 2019 introducing BERT into a lecture required custom university attire.

thumb_up_off_alt8

chat_bubble_outline1

repeat0

shareShare