search founder (@n0riskn0r3ward) Twitter Tweets • TwiCopy

search founder

@n0riskn0r3ward

+ Follow

Solo entrepreneur passionate about AI and search tech. Building a niche search product and sharing what I learn along the way.

ID: 1539267575611285504

calendar_today21-06-2022 15:23:35

3,3K Tweet

2,2K Takipçi

1,1K Takip Edilen

wh

@nrehiew_

2 months ago

New post! This time, about the current state of Long Context Evaluation. I discuss existing benchmarks, what makes a good long context eval, what's missing from existing ones and introduce a new one - LongCodeEdit :)

thumb_up_off_alt504

chat_bubble_outline13

repeat43

shareShare

search founder

@n0riskn0r3ward

a month ago

Had never really tried formal topic modeling before, just read about it a bit. Tried BERTopic today with a real dataset. Might be a skill issue but the results were terrible, would not recommend. Now to play with “recursive language modeling” for this task…

thumb_up_off_alt8

chat_bubble_outline2

repeat0

shareShare

search founder

@n0riskn0r3ward

a month ago

The recursive language modeling approach was about 100x more effective… Granted I modified the prompts to help steer things the direction I wanted it to go. But that’s also part of the beauty of the RLM approach vs traditional topic modeling.

thumb_up_off_alt11

chat_bubble_outline2

repeat0

shareShare

search founder

@n0riskn0r3ward

a month ago

For a third test I tried something even simpler - what if I just point codex to a parquet file with the 10k raw records (that I setup the RLM with in the python REPL), and ask it to attempt to build a topic model. Turns out that also works. Not meant to be an apples to apples

thumb_up_off_alt14

chat_bubble_outline1

repeat0

shareShare

search founder

@n0riskn0r3ward

18 days ago

Liked this honest take on prompt optimizers bc the tone reminds me of the parts of academic discourse I enjoyed most. In its better moments it welcomes open debate of ideas and is honest about the unsolved parts of the problem. The slide in the screenshot makes the key point IMO.

thumb_up_off_alt23

chat_bubble_outline1

repeat6

shareShare

search founder

@n0riskn0r3ward

17 days ago

If OpenAI releases a crypto coin I’ma be unreasonably upset. Pls pls pls no.

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Alex Albert

@alexalbert__

16 days ago

Tool Search Tool Instead of loading all tool definitions upfront, Claude discovers tools on-demand. Mark tools with defer_loading: true and only pays tokens for tools Claude actually needs. Up to an 85% token reduction and big boost in accuracy on our MCP evals (79.5% to 88.1%)