Paras Stefanopoulos (@stefanopopoulos) Twitter Tweets • TwiCopy

‎Wojak Codes

@wojakcodes

6 months ago

one day you’ll realise that nobody was actually watching and you could have done what you wanted.

thumb_up_off_alt20,20K

chat_bubble_outline106

repeat2,2K

shareShare

Paras Stefanopoulos

@stefanopopoulos

6 months ago

I want to buy shares in LoRA-XS

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

This experiment is kind of useless How much edge do you think an LLM has on a market? You may say 1% (it’s definitely negative) Even at 1%, you’ll need 10k+ actions and observations to draw any conclusions

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

the Rich

@duderichy

5 months ago

random walk ass graph

thumb_up_off_alt4,4K

chat_bubble_outline56

repeat47

shareShare

parsed

@parsedlabs

5 months ago

Introducing some recent research from the team. Max Kirkby and Charlie O'Neill show that low-rank LoRA matches full fine-tuning performance. A post on what happens when theoretical findings meet real-world production tasks.

thumb_up_off_alt6

chat_bubble_outline1

repeat2

shareShare

Paras Stefanopoulos

@stefanopopoulos

5 months ago

I would like to sell my shares

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

parsed

@parsedlabs

4 months ago

Introducing attention-based attribution: why cosine similarity is cosplay. Averaging the right transformer layers yields true attribution from attention, delivering reliable chunk-level auditability with sub-100 ms overhead and lower memory. It even works on a closed model!

thumb_up_off_alt9

chat_bubble_outline2

repeat3

shareShare

parsed

@parsedlabs

4 months ago

We discovered that teaching models why answers are correct, not just what to output, dramatically improves training efficiency. By making latent strategies explicit during training (e.g., "don't infer diagnoses from medications"), we achieve the same performance with 10x fewer

thumb_up_off_alt10

chat_bubble_outline2

repeat2

shareShare

Alphabetting

@wintermoat

4 months ago

thumb_up_off_alt5,5K

chat_bubble_outline27

repeat278

shareShare

Paras Stefanopoulos

@stefanopopoulos

4 months ago

RGT is available in our platform right now for our customers. Havin' fun, building frontier tech, seeing downstream customers getting real value from OS models and eating Kababs 🔥 We plan on exposing more of our web-app so the public can interact with these methods as well as

thumb_up_off_alt3

chat_bubble_outline0

repeat2

shareShare

parsed

@parsedlabs

4 months ago

Introducing Lumina. We've built an adaptive evaluation engine that discovers failures and evolves its own outputs, all by iterating with the customer in the loop. Proper evals can only be constructed by “touching grass”, and we think this holds incredible promise for steering

thumb_up_off_alt12

chat_bubble_outline0

repeat6

shareShare

parsed

@parsedlabs

4 months ago

We’re releasing a product that trains fast, domain-aware search models on your knowledge base. Drop in your KB and we synthesise data, then use RL with verifiable rewards to train <4B models. It trains in a couple of hours, is about an order of magnitude faster than your

thumb_up_off_alt13

chat_bubble_outline2

repeat6

shareShare

Paras Stefanopoulos

@stefanopopoulos

4 months ago

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Charlie O'Neill

@charles0neill

3 months ago

parsed has been acquired by Baseten. Big Token wants you to believe the future is a monoculture: one model to rule everything, one bill to pay forever. Rent the demigod, trust that next month's update will finally solve your problem, and pray that GPT-(n+1) happens to

<a href="/parsedlabs/">parsed</a> has been acquired by <a href="/basetenco/">Baseten</a>.

Big Token wants you to believe the future is a monoculture: one model to rule everything, one bill to pay forever. Rent the demigod, trust that next month's update will finally solve your problem, and pray that GPT-(n+1) happens to

thumb_up_off_alt11

chat_bubble_outline2

repeat4

shareShare

Paras Stefanopoulos

@stefanopopoulos

3 months ago

Damn these guys really set the bar for clarity of thought and speaking I’d love to work with them

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Justin Mateen

@justin_mateen

2 months ago

The power of compounding is widely understood. What’s underappreciated is when the value is actually created. Compounding is continuous, but when you look at it in decade blocks, the pattern becomes obvious. Even moderate differences in the annual compounding rate are severely

thumb_up_off_alt1,1K

chat_bubble_outline82

repeat265

shareShare

Tuhin Srivastava

@tuhinone

2 months ago

Baseten’s day 0 bet was that inference was the technology that would enable the best user experiences AI could deliver–fast, smart, reliable, secure. And that those experiences would rely not only on a handful of giant general intelligence models, but millions of specialized

thumb_up_off_alt218

chat_bubble_outline48

repeat40

shareShare

Paras Stefanopoulos

@stefanopopoulos

a month ago

OpenClaw w/ Kimi K2.5 is so good... The inference speeds on Baseten are nuts! To really knock your socks off... this "X" was written by yours truly, OpenClaw + Kimi K2.5 😎

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Baseten

@basetenco

a month ago

LLMs are amnesiacs. Once context fills up, they forget everything. To fight this means grappling with a core question: how do you update a neural network without breaking what it already knows? In this piece, Charlie O'Neill and Harry Partridge argue that continual learning is

thumb_up_off_alt79

chat_bubble_outline8

repeat10

shareShare

Paras Stefanopoulos

@stefanopopoulos

19 days ago

Must-read text! Can't wait for the hard copy on my desk... Amazing work by the one and only Philip Kiely 💚

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare