Eureka (@eurekates) 's Twitter Profile
Eureka

@eurekates

Engineer
#DataScience

ID: 1894884626

calendar_today22-09-2013 19:45:09

41 Tweet

106 Takipçi

969 Takip Edilen

Andrew Ng (@andrewyng) 's Twitter Profile Photo

It is only rarely that, after reading a research paper, I feel like giving the authors a standing ovation. But I felt that way after finishing Direct Preference Optimization (DPO) by Rafael Rafailov @ NeurIPS Archit Sharma Eric Stefano Ermon Christopher Manning and Chelsea Finn. This

Guillaume Champeau (@gchampeau) 's Twitter Profile Photo

Envie de créer la Journée Mondiale du Flux RSS pour sensibiliser sur cette espèce menacée d'extinction. On pourrait en profiter pour rappeler leur existence et leur importance pour l'équilibre écosystémique du Web.

Hugh Zhang (@hughbzhang) 's Twitter Profile Photo

Data contamination is a huge problem for LLM evals right now. At Scale, we created a new test set for GSM8k *from scratch* to measure overfitting and found evidence that some models (most notably Mistral and Phi) do substantially worse on this new test set compared to GSM8k.

Data contamination is a huge problem for LLM evals right now. At Scale, we created a new test set for GSM8k *from scratch* to measure overfitting and found evidence that some models (most notably Mistral and Phi) do substantially worse on this new test set compared to GSM8k.
Chris Olah (@ch402) 's Twitter Profile Photo

I'm really excited about these results for many reasons, but the most important is that we're starting to connect mechanistic interpretability to questions about the safety of large language models.

Hamel Husain (@hamelhusain) 's Twitter Profile Photo

Another example of loaded jargon for LLMs. This should be the poster child of that. We should only be saying the first thing. Talk in plain language

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Awesome and highly useful: FineWeb-Edu 📚👏 High quality LLM dataset filtering the original 15 trillion FineWeb tokens to 1.3 trillion of the highest (educational) quality, as judged by a Llama 3 70B. +A highly detailed paper. Turns out that LLMs learn a lot better and faster

Awesome and highly useful: FineWeb-Edu 📚👏
High quality LLM dataset filtering the original 15 trillion FineWeb tokens to 1.3 trillion of the highest (educational) quality, as judged by a Llama 3 70B. +A highly detailed paper.

Turns out that LLMs learn a lot better and faster
Niels Rogge (@nielsrogge) 's Twitter Profile Photo

Woah what??? Microsoft just dropped Florence-2 on Hugging Face with an MIT license!! Pretty huge. Florence was initially Microsoft’s internal CLIP model, and they now expanded it to do various tasks like captioning, object detection, OCR, … just by prompting the model

Reka Juhasz (@juhreka13) 's Twitter Profile Photo

Happy to see our WP w Shogo Sakabe and David Weinstein (so many years in the making!) out. We examine the role of codifying knowledge in the spread of the Industrial Revolution. A little thread. 1/N

Sabine Hossenfelder (@skdh) 's Twitter Profile Photo

This is amazing. I have never seen a theory paper in physics being retracted, no matter how many mistakes. If they go through with this, it could have a big impact on the field. x.com/WPMcB1997/stat…

Sabine Hossenfelder (@skdh) 's Twitter Profile Photo

Steve McCormick It is fairly rare that you can point at an equation and say "this is obviously wrong" like in this case. More often you know they're wrong because an assumption they made disagrees with established results. Eg, I remember a case 15 years ago or so when someone repeated a

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

It's a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; It's just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something. They

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Multivac, how can the net amount of entropy of the universe be decreased? I apologize, but as an AI language model I am not able to answer, as reversing entropy is a highly complex, multi-faceted problem. Here is a nuanced look at how leading experts have approached the topic:

The Nobel Prize (@nobelprize) 's Twitter Profile Photo

BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.”

BREAKING NEWS
The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.”
Aditya Gunturu (@adigunturu) 's Twitter Profile Photo

What if you could make physics diagrams come alive? At #UIST2024, we will be presenting our paper, Augmented Physics, an ML-Integrated Authoring Tool for Creating Interactive Physics Simulations from Static Diagrams Co-authors: Yi Wen Nandi Zhang Jarin Rubaiat Habib Ryo Suzuki

DeepSeek (@deepseek_ai) 's Twitter Profile Photo

🚀 Introducing DeepSeek-V3! Biggest leap forward yet: ⚡ 60 tokens/second (3x faster than V2!) 💪 Enhanced capabilities 🛠 API compatibility intact 🌍 Fully open-source models & papers 🐋 1/n

Niels Rogge (@nielsrogge) 's Twitter Profile Photo

Unpopular opinion: benchmarks like these are moving the field in the wrong direction No I don't want an AI to be able to memorize (useless?) questions like "How many paired tendons are supported by a sesamoid bone?" in its weights I want the "intern", as Andrej Karpathy is suggesting

@levelsio (@levelsio) 's Twitter Profile Photo

I'm organizing the 🌟 2025 Vibe Coding Game Jam Deadline to enter: 25 March 2025, so you have 7 days - anyone can enter with their game - at least 80% code has to be written by AI - game has to be accessible on web without any login or signup and free-to-play (preferrably its

I'm organizing the

🌟 2025 Vibe Coding Game Jam

Deadline to enter: 25 March 2025, so you have 7 days

- anyone can enter with their game
- at least 80% code has to be written by AI 
- game has to be accessible on web without any login or signup and free-to-play (preferrably its
Alex Vacca (@itsalexvacca) 's Twitter Profile Photo

BREAKING: MIT just completed the first brain scan study of ChatGPT users & the results are terrifying. Turns out, AI isn't making us more productive. It's making us cognitively bankrupt. Here's what 4 months of data revealed: (hint: we've been measuring productivity all wrong)

BREAKING: MIT just completed the first brain scan study of ChatGPT users & the results are terrifying.

Turns out, AI isn't making us more productive. It's making us cognitively bankrupt.

Here's what 4 months of data revealed:

(hint: we've been measuring productivity all wrong)