Raphaël Sourty (@raphaelsrty) Twitter Tweets • TwiCopy

Raphaël Sourty

@raphaelsrty

+ Follow

Language Models, Knowledge Bases, Knowledge Distillation PhD | AI @LightonIO

ID: 1262679654239961088

linkhttps://github.com/raphaelsty calendar_today19-05-2020 09:41:23

313 Tweet

395 Followers

633 Following

Benjamin Clavié

@bclavie

12 days ago

LATEINTERACTION DOT COM

thumb_up_off_alt38

chat_bubble_outline1

repeat2

shareShare

LightOn joins the OCR mania We release a 1B model achieving SOTA results while being much faster than all the recent releases It is also an end-to-end trainable solution for easy adaptation to your specific domains We also share interesting insights (and soon the dataset!)

thumb_up_off_alt291

chat_bubble_outline6

repeat46

shareShare

Iacopo Poli

@iacopo_poli

11 days ago

The recipe for a fast, performant OCR model: 1. tell Said that OCR is solved 2. let him rage about the state of OCR 3. get a few smart people in a GMeet with him 4. tell them there are GPUs available 5. wait a bit 6. enjoy🦉 Soon deployed in your favorite Enterprise environments

thumb_up_off_alt16

chat_bubble_outline1

repeat4

shareShare

Oskar Hallström

@oskar_hallstrom

11 days ago

Last few days have been insane in the OCR land with releases from DeepSeek, PaddlePaddle and others. Now we at LightOn are entering the game with our latest release, pushing the state of the art even further. Kudos staghado Baptiste Aubertin Adrien Cavaillès 🥳

thumb_up_off_alt9

chat_bubble_outline0

repeat4

shareShare

Daniel van Strien

@vanstriendaniel

10 days ago

Vibe checks of the new LightOn OCR model on some 20th Century digitised books (from National Library of Scotland). This collection has existing OCR (turns out OCR was a thing before VLMs 😜) Vs the old OCR: Paragraph structure better maintained

Vibe checks of the new <a href="/LightOnIO/">LightOn</a> OCR model on some 20th Century digitised books (from <a href="/natlibscot/">National Library of Scotland</a>).

This collection has existing OCR (turns out OCR was a thing before VLMs 😜)

Vs the old OCR:

Paragraph structure better maintained

thumb_up_off_alt38

chat_bubble_outline3

repeat8

shareShare

staghado

@staghado

7 days ago

🦉LightOnOCR has landed in llama.cpp! thanks to Xuan-Son Nguyen for the quick integration! github.com/ggml-org/llama…

thumb_up_off_alt36

chat_bubble_outline1

repeat3

shareShare

Raphaël Sourty

@raphaelsrty

7 days ago

Talk from my LightOn buddy Amélie Chatelain next week with Weights & Biases in London. Fresh perspective of the cost of LLM at scale when feeding decent amount of context in their prompt I'll don't spoil the moral of the story but retrieval might be helpful there

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Omar Khattab

@lateinteraction

6 days ago

So cool! A strong late interaction (ColBERT) model that beats strong late interaction models!

thumb_up_off_alt63

chat_bubble_outline0

repeat7

shareShare

Raphaël Sourty

@raphaelsrty

6 days ago

This is very very cool, pushing the boundaries of late-interaction models

thumb_up_off_alt11

chat_bubble_outline1

repeat4

shareShare

Antoine Chaffin

@antoine_chaffin

5 days ago

I suggest following LightOn on HF, you might get cool stealth contributions such as this one 😇 huggingface.co/lightonai

thumb_up_off_alt13

chat_bubble_outline0

repeat1

shareShare

Raphaël Sourty

@raphaelsrty

4 days ago

It will be released in a near period of time, we are cooking right now

thumb_up_off_alt8

chat_bubble_outline2

repeat0

shareShare

staghado

@staghado

3 days ago

A Halloween gift 🎃 New finetuning notebook for LightOnOCR-1B: • Supports both Full and LoRA training • Supports FineVision 🤗 subsets incl. OlmOCR-mix & handwritten IAM • ~12 min/epoch on one H100 (can also runs on Colab!)

thumb_up_off_alt89

chat_bubble_outline7

repeat16

shareShare