Manuel Faysse (@manuelfaysse) Twitter Tweets • TwiCopy

Manuel Faysse

@manuelfaysse

+ Follow

NLP (LLMs) & ML Privacy -
🥐CroissantLLM
PhD Candidate @CentraleSupelec
Prev: @imperialcollege, @epfl, @La_UPM

ID: 2220306764

linkhttps://manuelfay.github.io/ calendar_today28-11-2013 20:22:28

185 Tweet

921 Followers

273 Following

Manuel Faysse

@manuelfaysse

3 months ago

Lots of interesting info on continued pretraining for domain adapting LLMs in the associated paper !

thumb_up_off_alt2

chat_bubble_outline0

repeat2

shareShare

Slides for my talk on NLP by Vision Language Models. personal.ntu.edu.sg/axsun/slides/N… I started with UTF-8, which makes language storage transparent to language processing. Then, LLMs make the traditional NLP pipeline (e.g., POS tagging, parsing, NER) transparent to NLP applications.

thumb_up_off_alt86

chat_bubble_outline1

repeat22

shareShare

Manuel Faysse

@manuelfaysse

3 months ago

It was not obvious pooling image tokens would only induce minimal performance degradation as with text tokens ! Large redundancies do exist between some patches (eg. empty white patches), but we had seen even those were useful as reasoning buffers, so very exciting results ! 🚀

thumb_up_off_alt11

chat_bubble_outline2

repeat2

shareShare

Manuel Faysse

@manuelfaysse

3 months ago

Fun ColPali finding of the day: training a LoRA adapter on top of the "mix" version of PaliGemma, but then using this adapter with the "pt" base model version actually leads to better results (+2% DocVQA) ! Crazy how many inference time optimizations exist !

thumb_up_off_alt9

chat_bubble_outline1

repeat0

shareShare

Manuel Faysse

@manuelfaysse

2 months ago

Enough people asked - we obliged- here's the entire ColPali training set: huggingface.co/datasets/vidor… ! We hope this can help bootstrap some ColPali finetuning efforts and we're eager to see cool work from the community !

thumb_up_off_alt326

chat_bubble_outline5

repeat48

shareShare

Manuel Faysse

@manuelfaysse

2 months ago

Super happy that people agree with the main takeaway from our ColPali paper: the future of Document AI is doing everything in vision space - not over engineering brittle text extraction pipelines !

thumb_up_off_alt264

chat_bubble_outline6

repeat22

shareShare

Benjamin Clavié

@bclavie

2 months ago

RAG is increasingly going multi-modal, but document retrieval is tough, and layout gets in your way. But it shouldn't! Introducing 🪤RAGatouille's Vision-equipped, ColPali-powered sibling: 🐭Byaldi With just a few lines of code, search through documents, with no pre-processing.

thumb_up_off_alt690

chat_bubble_outline19

repeat106

shareShare

Jo Kristian Bergum

@jobergum

2 months ago

With 200M hamming distances per second per CPU core over 128d binary ColPali embeddings we are ready to tackle billion scaled PDF datasets. Harvesting the power of VLMs. Storage footprint of ColPali with binary embeddings is the same as for 7B embedding models using 4096

thumb_up_off_alt209

chat_bubble_outline3

repeat25

shareShare