Dharmik Gohil
@dharmikgohil_
22 | Backend SE | AWS & ELK Stack | TDD, Clean Architecture & OOP | TypeScript, Node.js & Express | Passionate About System Design & Distributed Systems
ID: 1387983278305513477
https://leetcode.com/dharmikgohil/ 30-04-2021 04:13:08
209 Tweet
66 Followers
417 Following
you can have the best LLM, the best vector database, and perfect prompt engineering. if your ๐ฒ๐บ๐ฏ๐ฒ๐ฑ๐ฑ๐ถ๐ป๐ด ๐บ๐ผ๐ฑ๐ฒ๐น ๐ถ๐ ๐๐ฟ๐ผ๐ป๐ด, your RAG system will still underperform. embeddings are the invisible layer most teams get wrong. an embedding model translates language
There are two ways to parse a PDF. one of them actually works. so let me do it quickly. ๐ง๐ฟ๐ฎ๐ฑ๐ถ๐๐ถ๐ผ๐ป๐ฎ๐น ๐ฃ๐ฎ๐ฟ๐๐ถ๐ป๐ด tools like pypdf read the X/Y coordinates of every character on the page then guess structure from position. characters close horizontally? that's a
There is a free and open source alternative to Screen Studio btw. It's called recordly.dev
most people think ๐๐ฒ๐ฏ ๐๐ฐ๐ฟ๐ฎ๐ฝ๐ถ๐ป๐ด is the easy part of ๐ฅ๐๐. it's not. it's where most pipelines silently break. 6 problems nobody talks about: ๐ญ. ๐บ๐ผ๐ฑ๐ฒ๐ฟ๐ป ๐๐ฒ๐ฏ๐๐ถ๐๐ฒ๐ ๐ฑ๐ผ๐ป'๐ ๐๐ฒ๐ฟ๐๐ฒ ๐ฐ๐ผ๐ป๐๐ฒ๐ป๐ ๐ถ๐ป ๐๐ต๐ฒ ๐๐ง๐ ๐ ๐ฎ. ๐ด๐ฌ% ๐ผ๐ณ ๐ฎ ๐๐ฒ๐ฏ๐ฝ๐ฎ๐ด๐ฒ