Christy Bergman (@cbergman) Twitter Tweets • TwiCopy

Towards Data Science

2 years ago

Learn how to adopt RAG best practices by incorporating evaluations into your pipeline: Christy Bergman covers the ins and outs of optimizing chunkings, embeddings, and more. buff.ly/3LjysTk

thumb_up_off_alt51

chat_bubble_outline0

repeat13

shareShare

Had a blast creating this #MultiModal tutorial/demo! Used fun mix of tools for awesome results! 💡✨ 🖼️Milvus for the #VectorDatabase 🧠 a tiny Clip #EmbeddingModel by Ash Vardanian 🤖 @chatgpt4o as the #LLM. Check it out! ➡️ github.com/christy/Zilliz… #AI #MachineLearning

thumb_up_off_alt10

chat_bubble_outline1

repeat0

shareShare

Christy Bergman

@cbergman

2 years ago

Thanks Towards Data Science for the reshare! Iterate to find the best RAG combinations by: Changing the Chunking Strategy 📦 Changing the Embedding Model 📷 Changing the LLM Model 📷 I made a video youtube.com/watch?v=BzZLyP…… Thanks to Greg Kamradt for the original chunking article!

thumb_up_off_alt6

chat_bubble_outline1

repeat2

shareShare

Zilliz

@zilliz_universe

2 years ago

Monday Meetup is right around the corner! 🗣 Join us in SF on August 5 for exciting talks: 🔢 Using Ray Data for Multimodal Embedding Inference with Christy Bergman 📐 A Different Angle: Retrieval Optimized Embedding Models Marqo 🛠 Building the Future of Neural Search: How to

Monday Meetup is right around the corner! 🗣 Join us in SF on August 5 for exciting talks:
🔢 Using Ray Data for Multimodal Embedding Inference with <a href="/cbergman/">Christy Bergman</a>
📐 A Different Angle: Retrieval Optimized Embedding Models <a href="/marqo_ai/">Marqo</a>
🛠 Building the Future of Neural Search: How to

thumb_up_off_alt7

chat_bubble_outline0

repeat5

shareShare

The AI Conference

@aiconference

2 years ago

🌟Join our expert panel at The AI Conference 2024 to explore advanced RAG (Retrieval-Augmented Generation) techniques. Learn how integrating information retrieval with generative models is revolutionizing AI, making it more contextually rich and useful in real-world

thumb_up_off_alt7

chat_bubble_outline1

repeat5

shareShare

Christy Bergman

@cbergman

2 years ago

Interesting take-down how to do LoRA properly, quickly, with less memory, on all layers Daniel Han's tweet and blog unsloth.ai/blog/contpretr… ! > For continued pretraining, I advise people to train on all layers (inc gate) + lm_head, embed_tokens, use RS LoRA, use rank>=256

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

swyx

@swyx

2 years ago

CUDA MODE hackathon today! Here's Andrej Karpathy on the 🏖️ origin story of llm.c, and what it hints at for the fast, simple, llm-compiled future of custom software.

thumb_up_off_alt619

chat_bubble_outline13

repeat55

shareShare

Christy Bergman

@cbergman

2 years ago

I just tried this hack. Thanks, I really needed that! 😂

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Christy Bergman

@cbergman

2 years ago

Nice to meet and chat w/you too! Adam Seligman Felipe Hoffa It was fun to get some hands-on time and see what's new with Amazon Web Services Bedrock.

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Christy Bergman

@cbergman

a year ago

Interesting! The most common inference quantization int8/fp8 is not necessarily the best. bf16 #quantization is a way better accuracy/latency tradeoff.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Christy Bergman

@cbergman

a year ago

Thanks paco nathan ! I'd better get started preparing my talk for that! #SonomaAI #FoodWineAI

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Sam Altman

@sama

a year ago

GPT-4.5 is ready! good news: it is the first model that feels like talking to a thoughtful person to me. i have had several moments where i've sat back in my chair and been astonished at getting actually good advice from an AI. bad news: it is a giant, expensive model. we

thumb_up_off_alt42,42K

chat_bubble_outline3,3K

repeat3,3K

shareShare

DeepSeek

@deepseek_ai

a year ago

🚀 Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks. ⚡ 6.6 TiB/s aggregate read throughput in a 180-node cluster ⚡ 3.66 TiB/min

thumb_up_off_alt10,10K

chat_bubble_outline532

repeat1,1K

shareShare

Towards Data Science

@tdatascience

a year ago

Thankfully Christy Bergman's article can help you identify key convos with an AI hack to perform semantic clustering simply by prompting LLMs! towardsdatascience.com/tutorial-seman…

thumb_up_off_alt6

chat_bubble_outline1

repeat2

shareShare

Christy Bergman

@cbergman

10 months ago

Don't🍷about #OOM running out of memory! Hugging Face is making it easier to run huge #TransformerandDiffuser models on consumer GPUs w quantization, tensor parallelism, offloading. Hear from Steven Liu how to fit these models on your setup. lu.ma/taf3lmvj #HuggingFace

Don't🍷about #OOM running out of memory!
<a href="/huggingface/">Hugging Face</a> is making it easier to run huge #TransformerandDiffuser models on consumer GPUs w quantization, tensor parallelism, offloading. Hear from <a href="/stevhliu/">Steven Liu</a> how to fit these models on your setup. lu.ma/taf3lmvj #HuggingFace

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Christy Bergman

@cbergman

5 months ago

💓Andrew Ng Note to self: look here before next CFP submission or helping others. Ask the model to summarize best advice per conference CFP rules and topic submitter wants to talk about...

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare