Yiying Zhang (@yiying__zhang) Twitter Tweets • TwiCopy

Yiying Zhang

@yiying__zhang

+ Follow

Founder and CEO of GenseeAI, Associate Professor of Computer Science at UCSD. LLM serving, AI Workflows, Agents

ID: 936289743511441408

calendar_today30-11-2017 17:44:05

31 Tweet

1,1K Takipçi

139 Takip Edilen

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️

⏰ Promotion Period: January 15th - Feburary 15th, 2025

👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Yizhou Shan

4 years ago

Clio is an hardware-based (FPGA) memory disaggregation solution with a new virtual memory system, a customized transport, and a framework for computation offloading. [2/2]

Clio is an hardware-based (FPGA) memory disaggregation solution with a new virtual memory system, a customized transport, and a framework for computation offloading. [2/2]

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Yiying Zhang

3 years ago

The Third Workshop on Resource Disaggregation and Serverless Computing (WORDS'22) will happen on 11/17/2022. Consider submitting your new or published works! Deadline of paper submission is 9/29/2022. More info can be found at wordsworkshop.org

thumb_up_off_alt12

chat_bubble_outline0

repeat2

shareShare

Yiying Zhang

3 years ago

Deadline of WORDS'22 extended to 10/6. Consider submitting your new work (<= 5-page) or published work (<=2-page abstract).

thumb_up_off_alt4

chat_bubble_outline1

repeat2

shareShare

Yiying Zhang

3 years ago

WORDS’22 (Workshop on Resource Disaggregation and Serverless) will happen on Nov 17th, both in person in San Diego, CA and virtually. Registration is free for both options! Check out the program and register here: wordsworkshop.org

thumb_up_off_alt18

chat_bubble_outline0

repeat8

shareShare

Yiying Zhang

2 years ago

APSys paper registration due in three days!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Yiying Zhang

2 years ago

4th Workshop on Resource Disaggregation and Serverless Computing (co-located with SOSP’23). wordsworkshop.org Submissions open, deadline 7/16! Soliciting 5-page workshop papers and 2-page abstracts of recently published works on resource disaggregation/serverless computing.

thumb_up_off_alt27

chat_bubble_outline0

repeat8

shareShare

Yiying Zhang

a year ago

Today, LLMs are constantly being augmented with tools, agents, models, RAG, etc. We built InferCept [ICML'24], the first serving framework designed for augmented LLMs. InferCept sustains a 1.6x-2x higher serving load than SOTA LLM serving systems. #AugLLM mlsys.wuklab.io/posts/infercep…

thumb_up_off_alt29

chat_bubble_outline1

repeat2

shareShare

Yiying Zhang

a year ago

LLM prompts are getting longer and increasingly shared with agents, tools, documents, etc. We introduce Preble, the first distributed LLM serving system targeting long and shared prompts. Preble reduces latency by 1.5-14.5x over SOTA serving systems. #LLM mlsys.wuklab.io/posts/preble/

thumb_up_off_alt25

chat_bubble_outline2

repeat5

shareShare

Yiying Zhang

a year ago

Join us at ICML in Vienna next Thursday 11:30-1pm local time (poster session 5) for our poster on InfeCept (Augmented, or compound, AI serving system) at Hall C 4-9 #709 Know more about InferCept with our newly posted video: youtube.com/watch?v=iOs1b0…

thumb_up_off_alt6

chat_bubble_outline5

repeat1

shareShare

Yiying Zhang

9 months ago

WukLab's new study reveals CPU scheduling overhead can dominate LLM inference time—up to 50% in systems like vLLM! Scheduling overhead can no longer be ignored as model forwarding speeds increase and more scheduling tasks get added.#LLM #vLLM #SGLang Read tinyurl.com/yk4jeaz8

thumb_up_off_alt55

chat_bubble_outline3

repeat12

shareShare

Yiying Zhang

7 months ago

Struggling with developing high-quality gen-AI apps? Meet Cognify: an open-source tool for automatically optimizing gen-AI workflows. 48% higher generation quality, 9x lower cost, fully compatible with LangChain, DSPy, Python. Read & try Cognify: tinyurl.com/a8b9cdnj #GenseeAI

Struggling with developing high-quality gen-AI apps? Meet Cognify: an open-source tool for automatically optimizing gen-AI workflows. 48% higher generation quality, 9x lower cost, fully compatible with LangChain, DSPy, Python. Read & try Cognify: tinyurl.com/a8b9cdnj #GenseeAI

thumb_up_off_alt22

chat_bubble_outline0

repeat4

shareShare

Yiying Zhang

3 months ago

Boost your gen-AI workflow's quality by 2.8x with just $5 in 24 minutes! Check how Cognify autotunes gen-AI workflow’s quality and execution efficiency with a tiny budget in our latest blog post tinyurl.com/4tyvvdks. Paper tinyurl.com/3kx2xjn9. Code tinyurl.com/2tp9bndr.

thumb_up_off_alt8

chat_bubble_outline0

repeat4

shareShare

Yiying Zhang

3 months ago

Check how Cognify uses only $5 and 24 minutes to cover a search space of $168K and weeks when autotuning gen-AI workflows in the pt.2 of our tech blog: tinyurl.com/yutx334k. Code tinyurl.com/2tp9bndr. Paper tinyurl.com/3kx2xjn9

Check how Cognify uses only $5 and 24 minutes to cover a search space of $168K and weeks when autotuning gen-AI workflows in the pt.2 of our tech blog: tinyurl.com/yutx334k. Code tinyurl.com/2tp9bndr. Paper tinyurl.com/3kx2xjn9

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Yiying Zhang

3 months ago

My team and I will be at Nvidia GTC in person next week. Happy to chat about GenseeAI and more! 🤝 #NVIDIA #GTC #GenAI

My team and I will be at Nvidia GTC in person next week. Happy to chat about GenseeAI and more! 🤝 #NVIDIA #GTC #GenAI

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

Yiying Zhang

2 months ago

We're collecting insights on the current & potential use of AI agents to help build better future infrastructure. Please take our quick 1-2 minute survey: lnkd.in/gcWU9mmQ. Your responses are valuable for our R&D (anonymous option available), and you will receive a $25-$50

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Yiying Zhang

2 months ago

We are excited to launch the free beta of our AI agent/workflow serving platform, designed for intelligent execution optimization; tester.gensee.ai. Send me a direct message for an invitation code if you want to try it out. #AI #AIAgent #GenseeAI #LLMs #Infrastructure

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare