Graphsignal (@graphsignalai) Twitter Tweets • TwiCopy

66% of organizations report their technology investments will be easier to justify if they support a #GenAI initiative bit.ly/3FtFKRm via Enterprise Strategy Group

thumb_up_off_alt4

chat_bubble_outline0

repeat3

shareShare

Improving Information Retrieval in LLMs One effective way to use open-source LLMs is for search tasks, which could power many other applications. This work explores the use of instruction tuning to improve a language model's proficiency in information retrieval (IR) tasks.

thumb_up_off_alt640

chat_bubble_outline8

repeat156

shareShare

Victor M

@victormustar

2 years ago

Free ChatGPT users - what's stopping you from switching to HuggingChat?

thumb_up_off_alt284

chat_bubble_outline60

repeat32

shareShare

Graphsignal

@graphsignalai

2 years ago

#AI observability is evolving. Today's tools not only monitor AI performance but also unravel complex model behaviors, enhancing transparency and reliability.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Graphsignal

@graphsignalai

2 years ago

Learn how to measure and analyze LLM streaming performance using time-to-first-token metrics and traces ➡️ graphsignal.com/blog/measuring…

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Graphsignal

@graphsignalai

2 years ago

YES, you need to see the prompts! Great article by Hamel Husain hamel.dev/blog/posts/pro…

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Nathan Benaich

@nathanbenaich

a year ago

berlin done, munich next! good crew coming too:

thumb_up_off_alt21

chat_bubble_outline1

repeat1

shareShare

Graphsignal

@graphsignalai

a year ago

LLM API Latency Optimization Explained graphsignal.com/blog/llm-api-l…

thumb_up_off_alt4

chat_bubble_outline1

repeat2

shareShare

Graphsignal

@graphsignalai

5 months ago

Fresh dev setup on #dgx_spark running as a dstack fleet⚡️

Fresh dev setup on #dgx_spark running as a <a href="/dstackai/">dstack</a> fleet⚡️

thumb_up_off_alt8

chat_bubble_outline3

repeat4

shareShare

Graphsignal

@graphsignalai

18 days ago

New post: vLLM production observability - from model to hardware. graphsignal.com/blog/vllm-prod…

thumb_up_off_alt4

chat_bubble_outline0

repeat3

shareShare

Graphsignal

@graphsignalai

17 days ago

New post: Traditional Observability Is Blind to Inference graphsignal.com/blog/tradition…

thumb_up_off_alt1

chat_bubble_outline0

repeat1

shareShare

Graphsignal

@graphsignalai

17 days ago

New post: AI Debugging and Optimization For Production Inference graphsignal.com/blog/ai-debugg… Use Claude Code to debug and optimize AI systems with rich production context from Graphsignal

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

dstack

@dstackai

11 days ago

Now Graphsignal integrates with dstack — add SGLang profiling, tracing, and GPU metrics to your inference services. pip install 'graphsignal[cu12]' + wrap with graphsignal-run. That's it. graphsignal.com/docs/integrati…

Now <a href="/GraphsignalAI/">Graphsignal</a> integrates with dstack — add <a href="/sgl_project/">SGLang</a> profiling, tracing, and GPU metrics to your inference services.

pip install 'graphsignal[cu12]' + wrap with graphsignal-run. That's it.

graphsignal.com/docs/integrati…

thumb_up_off_alt6

chat_bubble_outline0

repeat6

shareShare

dstack

@dstackai

4 days ago

Agent orchestration is evolving fast! Agents + orchestration + telemetry → closed-loop systems. Our friends at GraphSignal show how this unlocks continuous inference optimization in production — across heterogeneous hardware. This is where things get interesting.

thumb_up_off_alt5

chat_bubble_outline0

repeat3

shareShare

Andrey Cheptsov

@andrey_cheptsov

3 days ago

Config tuning is just the start. The same loop can optimize inference code and even custom CUDA kernels. It all depends on what tools the agent can use.

thumb_up_off_alt5

chat_bubble_outline1

repeat2

shareShare

dstack

@dstackai

2 days ago

autodebug by Graphsignal is a closed-loop system for inference optimization. It uses dstack to provision GPUs and redeploy services on each pass through the loop: benchmark → read profiling telemetry → tweak config → redeploy → repeat. What's interesting here is

autodebug by <a href="/GraphsignalAI/">Graphsignal</a> is a closed-loop system for inference optimization.

It uses <a href="/dstackai/">dstack</a> to provision GPUs and redeploy services on each pass through the loop:

benchmark → read profiling telemetry → tweak config → redeploy → repeat.

What's interesting here is

thumb_up_off_alt8

chat_bubble_outline1

repeat6

shareShare

Graphsignal

Graphsignal

Graphsignal

Rich Stone

elvis

Victor M

Graphsignal

Graphsignal

Graphsignal

Nathan Benaich

Graphsignal

Graphsignal

Graphsignal

Graphsignal

Graphsignal

dstack

dstack

Andrey Cheptsov

dstack