StackTrace

StackTrace gives you full visibility into every LLM call, agent run, and RAG pipeline your application makes.

What it does

Records every LLM request: model, tokens, cost, latency, and status
Captures prompt content and completion content (optional)
Groups related calls into traces with parent-child relationships
Assigns semantic quality scores to responses
Lets you drill into any request in the dashboard

Concepts

Trace — a logical unit of work, typically one user request or agent run. A trace has a unique ID and can contain multiple spans.

Span — a single operation within a trace. For LLM calls, a span records the model, token counts, estimated cost, and timing.

Score — a numeric evaluation attached to a trace. Scores can be added manually or automatically (e.g. semantic similarity, correctness).

Recording a trace

One-line trace

import stacklens
 
stacklens.configure(api_key="sl-xxxx")
 
response = openai_client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarise this document."}],
)
 
stacklens.trace(
    "document-summary",
    model="gpt-4o",
    provider="openai",
    input_tokens=response.usage.prompt_tokens,
    output_tokens=response.usage.completion_tokens,
)

Context manager (for agent workflows)

with stacklens.start_trace("support-agent") as span:
    response = openai_client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
    )
    span.record_llm(
        model="gpt-4o",
        provider="openai",
        input_tokens=response.usage.prompt_tokens,
        output_tokens=response.usage.completion_tokens,
        completion=response.choices[0].message.content,
        cost_usd=0.0015,
    )
    span.set_attribute("user_id", user_id)
    span.set_attribute("session_id", session_id)
    span.add_tag("support", "production")

Supported providers

Pass any provider string. Common values:

Provider	Value
OpenAI	`openai`
Anthropic	`anthropic`
Google Gemini	`gemini`
Azure OpenAI	`azure-openai`
AWS Bedrock	`bedrock`

Dashboard

The StackTrace dashboard shows:

Trace list — filterable by model, status, date, and tags
Trace detail — full span timeline, LLM metadata, and scores
Usage analytics — total tokens, cost, latency percentiles (P50/P90/P95/P99), and model breakdown by day

SDK reference

See the Python SDK reference for the full API.