StackTrace
StackTrace gives you full visibility into every LLM call, agent run, and RAG pipeline your application makes.
What it does
- Records every LLM request: model, tokens, cost, latency, and status
- Captures prompt content and completion content (optional)
- Groups related calls into traces with parent-child relationships
- Assigns semantic quality scores to responses
- Lets you drill into any request in the dashboard
Concepts
Trace — a logical unit of work, typically one user request or agent run. A trace has a unique ID and can contain multiple spans.
Span — a single operation within a trace. For LLM calls, a span records the model, token counts, estimated cost, and timing.
Score — a numeric evaluation attached to a trace. Scores can be added manually or automatically (e.g. semantic similarity, correctness).
Recording a trace
One-line trace
import stacklens
stacklens.configure(api_key="sl-xxxx")
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarise this document."}],
)
stacklens.trace(
"document-summary",
model="gpt-4o",
provider="openai",
input_tokens=response.usage.prompt_tokens,
output_tokens=response.usage.completion_tokens,
)Context manager (for agent workflows)
with stacklens.start_trace("support-agent") as span:
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=messages,
)
span.record_llm(
model="gpt-4o",
provider="openai",
input_tokens=response.usage.prompt_tokens,
output_tokens=response.usage.completion_tokens,
completion=response.choices[0].message.content,
cost_usd=0.0015,
)
span.set_attribute("user_id", user_id)
span.set_attribute("session_id", session_id)
span.add_tag("support", "production")Supported providers
Pass any provider string. Common values:
| Provider | Value |
|---|---|
| OpenAI | openai |
| Anthropic | anthropic |
| Google Gemini | gemini |
| Azure OpenAI | azure-openai |
| AWS Bedrock | bedrock |
Dashboard
The StackTrace dashboard shows:
- Trace list — filterable by model, status, date, and tags
- Trace detail — full span timeline, LLM metadata, and scores
- Usage analytics — total tokens, cost, latency percentiles (P50/P90/P95/P99), and model breakdown by day
SDK reference
See the Python SDK reference for the full API.