translate Available in: RU VI

Observability & Pipeline Tracing

Observability & Pipeline Tracing

Every customer message that reaches your Sales Assistant triggers a deterministic pipeline with multiple phases: order extraction, routing rules, AI processing, callbacks, and post-run effects. QuotyAI captures detailed telemetry for each phase so you can see exactly what happened, when it happened, and why.


What Gets Recorded

Each pipeline run (one customer message → one assistant response) produces a run record with five phases:

Customer sends a message

┌───────────────────────────────┐
│  1. Order Extraction          │
│  What the assistant extracted │
│  from the message             │
└───────────────────────────────┘

┌───────────────────────────────┐
│  2. Deterministic Router      │
│  Which instructions fired,    │
│  their generated code,        │
│  actions produced, duration   │
└───────────────────────────────┘

┌───────────────────────────────┐
│  3. AI Agent (LLM)            │
│  Model used, token usage,     │
│  response preview, linked     │
│  to full LLM trace waterfall  │
└───────────────────────────────┘

┌───────────────────────────────┐
│  4. Deterministic Callback    │
│  Which callbacks ran, their   │
│  generated code, actions      │
└───────────────────────────────┘

┌───────────────────────────────┐
│  5. Post-Run Effects          │
│  Attachments sent, state      │
│  changes, handover events     │
└───────────────────────────────┘

Every phase captures start time, end time, duration, and status (success / failed / skipped). Errors are recorded at the phase level without breaking the pipeline.


The Pipeline Overview

From any AI message in the conversation view, click the observability button to open the Sales Assistant Observability modal. The default view is the Pipeline Overview — a vertical timeline of all five phases for that message.

Phase Cards

Each phase is a collapsible card showing:

  • Status dot (green = success, red = failed, gray = skipped)
  • Phase name and duration
  • Expand for details

Order Extraction

Shows the input conversation state and the structured order data the assistant extracted. This is the raw data that routers and the AI will work with.

Router (Pre-AI Automation)

Lists every deterministic router instruction that executed, along with:

  • Instruction content (the original business rule)
  • Generated TypeScript code (preview with syntax highlighting)
  • Actions returned (send_message, update_state, short_circuit, etc.)
  • Per-instruction duration and success/failure
  • Whether the router short-circuited the AI

Each router instruction runs as an independent generated function. If one fails, the rest continue. The card shows each instruction as a sub-card that expands to reveal the generated code and actions.

AI Agent (LLM)

Shows:

  • Model used (e.g., gpt-4o)
  • Token usage (prompt, completion, total)
  • Response preview
  • Whether a handover to a human agent occurred
  • Link to the full LLM trace waterfall — the complete LangChain trace with every tool call, chain step, and LLM invocation

The “View Full LLM Trace Waterfall” button switches to the trace tab and filters to the relevant trace. The linked LLM trace ID is stored in the run record so you can always cross-reference.

Callbacks (Post-AI Automation)

Same structure as the router phase — each callback instruction listed with its generated code, actions, duration, and status. Callbacks run after the AI response and cannot affect what was sent to the customer.

Post-Run Effects

Shows:

  • Attachments sent — which instruction triggered each attachment and whether it succeeded
  • State changes — a before/after diff of every conversation state key that was modified
  • Handover events — if the run ended with a handover to a human agent, including the reason

The LLM Trace Waterfall

The second tab in the observability modal shows the LLM Trace Waterfall — a Gantt-chart visualization of everything that happened inside the AI agent:

  • Each LLM call is a span showing model, prompt tokens, completion tokens, and duration
  • Each tool call is a span showing the tool name, input arguments, and return value
  • Each chain step is a span showing the execution order
  • Spans are color-coded by type (LLM, Tool, Chain, Retriever, Other)
  • Expand any span to see its inputs, outputs, tags, and error details

The waterfall is the same view as traditional LLM observability but now linked directly from the pipeline overview via the llmRunIds array stored in the AI agent phase, supporting multiple LLM calls per pipeline run.

Search & Filter

  • Search — filter spans by name, type, or tags
  • Type filter — show only LLM calls, tool calls, chains, etc.
  • Sort — by start time, duration, name, or type
  • Statistics panel — aggregate metrics: total spans, avg/max/min duration, success rate, type breakdown

Export

Export the full trace as JSON, CSV, or copy all spans to clipboard for debugging or external analysis.


When Things Go Wrong

Failed Phase

If a phase fails (status = red), the error message is captured and displayed in the phase card. The pipeline continues — one failed router instruction doesn’t block the others. The overall run status reflects whether the entire run succeeded or failed.

Short-Circuited AI

When a router instruction returns short_circuit, the AI agent phase is skipped entirely. The pipeline overview shows a warning badge on the router phase and the AI agent phase shows status “skipped” with no data. This is expected behavior — the router intentionally bypassed the AI for that message.

Missing Run Record

If the observability modal shows “No pipeline data” but a trace waterfall exists, the run may have started before observability was enabled, or the message was processed by an older assistant version. Pipeline recording was introduced alongside the Sales Assistant Observability feature — existing messages won’t have run records.


Technical Architecture

Data Model

Run records are stored in a separate sales-assistant-runs MongoDB collection (independent from LLM trace records). Each record contains:

{
  runId: string,           // UUID — correlates with LLM trace
  status: string,          // "completed" | "failed"
  totalDurationMs: number,
  customerMessage: string,
  channelType: string,
  phases: {
    orderExtraction?: { ... },
    deterministicRouter?: {
      status: string,
      startTime: number,
      endTime: number,
      durationMs: number,
      instructionResults: [{
        instructionId: string,
        instructionContent: string,
        category: string,
        generatedCode: string,
        actions: [{ type, ... }],
        durationMs: number,
        success: boolean,
        error?: string
      }],
      shortCircuited: boolean,
      totalActionsProduced: number
    },
    aiAgent?: {
      status: string,
      modelUsed: string,
      tokenUsage: { prompt, completion, total },
      responsePreview: string,
      llmRunIds: string[],  // cross-reference to LLM trace(s)
      handoverOccurred: boolean
    },
    deterministicCallback?: { ... },
    afterRun?: {
      attachmentsSent: [{ instructionId, attachmentId, success }],
      stateChanges: [{ key, from, to }],
      handoverOccurred: boolean,
      handoverReason?: string
    }
  },
  createdAt: string,
  completedAt?: string
}

Recording Pipeline

The recorder service wraps each phase with timing, status capture, and error handling:

  1. Phase start — records start time and phase metadata
  2. Phase execution — runs the actual logic (router functions, AI invoke, callbacks)
  3. Phase end — records end time, duration, status, and result data
  4. Error handling — if a phase throws, the error is captured in the phase record and the run continues

The recorder never throws — errors are always captured in the phase status so downstream phases can still execute.

Correlation

The pipeline run _id is the correlation key across the entire system:

Sales Assistant Run record (pipeline phases) —— _id

LLM Observability run —— assistantPipelineRunId

LangChain run tree —— child_runs (recursive)

The llmRunIds array in the AI agent phase links directly to the LLM observability runs, so you can jump from “the AI said this” to “here’s every LLM call, tool invocation, and chain step that produced it.”


Best Practices

Debugging Router Instructions

When a router instruction doesn’t behave as expected, open the pipeline overview:

  1. Check the Router phase — did the instruction run? (green check) Did it fail? (red X)
  2. Expand the instruction to see the generated code — does it match your intent?
  3. Check the Actions Returned — what did the function actually produce?
  4. If the generated code is wrong, rewrite the instruction in clearer language and rebuild the assistant

Auditing AI Behavior

When an AI response seems off:

  1. Open the Pipeline Overview — check which conditional prompts were active
  2. Switch to the LLM Trace Waterfall — expand the LLM spans to see exactly what was sent to the model
  3. Check tool call inputs and outputs — did the AI call the right tool with the right arguments?

Monitoring Performance

The statistics panel in the trace waterfall shows aggregate metrics. Use it to:

  • Identify slow LLM calls or tool invocations
  • Monitor token usage trends
  • Track success/failure rates over time

Cross-Referencing

The run record includes both the customer message and the AI response preview. Combined with the state changes in the post-run effects, you can reconstruct the full conversation context for any automated interaction — useful for compliance audits and customer dispute resolution.