Distributed Tracing¶
Scouter provides OpenTelemetry-compatible distributed tracing built on Rust with a Python interface. The system supports synchronous and asynchronous code, automatic context propagation, and dual-export to Scouter's backend and external OTEL collectors.
Traces captured by Scouter can be evaluated offline or in production using TraceAssertionTask — validating span execution order, latency SLAs, token budgets, and more.
Architecture¶
graph TB
subgraph "Python Layer"
A[Application Code] --> B[Tracer Decorator/Context Manager]
B --> C[ActiveSpan]
end
subgraph "Rust Core"
C --> D[BaseTracer]
D --> E[SpanContext Store]
D --> F[Context Propagation]
F --> G[AsyncIO ContextVar]
end
subgraph "Export Pipeline"
D --> H[Scouter Exporter<br/>Required]
D --> I[OTEL Exporter<br/>Optional]
H --> J[(Scouter Backend)]
I --> K[(OTEL Collector)]
end
style H fill:#8059b6
style I fill:#5cb3cc
Dual Export System¶
Every span is always exported to Scouter's backend while optionally exporting to external OTEL-compatible systems:
| Export Target | Purpose | Configuration |
|---|---|---|
| Scouter (Required) | Correlation with models, drift profiles, events | transport_config |
| OTEL Collector (Optional) | Integration with existing observability stack | exporter |
Core Components¶
Initialization¶
Tracer Lifecycle¶
sequenceDiagram
participant App as Application
participant Tracer as BaseTracer
participant Store as Context Store
participant Exp as Exporters
App->>Tracer: init_tracer()
Tracer->>Store: Initialize global state
Tracer->>Exp: Configure exporters
App->>Tracer: start_as_current_span()
Tracer->>Store: Register span context
Store-->>App: ActiveSpan
App->>App: Execute logic
App->>Store: __exit__()
Store->>Exp: Export span data
Store->>Store: Cleanup context
App->>Tracer: shutdown_tracer()
Tracer->>Exp: Flush & shutdown
OpenTelemetry Compatibility¶
Scouter's tracing layer is built on top of the OpenTelemetry SDK. You can use Scouter as a drop-in TracerProvider in any OTEL-instrumented application.
ScouterInstrumentor¶
ScouterInstrumentor implements the standard OpenTelemetry BaseInstrumentor interface. It registers Scouter's TracerProvider as the global OTEL provider, so any library that calls opentelemetry.trace.get_tracer() will route spans through Scouter automatically.
ScouterInstrumentor is a singleton — calling it multiple times returns the same instance.
from scouter.tracing import ScouterInstrumentor
instrumentor = ScouterInstrumentor()
instrumentor.instrument()
This is equivalent to calling the convenience function:
With transport and batch configuration:
from scouter.tracing import ScouterInstrumentor
from scouter import GrpcConfig, BatchConfig
ScouterInstrumentor().instrument(
transport_config=GrpcConfig(),
batch_config=BatchConfig(scheduled_delay_ms=200),
)
With an external OTEL collector:
from scouter.tracing import ScouterInstrumentor, HttpSpanExporter, OtelExportConfig
ScouterInstrumentor().instrument(
transport_config=GrpcConfig(),
exporter=HttpSpanExporter(
export_config=OtelExportConfig(endpoint="http://otel-collector:4318")
),
)
Check instrumentation status:
Tear down instrumentation:
uninstrument() flushes pending spans, shuts down the provider, and resets the global OTEL tracer provider. The singleton is also reset, so the next call to ScouterInstrumentor() creates a fresh instance.
ScouterInstrumentor vs init_tracer¶
Both paths initialize Scouter tracing and produce identical span output. The difference is integration scope:
init_tracer() |
ScouterInstrumentor |
|
|---|---|---|
| Registers global OTEL provider | No | Yes |
| Works with OTEL auto-instrumentation libraries | No | Yes |
| Idiomatic for greenfield Scouter-only code | Yes | No |
Follows OTel BaseInstrumentor lifecycle |
No | Yes |
Use ScouterInstrumentor when your application uses OTEL auto-instrumentation libraries (e.g., opentelemetry-instrumentation-fastapi, opentelemetry-instrumentation-httpx) and you want their spans to flow through Scouter. Use init_tracer() for simpler setups where you instrument everything manually.
Using Scouter as a TracerProvider¶
You can construct a TracerProvider directly and set it as the global provider:
from opentelemetry import trace
from opentelemetry.sdk.trace.export import set_tracer_provider
from scouter.tracing import TracerProvider
from scouter import GrpcConfig
provider = TracerProvider(transport_config=GrpcConfig())
set_tracer_provider(provider)
# Any OTEL-instrumented library now routes spans through Scouter
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("my-span") as span:
span.set_attribute("key", "value")
Synchronous vs Asynchronous¶
Scouter's tracing system fully supports both synchronous and asynchronous code patterns in Python, ensuring seamless integration regardless of your application's architecture.
Sync Functions¶
from scouter.tracing import get_tracer
# init_tracer(...) must be called beforehand
tracer = get_tracer(name="sync-service")
@tracer.span("process_data")
def process_data(items: list[dict]) -> dict: # (1)
result = {"processed": len(items)}
return result # (2)
- Function arguments automatically captured as span input
- Return value captured as span output
Async Functions¶
Generators & Streaming¶
Context Propagation¶
Within Process¶
Context automatically propagates through the call stack using Python's contextvars:
@tracer.span("parent_operation")
def parent():
child() # (1)
@tracer.span("child_operation")
def child():
pass # (2)
- Child automatically inherits parent's trace context
- parent_span_id automatically set to parent's span_id
Across Services¶
Extract trace headers from the current span and inject into downstream requests:
from scouter import get_tracing_headers_from_current_span
@tracer.span("service_a_request")
async def call_service_b():
headers = get_tracing_headers_from_current_span()
# (1)
async with httpx.AsyncClient() as client:
response = await client.post(
"http://service-b/api/endpoint",
headers=headers
)
- Returns
{"trace_id": "...", "span_id": "...", "is_sampled": "true"}
from fastapi import Header
@app.post("/api/endpoint")
async def endpoint(
trace_id: str = Header(None),
span_id: str = Header(None),
is_sampled: str = Header(None)
):
with tracer.start_as_current_span(
name="service_b_handler",
trace_id=trace_id,
span_id=span_id, # (1)
remote_sampled=is_sampled == "true"
) as span:
span.set_tag("service", "service-b")
return process_request()
- Becomes the parent span ID, linking spans across services
Span Attributes & Events¶
Spans can be enriched with attributes, tags, and events to provide additional context:
from pydantic import BaseModel
class UserInput(BaseModel):
user_id: int
action: str
metadata: dict
@tracer.span("process_user_action")
def handle_action(input: UserInput): # (1)
with tracer.current_span as span:
span.set_tag("user_id", str(input.user_id))
span.set_attribute("action_type", input.action)
span.add_event("validation_completed", { # (2)
"checks_passed": 5,
"duration_ms": 23
})
- Pydantic models automatically serialized
- Add structured events to spans for detailed observability
Tags vs Attributes¶
Understanding Tags
Tags are indexed attributes designed for efficient searching. They're stored in a separate backend table and prefixed with scouter.tag.* during export
Note: Attributes are the OTEL preferred way to store general metadata; however, Scouter uses tags for key metadata that users frequently filter/search on.
with tracer.start_as_current_span("api_request") as span:
# Tags - indexed for search/filtering
span.set_tag("environment", "production") # (1)
span.set_tag("customer_tier", "enterprise")
# Attributes - general metadata
span.set_attribute("request_size_bytes", 1024) # (2)
span.set_attribute("cache_hit", True)
- Stored as scouter.tag.environment in backend
- Stored as regular attributes
Span Kinds & Labels¶
Span Kinds¶
Use semantic span kinds for better trace visualization:
from scouter.tracing import SpanKind
# Server receiving request
with tracer.start_as_current_span("handle_request", kind=SpanKind.Server):
process_request()
# Client making request
with tracer.start_as_current_span("call_api", kind=SpanKind.Client):
httpx.get("https://api.example.com")
# Producer publishing message
with tracer.start_as_current_span("publish_event", kind=SpanKind.Producer):
kafka_producer.send(...)
# Consumer processing message
with tracer.start_as_current_span("consume_event", kind=SpanKind.Consumer):
handle_message(...)
Labels¶
Categorize spans for organizational purposes
@tracer.span("train_model", label="ml-training")
def train_model():
pass
@tracer.span("validate_input", label="data-validation")
def validate():
pass
Error Handling¶
Exceptions are automatically captured with full tracebacks:
@tracer.span("risky_operation")
def process():
try:
might_fail()
except ValueError as e: # (1)
raise
- Exception automatically recorded with type, value, and full traceback. Span status set to ERROR
Real-time Monitoring¶
In addition to standard span methods, Scouter provides additional convenience methods for real-time monitoring and inserting entity records into queues. This is especially useful for apis in which you wish to correlate traces and drift detection/evaluation records.
from scouter.queue import ScouterQueue
from scouter.tracing import get_tracer, init_tracer
# usually called once at app startup
init_tracer(service_name="monitoring-service")
# example fastapi lifespan
@asynccontextmanager
async def lifespan(app: FastAPI):
logger.info("Starting up FastAPI app")
# get tracer
tracer = get_tracer(name="monitoring-service")
queue ScouterQueue.from_path(
path={"genai": Path(...)},
transport_config=GrpcConfig(),
)
# set the queue on the tracer
tracer.set_scouter_queue(queue)
yield
logger.info("Shutting down FastAPI app")
queue.shutdown()
tracer.shutdown()
def monitoring_task():
with tracer.start_as_current_span("monitoring_task") as span:
# insert items into queue with the span
span.insert_queue_item(
"alias", # (1)
GenAIEvaluationRecord(...) # (2)
)
- Alias to identify the queue
- Any Scouter entity record type can be inserted into the queue
Performance Considerations¶
Sampling¶
Control trace sampling rates to balance performance and observability
Each OTEL exporter can be instantiated with a sampling ratio between 0.0 and 1.0:
from scouter.tracing import init_tracer, HttpSpanExporter, GrpcSpanExporter
init_tracer(
service_name="sampled-service",
exporter=HttpSpanExporter(
sample_ratio=0.25, # (1)
)
)
init_tracer(
service_name="sampled-service-grpc",
exporter=GrpcSpanExporter(
sample_ratio=0.25,
)
)
- 25% of spans exported to OTEL collector
Note
Sample ratio in the above example only affects the OTEL exporter. If you wish to enforce the same sampling ratio for both Scouter and OTEL exporters, you must set the sample_ratio parameter in the init_tracer function directly.
Enforce Global Sampling Ratio for both Scouter and OTEL exporters:
from scouter import init_tracer, HttpSpanExporter
init_tracer(
service_name="globally-sampled-service",
sample_ratio=0.1, # (1)
exporter=HttpSpanExporter()
)
- 10% of spans exported to both Scouter and OTEL collector. This overrides individual exporter sampling ratios.
Batch Export¶
Scouter provides a BatchConfig to optimize span exporting:
Batch is enabled by default. Customize batch settings as needed:
from scouter.tracing import BatchConfig, init_tracer, GrpcSpanExporter
init_tracer(
service_name="high-throughput-service",
batch_config=BatchConfig(
max_queue_size=4096,
scheduled_delay_ms=1000, # (1)
max_export_batch_size=1024
),
exporter=GrpcSpanExporter(batch_export=True) # (2)
)
init_tracer(
service_name="high-throughput-service",
exporter=GrpcSpanExporter(batch_export=True) # (3)
)
- Export spans every 1 second in batches
- Ensure exporter is set to batch mode (default is True)
- Exporter uses default batch settings if not specified
Input/output Truncation¶
Large span inputs/outputs can be truncated to reduce payload size:
1. Automatically truncates inputs/outputs exceeding 5000 characters 2. Manually specify max length for input/output serialization