destruction-layer

Why Distributed Tracing Does Not Establish AI Run Identity

Tracing answers the question: how did this system execute? Identity answers a different question: what was this execution? The tools that answer one cannot answer the other.

What Tracing Solves

What Distributed Tracing Does Solve

Distributed tracing is a mature, well-designed tool for understanding how requests move through complex systems. It captures spans — units of work — and assembles them into a tree that represents the execution path of a request across multiple services.

For infrastructure problems, tracing is often the only tool that works. When a request is slow, tracing shows which service introduced latency. When a request fails, tracing shows which service returned the error. When services interact in unexpected ways, tracing makes the interaction visible.

These are real, valuable capabilities. Tracing has transformed the operational practice of running distributed systems. The question is not whether tracing works. The question is whether tracing operates at the level where AI run identity exists. It does not.

Level Mismatch

The Level at Which Tracing Operates vs. the Level at Which Identity Must Operate

Tracing operates at the infrastructure level. It sees services, endpoints, network calls, database queries, queue operations. It understands that Service A called Service B, which called Service C. It records timing, status codes, and propagated context headers.

AI run identity operates at the semantic level. It concerns what model was selected, what instructions were assembled, what retrieval context was injected, what tools were made available, what constraints were in effect. These are not infrastructure events. They are compositional decisions that define what the run is.

A trace can show that a request hit an embedding service, then a vector database, then a language model endpoint. It cannot show what the system prompt contained, how the retrieval results were ranked and filtered, whether the context window was truncated, or what tool definitions were active. These are not spans. They are not network calls. They are internal compositional states that exist below the level at which tracing instruments.

The level mismatch is not a configuration problem. Tracing instruments at service boundaries. Identity is constituted within a service, before the service boundary is crossed. Tracing sees the call. It does not see what made the call what it was.

Missing Information

Specific AI Identity Information That Traces Do Not Capture

A trace of an AI system execution typically captures the following: which services were called, in what order, with what latency, and whether each call succeeded or failed. This is the execution graph.

It does not capture the assembled system prompt. The prompt is constructed inside the application, often from multiple sources — templates, feature flags, user-specific rules, A/B test variants. The trace sees the outbound call to the model API. It does not see the prompt that was sent, because the prompt exists inside the application boundary.

It does not capture the retrieval context selection logic. A RAG pipeline retrieves documents, scores them, filters them, and assembles them into context. The trace may show that a vector database was queried. It does not show which documents were selected, which were discarded, or how the final context was composed.

It does not capture the effective tool definitions. An AI run may have access to tools — functions the model can call. Which tools are available, with what descriptions, under what constraints — these define what the run can do. Traces record tool invocations. They do not record tool availability.

It does not capture configuration state. Model parameters, safety settings, output constraints, fallback rules — these are set within the application and shape the run's behavior. The trace does not instrument application configuration. It instruments application communication.

Complementary, Not Substitutable

Why Tracing and Identity Are Complementary, Not Substitutable

Tracing answers: how did this request flow through the system? Identity would answer: what was this run composed of when it executed? Both questions are legitimate. Both deserve answers. Neither answer substitutes for the other.

A system with tracing but no identity can tell you that a request took 450ms, hit three services, and succeeded. It cannot tell you what model was used, what instructions were in effect, or what context was assembled. A system with identity but no tracing could tell you what the run was composed of but not how the request flowed through infrastructure.

The complementary relationship means that tracing cannot be improved into identity. Adding more spans does not capture compositional state. Adding custom attributes to spans can carry fragments of run information, but the fragments are still produced by the executing system, still assembled after the fact, and still unverifiable from outside. The structural properties of tracing remain intact regardless of what data is attached to spans.

What a Trace Cannot Tell You

A Complete Trace of an AI System Execution Cannot Tell You

Whether the system prompt that was active during this run matched the system prompt that was approved for this use case. Tracing does not capture prompt content. Prompt governance is outside its scope.

Whether the model that executed was the model that was authorized for this workload. Tracing may capture the endpoint URL. It does not capture the model routing logic, the fallback configuration, or the model selection criteria that determined which model actually ran.

Whether the run's composition at the moment of execution matched the composition that was declared or intended. Tracing does not capture declarations. It captures executions. The gap between what was intended and what was executed is precisely the gap identity would address.

Whether this run and another run that produced different outputs were actually different in composition. Two traces may look identical at the infrastructure level while the runs differed in prompt, context, or configuration. The trace cannot distinguish them. The identity record would.

→ Why Observability Falls Short

Observability monitors systems. Identity defines executions. The gap is categorical.

FAQ

Frequently Asked Questions

+Can OpenTelemetry semantic conventions for AI close this gap?

Semantic conventions for AI add AI-specific attributes to spans — model name, token counts, prompt fragments. This is useful instrumentation. It does not change the structural properties of tracing: the data is still produced by the executing system, still assembled during or after execution, and still unverifiable from outside. Better attributes improve the trace. They do not convert a trace into an identity record.

+What about custom spans that capture the full prompt and context?

Custom spans that capture prompt and context create a more detailed trace. The trace is still a self-report from the executing system. Its completeness cannot be verified from outside. Whether the captured prompt matches what was actually sent to the model depends on the instrumentation being correct and complete — which is the same verification problem that identity would need to solve independently of the trace.

+Is tracing still worth implementing for AI systems?

Tracing remains essential for operational visibility, latency diagnosis, error propagation analysis, and infrastructure debugging. These needs do not go away because identity is a separate problem. The recommendation is not to abandon tracing but to stop expecting it to answer questions it was never designed to answer.