airunidentity.com

ai-run-identity-interpretation

AI Run Identity Must Not Depend on Interpretation

If establishing identity requires judgment, inference, or parsing decisions, the result is not identity. It is opinion with structure.

What Interpretation Means

What Interpretation Means in the Context of Identity

Interpretation is any step where a system or a person must make a judgment about what something means. Not what it contains. What it means. The distinction is precise and consequential.

Reading a value from a structured field is not interpretation. The field has a name. The value is present. The system reads it. But deciding which log entry corresponds to the system prompt — that is interpretation. Inferring the model version from an API response header — that is interpretation. Reconstructing the retrieval context from fragments scattered across multiple log lines — that is interpretation.

Interpretation enters any process where the relationship between the raw data and the identity is not explicit. When someone must decide what counts as part of the run, what order elements belong in, whether two values are equivalent, or what a missing field implies — interpretation is operating. And wherever interpretation operates, two parties examining the same data can reach different conclusions.

Identity that depends on interpretation is identity that depends on agreement. Agreement between systems. Agreement between teams. Agreement between the current reading and all future readings. That agreement does not exist in practice, and no convention or standard has ever achieved it at scale.

Why It Is Not Identity

Why Interpretation-Dependent Identity Is Not Identity

Identity must be singular. One run, one identity. If the process of establishing that identity involves judgment, then the identity is a function of the judge. Change the judge — change the system, the version, the configuration, the team — and the identity may change. An identity that changes with its observer is not an identity. It is a perspective.

This is not a philosophical distinction. It produces concrete failures. System A processes a run and derives an identity. System B processes the same run and derives a different identity. Both systems are functioning correctly within their own logic. The difference arises because each system interpreted some aspect of the input differently. Perhaps one system treated whitespace as significant and the other did not. Perhaps one system included a metadata field the other excluded. Perhaps one system normalized a value the other preserved verbatim.

Each of these differences is small. Each is defensible. And each is sufficient to produce a different identity for the same run. The problem is not that one system is wrong. The problem is that the identity was never fixed. It was always subject to interpretation, and interpretation always diverges.

When identity diverges, every downstream operation that depends on identity comparison fails silently. Audits match different records. Verifications confirm different things. Comparisons operate on different assumptions. The failure is invisible until someone explicitly checks whether two systems agree. They will not.

Where Interpretation Enters

Where Interpretation Enters Current Approaches

Interpretation is not an accidental feature of current identity approaches. It is embedded at every layer.

At the data layer, formats do not enforce a single representation. The same object can be serialized with keys in any order. Values can use different encodings. Optional fields may be present or absent. Each variation is semantically equivalent but structurally distinct. Any system that derives identity from the structure — rather than from a canonical form that eliminates structural variation — requires an interpretation step to decide which structural differences matter.

At the scope layer, there is no agreement on what constitutes a run. Does the run include the system prompt? Always, or only when it was explicitly set? Does it include retrieval context? What about tool definitions that were available but not invoked? Each decision about scope is an act of interpretation. Different systems make different decisions. They are all reasonable. None is authoritative.

At the semantic layer, values that look different may mean the same thing. A temperature of 0.7 and 0.70 are mathematically identical. A model name of "gpt-4" and "gpt-4-0613" may or may not refer to the same model depending on context. Deciding whether these are the same requires knowledge that is not present in the data itself. It requires interpretation.

Interpretation cannot be removed from these systems because it is not a flaw in their implementation. It is a consequence of using formats, schemas, and conventions that permit variation. Where variation is permitted, interpretation is required. Where interpretation is required, identity diverges.

What Would Be Required

What Interpretation-Free Identity Would Require

For identity to be free of interpretation, the representation of a run must permit no variation. Not "minimize variation." Not "standardize common cases." No variation at all. The same logical content must always produce the same physical form. If two representations of the same run can differ in any way — ordering, encoding, whitespace, field inclusion — then a system processing them must interpret which differences are meaningful. That interpretation step is exactly what must be eliminated.

This means the representation rules must be total. They must define how every type of value is encoded. How every edge case is handled. What happens with missing fields. What order elements appear in. What encoding is used for every character. The rules cannot defer to convention, because convention varies. They cannot defer to implementation, because implementations differ. The rules must be complete and unambiguous.

The scope must also be fixed. Which elements of a run are included in the identity cannot be a decision made by each system independently. It must be defined once and followed universally. If scope is configurable, then two systems with different configurations will produce different identities for the same run. The scope decision is itself an interpretation that must be removed.

And the derivation from representation to identity must be deterministic. Given the same representation, any system must produce the same identity. Not "the same identity within a tolerance." Not "the same identity for practical purposes." The same identity, bit for bit, every time.

These requirements are strict. They rule out most existing data formats, most existing serialization approaches, and most existing identity derivation methods. That is not an argument against the requirements. It is an observation about why no current system satisfies them.