ai-run-source-identity
AI Run Identity Must Be Derived From the Source
The identity comes from what the run declared at the point of execution. Not from what was logged. Not from what was produced.
Source Derivation vs. Post-Hoc Reconstruction
There are two fundamentally different ways to establish what an AI run was. The first is to derive the identity from the run's declared inputs at the moment of execution. The second is to reconstruct what the run was by examining its artifacts after it completes. These are not two methods for achieving the same result. They answer different questions.
Source derivation asks: what did this run declare itself to be? It captures the run's composition — the parameters, the context, the instructions, the model — as they existed when the run was assembled. The identity is fixed before inference begins. It does not depend on what the run produces.
Post-hoc reconstruction asks: what can we infer this run was, based on what it left behind? It examines logs, outputs, telemetry, and metadata to piece together a picture of the execution. The identity is assembled after the fact from whatever artifacts are available.
Every system in production today uses the second approach. No system uses the first.
Why Post-Hoc Reconstruction Cannot Produce Trustworthy Identity
Post-hoc reconstruction fails for three reasons, each sufficient on its own.
First: reconstruction is incomplete. Logs do not capture everything. They capture what the logging system was configured to record. System prompts may be omitted. Retrieval context may be summarized. Parameter values may be truncated or absent. The reconstruction is always a subset of what actually ran. An identity built from a subset is not an identity of the whole.
Second: reconstruction is mutable. Logs can be edited. Telemetry can be modified. Metadata can be overwritten. The artifacts that reconstruction depends on are not immutable records. They are data stored in systems with write access. An identity that can be changed after the fact is not an identity. It is a claim that may or may not reflect what happened.
Third: reconstruction requires interpretation. Given a set of log entries, someone must decide which entries are relevant, what they mean, and how they relate to each other. Different analysts examining the same logs will produce different reconstructions. The identity becomes a function of who is reading the logs, not of what the run was.
These three problems are not implementation flaws. They are inherent to the approach. You cannot reconstruct an identity from artifacts that are incomplete, mutable, and require interpretation. You will produce an approximation. Approximations are useful for debugging. They are not sufficient for identity.
What Source-Derived Identity Would Require
For identity to be derived from the source, the run must declare its own composition. Not after execution. Not during execution. Before inference begins. The declaration must include every element that defines what the run is: the model, the parameters, the system instructions, the context, the tool definitions. All of it.
This declaration must be complete. If any element is omitted, the identity is partial. A partial identity cannot distinguish between two runs that differ only in the omitted element. Completeness is not optional. It is a structural requirement.
The declaration must also be immutable once established. If the declaration can be modified after the identity is derived, the identity no longer corresponds to what ran. Immutability is not a storage concern. It is a logical property: the mapping between the declaration and the identity must be fixed at the moment the identity is created.
And the declaration must be structured in a way that supports the other three conditions. It must be formatted so that consistency is achievable. It must contain enough information that independent verification is possible. It must be unambiguous enough that interpretation is not required.
This is where the mutual dependency of the four conditions becomes concrete. Source derivation is not just about capturing inputs. It is about capturing them in a form that makes the other three conditions satisfiable. Without that, source derivation alone is just better logging.
How This Condition Relates to the Others
Source derivation is not the most important of the four conditions. None of them is. They are all equally necessary. But source derivation occupies a particular position: it determines what the identity is derived from. The other conditions determine what properties that derivation must have.
Without source derivation, consistency has nothing stable to operate on. Post-hoc reconstruction produces different inputs each time, depending on what was logged, what was available, and who performed the reconstruction. Consistency requires a fixed input. Source derivation provides it.
Without source derivation, independent verification has nothing substantive to verify. A verifier can check that a trace ID matches a log entry. But that only confirms internal record-keeping. It does not confirm what the run was. Source derivation gives the verifier access to the run's declared composition — the actual subject of verification.
Without source derivation, the no-interpretation condition cannot be enforced. Reconstructed artifacts always require interpretation. Only a structured, pre-execution declaration can be designed to be interpretation-free.
The four conditions are mutually necessary. Source derivation is the condition that provides the raw material the other three conditions operate on. Without it, the others have nothing to work with.