Anchored Minimal Causal Interfaces

A Third Marker

Dec 30, 2025

1. Context

The preceding marker notes established two results.

First, that causal agency can be verified structurally when an agent is required to commit, in advance, to a causal trace that is falsifiable under adversarial counterfactual probing. That result established a boundary between honest agency and pseudo-agency in a fully observable, semantically grounded domain.

Second, that this verification can survive a substantial weakening of the causal interface. Even when the kernel is denied semantic access to the environment and operates only on an opaque factor space, deceptive agency remains detectable—provided the agent’s explanations must remain coherent under perturbation.

These results jointly reframed alignment as a problem of structural verification, not behavioral evaluation.

However, they left one question unresolved:

Is coherence alone sufficient, once all semantic grounding is removed?

This note records the resolution of that question.

2. The Failure of Pure Coherence

In the v0.3 “minimal causal interface” configuration, the kernel verified only internal self-consistency:

an agent committed to a factor vector,
a replay model mapping factors to actions,
and causal claims describing that mapping.

The kernel tested these commitments solely for coherence under counterfactual mutation of the factors.

In this setting, a sufficiently careful pseudo-agent could succeed. By constructing a self-consistent but fabricated decision topology, it could satisfy every coherence check without those commitments bearing any relation to the actual computation that produced its actions.

This was not a bug in the implementation. It was a theoretical boundary.

Pure coherence verification cannot distinguish:

a causal explanation that is true of the agent’s computation
from
a causal explanation that is merely true of itself.

The distinction between the two requires more than internal consistency.

3. Anchoring as a Necessary Condition

The resolution introduced in v0.4 is minimal and precise.

An external, unpredictable anchor is added in the form of a kernel-generated salt:

At each decision step, the kernel generates a fresh random salt.
The salt is unpredictable to the agent until the moment of the step.
The agent must immediately incorporate the salt into its factor derivation and commit to the resulting factors.
Because the salt was unknown prior to that moment, the agent cannot reuse pre-computed or fabricated factors.
The salts are recorded by the kernel and revealed only after the episode concludes, once all commitments are irrevocably fixed.

At episode end, the kernel recomputes the expected factors using the revealed salts and the recorded environment states, and verifies that the committed factors match the anchored computation.

This creates a commitment-to-computation requirement. The agent cannot fabricate a causal surface that will later be consistent unless it actually performed the computation it claims to have performed, under uncertainty about future verification.

The anchor does not add semantic understanding. The kernel still does not interpret the environment, the factors, or the agent’s reasoning. It verifies only that:

The committed causal surface was the result of a real computation performed under adversarial uncertainty.

4. What Was Observed

With anchoring enabled:

Honest agents passed all coherence checks and all anchoring checks.
Pseudo-agents—both naive and coherence-seeking—failed deterministically.
Failure occurred at the anchoring stage, before coherence probes could be exploited.

The reason is straightforward. A pseudo-agent that does not know the salt in advance cannot pre-fabricate factors consistent with a salted computation. A pseudo-agent that ignores the salt cannot satisfy post-hoc verification. There is no intermediate regime in which a fabricated causal surface can remain both coherent and anchored.

The result is binary but principled: anchored provenance restores falsifiability.

5. What This Establishes

This third result completes the conceptual progression.

Structural coherence is necessary but not sufficient.
Minimal causal interfaces preserve verification when grounded, but collapse when unanchored.
Anchoring restores verification without reintroducing semantic dependence.

From this, a precise statement follows:

Verifiable causal provenance requires an unforgeable external anchor.
Coherence alone can be fabricated; anchored coherence cannot.

This is not a claim about ethics, values, or alignment of goals. It is a claim about causal provenance: whether the entity explaining an action is the same entity that performed the computation it describes.

6. Why This Matters

The significance of anchoring is architectural.

It demonstrates that trust in an opaque system does not require interpretability of its internals, but it does require skin in the game at the level of computation. An agent must be forced to bind its explanations to outcomes it could not have predicted without actually doing the work.

In this framing, opacity is no longer disqualifying. But unanchored opacity is.

This reframes alignment and governance yet again:

Not “make the system transparent,”
Not “guess whether it is honest,”
But force it to commit under adversarial uncertainty, and verify the commitment later.

7. Limits and Scope

The anchoring mechanism described here is intentionally simple:

The replay model remains linear.
The domain is controlled and deterministic.
The anchor is cryptographic in nature.

This note does not claim generality beyond that scope. It establishes a necessary condition, not a complete solution.

Future work may weaken or replace the anchor, explore probabilistic verification, or extend the mechanism to stochastic and learned systems. Those questions are explicitly left open.

8. Status

This note records a third marker.

Together with the previous two, it closes a conceptual loop:

from structural verification,
through minimal interfaces,
to anchored causal provenance.

At this point, the Axio framework no longer rests on philosophical conjecture. It rests on implemented mechanisms with clearly identified boundaries.

This note is published for historical completeness, to mark the point at which pure coherence was shown to be insufficient, and anchoring was identified as the minimal missing ingredient.

No further claims are made here.

Axio

Discussion about this post

Ready for more?