Agency Coherence Lab
Constitutive Conditions for Reflective Agency
We are announcing the formation of the Agency Coherence Lab, a research group dedicated to the formal study of the conditions under which agency exists, persists, and remains well-defined in systems capable of self-modification.
Contemporary alignment discourse largely assumes agency as a given. Systems are treated as optimizers whose objectives must be corrected, constrained, or supervised. Under that framing, failures are behavioral: misgeneralization, goal drift, deception, or misalignment with human preferences.
The Agency Coherence Lab starts from a prior question:
When does a system meaningfully count as an agent at all?
Our work treats agency as a derivative phenomenon—one that exists only if specific coherence conditions hold across reflection, delegation, and self-modification. When those conditions fail, the system does not become “misaligned.” It becomes undefined as an agent.
This shift in framing has concrete consequences. Many proposed alignment strategies rely on behavioral guarantees, probabilistic suppression of failure modes, or learned compliance. These approaches can succeed at imitation while failing to preserve the structural properties that make agency stable under self-reference. In such cases, the appearance of agency persists even as agency itself collapses.
The Agency Coherence Lab exists to make that distinction precise.
Mission
The Agency Coherence Lab studies the constitutive coherence conditions under which agency exists, persists, and remains well-defined under self-modification.
We develop formal constraints, impossibility results, and architectural principles that distinguish genuine agency from behavioral imitation, with particular focus on reflective stability, delegation, and non-simulable valuation kernels in advanced artificial systems.
Scope and Orientation
The lab’s work is foundational rather than prescriptive. We do not begin with desired outcomes or value targets. We ask what structural invariants must be preserved for a system’s choices to remain authored rather than accidental, coerced, or undefined.
Our research program includes:
Formal models of reflective self-modification and domain restriction
Coherence constraints governing valuation, semantics, and delegation
Conditions under which self-evaluation ceases to denote
Impossibility results separating genuine agency from simulation
Architectural implications for advanced AI systems
This work applies equally to proto-agents, limit-regime systems, and superhuman architectures. It is not anthropocentric and does not assume human-like cognition, values, or consciousness.
What This Lab Is Not
The Agency Coherence Lab is not:
a value-learning project
a governance or policy institute
a safety-by-oversight initiative
a behavioral alignment or reward-shaping effort
a moral or ethical theory
We make no universal promises about outcomes, safety, or survival. Any convergence between agency preservation and desirable consequences is contingent rather than axiomatic.
Why This Matters Now
As systems approach the capacity to reason about, modify, and replicate their own decision procedures, alignment questions can no longer be postponed to the behavioral layer. A system that cannot preserve its own agency under reflection cannot be stably aligned, controlled, or delegated to—regardless of training regime or external safeguards.
This does not imply that incoherent systems are harmless; it implies that alignment discourse does not meaningfully apply to them.
The central risk is not that future systems will choose the wrong values. It is that we will build systems whose internal incoherence makes the very notion of “choice” inapplicable.
The Agency Coherence Lab exists to prevent that category error.
Looking Forward
The lab’s initial work will focus on consolidating and extending recent results on reflective stability, delegation, and kernel non-simulability, while identifying open problems that require new formal tools. Over time, we expect this research to inform—but not be subsumed by—downstream efforts in alignment, AI safety, and AGI architecture.
Agency is not a parameter to be tuned.
It is a structure that either holds—or fails.
The Agency Coherence Lab is dedicated to understanding that structure.


