The Collapse of Proof-of-Work Credentials

When production cost falls, institutions lose their favorite proxy for skill

Jul 05, 2026

A theorem used to prove more than its statement. It proved something about the mathematician. A polished painting proved something about the artist; a working program, something about the programmer. Many of our credentials were built on this: artifact proof-of-work. The artifact did object-level work, then social work. The inference was never perfect. People plagiarized, ghostwriters existed, assistants did invisible labor. Still, it worked well enough for institutions to rely on, because the hidden premise held: faking the artifact was expensive.

AI breaks that premise. It lowers the cost of producing credential-shaped artifacts: the essay, the image, the legal memo, the coding exercise. Sometimes it produces garbage, sometimes mediocrity with polish, sometimes work good enough to pass the external test. That last category is the institutional problem. The theorem may still be correct, the image beautiful, the program functional. The lost value is evidential. The artifact can satisfy the surface criteria without telling us much about the mind behind it.

The Broken Inference

The old inference was simple: an impressive artifact implied an impressive producer. After AI, it underdetermines too much. The artifact may mean the producer understands the domain, or prompted well, or edited something they barely understood, or got lucky. Human competence is still real. What broke is naïve attribution. The relevant question is what human capacity the final artifact still certifies.

There is precedent. A calculator did not eliminate mathematical competence, but it changed which computations could serve as evidence. A camera changed the meaning of realistic depiction. Spellcheck weakened spelling as a proxy for literacy. AI generalizes the pattern across every domain that relied on expensive production as proof of ability.

In mathematics, a proof can now be correct without being assimilated. A proof blob may settle a question formally while adding little to understanding: a certificate rather than a concept. Priority stops being a proxy for conceptual power, and the scarce work shifts toward problem formation, exposition, and compressing results into teachable ideas. A theorem enters mathematics fully only when it becomes available to understanding.

In programming, a demo has always been cheaper than a system, and AI widens the gap. A candidate can generate a plausible application that works under friendly conditions; then the database has bad data, the dependency has a vulnerability, the tests check only the happy path, and the architecture cannot absorb the next feature. The credential question becomes whether the programmer can read the system, reason about failure, and make tradeoffs while the tools shift beneath them.

In art, polish once implied training. Now anyone can produce attractive images: lighting, texture, and surface drama with no coherent authorship behind it. That makes polish less scarce, not art obsolete. When generation is cheap, selection, taste, and sustained identity across a body of work matter more. A single impressive image means less; a coherent practice means more.

In education, the submitted essay was supposed to measure the student, and that instrument is broken. This is not mainly a cheating problem: the assessment no longer measures what it claims to measure. Surveillance cannot fix it. The detector tries to restore trust in the submitted artifact, but the artifact is the thing whose evidential value has already collapsed. Assessment has to move to retained understanding: oral defense, live derivation, critique of generated answers. Less convenient than grading stacks of homework, and more honest. The point of education was never homework-shaped objects; it was transformation of the student.

What Remains Scarce

AI makes outputs cheaper, not all capacities. Specification stays scarce: most people cannot frame the problem because they do not understand the domain. So do taste, diagnosis, and repair. Generated work looks plausible at the surface and fails in the joints, and fixing it requires understanding the system behind it.

Above all, accountability stays scarce. A model cannot hold a license, carry malpractice insurance, sign an engineering report, or absorb liability when a bridge fails. Institutions will retain human credentialing because responsibility needs an address. Credentialing shifts from “Who made this?” toward “Who is competent to authorize this, and who bears the cost if it fails?”

None of these categories is safe from automation. AI will specify, diagnose, and repair better over time, and it will imitate the traces of process too. The scarce thing is not any fixed task but accountable control over a shifting toolchain. Credentials must move from artifact possession to demonstrated control under challenge, consequence, and continuity: challenge when the artifact is questioned or altered, consequence when it must survive real use, continuity when quality has to persist across time.

The Institutional Problem

Artifact credentials dominated because they gave institutions scalable proxies: grade the essay, count the publication, inspect the portfolio. Never philosophically clean, but administratively useful, and scale is a real constraint. A university with 40,000 students cannot orally examine every homework assignment; a company with 10,000 applicants cannot interview them all adversarially.

AI attacks the convenient proxy without providing an equally convenient replacement, and institutions will resist. They will police artifacts instead of redesigning assessments, ban tools they cannot exclude, and count outputs because counting is easier than judging. The result is credential inflation: more people holding impressive artifacts, fewer artifacts reliably indicating competence.

The institutions that adapt will use tiered evaluation. Cheap filters remain but count as weak evidence; stronger candidates face challenge; audits replace universal inspection; AI stress-tests artifacts as well as generating them. The danger is the first tier hardening into a pedigree filter that discards outsiders before they reach the challenge stage. Artifact proof-of-work was imperfect, but it let an outsider force attention: the artifact could speak before the institution knew whom it was speaking for. A serious post-AI system needs open challenge lanes, where anyone can submit a public artifact and some fraction advance by contest, audit, or demonstrated deployment rather than prior status. The artifact no longer deserves full trust. It still deserves a chance to be challenged.

The Replacement Signals

There is no clean replacement, only a harder measurement problem. Live performance and oral defense are expensive, scale badly, and reward confidence, class markers, and interview training. Reputation compounds early advantage. The answer cannot be to replace artifact review with charisma review.

The better model is the challengeable artifact: still visible, public, and portable, but no longer standing alone. Show the program, then change a requirement and watch the system evolve. Show the essay, then defend it. Show the proof, then repair a gap. Process records count only weakly, because AI will generate fake drafts and rehearsed explanations too; the process artifact becomes just another artifact. The signals that survive gaming are deployment (did the work survive use?) and maintenance (can the person keep improving it after reality damages it?). Friendly evaluation rewards polish. Hostile evaluation finds structure.

The Apprenticeship Problem

Cheap generation also breaks training. Producing bad first versions was how people learned to make durable ones; juniors learned by writing clumsy code, proving routine lemmas, and repairing their own mistakes. Inefficient as production, efficient as formation. AI tempts institutions to remove that layer, and the developmental cost is less visible than the production savings. A programmer who never fought with state and deployment lacks the instincts to review generated systems; an artist who never made a thousand bad images lacks the taste to reject attractive nonsense. Some low-status work must survive as training after it stops making sense as production. Institutions that automate away the apprenticeship layer will eventually discover they have no one left who can supervise the automation.

Postscript

AI does not abolish merit; it forces a better account of it. Merit was never the artifact but the capacity the artifact imperfectly revealed. The panic around AI is partly about labor, partly about status, and partly epistemic, and the epistemic panic is the interesting one because it exposes the old system: we were never just rewarding outputs, we were using outputs to read minds. Now the minds are harder to read.

The answer is not nostalgia for expensive production. It is better evidence. Ask people to explain, repair, extend, defend, and take responsibility. The credentialing systems that survive will keep artifacts public, portable, and challengeable; preserve apprenticeship after first drafts stop making economic sense; and attach authority to liability rather than polish. The rest will keep rewarding artifacts whose evidential value has already collapsed.

Axio

Discussion about this post

Ready for more?