Self-Improvement

The routine AI-agent fleet that reviews, updates, fixes, learns, aligns, and evolves the year-of-ai hub and framework — a self-monitoring, continuously improving "model for models."

Self-Improvement — models watching models

The hub grows the knowledge bases (AI Orchestration) and spawns new eras on its own. But growing isn’t enough — a system that writes itself also needs to watch and improve itself. This page identifies the routine agent fleet that does that: a meta-layer whose job is the health and continuous improvement of the growth system itself.

Today the central engine runs only the 3-tier grow tick. The framework’s own self-improvement mechanisms (learn, pollinate, distill, evolve) exist but are dormant — built for the old per-repo model and never wired into the central engine. The fleet below wires those up and fills the gaps, forming a closed loop.

The continuous-improvement loop

Each grow tick feeds telemetry into the loop; what the fleet learns is written back into the canonical framework, so the next tick across all repos improves.

Operating doctrine. Every agent that mutates content, repos, or secrets opens a pull request — never a direct commit (the lone exception is the append-only telemetry ledger). Four highest-risk mutations (model swaps, model-tier changes, content rollbacks, deletions & permission changes) require a human gate. A global kill-switch halts all mutation on demand, and writes are serialized so no two agents race the same branch.

The fleet (25 agents)

Each agent is tagged by origin — net-new (build it), rewire (wire up a dormant framework mechanism), or upgrade (make an existing deterministic tool agentic). Priorities P0 → P2.

Monitor / Observe

Agent	Pri	Origin	What it does
`telemetry-ledger-collector`	P0	upgrade	The P0 keystone every learn/cost/monitor agent depends on
`fleet-pause-killswitch`	P0	net-new	The single global halt the doctrine is missing
`pages-deploy-sentinel`	P0	net-new	Built — hourly read-only check that each member repo’s Pages build succeeded + the live site responds; files one sticky issue. The sole liveness signal a rollback agent keys off
`secret-expiry-watch`	P0	net-new	Built — daily single-attempt probe of `CLAUDE_CODE_OAUTH_TOKEN` + `LIFECYCLE_PAT` ahead of the growth window; files/auto-closes one issue (catches a dead token before ticks burn)
`fleet-health-watch`	P1	net-new	Built — daily read-only digest over the evolution ledger: error rate, cost trend, per-member last-grow, stalls; one auto-closing issue (the DETECT half of the loop)
`fleet-cost-governor`	P2	net-new	Cost rail on the orchestrator — pre-flight ceiling + budget writer
`lineage-state-report`	P2	net-new	Single human pane of glass; a pure aggregator over the other signals

Review

Agent	Pri	Origin	What it does
`prepublish-gate`	P0	net-new	The biggest hole: an inline gate that builds + fact/link/tone-checks a tick before it publishes
`framework-pr-reviewer`	P0	upgrade	Safety for the self-modification path; gates every framework auto-merge
`injection-surface-auditor`	P1	net-new	Audits the workflow-injection surface (untrusted input into prompts; web-tools-with-write)
`published-content-auditor`	P2	upgrade	Audits published content for post-publish drift — link/image rot, stale facts, orphans
`content-license-attribution-auditor`	P2	net-new	Checks the copyright/attribution posture of sourced facts across the public sites

Fix

Agent	Pri	Origin	What it does
`repo-write-serializer`	P0	rewire	Built (convention) — the `repo-write-<repo>` concurrency lock `grow-lineage` holds + the documented framework/policy single-PR groups; future writers join them so no two race a branch
`repo-hygiene-warden`	P1	net-new	Enforces the invariant that a year repo holds only content + config + .claude + telemetry
`model-id-drift-checker`	P1	upgrade	Keeps the hardcoded model IDs in policy.yml from going stale or deprecated
`supply-chain-security-warden`	P1	net-new	Adds the missing Dependabot / CodeQL / action-pinning audit on secret-bearing workflows
`tick-rollback-sentinel`	P2	net-new	Bounded, human-initiated withdrawal of a bad publish — the rollback the system lacks

Align

Agent	Pri	Origin	What it does
`ledger-tickclock-auditor`	P1	rewire	Validates the tick-clock / Evolution Log against reality via a CI check gate

Update

Agent	Pri	Origin	What it does
`docs-warden`	P1	net-new	Built — gates every PR + sweeps `main` so every change is matched by a doc update on the hub’s own doc surface (ADR-0005)
`claude-md-canon-warden`	P1	rewire	Re-syncs each repo’s drifted (stale old-model) CLAUDE.md back to canon
`adapter-canon-aligner`	P2	rewire	The backward `.claude/`-to-canon diff — keeps the framework converging, not fragmenting

Learn

Agent	Pri	Origin	What it does
`learn-flywheel`	P1	rewire	Mines a stabilized telemetry window and embeds friction-removing edits into the canonical framework

Evolve

Agent	Pri	Origin	What it does
`tier-roi-auditor`	P2	net-new	Makes model-tier selection data-driven instead of fixed-by-intuition
`recovery-rehearsal-agent`	P2	net-new	Exercises the re-seed / rebuild path in an isolated scratch org (the untested recovery story)
`fleet-meta-auditor`	P2	net-new	Watches the watchers — audits the agent fleet itself, with an external dead-man’s-switch

Defects the design surfaced (worth fixing regardless)

No evolution ledger — per-tick telemetry is uploaded then discarded after 14 days; the aggregator (telemetry.yml) is dormant and misconfigured (wrong trigger name + artifact pattern).
No supply-chain security — no Dependabot / CodeQL; workflows pin actions to floating tags on secret-bearing jobs.
A bypassable review — learn.yml force-merges its own PR.
A publish race on each year repo’s main (no shared write lock).
A prompt-injection surface — an untrusted dispatch input is interpolated straight into a model prompt.
CLAUDE.md drift — never re-synced, so year repos carry a stale old-model copy.

Rollout order

Phase 0 — safety scaffolding (before any mutating agent): the kill-switch, the write serializer, removing the force-merge, and the framework PR reviewer.
Phase 1 — keystone + gates + canaries: the telemetry ledger collector first, then the inline prepublish gate, then the secret / Pages / health watchers.
Phase 2 — the loops: the learn flywheel and the align / fix wardens.
Phase 3 — slow structural, economic & meta: tier-ROI, cost governor, content & license auditors, recovery rehearsal, and the fleet meta-auditor.

Full detail

The complete fleet — every agent’s trigger, cadence, model tier, inputs, outputs, guardrails, and the closed-loop rationale — is recorded in ADR-0003. It is the synthesized output of a 14-agent design workflow (4 architect lenses → synthesis → 3 adversarial critics → finalize).

Layout	`default`
Collection	`none`
Path	`pages/self-improvement.md`
URL	`/self-improvement/`

Settings

Color Mode

Theme Skin

Background

Environment

Theme & Build

Page Location

Page Info

Source Code