Changelog

Substantive changes to the playbook. Minor formatting and copy edits are not listed.

2 May 2026

Role Positions: rewritten as forces, not prescriptions. Replaced prescriptive role definitions with a forces-based diagnostic. Each role section now describes the core pressure acting on it, poses diagnostic questions for teams, and flags observable anti-patterns. The section is explicitly framed as pre-descriptive - no organisation has yet produced the definitive account of how roles reshape around agentic workflows, and the playbook acknowledges this rather than prescribing answers. The Improvement Kata is recommended as the experimentation methodology for teams discovering their own role transitions.

Adoption Sequence: organisational rollout layer added. New subsection - try, scale, optimise - describing the organisational rollout pathway that runs in parallel to the engineering readiness sequence. Includes exit criteria for each stage, a pathway interaction table, cost as an adoption bottleneck, and shadow IT risk. References Jesse Vincent's Superpowers as a concrete example of the composable skills framework pattern teams build during the optimise phase.

Adoption Sequence: guardrails expanded with agentic safety. Step 2 now covers two categories: engineering foundations (unchanged) and agentic safety (new). Agentic safety covers sandboxing, data leak prevention, inference provider resilience, prompt injection, shadow IT, and audit/provenance. Each is framed as an organisational decision, not an implementation detail.

Maturity Models: pre-agentic stages labelled explicitly. Individual stages 1–3 relabelled as pre-agentic. Tier names updated: Novice → Pre-agentic, Beginner → Supervised agentic, Competent → Autonomous agentic. Framing paragraph added before stage 1 explaining the qualitative difference at the stage 4 boundary. Organisational maturity stages 1–2 similarly noted as pre-agentic.

Open Questions: measurement section rewritten. Added the evidence-bias problem (provider-funded studies with incentive issues). Token consumption repositioned as a cost-management concern, not a performance indicator. Agent-provenance percentage reframed as a diagnostic input rather than a performance metric, with both sides of the debate acknowledged. Specification health added as a leading indicator. Volume-based metrics explicitly flagged as misleading.

Changes prompted by feedback from Billie Thompson (Armakuni), with thanks.

Licensed under CC BY-NC-SA 4.0.