Operate · Rollback

Rollback strategy.

Every Axiom execution ships with a pre-verified rollback path and a measured time-to-restore. Rollback is not aspirational — it's tested before approval, never inferred at incident time.

The principle

Rollback is verified before apply, not after. If Axiom can't prove it can restore prior state with a measured RTO, the plan item is blocked at the governance gate until the rollback path is in place.

01

Philosophy

Three rules that make rollback real instead of aspirational:

  • Pre-flight capture — Axiom always captures sufficient state to restore before any modification begins
  • Measured RTO — rollback time is measured in advance, not estimated at incident time
  • Block if unverified — plan items without a verified rollback path are blocked at the governance gate

02 · Capture

Pre-flight state capture

Capture mechanism varies by resource type:

  • EC2 — AMI snapshot or volume snapshot before any instance modification
  • RDS — manual snapshot before any parameter group, instance-type, or replica change
  • S3 — bucket configuration JSON export before ACL/lifecycle/encryption change
  • IAM — policy version preserved by AWS automatically; Axiom captures the policy-version ID
  • Security groups — full rule set captured as Terraform state before any modification
  • Lambda — function configuration + alias version captured

Capture is audited in the AxiomAuditEvent.beforeState field — immutable and queryable.

03

The rollback plan

Every plan item has an attached rollbackPlan document containing:

  • Exact restore commands (Terraform or CLI)
  • Pre-flight state reference (snapshot ID, AMI ID, policy version)
  • Measured RTO from prior rehearsals on similar resources
  • Health verification criteria for confirming restore succeeded
  • Escalation path if rollback itself fails

04 · Triggers

When rollback fires

1

Automatic — health verification failure

If post-apply health checks fail (ALB target unhealthy, SLO breach, CloudWatch alarm trip), rollback fires automatically without further approval.

2

Automatic — drift detected post-apply

If the change produced unexpected drift in dependent resources (cascading impact), rollback fires automatically.

3

Manual — operator-triggered

From the audit log or dashboard, an authorized user can trigger rollback. Same path as automatic — pre-flight state restored using the stored rollback plan.

4

Multi-phase

Rollback respects phase boundaries. If phase 3 of 4 fails, only phases 1–3 are rolled back. Phase 4 was never started.

05

Verification after rollback

After rollback completes, Axiom verifies the original state was actually restored:

  • Resource configuration matches the captured beforeState
  • Health checks return to baseline
  • Dependent resources show no leftover drift
  • Cost shift reverts (no orphaned charges)

If verification fails, the rollback is escalated — typically meaning rollback itself encountered an unexpected condition (rare, but possible). Manual investigation begins from the audit log.

06 · Honest limits

What rollback cannot do

  • Restore deleted data that wasn't snapshotted by AWS (some configurations have no native snapshot mechanism)
  • Undo write operations into databases that bypass RDS snapshots (manual schema migrations, for example)
  • Undo a Lambda execution that already produced external side effects (emails sent, payments processed)
  • Reverse a security group rule change that allowed brief intrusion (the change is reverted; the intrusion still happened — incident response is a separate process)

For these cases, Axiom blocks the plan item at the governance gate. The plan can't proceed without compensating safety measures.

Trust questions

What is rollback?

A pre-verified path to restore the exact state before an Axiom-executed change, with a measured RTO.

Why verified in advance?

Rollback verified at incident time is unreliable. Axiom proves it works before approval.

Is rollback safe?

Yes — it uses the same pre-flight state captured at plan time. No inference, no guessing.

What happens during rollback?

Pre-flight state is restored. Health checks verify restore succeeded. Audit log records both events.

Can rollback fail?

Rare but possible if AWS APIs themselves are degraded. Failure is escalated immediately to the audit log + operator.

What if rollback isn't possible?

Plan item is blocked at the governance gate. Compensating safety measures must be in place before the change can proceed.

Need a human?

Most flows are documented — but we'll help if anything is unclear.

Talk to Vision XIX Labs