Skip to main content

Backup Standards

Backups protect node availability, not protocol correctness. A backup is useful only if it can be restored into a node that rejoins the intended network and passes chain-specific health checks.

Backup classes

DataBackup methodRestore expectation
Node stateFilesystem snapshot, volume snapshot, or client-supported state exportRestore to a compatible client version, then resume sync from peers.
ConfigurationGit-tracked manifests plus sealed or externalized secretsRecreate the node without copying mutable host state.
Keys and secretsSecret manager backup with access auditRestore only through approved secret workflows; never into plaintext files in Git.
Indexer dataDatabase snapshot plus migration versionRestore with schema compatibility and replay from a known checkpoint.

:::danger Key material Validator keys, JWT secrets, API keys, database credentials, and private RPC credentials are not ordinary files. Store and restore them through the approved secret manager only. Do not include secrets in node snapshots, sample repos, support bundles, or incident notes. :::

Snapshot cadence

Node roleCadenceRetentionNotes
Production RPC full nodeDaily snapshot plus pre-change snapshot7 daily, 4 weeklyKeep at least one restore point before client upgrades.
Archive or indexer nodeDaily database or volume snapshotBased on rebuild cost and storage budgetArchive rebuilds are expensive; validate capacity before extending retention.
Validator or signer-adjacent nodePre-change and after successful upgradePolicy-drivenPrioritize key handling and rollback safety over fast cloning.
Development/test nodeBest effortShort retentionUse for convenience, not disaster recovery.

Restore drills

Run restore drills on a schedule, not during the first outage.

  1. Select a recent backup without modifying production retention.
  2. Restore into an isolated network or non-production namespace.
  3. Start the node with the documented image and config version.
  4. Confirm the node reaches the expected network and resumes sync.
  5. Run the chain-specific RPC smoke tests.
  6. Record restore duration, storage consumed, and any manual steps.
# Generic post-restore checklist; commands are examples, not a replacement for chain docs.
df -h
curl -fsS "$HEALTH_URL"
curl -fsS "$RPC_URL" -H 'content-type: application/json' -d "$RPC_SMOKE_PAYLOAD"

Backup validation checklist

CheckPass condition
InventoryEach production node role has a named backup source and owner.
EncryptionBackups are encrypted at rest and in transit.
AccessRestore permissions are limited and audited.
CompatibilityBackup metadata records chain, network, client, version, data path, and snapshot height/checkpoint/slot.
Restore evidenceA restore drill has completed within the required interval.
DeletionRetention expiry removes old backups without manual cleanup.

:::warning Snapshot consistency Do not assume a crash-consistent disk snapshot is application-consistent for every client or database. Prefer client-supported export, database-native backup, or a stopped/quiesced node when the chain client requires it. :::

Before major changes

Create or verify a recent backup before:

  • Client upgrades or downgrades.
  • Data directory migrations.
  • Database schema migrations for indexers.
  • Pruning mode changes.
  • Storage class or persistent volume changes.
  • Moving nodes between runtimes.

Link backup evidence in the change record and incident timeline when the backup becomes part of a recovery path.