Skip to main content
HomeBlogEngineering
● Engineering

Planning database upgrades without marketing the drill

A play-by-play of the scripts, rollback checks, rehearsal notes, and the evidence we want before calling an upgrade zero-downtime.

TS
Tobias Strauss
Infra lead
10 min read

Zero-downtime is a marketing word for an engineering claim that's almost never actually tested. Most upgrades succeed because they didn't surface the case the rollback was designed for. The opposite story — the upgrade hit the case and the rollback worked — is rarer and worth more.

The pieces that have to exist before you call something zero-downtime

  1. A migration runbook reviewed by someone other than the author.
  2. A rollback script that's been executed end-to-end against a production-shaped snapshot.
  3. A pre-flight script that verifies the assumed schema state before the migration runs.
  4. A canary path that lets a small percentage of traffic hit the new code before the cutover, with explicit success criteria.
  5. An evidence-collection step that writes the state before, during, and after the migration to a separate audit log.

What we've learned from doing it badly

Skip step 3 and the migration silently no-ops on a row your code thought existed. Skip step 5 and the post-mortem becomes archeology. The expensive lesson, every time, is that the runbook is the rehearsal — if it hasn't been run on a snapshot, it hasn't been written.

About PipSync

PipSync is a signal-to-execution routing platform. We do not provide investment advice, do not recommend signal sources, and do not hold client funds. Trading leveraged products involves substantial risk of loss. Read the Trust Center →

← All articlesHave feedback on this post? Get in touch →

The pip drop — weekly.

One well-edited email every Friday: the most interesting post on PipSync, one trade that caught our eye, and a link to what the team is reading. No hype, unsubscribe in one click.

4,180 subscribers · 48% open rate · zero tracking pixels