Database Migration Testing in CI: Catch Schema Bugs Before Production

Database migration testing in CI should prove more than "the SQL file runs." A migration can apply cleanly to an empty database and still break production when old rows, old application code, new application code, background workers, constraints, indexes, and rollback behavior all overlap.

That is the gap CI should reduce. It cannot fully simulate production traffic, data size, locks, or operational pressure, but it can catch the predictable migration bugs before the deploy reaches a live database.

This article focuses on the CI side of database migrations: what to test, which database states to create, how to verify compatibility, and where migration checks fit in the release pipeline. For the production rollout sequence, read Safe Database Migrations in Production. For rollback decisions after a migration starts moving data, read Database Migration Rollback Strategy in Production. For the wider release-safety path, see Testing And Software Delivery and SQL And Data Correctness.

Why Migration Tests Pass Locally And Fail In Production

Most migration tools make it easy to test the happy path. Create a fresh database. Apply all migrations. Run the test suite.

That proves the latest schema can exist. It does not prove the migration is safe for a production database that already contains data and already serves code.

Production has conditions a fresh local database usually does not:

Production condition	What a fresh database hides
Historical rows with old data shapes	New `NOT NULL`, enum, unique, and foreign key assumptions
Old and new application versions	Compatibility gaps during rolling deploys
Partially backfilled rows	Read paths that assume migration work already finished
Large tables	Lock, scan, and index build behavior
Background workers and scheduled jobs	Code paths outside the main web request flow
Previous migration history	Drift, edited migration files, missing files, or checksum issues
Rollback or forward-fix requirements	Unsafe assumptions about down migrations

The common mistake is treating migration testing as a schema creation test. The better question is:

Can the old data and overlapping application versions survive the next migration phase?

That is a different test.

What CI Should Prove Before A Migration Ships

A useful migration pipeline does not need to reproduce production perfectly. It should prove a few specific things repeatedly.

CI check	Question it answers
Migration history validation	Did anyone edit, delete, reorder, or forget a migration?
Fresh-schema apply	Can the full migration history create the current schema from scratch?
Production-shaped upgrade	Can the new migration apply to old schema plus old data?
Compatibility test	Can old and new code survive the mixed schema period?
Backfill verification	Are historical rows migrated without losing or overwriting valid data?
Constraint and index gate	Are dangerous DDL operations visible before deploy?
Rollback or forward-fix rehearsal	Does the release plan define what happens if this phase fails?

Those checks belong in CI because they turn migration review from "this looks safe" into "this failed against a state we care about."

They also make code review sharper. Reviewers can ask about a failed compatibility test or a missing old-row fixture instead of trying to reason about every production path from a SQL diff.

Start With Migration History Validation

The first gate should catch migration history problems before any application test runs.

For migration tools that store checksums or schema history, CI should verify that local migrations match what the database believes has already run. Flyway's validate command, for example, checks applied migrations against available migration files and fails on checksum, name, type, missing, or unapplied-version differences in its official documentation.

A simple pipeline stage might look like this:

migration-history:
  script:
    - flyway validate

For Prisma, production and test environments use prisma migrate deploy to apply pending migrations, and the Prisma documentation notes that the command compares applied migrations against migration history and applies pending migrations. It also explicitly separates production deployment from development reset workflows in the Prisma Migrate development and production docs.

migration-apply:
  script:
    - npx prisma migrate deploy

The exact tool does not matter as much as the rule:

CI should fail if the migration history is not reproducible.

Do not rely only on a local development database. Local databases often contain manual changes, reset history, experimental schemas, or data that makes a broken migration look fine.

Test The Upgrade From A Production-Shaped Baseline

The most important migration test is not "apply every migration to an empty database." It is "apply this migration to the kind of database production has right now."

Create a baseline fixture that represents the previous released schema and meaningful old data.

For example, suppose the migration changes users from name to full_name. The previous schema might look like this:

CREATE TABLE users (
  id BIGSERIAL PRIMARY KEY,
  email TEXT NOT NULL UNIQUE,
  name TEXT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

The old data should include normal and uncomfortable cases:

INSERT INTO users (email, name, created_at) VALUES
  ('ada@example.test', 'Ada Lovelace', '2024-01-10T10:00:00Z'),
  ('grace@example.test', 'Grace Hopper', '2024-02-20T12:00:00Z'),
  ('legacy-spacing@example.test', '  Legacy Name  ', '2023-05-01T08:00:00Z');

Then CI applies only the new migration set on top of that baseline.

createdb migration_test
psql migration_test < test/fixtures/schema-before-users-full-name.sql
psql migration_test < test/fixtures/data-before-users-full-name.sql
./scripts/apply-pending-migrations migration_test

After the migration applies, assert the shape and data you expect:

SELECT COUNT(*) AS missing_full_name
FROM users
WHERE full_name IS NULL;

The expected answer depends on the phase. If the migration only expands the schema, full_name may be intentionally nullable. If the migration includes a backfill, missing values may be a failure.

The test should know which phase it is proving.

Test Mixed Schema Compatibility

Most safe database migrations rely on a compatibility window. Old code and new code must survive while the schema is between shapes.

For the name to full_name example, the phases might be:

Phase	Schema state	Application requirement
Expand	`name` exists, `full_name` added	Old code still reads and writes `name`
Dual write	Both columns exist	New code writes both columns
Backfill	Historical rows gradually populated	New code can read old, new, and partial rows
Switch read	New code prefers `full_name`	Fallback remains until old rows are proven safe
Contract	`name` removed	No deployed code reads or writes `name`

CI should test at least the mixed state. That is where many production bugs hide.

Application code should be tested against old rows:

it('renders a user created before full_name existed', async () => {
  await db.query(`
    INSERT INTO users (email, name, full_name)
    VALUES ('old@example.test', 'Old Name', NULL)
  `)

  const response = await request(app).get('/api/users/old@example.test')

  expect(response.status).toBe(200)
  expect(response.body.displayName).toBe('Old Name')
})

Then test new writes:

it('writes both name columns during the compatibility window', async () => {
  const response = await request(app)
    .post('/api/users')
    .send({ email: 'new@example.test', fullName: 'New Name' })

  expect(response.status).toBe(201)

  const row = await db.queryOne(`
    SELECT name, full_name
    FROM users
    WHERE email = 'new@example.test'
  `)

  expect(row.name).toBe('New Name')
  expect(row.full_name).toBe('New Name')
})

This is where How to Write API Integration Tests connects to database migration work. If the migration affects API behavior, the migration test should go through the API boundary too, not only through SQL assertions.

Verify Backfills As Data Transformations

Backfills should be tested like data transformations, not like background decoration.

A backfill test should answer:

which rows are eligible?
which rows are intentionally skipped?
can the job run twice without corrupting data?
can it resume from a checkpoint?
does it preserve user edits or newer writes?
does it leave evidence for verification?

For example, a backfill that copies name into full_name should not overwrite a value already written by new code:

UPDATE users
SET full_name = name
WHERE full_name IS NULL
  AND name IS NOT NULL;

That condition matters. A weaker backfill can erase valid data:

-- Dangerous: overwrites newer full_name values.
UPDATE users
SET full_name = name;

CI can catch that with a small fixture:

INSERT INTO users (email, name, full_name) VALUES
  ('old@example.test', 'Old Name', NULL),
  ('new@example.test', 'Old Source', 'New Source');

After the backfill:

SELECT email, name, full_name
FROM users
ORDER BY email;

Expected:

email	name	full_name
`new@example.test`	Old Source	New Source
`old@example.test`	Old Name	Old Name

This is a tiny test, but it protects an important rule:

A backfill should repair missing historical state, not overwrite legitimate new state.

For larger migrations, also test idempotency. Run the backfill twice and assert the second run changes nothing.

const first = await runUserFullNameBackfill()
const second = await runUserFullNameBackfill()

expect(first.updatedRows).toBe(1)
expect(second.updatedRows).toBe(0)

That one assertion prevents a surprising number of production backfill mistakes.

Make Dangerous DDL Visible In Code Review

Some migration failures are not logical data bugs. They are operational surprises.

PostgreSQL documentation for ALTER TABLE notes that lock levels differ by subform and that an ACCESS EXCLUSIVE lock is acquired unless explicitly noted. That is easy to miss in a review where the SQL file looks small. The PostgreSQL ALTER TABLE documentation is worth linking from team migration rules because the lock behavior depends on the exact operation.

CI cannot prove a hot production table will tolerate a lock. It can still flag risky statements before they merge.

Examples of statements that deserve review:

ALTER TABLE users ALTER COLUMN full_name SET NOT NULL;
ALTER TABLE users DROP COLUMN name;
ALTER TABLE orders ADD CONSTRAINT orders_reference_unique UNIQUE (reference);
CREATE INDEX orders_created_at_idx ON orders (created_at);

The correct answer is not always "never do this." The correct answer is "make the risk explicit."

For PostgreSQL indexes, CREATE INDEX CONCURRENTLY is often relevant for production because normal index creation can block writes. The PostgreSQL CREATE INDEX documentation explains both the benefit and the trade-off: concurrent builds allow normal operations to continue, but require more work, can take longer, and can leave invalid indexes if the build fails.

A CI linter can be simple at first:

const migrationSql = await fs.readFile(migrationPath, 'utf8')

const riskyPatterns = [
  /ALTER\s+TABLE\s+\w+\s+DROP\s+COLUMN/i,
  /ALTER\s+TABLE\s+\w+\s+ALTER\s+COLUMN\s+\w+\s+SET\s+NOT\s+NULL/i,
  /CREATE\s+INDEX\s+(?!CONCURRENTLY)/i,
  /TRUNCATE\s+TABLE/i,
]

for (const pattern of riskyPatterns) {
  if (pattern.test(migrationSql)) {
    throw new Error(`Migration needs explicit review: ${pattern}`)
  }
}

This should not replace human review. It should force the migration author to explain the plan: table size, lock expectation, rollout phase, stop condition, and rollback posture.

Do Not Treat Test Database Reset As Migration Testing

Resetting a test database is useful. It is not the same as testing a production migration.

For example, PostgreSQL TRUNCATE quickly removes rows, but the official docs note that it takes an ACCESS EXCLUSIVE lock on each table it operates on and that CASCADE can include dependent tables. That is fine for isolated test setup. It is not proof that a migration is production-safe.

Use reset commands for test isolation. Use migration tests for upgrade behavior.

Those are different jobs:

Operation	Good for	Not proof of
Drop and recreate database	Clean local development and test isolation	Upgrade safety from an existing production schema
Truncate tables	Fast serial test cleanup	Compatibility with old data or live traffic
Apply all migrations fresh	Current schema can be created from nothing	New migration works against historical data
Upgrade old fixture schema	Migration handles old schema and old data	Lock behavior under production traffic
API compatibility tests	Application survives mixed schema assumptions	Full operational safety at production data size

This distinction matters because a team can have a clean green CI pipeline and still ship a migration that fails against real old rows.

Add A Migration Test Matrix To The Pull Request

Every risky migration should name the states it tested.

A lightweight pull request matrix is enough:

State tested	Required evidence
Fresh schema	Full migration history applies
Previous production schema	New migration applies to old schema fixture
Old rows	Historical rows still read correctly
New writes	New code writes the expected old/new fields
Partial backfill	Reads survive rows before and after backfill
Backfill idempotency	Second run changes zero already-correct rows
Constraint/index risk	Lock/build risk reviewed and rollout plan named
Rollback or forward-fix phase	Stop condition and action are documented

The matrix helps reviewers see what is missing. If a migration changes a column type but has no old-row fixture, that is visible. If it adds a constraint but has no invalid historical data sample, that is visible. If it drops a column but has no proof that old code is gone, that is visible.

The point is not bureaucracy. It is making the mixed state concrete before production creates it for you.

Example CI Shape

A practical migration CI job might have four stages.

jobs:
  migration-tests:
    steps:
      - checkout
      - setup-database

      - name: Validate migration history
        run: ./scripts/migrations/validate

      - name: Apply migrations from empty database
        run: ./scripts/migrations/test-fresh-schema

      - name: Apply migration from previous schema fixture
        run: ./scripts/migrations/test-upgrade-fixture users-full-name

      - name: Run compatibility integration tests
        run: yarn test:integration --grep users-full-name-migration

The scripts can be thin wrappers. The important part is the separation:

history is valid
current schema can be built
the new migration upgrades a realistic old state
application behavior still works through the mixed state

When the migration is large, add separate jobs for backfill replay, lock-risk linting, and rollback or forward-fix rehearsal.

What CI Still Cannot Prove

CI migration tests reduce risk. They do not eliminate it.

They usually cannot prove:

exact lock wait behavior on a hot production table
index build time at production scale
replication lag from a large backfill
cache behavior under live traffic
every old client or worker version still deployed
human rollback timing during an incident

Those belong in the production rollout plan, monitoring, throttling, and stop conditions.

The job of CI is to catch the mistakes that do not require production to discover:

old rows missing required values
unsafe assumptions about partially migrated data
edited or missing migration files
backfills that overwrite newer values
application code that cannot read both old and new shapes
destructive operations with no explicit review
rollback plans that are only generated down migrations

That is already a lot.

Database Migration Testing Checklist

Before merging a database migration, check:

Does migration history validation run in CI?
Does the full migration history apply to an empty database?
Does the new migration apply to a previous-schema fixture?
Does the fixture include old rows, uncomfortable rows, and partially migrated rows?
Can old-row data still be read through the application boundary?
During compatibility phases, does new code write both old and new shapes where needed?
Is the backfill idempotent?
Does the backfill avoid overwriting legitimate newer values?
Are DROP COLUMN, SET NOT NULL, unique constraints, TRUNCATE, and index builds explicitly reviewed?
Does the pull request name rollback, forward-fix, or stop actions for each risky phase?

If the answer is "we only tested this on a fresh database," the migration is not really tested yet.

Takeaway

Database migration testing in CI is about mixed reality.

The real risk is not whether the latest schema can exist. The real risk is whether old data, new data, old code, new code, and partially completed migration work can coexist long enough for the rollout to finish safely.

Good CI checks make those states visible before production has to host them. They do not replace safe rollout planning, but they make the plan much harder to fake.