Database Transaction Boundaries in Backend APIs

Database Transaction Boundaries in Backend APIs

Database transaction boundaries in backend APIs decide which parts of a request commit together, which parts can safely happen later, and which failures should roll back user-visible state. A transaction that is too small leaves partial writes behind. A transaction that is too large holds locks while the API waits on slow work, external services, or network calls.

The hard part is not "use transactions." The hard part is choosing the boundary around the invariant your API must protect.

This article is part of the API Correctness and SQL And Data Correctness clusters. It connects to SQL Isolation Levels Explained, How to Prevent Race Conditions in Backend Systems, and Transactional Outbox Pattern in Microservices.


The Boundary Is A Product Decision

Consider POST /orders.

The request may need to:

  1. validate the cart
  2. check inventory
  3. create an order
  4. create order items
  5. decrement stock
  6. create an idempotency record
  7. enqueue receipt email
  8. publish order.created
  9. return a response

Some of that work belongs in one transaction. Some does not.

The transaction should protect the invariant:

If the API returns "order created," the order, order items, inventory reservation, and idempotency response must agree.

That does not mean the receipt email must be sent inside the transaction. It means the durable intent to send the email should be recorded safely before the response claims success.

The boundary is not technical ceremony. It is the API's correctness promise.


Too Small: Partial Writes Leak Through

A transaction that is too small often looks like this:

const order = await db.orders.create({
  customerId,
  status: 'placed',
})

for (const item of items) {
  await db.orderItems.create({
    orderId: order.id,
    productId: item.productId,
    quantity: item.quantity,
  })
}

await db.inventory.decrement(items)

If inventory decrement fails after the order and items are inserted, the system now has an order that cannot actually be fulfilled.

The API may return an error, but the database has already recorded part of the operation.

That is not only a cleanup problem. It creates ambiguity:

  • Should support see the order?
  • Should the user retry?
  • Should the retry create a second order?
  • Should downstream jobs process it?

For multi-row writes that represent one business operation, the default should be one transaction.


Too Large: The API Holds Locks While Waiting

A transaction that is too large often looks like this:

await db.transaction(async (tx) => {
  const order = await tx.orders.create(...)
  await tx.inventory.decrement(...)

  const payment = await paymentProvider.charge(...)

  await tx.payments.create({
    orderId: order.id,
    providerPaymentId: payment.id,
  })

  await email.sendReceipt(order.id)
})

This feels safe because everything is in one block.

It is not safe.

The transaction now waits on a payment provider and an email service while holding database locks and open connection-pool capacity. If the provider becomes slow, the API can create lock waits, connection exhaustion, and deadlocks. If the payment succeeds but the transaction later rolls back, the external side effect already happened.

External systems do not roll back because your database transaction rolled back.

Do not put slow network calls inside the database transaction.


What Usually Belongs Inside The Transaction

Put work inside the transaction when it must commit or roll back with the core state.

For an order API:

await db.transaction(async (tx) => {
  const idempotency = await tx.idempotencyKeys.reserve({
    scope: `customer:${customerId}:orders`,
    key: idempotencyKey,
    requestHash,
  })

  const order = await tx.orders.create({
    customerId,
    status: 'placed',
    totalCents,
  })

  await tx.orderItems.createMany({
    orderId: order.id,
    items,
  })

  await tx.inventory.reserve({
    items,
    reason: `order:${order.id}`,
  })

  await tx.outboxEvents.create({
    eventType: 'order.created',
    aggregateType: 'order',
    aggregateId: order.id,
    payload: { orderId: order.id, customerId, totalCents },
  })

  await tx.idempotencyKeys.markCompleted({
    id: idempotency.id,
    responseStatus: 201,
    responseBody: { id: order.id, status: 'placed' },
  })
})

This transaction includes:

  • idempotency reservation
  • order creation
  • child rows
  • inventory reservation
  • durable outbox event
  • replayable idempotency response

Those pieces define the committed API outcome.

If any of them fails, the API should not claim the order was created.


What Usually Belongs Outside The Transaction

Keep work outside the transaction when it is slow, external, or retryable from durable state.

Usually outside:

WorkBetter boundary
Payment provider callAuthorize before or after with explicit state machine
Email sendOutbox row or background job
Webhook dispatchOutbox row plus retrying dispatcher
Search indexingOutbox or asynchronous projection
Analytics eventOutbox or best-effort async path
Long report generationBackground job
File upload to object storageSeparate staged workflow

The safe replacement for "do it inside the transaction" is not "hope the async work happens."

The safe replacement is "record durable work inside the transaction, then process it after commit."

That is exactly the role of the Transactional Outbox Pattern in Microservices.


Transaction Isolation Does Not Replace Invariants

Isolation levels define what concurrent transactions can observe. They do not decide your business invariant for you.

PostgreSQL's transaction isolation documentation describes the standard isolation levels and the phenomena they allow or prevent, such as dirty reads, nonrepeatable reads, phantom reads, and serialization anomalies. See the PostgreSQL docs on transaction isolation.

For most backend APIs, the question is more concrete:

Which rows must not be changed by another request between my check and my write?

If the API checks inventory and then reserves it, the invariant is not "I used a transaction." The invariant is "two concurrent orders cannot reserve the same last unit."

A robust transaction usually uses one of:

  • atomic conditional updates
  • unique constraints
  • row locks
  • serializable transactions with retry handling
  • optimistic version checks

For example:

UPDATE inventory
SET available = available - $1
WHERE product_id = $2
  AND available >= $1
RETURNING product_id, available;

This is often safer than:

const inventory = await tx.inventory.find(productId)

if (inventory.available < quantity) {
  throw new OutOfStock()
}

await tx.inventory.update(productId, {
  available: inventory.available - quantity,
})

The second shape can be safe only if the read and write are protected by the right lock or isolation behavior.

The first shape makes the invariant part of the write.


Locks Last Until The Transaction Ends

When you explicitly lock rows, remember that the lock duration is the transaction duration.

PostgreSQL's explicit locking documentation states that row-level locks block writers and lockers to the same row and are released at transaction end. See PostgreSQL explicit locking.

That has a practical API consequence:

await db.transaction(async (tx) => {
  const account = await tx.accounts.findForUpdate(accountId)

  await slowFraudService.check(account)

  await tx.accounts.update(accountId, {
    status: 'approved',
  })
})

The row lock stays held while slowFraudService.check(...) runs.

If that call takes two seconds and many requests need the same account row, the API has created a lock queue.

Prefer:

  1. gather data without lock when possible
  2. call slow external service outside the transaction
  3. open a short transaction
  4. re-check the invariant
  5. write the final state

Short transactions are not only faster. They reduce the time other requests spend waiting behind locks.


Use State Machines For Multi-Step Work

Some workflows cannot fit inside one transaction because they include external side effects.

Payments are the classic example.

Do not pretend this is one atomic operation:

create order
charge card
send email
publish event

Model it as states:

StateMeaning
pending_paymentOrder exists, payment not confirmed
payment_authorizedProvider authorized charge
placedOrder is committed for fulfillment
payment_failedPayment failed or expired
cancelledOrder was intentionally stopped

Then each transition has its own transaction boundary.

For example:

await db.transaction(async (tx) => {
  const order = await tx.orders.findForUpdate(orderId)

  if (order.status !== 'pending_payment') {
    return
  }

  await tx.orders.update(order.id, {
    status: 'payment_authorized',
    providerPaymentId,
  })

  await tx.outboxEvents.create({
    eventType: 'order.payment_authorized',
    aggregateId: order.id,
    payload: { orderId: order.id, providerPaymentId },
  })
})

This is easier to reason about than one giant transaction that waits on every system.

It also gives support and reconciliation jobs a truthful state to inspect.


Idempotency Belongs Near The Boundary

If clients retry the API request, the transaction boundary must include idempotency.

For POST /orders, the idempotency record should be reserved before the side effect runs and completed with the durable response before the transaction commits.

That gives the API a safe answer when the client retries after a timeout:

Existing idempotency stateAPI response
in_progressReturn conflict or retry-after
completed with same request hashReplay original response
completed with different request hashReject key reuse
no recordReserve and process

Without that boundary, the transaction may protect database rows, but a retried request can still create duplicate business operations.

For the full API pattern, see API Idempotency Keys: Prevent Duplicate Requests Safely.


What To Test

API transaction tests should prove behavior through the real persistence boundary.

Test cases:

TestWhat it proves
success path creates all rowsTransaction commits the whole operation
validation failure creates no rowsBad input does not leak partial writes
inventory failure creates no orderRollback covers related writes
duplicate idempotency key replays responseRetry boundary is durable
concurrent requests for last unitInvariant holds under overlap
outbox row exists after commitPost-commit work has durable intent
external provider timeoutAPI state remains truthful

These are integration tests, not only unit tests. The article How to Write API Integration Tests covers the test shape in more detail.


Practical Checklist

Before finalizing an API transaction boundary, ask:

  • What invariant must commit atomically?
  • Which writes must roll back together?
  • Which work is slow or external and should happen outside the transaction?
  • Is durable post-commit work recorded inside the transaction?
  • Are locks held only for the shortest reasonable time?
  • Does the transaction re-check data that may have changed since earlier reads?
  • Are side-effecting retries protected by idempotency?
  • Does the workflow need a state machine instead of one large transaction?
  • Do tests prove rollback, concurrency, and retry behavior?
  • Can support inspect the state if the request fails halfway?

If the transaction boundary is unclear, the API contract is unclear too.


Final Takeaway

A database transaction should be wrapped around the business invariant, not around the entire request handler by habit.

Keep the atomic state change inside. Keep slow external calls outside. Record durable post-commit work before returning success. Use idempotency for retried API calls. Use tests that prove the boundary under failure and concurrency.

That is how transactions become part of API correctness instead of just a block of code around database calls.