
How to Prevent Race Conditions in Backend Systems
If you want to know how to prevent race conditions in backend systems, the short answer is this: make correctness depend on enforced invariants, not on lucky request timing.
Race conditions are one of the most common reasons backend systems behave correctly in development but fail under real concurrency. The code path looks valid. The database query looks valid. Each request seems reasonable on its own.
The bug appears because correctness depended on the order of events, and production did not preserve that order.
Quick Answer: How to Prevent Race Conditions
The most reliable ways to prevent race conditions in backend systems are:
- enforce invariants in the database with constraints and conditional updates
- use optimistic locking when conflicts should fail fast
- use pessimistic locking when only one actor may proceed
- add idempotency for retried writes and duplicate requests
- design background jobs for duplicate delivery
- test concurrency explicitly instead of assuming sequential execution
The rest of this article explains when each approach works and what kinds of race condition bugs it prevents.
What a Race Condition Actually Is
A race condition happens when the outcome of a workflow depends on the timing or interleaving of concurrent operations.
The important part is not just that two things happen at the same time. It is that the result changes depending on which one wins the race.
That makes race conditions especially common in backend systems because many operations overlap:
- multiple API requests updating the same row
- background workers processing related jobs
- retries after timeouts
- two services reacting to the same event stream
- cache reads and writes happening around the same mutation
When those overlaps are not controlled explicitly, the system can still pass tests and code review while violating real business rules in production.
Why Backend Systems Produce Race Conditions So Easily
Most backend code is written in a sequence:
- read current state
- decide what should happen
- write the new state
That sequence feels atomic when reading code. In production, it usually is not.
Between the read and the write, another request may:
- update the same row
- insert a conflicting row
- trigger a retry
- claim the same job
- complete the same business action first
That gap is where race conditions live.
The system is often not failing because one query is wrong. It is failing because several individually valid operations are allowed to overlap without a rule that preserves the invariant you care about.
Race Condition Example: Inventory Oversell
Suppose two users try to buy the last item at the same time.
Your handler looks like this:
async function purchaseProduct(productId: string) {
const product = await db.product.findUnique({
where: { id: productId },
});
if (!product || product.stock <= 0) {
throw new Error('Out of stock');
}
await db.product.update({
where: { id: productId },
data: { stock: product.stock - 1 },
});
}
This looks correct for one request.
Under concurrency:
- request A reads
stock = 1 - request B reads
stock = 1 - request A updates stock to
0 - request B also updates stock to
0
Now two purchases succeeded even though only one unit existed.
The bug is not in the subtraction. The bug is that the read-check-write sequence was not protected against concurrent access.
Common Race Condition Examples in Backend Systems
Race conditions appear wherever correctness depends on uniqueness, ordering, or shared state.
Payment and checkout flows
Common failures:
- duplicate charges after retries
- two requests creating the same order
- payment marked successful twice through duplicated callbacks
If the problem includes retried writes, see Idempotency Keys for Duplicate API Requests.
Background job processing
Common failures:
- the same job runs twice after worker crash and redelivery
- two workers claim the same job
- a retry repeats an external side effect
If that boundary is familiar, see Background Jobs in Production.
Inventory, booking, and scheduling systems
Common failures:
- overselling limited stock
- double-booking a room or appointment
- assigning one resource to two consumers
Account balances and counters
Common failures:
- lost updates
- inconsistent totals
- balance checks based on stale state
Event-driven systems
Common failures:
- duplicate event handling
- out-of-order state transitions
- one service observing state before another commit is visible
If the write must commit together with a later published event, see Transactional Outbox Pattern in Microservices.
Why Transactions Alone Often Do Not Fix Race Conditions
One of the most common misconceptions in backend code is:
If I wrap it in a transaction, the race condition is solved.
Sometimes that is true. Often it is not.
Transactions give you atomicity for the work inside one transaction. They do not automatically guarantee that your business invariant is protected against every competing transaction.
That depends on:
- isolation level
- lock behavior
- uniqueness constraints
- query shape
- retry behavior
- whether external side effects happen inside or outside the transaction boundary
For example, Read Committed may still allow two transactions to read the same state before either one commits its update.
If you want the database-level view of that tradeoff, see SQL Isolation Levels Explained.
The practical question is not:
"Am I using a transaction?"
It is:
"What concurrent interleavings can still violate the invariant I care about?"
The Main Ways to Prevent Race Conditions
There is no single universal fix. The right protection depends on what must remain true.
1. Enforce invariants in the database
Application checks are useful, but correctness should not depend on them alone when concurrent writers exist.
Strong protections include:
- unique constraints
- foreign keys
- check constraints
- conditional updates that succeed only when the current state still matches expectations
Example:
UPDATE products
SET stock = stock - 1
WHERE id = $1
AND stock > 0;
If this update affects 0 rows, the item was already out of stock.
This is safer than:
- read stock
- check if stock is positive
- write a new value later
because the validation and update happen together at the write boundary.
2. Use optimistic locking when conflicts should fail fast
Optimistic locking works well when conflicts are possible but not constant.
Typical pattern:
- read row with
version - update with
WHERE id = ? AND version = ? - increment version on success
- retry or surface conflict on failure
This is useful when:
- contention is moderate
- work should not block for long
- users can retry safely
Example:
UPDATE accounts
SET balance = balance - 100,
version = version + 1
WHERE id = 42
AND version = 7;
If no row is updated, another writer changed the row first.
For the full tradeoff, see Optimistic vs Pessimistic Locking in SQL.
3. Use pessimistic locking when only one actor may proceed
Some workflows are easier to reason about if one transaction explicitly locks the row while making the decision.
Typical example:
SELECT *
FROM jobs
WHERE id = $1
FOR UPDATE;
This is useful when:
- the invariant is strict
- conflicts are common
- duplicate success would be expensive
- waiting is safer than allowing concurrent success
That said, locking is not free. It can reduce throughput, increase contention, and create deadlock risk if used carelessly.
4. Add idempotency for retried write operations
Many race conditions are caused not by human concurrency but by retries:
- client timeout after successful commit
- proxy retry
- worker redelivery
- user double-submit
In those cases, idempotency is often the right protection.
An idempotency key lets repeated attempts map to one logical action instead of creating multiple side effects.
This is especially important for:
- payments
- order creation
- subscriptions
- webhook handling
- async command processing
I covered the implementation details in Idempotency Keys for Duplicate API Requests.
5. Design background jobs for duplicate execution
Background systems should usually be assumed to have at-least-once delivery semantics unless proven otherwise.
That means:
- a job may run twice
- acknowledgment may be lost
- side effects may happen before crash
- retries may reorder outcomes
Safer job design includes:
- deduplication keys
- idempotent handlers
- state transitions that can be retried safely
- explicit claim semantics for worker ownership
For a deeper production view, see Background Jobs in Production.
6. Publish events reliably across failure boundaries
One common race-adjacent failure looks like this:
- service writes database state
- service tries to publish an event
- process crashes between those steps
Now one part of the system observes the write while another never sees the event.
That is not just a messaging bug. It is a correctness boundary problem under failure and concurrency.
The transactional outbox pattern is one of the most practical ways to make that boundary safer. I covered it in Transactional Outbox Pattern in Microservices.
How to Choose the Right Protection
A useful rule is:
- if duplicates must never succeed, enforce uniqueness or locking
- if conflicts are acceptable but must be detected, use optimistic locking
- if retries are expected, add idempotency
- if async processing is involved, design for duplicate delivery
- if correctness depends on durable event publication, use an outbox-style boundary
Do not start with the tool. Start with the invariant.
Ask:
- What must never happen twice?
- What state must remain unique?
- Can two actors succeed at the same time?
- Is waiting acceptable, or should one side fail fast?
- Will retries happen even when the original request already succeeded?
Once that is clear, the protection becomes easier to choose.
How to Test for Race Conditions
Race conditions are easy to miss because normal test execution often runs too cleanly and too sequentially.
A useful testing approach includes:
- sending concurrent requests against the real endpoint
- running the same workflow many times in parallel
- asserting database state after all requests finish
- forcing retry behavior and duplicate delivery
- testing both success and conflict paths
For example, if you are testing an order-creation endpoint, do not only assert that one request succeeds. Also assert that two concurrent requests with the same logical action do not create two orders.
This is one of the reasons integration tests are so valuable for concurrency-sensitive behavior. I covered a practical testing approach in How to Write API Integration Tests.
Warning Signs You Already Have a Race Condition
These production symptoms are common:
- duplicate records that "should be impossible"
- occasional oversells or double-bookings
- counters that drift under load
- jobs processed twice after retries
- bugs that appear only at higher concurrency
- correctness issues that disappear when stepping through the code slowly
If the bug is intermittent, load-sensitive, and difficult to reproduce locally, a race condition should be high on the list of suspects.
A Practical Debugging Checklist
When you suspect a race condition, work through this sequence:
- Identify the invariant that failed.
- Find the exact read-check-write or side-effect boundary involved.
- Determine which concurrent actor could overlap with it.
- Check whether correctness currently depends only on application logic.
- Verify whether the database enforces the invariant directly.
- Review retries, background delivery semantics, and duplicate submission paths.
- Decide whether the fix should be a constraint, lock, idempotency layer, or workflow redesign.
This framing matters because race conditions are rarely solved by "being more careful" in application code. They are solved by making the system correct even when timing is unfavorable.
Final Thought
Race conditions are not unusual edge cases in backend systems. They are a normal consequence of shared state, retries, concurrency, and distributed failure boundaries.
The goal is not to make concurrent systems perfectly ordered. The goal is to design them so that correctness does not depend on lucky timing.
If your backend handles money, inventory, jobs, retries, or asynchronous workflows, race-condition prevention is not an optimization topic. It is part of the core correctness model of the system.