
How to Prevent Duplicate API Requests with Idempotency Keys
Situation
A customer clicks "Pay" once. Your API receives the request, starts processing, and then the client times out before getting a response. The client retries. Now two writes race through the system, and you risk duplicate charges, duplicate orders, and inconsistent state.
This is a standard production failure mode in distributed systems. Retries are necessary for reliability, but without idempotency they can turn partial failures into data integrity problems.
What an Idempotency Key Actually Solves
An idempotency key lets clients identify retries of the same logical operation. Instead of treating every request as new work, the server can detect repeats and return the original outcome.
For write endpoints (payments, orders, subscriptions, transfers), this provides a critical guarantee:
- one logical action produces one effect
- retries replay the original response instead of creating new side effects
Idempotency does not mean "the endpoint is always safe." It means repeated requests with the same key and same intent are handled as one operation.
Why Duplicate Requests Happen in Real Systems
Duplicates happen even with good engineering teams:
- client timeouts after server-side work already committed
- network drops after response is sent
- mobile retry behavior during connectivity transitions
- load balancer/proxy retries
- user double-submits in UI
- workers retrying queue messages after transient failures
If your architecture includes retries, duplicates are not edge cases. They are expected behavior.
Idempotency and Retries Must Be Designed Together
Retries increase availability. Idempotency preserves correctness.
If you add retries without idempotency, outages can get worse because each retry may create additional writes. If you add idempotency without sane retry strategy, latency and thundering-herd patterns still hurt reliability.
Use both:
- client retries with jittered exponential backoff
- server-side idempotency keys on non-idempotent writes
- clear retry windows aligned with key TTL
Request Lifecycle with Idempotency Keys
A robust flow for POST /payments looks like this:
- Client sends
Idempotency-Keyheader and request body. - Server validates request.
- Server atomically reserves or inserts key record.
- If key is new: process request and persist final response snapshot.
- If key exists and completed: return stored response.
- If key exists and in progress: return conflict or retry-after policy response.
The critical part is step 3. If key reservation is not atomic, concurrent duplicates can still execute twice.
Data Model for Idempotency Records
A practical table shape:
CREATE TABLE api_idempotency (
id BIGSERIAL PRIMARY KEY,
key TEXT NOT NULL,
scope TEXT NOT NULL, -- tenant/user/account scope
request_hash TEXT NOT NULL, -- hash of canonical request payload
status TEXT NOT NULL, -- in_progress | completed | failed
response_code INT,
response_body JSONB,
error_type TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
expires_at TIMESTAMPTZ NOT NULL,
UNIQUE (scope, key)
);
Design notes:
scopeprevents cross-tenant key collisions.request_hashensures repeated key usage with different payload is rejected.- storing response snapshot enables deterministic replay.
expires_atsupports cleanup and bounded storage growth.
Handling Concurrency Races Correctly
Two identical requests can hit different instances at the same time. Application-level checks alone are insufficient. Use a database uniqueness constraint or atomic cache primitive.
Example in TypeScript-style pseudo-code:
async function handlePayment(req, res) {
const key = req.headers['idempotency-key'];
const scope = req.auth.accountId;
const requestHash = hashCanonical(req.body);
const existing = await repo.findByScopeAndKey(scope, key);
if (existing && existing.requestHash !== requestHash) {
res.statusCode = 409;
res.end('Idempotency key reused with different payload');
return;
}
if (existing?.status === 'completed') {
res.statusCode = existing.responseCode;
res.end(JSON.stringify(existing.responseBody));
return;
}
const reserved = await repo.tryInsertInProgress({
scope,
key,
requestHash,
expiresAt: ttlFromNowHours(24),
});
if (!reserved) {
// Another request owns processing right now.
res.statusCode = 409;
res.setHeader('Retry-After', '1');
res.end('Request with this key is already in progress');
return;
}
try {
const payment = await payments.charge(req.body);
const response = { id: payment.id, status: 'succeeded' };
await repo.markCompleted({ scope, key, responseCode: 201, responseBody: response });
res.statusCode = 201;
res.end(JSON.stringify(response));
} catch (error) {
await repo.markFailed({ scope, key, errorType: classify(error) });
throw error;
}
}
The reservation operation should map to a single atomic write (INSERT ... ON CONFLICT DO NOTHING in SQL, or SETNX in Redis).
Response Replay Policy
When the same key returns:
- completed request: return the original status and body
- in-progress request: return
409or425/429based on policy - expired key: treat as new operation (or reject, depending on product risk)
For payment-like domains, replaying the original response is usually preferable to recomputing. It gives clients stable behavior across network retries.
TTL Strategy and Replay Window
TTL should match client retry behavior and business risk.
Common defaults:
- low-risk writes: 1-6 hours
- financial writes: 24-72 hours
- async workflows: aligned to max processing + retry horizon
Too short: duplicates slip through after expiration. Too long: storage bloat and accidental key reuse risk.
Document TTL in API docs so clients know replay guarantees.
Error Handling Semantics
Decide what gets persisted and replayed:
- validation errors (
4xx): usually replayed - deterministic business errors (insufficient funds, policy rejection): replayed
- transient infrastructure errors (
5xx): policy choice
A common approach:
- if write did not start, allow safe retry as new attempt
- if write may have committed, keep idempotency record and replay authoritative result
Ambiguous commit boundaries are where systems fail. Make commit state explicit in code and storage.
Security and Abuse Considerations
Idempotency keys are part of your write-protection surface. Apply basic safeguards:
- require high-entropy keys (UUID v4 or equivalent)
- scope keys by tenant/account/user
- enforce max key length
- rate-limit key creation per principal
- reject key reuse with payload mismatch
Never let one tenant probe another tenant's idempotency history. Scope is non-negotiable.
Common Implementation Mistakes
Teams often get idempotency wrong by:
- checking for key existence without atomic reservation
- not storing request hash
- replaying success for mismatched payloads
- storing only "seen=true" without response snapshot
- choosing TTL shorter than retry windows
- forgetting to cover async consumers and webhook handlers
The pattern is simple, but correctness depends on strict edge-case handling.
End-to-End Example: Preventing Duplicate Charges
Client sends:
POST /payments
Idempotency-Key: 0f95f3cd-5f8f-41f6-80d5-7ab7de5da56a
Content-Type: application/json
{
"orderId": "ord_123",
"amount": 4999,
"currency": "USD",
"methodId": "pm_9x2"
}
If the first attempt succeeds but response is lost, a retry with the same key should return the same 201 payload and payment ID, not create a second charge.
That behavior is what protects customer trust and reduces refund/reconciliation overhead.
Production Rollout Checklist
Before enabling idempotency in production:
- define key header contract in API docs
- ship atomic reservation in storage layer
- enforce
(scope, key)uniqueness - store and replay authoritative response
- validate payload hash on key reuse
- define TTL + cleanup job
- instrument metrics: new keys, replays, mismatches, in-progress conflicts
- run concurrency tests with duplicated requests
Idempotency should be observable, not hidden. If replay and conflict metrics spike, something upstream changed.
Closing Reflection
Retries are required in distributed systems. Duplicate side effects are optional.
Idempotency keys are the boundary between the two. When implemented with atomic reservation, request hashing, scoped uniqueness, and deterministic response replay, they turn a common failure mode into predictable behavior your clients can rely on.