Why Database Indexes Didn’t Fix Your Slow Query

Situation

A relational database backs a production system with steadily increasing traffic. Most queries perform within expected bounds, but one endpoint consistently shows elevated latency. The query involved is not complex: a small number of joins, a few filters, and an ORDER BY.

After profiling, the slow path is clearly identified. Execution time increases with data volume, and the query appears to scan far more rows than necessary. The behavior is repeatable across environments, though more pronounced in production.

The schema already includes several indexes, but none appear to match the exact access pattern. Adding a new index seems like a straightforward fix.

The Reasonable Assumption

Indexes exist to reduce the amount of data a query must scan. If a query filters on a column and that column is indexed, the database should be able to locate the relevant rows efficiently.

From that perspective, the expectation is simple: adding an index aligned with the query’s WHERE clause should reduce execution time.

This assumption is not naive. It reflects how indexes are commonly introduced and how many performance issues are resolved in practice. In many cases, it works exactly as expected.

What Actually Happened

The index was added successfully. The query plan was rechecked. The index existed and appeared valid.

Execution time did not meaningfully change.

In some runs, the planner ignored the new index entirely. In others, it used the index but still scanned a large portion of the table. Latency remained high, and variability increased slightly under load.

From the outside, nothing was obviously broken. The index was present. The query was unchanged. The database continued to choose a plan that looked inefficient.

Illustrative Code Example

The query shape looked roughly like this:

SELECT *
FROM orders
WHERE status = 'completed'
  AND created_at >= NOW() - INTERVAL '30 days'
ORDER BY created_at DESC
LIMIT 50;

An index was added to support the filters and sort:

CREATE INDEX ON orders (status, created_at DESC);

On inspection, the planner still favored a sequential scan in production-sized datasets.

Why It Happened

The core issue was not the absence of an index, but a mismatch between expectation and how the database evaluates cost.

Several factors contributed.

Index Selectivity Was Lower Than Expected

The assumption behind the index was that filtering on status = 'completed' would significantly reduce the result set.

In reality, most rows in the table had that status.

From the planner’s perspective, the index did not meaningfully narrow the search space. Scanning the index and then fetching rows was estimated to cost roughly the same as scanning the table directly.

As the table grew, this imbalance became more pronounced.

Data Distribution Had Drifted Over Time

The index was added based on current understanding of the data, not its historical evolution.

Earlier in the system’s life, statuses were more evenly distributed. Over time, retention policies and business workflows shifted the dataset toward a dominant state.

Statistics lagged behind reality, and even after being refreshed, they reflected a distribution that made the index less attractive than anticipated.

The Planner Optimized for Throughput, Not Intuition

From an engineering perspective, the query “looked indexed.” From the database’s perspective, the plan chosen minimized overall cost under expected load.

Sequential scans are not inherently slow. When a large percentage of rows must be examined, they can be faster than index lookups due to memory access patterns and reduced random I/O.

The planner’s goal was not to validate the intent behind the index, but to select the least expensive plan based on its model.

The `LIMIT` Did Not Guarantee Early Termination

It was tempting to assume that ORDER BY ... LIMIT 50 would allow the database to stop early.

That only holds when the planner can confidently retrieve rows in the desired order using an index that meaningfully narrows the search.

Because the filter condition was weak, the planner still expected to examine a large portion of the dataset before finding the top 50 matching rows.

The presence of LIMIT did not materially change the cost model.

Production and Non-Production Diverged Subtly

In smaller environments, the index appeared to help.

This reinforced the assumption that the production behavior was anomalous.

In reality, the difference came down to scale. Once the table crossed a certain size and distribution threshold, the cost model flipped. The same index existed in all environments, but only production had crossed the tipping point where it stopped being useful.

Alternatives That Didn’t Work

Several reasonable adjustments were considered.

Reordering the index columns produced marginal changes but did not alter the planner’s decision in production. Adding more columns increased index size and maintenance cost without improving selectivity.

Forcing index usage was briefly tested and immediately discarded. While it reduced latency for this specific query, it introduced worse performance under concurrent load and created a fragile dependency on a particular execution plan.

Breaking the query into multiple steps improved readability but did not change the underlying cost characteristics.

None of these approaches addressed the fundamental issue: the data itself no longer aligned with the assumptions behind the index.

Practical Takeaways

Indexes encode expectations about how data will be filtered and accessed. When those expectations drift from reality, indexes can remain technically correct while becoming practically irrelevant.

Query planners operate on cost models, not intent. They favor plans that minimize expected work, even when those plans conflict with developer intuition.

Data distribution matters as much as query structure. Changes over time can invalidate assumptions without any schema or code changes.

Performance issues that appear suddenly often reflect gradual shifts that only become visible at scale.

Closing Reflection

Indexes are often treated as mechanical optimizations: add one, observe improvement, move on. In practice, they are agreements between query patterns and evolving data.

When those agreements erode, the database does not fail loudly. It simply chooses a different path.

Understanding that behavior requires looking beyond whether an index exists and examining whether it still represents how the system actually operates.