News
🗄️ Databases Tutorials How to Read a PostgreSQL EXPLAIN ANALYZE Output

How to Read a PostgreSQL EXPLAIN ANALYZE Output

Turn a wall of planner output into actionable information — find the slow part, understand why it is slow, and know which fixes to try.

EXPLAIN ANALYZE runs a query and shows you exactly how PostgreSQL executed it — what scan method it used, how many rows it processed, and where the time was spent. The output looks intimidating at first, but there are only a handful of patterns you actually need to recognise.


Run it

EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id, u.name
ORDER BY order_count DESC
LIMIT 10;

Add BUFFERS to also see cache hit rates:

EXPLAIN (ANALYZE, BUFFERS)
SELECT ...

Read the output

The result is a tree of nodes, indented to show parent-child relationships. Each node performs one step of the query. The overall query is the root node; the deepest indented nodes run first.

Limit  (cost=1243.56..1243.58 rows=10 width=40) (actual time=48.231..48.234 rows=10 loops=1)
  ->  Sort  (cost=1243.56..1268.56 rows=10000 width=40) (actual time=48.228..48.229 rows=10 loops=1)
        Sort Key: (count(o.id)) DESC
        Sort Method: top-N heapsort  Memory: 25kB
        ->  HashAggregate  (cost=868.00..968.00 rows=10000 width=40) (actual time=42.118..44.891 rows=10000 loops=1)
              Group Key: u.id, u.name
              ->  Hash Left Join  (cost=295.00..743.00 rows=50000 width=16) (actual time=4.532..28.714 rows=50000 loops=1)
                    Hash Cond: (o.user_id = u.id)
                    ->  Seq Scan on orders  (cost=0.00..789.00 rows=50000 width=8) (actual time=0.012..7.341 rows=50000 loops=1)
                    ->  Hash  (cost=170.00..170.00 rows=10000 width=12) (actual time=4.102..4.102 rows=10000 loops=1)
                          ->  Seq Scan on users  (cost=0.00..170.00 rows=10000 width=12) (actual time=0.010..2.218 rows=10000 loops=1)
Planning Time: 1.234 ms
Execution Time: 48.512 ms

Each node shows two cost/row estimates:

  • cost=start..total — the planner's estimate. start is the cost before the first row can be returned; total is the cost to return all rows. Units are arbitrary but consistent.
  • actual time=start..total — the real measured time in milliseconds.
  • rows — how many rows were actually returned.
  • loops — how many times this node ran. Actual time is per loop, so multiply by loops for the total.

Find the slow node

The actual time value on a node includes all child nodes. The node doing the most work is the one with the highest actual time that is not caused by slow children. Compare actual time at each level to find where time is actually consumed.

Look for a big jump: if the parent took 48ms and its child took 44ms, the parent itself only contributed 4ms. If the parent took 48ms and all children took 5ms total, the parent's own work is 43ms — investigate that node.


The patterns to recognise

Sequential scan on a large table (Seq Scan):

Seq Scan on orders (cost=0.00..789.00 rows=50000 ...)

A sequential scan reads every row. Fine for small tables or when you need most of the rows. A problem when you are selecting a small fraction of a large table. The fix is usually an index.

Row estimate wildly off:

(cost=... rows=10000 ...) (actual ... rows=950000 ...)

The planner estimated 10,000 rows but got 950,000. Bad estimates lead to bad plan choices. Run ANALYZE tablename to update statistics and see if the estimate improves. Outdated statistics are the most common cause of bad query plans.

Nested Loop on large tables:

Nested Loop (... rows=50000 loops=1000)
  ...

A nested loop runs the inner query once for each row in the outer query. Fine when the outer table is small or the inner side is indexed. A nested loop with loops=1000 and a sequential scan inside is almost always the reason a query is slow.

Hash Join:

A hash join is usually efficient — PostgreSQL builds an in-memory hash table from the smaller side, then probes it. If memory is too small, the hash spills to disk (Batches: 8 instead of Batches: 1 in the output). Increase work_mem for the session if hash batches are high:

SET work_mem = '64MB';
EXPLAIN (ANALYZE, BUFFERS) SELECT ...;

Check index usage

When a query is slow on a large table and you expect an index to be used:

EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 42;

If you see Seq Scan instead of Index Scan, either:

  • The index does not exist
  • The column is not indexed
  • The query is returning a large enough fraction of rows that PostgreSQL prefers a sequential scan (this is correct behaviour)
  • Statistics are outdated and PostgreSQL underestimates selectivity

Check existing indexes:

\d orders

Or:

SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'orders';

Create an index and verify

CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders (user_id);

CONCURRENTLY builds the index without locking the table for writes. Run EXPLAIN ANALYZE again after the index exists — the plan should show Index Scan or Bitmap Index Scan instead of Seq Scan.

Use SysEmperor's SQL Query Editor to write and format queries before running them — especially useful for longer EXPLAIN ANALYZE outputs that are easier to read when the query itself is clean.


When the planner is still choosing a bad plan

If you have added an index and the planner is still doing a sequential scan on a selective query:

ANALYZE orders;    -- update table statistics

If that does not help, check whether the column has a very skewed distribution (many NULLs, one dominant value). For those cases, partial indexes and extended statistics give the planner more accurate information to work with.