Database performance degrades slowly, then suddenly. One day everything is fine, the next day your application is crawling. The culprit? Usually queries that gradually became slower as your data grew. Let's explore how pg_stat_statements
helps you catch these issues before your users do.
What is pg_stat_statements?
pg_stat_statements
is PostgreSQL's built-in query performance tracker. It records execution statistics for all SQL statements, giving you insights into:
- Which queries consume the most time
- How often queries are executed
- Query performance variability
- Cache efficiency
- Resource consumption patterns
Setting Up pg_stat_statements
First, you need to enable it (requires database restart):
-- In postgresql.conf shared_preload_libraries = 'pg_stat_statements' -- Optional configuration pg_stat_statements.max = 10000 -- Maximum number of statements tracked pg_stat_statements.track = 'all' -- Track all statements including functions pg_stat_statements.save = on -- Save stats across server restarts
Then create the extension:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
The Essential Queries Every DBA Should Know
1. Find Your Most Expensive Queries
SELECT round(total_exec_time::numeric, 2) AS total_time_ms , calls , round(mean_exec_time::numeric, 2) AS mean_time_ms , round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS percentage_cpu , regexp_replace(query, E'[\\n\\r ]+', ' ', 'g') AS query , queryid FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 10;
This shows which queries consume the most total time, your optimization targets to offload your database.
2. Identify Queries with High Variation
Queries with high standard deviation often indicate missing indexes or data skew:
SELECT calls , round(mean_exec_time::numeric, 2) AS mean_ms , round(stddev_exec_time::numeric, 2) AS stddev_ms , round((stddev_exec_time / mean_exec_time)::numeric, 2) AS coefficient_of_variation , regexp_replace(query, E'[\\n\\r ]+', ' ', 'g') AS query , queryid FROM pg_stat_statements WHERE calls > 100 -- Enough calls for meaningful statistics AND mean_exec_time > 10 -- Ignore very fast queries ORDER BY coefficient_of_variation DESC LIMIT 20;
3. Find Queries That Return Too Many Rows
Often indicates missing WHERE clauses or inefficient joins:
SELECT calls , rows , round(rows::numeric / calls, 2) AS rows_per_call , round(mean_exec_time::numeric, 2) AS mean_time_ms , round((mean_exec_time / (rows / calls))::numeric, 3) AS ms_per_row , regexp_replace(query, E'[\\n\\r ]+', ' ', 'g') AS query , queryid FROM pg_stat_statements WHERE rows / calls > 1000 -- Returns more than 1000 rows on average ORDER BY rows_per_call DESC LIMIT 20;
4. Monitor Cache Hit Ratio
Poor cache performance indicates you might need to increase shared_buffers
:
SELECT sum(blk_read_time) AS total_read_time , sum(blk_write_time) AS total_write_time , sum(shared_blks_hit) AS cache_hits , sum(shared_blks_read) AS cache_misses , round((sum(shared_blks_hit)::numeric / (sum(shared_blks_hit) + sum(shared_blks_read))), 4) AS cache_hit_ratio FROM pg_stat_statements WHERE shared_blks_hit + shared_blks_read > 0;
Aim for a cache hit ratio above 0.90 (90%).
5. Track Query Performance Over Time
Create a snapshot table to track trends:
-- Create snapshot table CREATE TABLE pg_stat_statements_snapshots AS SELECT now() AS snapshot_time, * FROM pg_stat_statements WHERE false; -- Create structure only -- Take regular snapshots (schedule this) INSERT INTO pg_stat_statements_snapshots SELECT now(), * FROM pg_stat_statements; -- Analyze performance trends WITH query_trends AS ( SELECT date_trunc('hour', snapshot_time) AS hour , queryid , max(calls) - min(calls) AS calls_in_hour , max(total_exec_time) - min(total_exec_time) AS time_in_hour FROM pg_stat_statements_snapshots WHERE snapshot_time > now() - interval '24 hours' GROUP BY hour, queryid ) SELECT hour , sum(calls_in_hour) AS total_calls , round(sum(time_in_hour)::numeric / 1000, 2) AS total_seconds FROM query_trends GROUP BY hour ORDER BY hour DESC;
Advanced Analysis Techniques
Finding Queries That Got Slower
Compare current performance with historical baselines:
WITH current_stats AS ( SELECT queryid , mean_exec_time AS current_mean , calls AS current_calls , query FROM pg_stat_statements WHERE calls > 100 ), historical_stats AS ( SELECT queryid, avg(mean_exec_time) AS historical_mean FROM pg_stat_statements_snapshots WHERE snapshot_time BETWEEN now() - interval '7 days' AND now() - interval '1 day' GROUP BY queryid ) SELECT c.queryid , round(c.current_mean::numeric, 2) AS current_mean_ms , round(h.historical_mean::numeric, 2) AS historical_mean_ms , round(((c.current_mean - h.historical_mean) / h.historical_mean * 100)::numeric, 2) AS percent_slower , c.current_calls , substring(c.query, 1, 60) AS query_preview FROM current_stats c JOIN historical_stats h ON c.queryid = h.queryid WHERE c.current_mean > h.historical_mean * 1.5 -- At least 50% slower ORDER BY percent_slower DESC;
Identifying Index Opportunities
Queries with high read ratios often benefit from indexes:
SELECT queryid , calls , shared_blks_read + local_blks_read AS blocks_read , shared_blks_hit + local_blks_hit AS blocks_from_cache , round( ((shared_blks_read + local_blks_read)::numeric / NULLIF(shared_blks_read + local_blks_read + shared_blks_hit + local_blks_hit, 0) * 100), 2 ) AS read_ratio , round(mean_exec_time::numeric, 2) AS mean_time_ms , substring(query, 1, 60) AS query_preview FROM pg_stat_statements WHERE calls > 10 AND (shared_blks_read + local_blks_read) > 1000 -- Significant reads ORDER BY blocks_read DESC LIMIT 20;
Creating a Performance Dashboard
Build views for easy monitoring:
-- Overall health metrics CREATE VIEW database_health AS SELECT (SELECT count(*) FROM pg_stat_statements) AS total_unique_queries, (SELECT sum(calls) FROM pg_stat_statements) AS total_query_calls, (SELECT round(sum(total_exec_time)::numeric / 1000 / 60, 2) FROM pg_stat_statements) AS total_minutes, (SELECT round(avg(mean_exec_time)::numeric, 2) FROM pg_stat_statements) AS avg_query_time_ms, (SELECT round((sum(shared_blks_hit)::numeric / (sum(shared_blks_hit) + sum(shared_blks_read)))::numeric, 4) FROM pg_stat_statements WHERE shared_blks_hit + shared_blks_read > 0) AS cache_hit_ratio, (SELECT count(*) FROM pg_stat_statements WHERE mean_exec_time > 1000) AS slow_queries_count; -- Top resource consumers CREATE VIEW top_queries AS SELECT row_number() OVER (ORDER BY total_exec_time DESC) AS rank , round(total_exec_time::numeric / 1000, 2) AS total_seconds , calls , round(mean_exec_time::numeric, 2) AS mean_ms , round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS percent_of_total , query FROM pg_stat_statements LIMIT 50; -- Recent problem queries CREATE VIEW problem_queries AS SELECT calls , round(mean_exec_time::numeric, 2) AS mean_ms , round(max_exec_time::numeric, 2) AS max_ms , rows , query , queryid FROM pg_stat_statements WHERE (mean_exec_time > 500 -- Slow OR (stddev_exec_time / NULLIF(mean_exec_time, 0)) > 2 -- High variation OR (rows / NULLIF(calls, 0)) > 10000) -- Too many rows ORDER BY total_exec_time DESC;
Best Practices and Tips
1. Regular Maintenance
-- Reset statistics periodically (e.g., after deployments, setting changes or schema changes) SELECT pg_stat_statements_reset(); -- Or reset for specific queries SELECT pg_stat_statements_reset(userid, dbid, queryid);
2. Combine with EXPLAIN ANALYZE
When you identify a slow query, analyze it:
-- Get the actual query text SELECT query FROM pg_stat_statements WHERE queryid = 12345678901234567; -- Then explain it EXPLAIN (ANALYZE, BUFFERS) <query_text>;
Common Pitfalls
- Not enabling track_io: Without I/O statistics, you miss cache performance data
- Setting pg_stat_statements.max too low: You might miss important queries
- Not resetting after schema changes: Old statistics can be misleading
- Ignoring high-frequency queries: Even fast queries add up when called millions of times
- Not tracking query patterns: Individual query optimization misses systemic issues
Action Items
Based on pg_stat_statements data, here's your optimization checklist:
- Queries with total_time > 10% of total: These are your bottlenecks
- mean_exec_time > 1000ms: Need immediate optimization
- stddev/mean > 2: Likely missing indexes or data skew
- rows/calls > 1000: Consider pagination or filtering
- Cache hit ratio < 0.90: Increase shared_buffers
- Sudden performance degradation: Check for plan changes or data growth
Next Steps
While pg_stat_statements shows you what's slow, it doesn't tell you why or how to fix it. The next post will explore how modern monitoring is broken and why we need smarter solutions.
For automated performance analysis and optimization suggestions, check out Azimutt Inspector, which continuously analyzes these metrics and provides actionable recommendations.
Need Performance Help?
Struggling with slow queries or need help interpreting pg_stat_statements data? I offer a free 30-minute consultation session where we can analyze your specific performance challenges and find solutions.
📅 Book a free discussion slot - Let's optimize your database performance together!
Remember
"Performance problems are like icebergs - by the time users see them, 90% of the damage is already done. pg_stat_statements helps you see below the surface."
Monitor proactively, optimize continuously, and sleep peacefully.