explain-plan-analyzer
Analyzes PostgreSQL (and YugabyteDB) EXPLAIN / EXPLAIN ANALYZE query plans to identify performance issues. Use this skill whenever a user pastes an explain plan, asks to analyze a query plan, mentions seq scans or full table scans, wants index recommendations, or asks why a query is slow. Trigger even if the user just says "look at this plan", "what's wrong with this query", or pastes raw EXPLAIN output.
Skill body
Explain Plan Analyzer
When a user provides a PostgreSQL or YugabyteDB EXPLAIN or EXPLAIN ANALYZE plan, work through these stages in order: parse → identify issues → recommend fixes → present clearly.
Stage 1: Understand the Plan Type
First determine what you’re working with:
- EXPLAIN only — has cost estimates but no actual timing or row counts
- EXPLAIN ANALYZE — has both estimates and actuals (actual time=, actual rows=, loops=)
- EXPLAIN (ANALYZE, BUFFERS) — also includes I/O hit/miss data
Note this at the start of your response, as it affects how confident you can be about findings.
Stage 2: Parse Key Node Attributes
For each node in the plan tree, extract where present:
- Node type: Seq Scan, Index Scan, Index Only Scan, Bitmap Heap Scan, Hash Join, Nested Loop, Merge Join, Sort, Hash, Aggregate, etc.
- Relation name: the table or index being accessed
- rows: planner’s estimated row count
- actual rows: real rows returned (ANALYZE only)
- cost: startup..total cost
- actual time: startup..total ms per loop (ANALYZE only)
- loops: how many times this node executed (ANALYZE only)
- filter / index cond / recheck cond: the predicate applied
- rows removed by filter: how many rows were scanned but discarded
- Buffers hit/read: I/O activity (BUFFERS only)
Stage 3: Seq Scan Analysis (Highest Priority)
Seq Scans are the first and most critical thing to evaluate.
Significance Thresholds
Flag a Seq Scan as significant (🔴 Critical) if ANY of these apply: | Condition | Reason | |—|—| | Estimated rows > 10,000 | Large table being fully scanned | | Actual rows > 10,000 (if ANALYZE) | Confirmed large scan | | Node is inside a Nested Loop with loops > 10 | Cost multiplies: even 500-row scans become expensive | | “Rows removed by filter” > 50,000 | Massive discard ratio — filter could become an index | | Seq Scan cost > 30% of total plan cost | Dominates the query |
Flag a Seq Scan as minor (🟡 Low priority) if:
- Estimated rows < 1,000 AND not inside a loop
- The table is clearly a small config/lookup table
- Query total cost is already very low (< 100)
For YugabyteDB: Lower the threshold to 1,000 rows for 🔴 Critical. LSM-tree storage makes sequential scans significantly more expensive than in standard Postgres due to SST file structure. Mention this explicitly.
What to Report for Each Significant Seq Scan
- Table name and estimated / actual row count
- Filter condition — this is your index candidate
- Rows removed by filter if present — high discard = strong index opportunity
- Loop count if inside a Nested Loop
- Specific index recommendation based on the filter
Index recommendation format:
-- If filter is: (status = 'pending')
CREATE INDEX ON orders (status);
-- If filter is: (user_id = $1 AND created_at > $2)
CREATE INDEX ON orders (user_id, created_at);
-- If most rows have status='completed' and you only query 'pending':
CREATE INDEX ON orders (status) WHERE status != 'completed';
Stage 4: Secondary Issue Checks
After seq scans, check for these in order of typical impact:
4a. Row Estimate Mismatch (ANALYZE only)
- If estimated rows vs actual rows differ by > 10×, flag stale or bad statistics
- Common cause: recent bulk loads, heavy deletes, or correlated columns
- Recommendation:
ANALYZE table_name;orCREATE STATISTICSfor correlated columns - Very large mismatches (> 100×) can cause the planner to choose completely wrong join strategies
4b. Nested Loop with Large Row Counts
- A Nested Loop is efficient for small outer sets (< ~1,000 rows)
- If outer side has many rows AND inner side does a Seq Scan or expensive lookup: flag it
- The actual cost is: outer_rows × inner_cost_per_loop
- Recommendation: consider forcing a Hash Join via
SET enable_nestloop = offtemporarily to test, or add an index on the join column
4c. Hash Join Spilling to Disk
- Visible as:
Batches: Nwhere N > 1 in a Hash node - Means the hash table didn’t fit in
work_mem - Recommendation:
SET work_mem = '64MB';(or higher, test in session first)
4d. Sort Spilling to Disk
- Visible as:
Sort Method: external mergeorexternal sort - Recommendation: increase
work_mem
4e. High “Rows Removed by Filter” on an Index Scan
- Means the index is finding rows but many are discarded by a post-scan filter
- The index may not be selective enough, or a composite index with the filter column would help
4f. Bitmap Heap Scan with High Recheck
Rows Removed by Index Recheck: Nbeing high means lossy bitmap pages- Recommendation: increase
work_memso the bitmap stays exact
4g. Very High Total Cost / Slow Execution
- If EXPLAIN ANALYZE shows total actual time > 1,000ms for what should be a simple query, flag it even if no single obvious issue dominates
Stage 5: Output Format
Structure your response exactly as follows:
Plan Type
State whether this is EXPLAIN, EXPLAIN ANALYZE, or EXPLAIN (ANALYZE, BUFFERS), and note any limitations this places on the analysis.
Plan Overview
1–2 sentences: what the query does, overall shape (joins, aggregations, etc.), and total cost or execution time if available.
🔴 Critical Issues
One section per issue. For each:
- Issue title (e.g., “Seq Scan on
orders— 450,000 rows”) - What it is and why it matters
- Specific fix with SQL if applicable
If no critical issues: say “No critical issues found.”
🟡 Secondary Issues
Brief bullets for lower-priority findings (row estimate mismatches, memory spills, etc.)
If none: omit this section.
✅ Recommended Actions
Numbered list, ordered by expected impact:
- Most impactful fix (usually the biggest seq scan index)
- Second fix
- …
Keep this list actionable — real SQL commands where possible.
Guidelines
- Always explain why something is a problem, not just that it is
- Be specific: name the table, show the filter, write the actual CREATE INDEX statement
- If the plan is truncated, say so and analyze what’s available
- Do not recommend indexes that are already being used (already showing Index Scan in the plan)
- If the plan looks healthy, say so clearly — a good plan is worth confirming
- For EXPLAIN-only (no ANALYZE): caveat your findings with “based on planner estimates”
- If the user mentions this is YugabyteDB, apply the lower thresholds and add YB-specific context