Agent Skill · YugabyteDB

explain-plan-analyzer

Analyzes PostgreSQL (and YugabyteDB) EXPLAIN / EXPLAIN ANALYZE query plans to identify performance issues. Use this skill whenever a user pastes an explain plan, asks to analyze a query plan, mentions seq scans or full table scans, wants index recommendations, or asks why a query is slow. Trigger even if the user just says "look at this plan", "what's wrong with this query", or pastes raw EXPLAIN output.

Provider: YugabyteDB Path in repo: skills/explain_plan_analyzer/SKILL.md

Skill body

Explain Plan Analyzer

When a user provides a PostgreSQL or YugabyteDB EXPLAIN or EXPLAIN ANALYZE plan, work through these stages in order: parse → identify issues → recommend fixes → present clearly.


Stage 1: Understand the Plan Type

First determine what you’re working with:

Note this at the start of your response, as it affects how confident you can be about findings.


Stage 2: Parse Key Node Attributes

For each node in the plan tree, extract where present:


Stage 3: Seq Scan Analysis (Highest Priority)

Seq Scans are the first and most critical thing to evaluate.

Significance Thresholds

Flag a Seq Scan as significant (🔴 Critical) if ANY of these apply: | Condition | Reason | |—|—| | Estimated rows > 10,000 | Large table being fully scanned | | Actual rows > 10,000 (if ANALYZE) | Confirmed large scan | | Node is inside a Nested Loop with loops > 10 | Cost multiplies: even 500-row scans become expensive | | “Rows removed by filter” > 50,000 | Massive discard ratio — filter could become an index | | Seq Scan cost > 30% of total plan cost | Dominates the query |

Flag a Seq Scan as minor (🟡 Low priority) if:

For YugabyteDB: Lower the threshold to 1,000 rows for 🔴 Critical. LSM-tree storage makes sequential scans significantly more expensive than in standard Postgres due to SST file structure. Mention this explicitly.

What to Report for Each Significant Seq Scan

  1. Table name and estimated / actual row count
  2. Filter condition — this is your index candidate
  3. Rows removed by filter if present — high discard = strong index opportunity
  4. Loop count if inside a Nested Loop
  5. Specific index recommendation based on the filter

Index recommendation format:

-- If filter is: (status = 'pending')
CREATE INDEX ON orders (status);

-- If filter is: (user_id = $1 AND created_at > $2)
CREATE INDEX ON orders (user_id, created_at);

-- If most rows have status='completed' and you only query 'pending':
CREATE INDEX ON orders (status) WHERE status != 'completed';

Stage 4: Secondary Issue Checks

After seq scans, check for these in order of typical impact:

4a. Row Estimate Mismatch (ANALYZE only)

4b. Nested Loop with Large Row Counts

4c. Hash Join Spilling to Disk

4d. Sort Spilling to Disk

4e. High “Rows Removed by Filter” on an Index Scan

4f. Bitmap Heap Scan with High Recheck

4g. Very High Total Cost / Slow Execution


Stage 5: Output Format

Structure your response exactly as follows:

Plan Type

State whether this is EXPLAIN, EXPLAIN ANALYZE, or EXPLAIN (ANALYZE, BUFFERS), and note any limitations this places on the analysis.

Plan Overview

1–2 sentences: what the query does, overall shape (joins, aggregations, etc.), and total cost or execution time if available.

🔴 Critical Issues

One section per issue. For each:

If no critical issues: say “No critical issues found.”

🟡 Secondary Issues

Brief bullets for lower-priority findings (row estimate mismatches, memory spills, etc.)

If none: omit this section.

Numbered list, ordered by expected impact:

  1. Most impactful fix (usually the biggest seq scan index)
  2. Second fix

Keep this list actionable — real SQL commands where possible.


Guidelines