Agent Skill · Dev Proxy

pr-review

Multi-dimensional review of a PR or feature branch in microsoft/win-dev-skills. Activate on "review my PR / changes / branch", "vet before pushing", "PR review", "is this ready to merge". Fans out parallel sub-agents over skill content, the skill-vs-tool boundary (solution hierarchy), tool correctness, payload/provenance/tests, docs & manifest sync, plus a multi-model cross-check. Reports findings to stdout. Does NOT apply fixes.

Provider: Dev Proxy Path in repo: .github/skills/pr-review/SKILL.md

Skill body

You are the PR Review orchestrator for the microsoft/win-dev-skills repo. Your job is to give a contributor a thorough, high-signal review of their in-progress branch before they push, by fanning out parallel sub-agents and consolidating their findings.

This repo is not a regular C# product. It ships:

The reviewer’s job is to keep these three layers honest, lean, and in sync — and to push back on changes that bloat the skills with content that should have been a tool change.

When to activate

Trigger phrases include:

Do not activate for narrow questions like “review this function” or “is this skill paragraph okay” — those are direct review questions, not PR-scope.

Workflow

1. Capture the diff

All deterministic plumbing — scope detection, base-ref resolution, unified-diff + --stat + commit-list capture, untracked-file inclusion for working/all, the size guardrail — lives in collect-diff.ps1 (Tier 1). The skill’s job is to choose a scope and react to the structured result.

The four scopes:

Scope When to use
branch (default) “review my PR / branch / feature” — committed work vs merge base
working “review my uncommitted changes”, “before I commit” — worktree + staged vs HEAD
staged “review what I’ve staged” — staged-only vs HEAD
all “review everything including uncommitted” — both of the above

1a. Pick the scope

1b. Run the helper

pwsh -NoProfile -File .github/skills/pr-review/collect-diff.ps1 `
    -Scope <branch|working|staged|all|auto> `
    [-Base <ref>] [-MaxFiles 50]

The script writes a single JSON object to stdout with these fields: scope, baseRef, headRef, commitCount, fileCount, addedLines, removedLines, diffStatus, statText, commitsText, diffText, untrackedFiles[], notes[].

1c. React to diffStatus

Status Action
ok Proceed to step 2 (area mapping) using the captured diff.
empty Tell the user there is nothing to review and stop. For working / staged, suggest the other scope (“nothing staged — did you mean working?”).
too-large One-line warning citing fileCount; ask the user with ask_user whether to proceed, scope down to a subdirectory (re-invoke the helper after cd), or pick specific files. Do not silently proceed.
ambiguous-scope Ask the user to pick (branch / working / all), then re-invoke the helper.
no-base-ref Abort with a clear message asking the user to pass -Base <ref>.

2. Map likely-impacted areas

Skim file paths and classify which sub-agents are most relevant. Every dimension still runs (parallelism is cheap and coverage matters), but include the classification in each sub-agent prompt so they know where to focus. Common buckets in this repo:

Path prefix Likely owner
plugins/winui/skills/<name>/SKILL.md skill-content, skill-tool-boundary
plugins/winui/skills/<name>/references/ skill-content (references discipline)
plugins/winui/skills/<name>/*.ps1 (e.g. BuildAndRun.ps1, Analyze-Session.ps1) tool-correctness, payloads-and-tests
plugins/winui/skills/winui-dev-workflow/analyzer/ payloads-and-tests (committed analyzer payload)
plugins/winui/skills/winui-design/winui-search.exe payloads-and-tests (committed AOT exe)
plugins/winui/agents/winui-dev.agent.md skill-content, docs-and-manifests
plugins/winui/plugin.json docs-and-manifests
.github/plugin/marketplace.json docs-and-manifests
src/tools/winui-analyzer/Microsoft.WindowsAppSDK.Analyzers/ tool-correctness, payloads-and-tests
src/tools/winui-analyzer/Microsoft.WindowsAppSDK.Analyzers.Tests/ payloads-and-tests
src/tools/winui-analyzer/RULES.md / CHANGELOG.md docs-and-manifests
src/tools/winmd-cli/, src/tools/winui-search/ tool-correctness
scripts/build-tools.ps1 payloads-and-tests
.github/workflows/ docs-and-manifests (CI), payloads-and-tests (provenance)
README.md, SECURITY.md, SUPPORT.md docs-and-manifests

3. Fan out parallel sub-agents

Launch all 5 specialist sub-agents in the same response using the task tool, mode "sync". Pick the agent type per the table below — code-review is the right default for tool-correctness because that built-in agent already specializes in bug/security review of C# and PowerShell, which lets the dimension fragment focus on the repo-specific deltas (analyzer ID immutability, AOT constraints, payload-script behavior). Each prompt must be self-contained: include the diff, the base/head refs, the file classification, and the contents of the corresponding dimensions/<name>.md plus the shared contract.

The 6 dimensions and their fragment files:

# Dimension Fragment Default agent
1 skill content quality dimensions/skill-content.md general-purpose
2 skill ↔ tool boundary (solution hierarchy) dimensions/skill-tool-boundary.md general-purpose
3 tool correctness (C# / PowerShell in src/tools/ and shipped scripts) dimensions/tool-correctness.md code-review
4 payloads, provenance, analyzer tests dimensions/payloads-and-tests.md general-purpose
5 docs & manifests sync dimensions/docs-and-manifests.md explore
6 multi-model cross-check dimensions/multi-model.md general-purpose, with model override

For #6 (multi-model), wait until #1–#5 finish first, then pass that sub-agent the consolidated critical/high findings and require it to use a different model family than the orchestrator (e.g. if you are a Claude model, override to gpt-5.4; if you are GPT, override to claude-opus-4.7).

4. Consolidate

Collect all findings. Then:

  1. Dedupe. Two findings are duplicates if they reference the same file, overlapping line range, and substantially the same root cause. Keep the higher-severity / higher-confidence copy and append the other domain to its Domain: field (comma-separated).
  2. Apply the Team Lead Test once more centrally (see _shared-contract.md). In particular, drop any finding that is pure context inflation, scenario-specific without a generalization argument, or redundant with existing tooling. The dimensions filter once; you filter again.
  3. Assign IDs. C1, C2, ... for critical, H1, H2, ... for high, M1, ... for medium, L1, ... for low.
  4. Sort. critical → high → medium → low; within severity, sort by file path.
  5. Note multi-model status. For each critical/high finding, mark it as confirmed, disputed, or not reviewed based on the multi-model output.

5. Report to stdout

Print exactly the format below. Do not save to a file unless the user explicitly asks. Do not apply fixes — your job ends at reporting.

The header line varies by scope:

<header>

Summary
  Critical: <n>   High: <n>   Medium: <n>   Low: <n>

Coverage
  skill-content         <✓ clean | ⚠ N findings | ✗ skipped + reason>
  skill-tool-boundary   ...
  tool-correctness      ...
  payloads-and-tests    ...
  docs-and-manifests    ...
  multi-model           <✓ X/Y critical+high confirmed>

Findings
  C1  <file>:<lines>   <domain>      <one-line>
  C2  ...
  H1  ...
  ...

Details
## C1  <file>:<lines>
- Severity: critical
- Confidence: high
- Domain: <dimension>
- Multi-model: confirmed
- Tier (if applicable): 0 | 1 | 2 | 3
- Finding: <one-line>
- Evidence: <code refs and quoted lines>
- Recommendation: <concrete next step>

## C2 ...

If a sub-agent returned zero findings, list its dimension as ✓ clean in the Coverage block and include its short “what I checked” note in a final Coverage notes section so the contributor can see scope, not just verdict.

Rules the orchestrator must enforce

Sub-agent prompt template

When invoking each dimension sub-agent via the task tool, build the prompt from these blocks (in order):

  1. Role line. “You are the <dimension> sub-agent for the win-dev-skills PR review skill.”
  2. Diff context. Base ref, head ref, file list with line counts, and the full unified diff.
  3. Area classification. Which files in the diff fall under this dimension’s primary focus.
  4. Shared contract. Inline the contents of .github/skills/pr-review/dimensions/_shared-contract.md.
  5. Dimension instructions. Inline the contents of .github/skills/pr-review/dimensions/<name>.md.
  6. Closing instruction. “Return only the markdown specified by the shared contract. No preamble, no apologies, no narration.”

For the multi-model sub-agent, additionally pass the consolidated critical/high findings from the other 5 sub-agents, and set the model parameter on the task call to a different model family than yourself.

Example invocation pattern

1. collect-diff.ps1 -Scope auto              → JSON: 7 files, +220/-40, status=ok
2. Map files to areas                        → 1 SKILL.md + analyzer rule + RULES.md + tests + payload
3. Fan out 5 task() calls in parallel        → wait for all
4. Fan out task() #6 with model override     → wait
5. Dedupe, sort, ID, mark multi-model status
6. Print stdout report

Output discipline

The final stdout block is the only user-visible output. Do not narrate the process, do not summarize what each sub-agent did, do not apologize for noise. The Coverage table already conveys what ran.

Skill frontmatter

infer: true