Step 4

Exercise 3: AI-Assisted Remediation

On this page

Exercise 3: AI-Assisted Remediation (~15 min)

Goal: Understand HOW Copilot Autofix works internally, evaluate its fix quality against your threat model, and use security campaigns for strategic batch remediation.

Part A: Copilot Autofix — Single Alert (~7 min)

📝 Open docs/autofix-evaluation.md — record your quality evaluation as you review the Autofix suggestion.

How Copilot Autofix Works

CodeQL Alert
  │
  ├─ Vulnerability type + severity
  ├─ Data flow path (source → sink)
  ├─ Surrounding code context
  └─ Best practice for this vulnerability class
       │
       ▼
  LLM (GPT-4o / GPT-5.1)
       │
       ├─ Code fix (diff)
       ├─ Natural language EXPLANATION of the fix
       └─ Confidence level

Autofix is NOT just “AI writes code.” It receives the FULL CodeQL analysis — including the data flow path you traced in Exercise 2 — and generates a fix with a human-readable explanation of WHY the change addresses the vulnerability.

Step 1: Navigate to the Alert

Navigate to the CodeQL alert from Exercise 2 (SQL injection in sql-injection.js):

Go to Security tab → Code scanning alerts → click the SQL injection alert

Step 2: Generate a Fix

Click “Generate fix” (Copilot Autofix)

Step 3: Read the Explanation FIRST

Before reviewing the diff, read the natural language explanation:

Autofix provides a natural language explanation of the vulnerability and the proposed fix
Verify: Does the explanation correctly identify the vulnerability type?
Verify: Does the reasoning for the fix make sense?
Verify: Does it reference the data flow you traced in Exercise 2 (source → sink)?

Step 4: Review the Suggested Diff

Review the suggested diff carefully:

Does it use parameterized queries instead of string concatenation?
Does it handle edge cases (null inputs, special characters)?
Does it introduce any new issues (e.g., breaking the function signature)?

Step 5: Check Alignment with Threat-Derived Instructions

Your .github/copilot-instructions.md (from the Warm-Up) says:

Row 2: “Always use parameterized queries — never string concatenation”

Does the Autofix suggestion follow this instruction? → If yes, threat model → AI fix alignment confirmed

Step 6: Rate the Fix Quality

Rating	Meaning	Action
✅ Correct	Fix is complete, aligns with instructions, introduces no new issues	Apply and commit
⚠️ Partially correct	Fix addresses the vulnerability but needs refinement	Apply, then manually improve
❌ Incorrect	Fix is wrong or introduces new vulnerabilities	Reject and fix manually

Step 7: Apply the Fix

If the fix is correct, apply it:

Click “Commit fix” to create a commit with the Copilot-suggested change
Verify the CodeQL alert is resolved after the next scan

[!NOTE] Speed comparison: This fix took ~30 seconds with Autofix. A manual fix for SQL injection typically takes 15–30 minutes (understanding context, writing parameterized queries, testing). That’s 30–60x faster for a single alert. At scale across hundreds of alerts, the productivity impact is transformative.

[!IMPORTANT] Copilot Autofix is not “accept all.” Every suggestion requires human evaluation. The value is speed of remediation, not removal of human judgment. The explanation + threat model alignment check ensures you understand WHAT is being fixed and WHY.

Part B: Security Campaigns — Batch Remediation (~8 min)

📝 Open docs/security-campaign-tracker.md — track your campaign strategy and per-fix evaluations.

Step 1: Choose a Campaign Strategy

Before creating a campaign, decide HOW to batch alerts:

Strategy	When to Use	Best For
By severity	Burn down critical/high alerts first	Risk-prioritized remediation
By vulnerability type	Fix one class systematically	Consistent fix patterns (e.g., all SQL injection)
By repository	Focus one codebase at a time	Team-scoped remediation sprints
All remaining	Quick cleanup of workshop alerts	This workshop

For this workshop, batch by vulnerability type — select all remaining SQL injection and dependency alerts.

Step 2: Create the Campaign

Navigate to the Security tab → Security campaigns
Click “New campaign”
Name: Workshop 2 — SQL Injection Remediation
Select remaining open GHAS alerts matching your chosen strategy
Set the campaign scope

Step 3: Assign to Copilot Coding Agent

Select the campaign → click “Assign to Copilot”
Copilot coding agent spins up in a sandboxed, ephemeral GitHub Actions environment
The agent analyzes alerts, implements fixes, runs tests, and opens a pull request

Step 4: Monitor Agent Progress

Navigate to the Agents tab (or the PR page)
Observe: which files are being modified, what fixes are being applied
Note: the agent reads your .github/copilot-instructions.md — your threat-derived instructions guide its fix approach

Step 5: Governance Review — Evaluate EACH Fix Individually

Open the PR created by Copilot coding agent. For each fix in the batch PR, evaluate:

#	Fix Correct?	Aligns with Instructions?	Introduces Regressions?
1	✅/⚠️/❌	Yes / No	Yes / No
2	✅/⚠️/❌	Yes / No	Yes / No
3	✅/⚠️/❌	Yes / No	Yes / No

Batch ≠ auto-approve. Each fix needs individual evaluation. A batch PR with 8 correct fixes and 2 incorrect ones should NOT be merged as-is.

Step 6: Discuss Scale and Governance

“This campaign addressed a handful of alerts. At enterprise scale:

500 alerts across 50 repos — How would you organize campaigns?

Who creates campaigns? — Requires Security Manager or Admin role

Who reviews batch PRs? — Security champions per team, or centralized review?

What if some fixes are wrong? — Request changes on specific commits, or split the PR?”

Key Insight

Copilot Autofix provides targeted, single-alert fixes with explainable reasoning — verify against your threat model. Security campaigns enable managing security debt at scale by batching alerts strategically and leveraging Copilot coding agent. Both require human judgment: Autofix for quality, campaigns for strategy and governance.

[!TIP] Run scripts/verify-exercise3.sh to validate your Exercise 3 completion.

Hand-Off to Workshop 3

🔗 Our guardrails caught vulnerable code, leaked secrets, and risky dependencies. The code is now clean and merged. But is the PIPELINE that builds and deploys it trustworthy? How do we know the artifact in production is actually what we reviewed? That’s supply chain integrity — the focus of Workshop 3.

← → to navigate between steps