Exercise 3: AI-Assisted Remediation
Exercise 3: AI-Assisted Remediation (~15 min)
Goal: Understand HOW Copilot Autofix works internally, evaluate its fix quality against your threat model, and use security campaigns for strategic batch remediation.
Part A: Copilot Autofix — Single Alert (~7 min)
📝 Open
docs/autofix-evaluation.md— record your quality evaluation as you review the Autofix suggestion.
How Copilot Autofix Works
CodeQL Alert
│
├─ Vulnerability type + severity
├─ Data flow path (source → sink)
├─ Surrounding code context
└─ Best practice for this vulnerability class
│
▼
LLM (GPT-4o / GPT-5.1)
│
├─ Code fix (diff)
├─ Natural language EXPLANATION of the fix
└─ Confidence level
Autofix is NOT just “AI writes code.” It receives the FULL CodeQL analysis — including the data flow path you traced in Exercise 2 — and generates a fix with a human-readable explanation of WHY the change addresses the vulnerability.
Step 1: Navigate to the Alert
Navigate to the CodeQL alert from Exercise 2 (SQL injection in sql-injection.js):
- Go to Security tab → Code scanning alerts → click the SQL injection alert
Step 2: Generate a Fix
Click “Generate fix” (Copilot Autofix)
Step 3: Read the Explanation FIRST
Before reviewing the diff, read the natural language explanation:
- Autofix provides a natural language explanation of the vulnerability and the proposed fix
- Verify: Does the explanation correctly identify the vulnerability type?
- Verify: Does the reasoning for the fix make sense?
- Verify: Does it reference the data flow you traced in Exercise 2 (source → sink)?
Step 4: Review the Suggested Diff
Review the suggested diff carefully:
- Does it use parameterized queries instead of string concatenation?
- Does it handle edge cases (null inputs, special characters)?
- Does it introduce any new issues (e.g., breaking the function signature)?
Step 5: Check Alignment with Threat-Derived Instructions
Your .github/copilot-instructions.md (from the Warm-Up) says:
- Row 2: “Always use parameterized queries — never string concatenation”
Does the Autofix suggestion follow this instruction? → If yes, threat model → AI fix alignment confirmed
Step 6: Rate the Fix Quality
| Rating | Meaning | Action |
|---|---|---|
| ✅ Correct | Fix is complete, aligns with instructions, introduces no new issues | Apply and commit |
| ⚠️ Partially correct | Fix addresses the vulnerability but needs refinement | Apply, then manually improve |
| ❌ Incorrect | Fix is wrong or introduces new vulnerabilities | Reject and fix manually |
Step 7: Apply the Fix
If the fix is correct, apply it:
- Click “Commit fix” to create a commit with the Copilot-suggested change
- Verify the CodeQL alert is resolved after the next scan
[!NOTE] Speed comparison: This fix took ~30 seconds with Autofix. A manual fix for SQL injection typically takes 15–30 minutes (understanding context, writing parameterized queries, testing). That’s 30–60x faster for a single alert. At scale across hundreds of alerts, the productivity impact is transformative.
[!IMPORTANT] Copilot Autofix is not “accept all.” Every suggestion requires human evaluation. The value is speed of remediation, not removal of human judgment. The explanation + threat model alignment check ensures you understand WHAT is being fixed and WHY.
Part B: Security Campaigns — Batch Remediation (~8 min)
📝 Open
docs/security-campaign-tracker.md— track your campaign strategy and per-fix evaluations.
Step 1: Choose a Campaign Strategy
Before creating a campaign, decide HOW to batch alerts:
| Strategy | When to Use | Best For |
|---|---|---|
| By severity | Burn down critical/high alerts first | Risk-prioritized remediation |
| By vulnerability type | Fix one class systematically | Consistent fix patterns (e.g., all SQL injection) |
| By repository | Focus one codebase at a time | Team-scoped remediation sprints |
| All remaining | Quick cleanup of workshop alerts | This workshop |
For this workshop, batch by vulnerability type — select all remaining SQL injection and dependency alerts.
Step 2: Create the Campaign
- Navigate to the Security tab → Security campaigns
- Click “New campaign”
- Name:
Workshop 2 — SQL Injection Remediation - Select remaining open GHAS alerts matching your chosen strategy
- Set the campaign scope
Step 3: Assign to Copilot Coding Agent
- Select the campaign → click “Assign to Copilot”
- Copilot coding agent spins up in a sandboxed, ephemeral GitHub Actions environment
- The agent analyzes alerts, implements fixes, runs tests, and opens a pull request
Step 4: Monitor Agent Progress
- Navigate to the Agents tab (or the PR page)
- Observe: which files are being modified, what fixes are being applied
- Note: the agent reads your
.github/copilot-instructions.md— your threat-derived instructions guide its fix approach
Step 5: Governance Review — Evaluate EACH Fix Individually
Open the PR created by Copilot coding agent. For each fix in the batch PR, evaluate:
| # | Alert | Fix Correct? | Aligns with Instructions? | Introduces Regressions? |
|---|---|---|---|---|
| 1 | ✅/⚠️/❌ | Yes / No | Yes / No | |
| 2 | ✅/⚠️/❌ | Yes / No | Yes / No | |
| 3 | ✅/⚠️/❌ | Yes / No | Yes / No |
Batch ≠ auto-approve. Each fix needs individual evaluation. A batch PR with 8 correct fixes and 2 incorrect ones should NOT be merged as-is.
Step 6: Discuss Scale and Governance
“This campaign addressed a handful of alerts. At enterprise scale:
- 500 alerts across 50 repos — How would you organize campaigns?
- Who creates campaigns? — Requires Security Manager or Admin role
- Who reviews batch PRs? — Security champions per team, or centralized review?
- What if some fixes are wrong? — Request changes on specific commits, or split the PR?”
Key Insight
Copilot Autofix provides targeted, single-alert fixes with explainable reasoning — verify against your threat model. Security campaigns enable managing security debt at scale by batching alerts strategically and leveraging Copilot coding agent. Both require human judgment: Autofix for quality, campaigns for strategy and governance.
[!TIP] Run
scripts/verify-exercise3.shto validate your Exercise 3 completion.
Hand-Off to Workshop 3
🔗 Our guardrails caught vulnerable code, leaked secrets, and risky dependencies. The code is now clean and merged. But is the PIPELINE that builds and deploys it trustworthy? How do we know the artifact in production is actually what we reviewed? That’s supply chain integrity — the focus of Workshop 3.