Step 0

Environment Setup

On this page

Environment Setup

Complete these steps before starting the exercises. This is the Grand Finale β€” ensure your foundation from Workshops 1–3 is solid.


Verify Foundation (from WS1–3)

Use the verify-or-skip pattern: check each item; if it’s already in place, move on. If not, either complete it now or skip (the exercises will still work with reduced context).

# Verify org exists and is in the correct region
gh api /orgs/YOUR_ORG --jq '.name'

# Verify GHAS is active
gh api /repos/YOUR_ORG/YOUR_REPO --jq '{
  code_scanning: .security_and_analysis.advanced_security.status,
  secret_scanning: .security_and_analysis.secret_scanning.status
}'

# Verify .github/copilot-instructions.md exists (created in WS1)
gh api /repos/YOUR_ORG/YOUR_REPO/contents/.github/copilot-instructions.md --jq '.name' 2>/dev/null \
  && echo "βœ… Custom instructions found" \
  || echo "⚠️  Not found β€” see Appendix A in README to create"

# Verify THREAT-MODEL.md exists (created in WS2, updated in WS3)
gh api /repos/YOUR_ORG/YOUR_REPO/contents/THREAT-MODEL.md --jq '.name' 2>/dev/null \
  && echo "βœ… Threat model found" \
  || echo "⚠️  Not found β€” see Appendix A in README to create"
Checkpoint Status
Trust boundary: Org exists (Japan region), EMU configured βœ… / ⚠️ Skip
GHAS: Code scanning, secret scanning, dependency review active βœ… / ⚠️ Skip
Pipeline: OIDC deployed, attestations configured βœ… / ⚠️ Skip
Defender: Connected and monitoring container workloads βœ… / ⚠️ Skip
THREAT-MODEL.md: Rows 1–5 filled βœ… / ⚠️ Skip
.github/copilot-instructions.md: Exists with security guidelines βœ… / ⚠️ Skip

Verify Deployed Application

Confirm that your container application from Workshop 3 is running and accessible:

# Check the deployment status
kubectl get deployments -n YOUR_NAMESPACE

# Verify the application health endpoint responds
kubectl port-forward svc/YOUR_SERVICE 8080:80 -n YOUR_NAMESPACE &
curl -s http://localhost:8080/health && echo "βœ… Application is healthy"

# Verify Defender for Cloud is actively monitoring
az security assessment list --query "[?contains(resourceDetails.id, 'YOUR_CLUSTER')]" -o table

Configure SRE Agent

Connect the Azure SRE Agent to your deployed container workload:

  1. Open the Azure Portal β†’ Navigate to SRE Agent (preview)
  2. Connect your AKS cluster as a monitored resource
  3. Define alert conditions:

    Alert Condition Threshold Purpose
    Container health check failure Crash loop detected Triggers Exercise 1
    Security policy violation Anomalous process execution Alternate trigger
    Resource exhaustion CPU > 90% or Memory > 85% Secondary monitoring
  4. Configure notification channel β€” Slack, Teams, or Azure Portal notifications
  5. Verify SRE Agent has read access to your repository for investigation context
# Verify SRE Agent is connected and receiving telemetry
az monitor metrics list --resource YOUR_AKS_RESOURCE_ID \
  --metric "node_cpu_usage_percentage" --interval PT1M --output table

Configure Copilot Coding Agent

# Verify Copilot coding agent is enabled for the repository
gh api /repos/YOUR_ORG/YOUR_REPO --jq '.security_and_analysis'

# Verify branch protection requires PR reviews (human approval gate)
gh api /repos/YOUR_ORG/YOUR_REPO/branches/main/protection \
  --jq '.required_pull_request_reviews.required_approving_review_count'

Verify these capabilities:

  • Copilot coding agent can create branches, commit, push, and open PRs
  • Copilot coding agent CANNOT merge to main without human approval
  • Existing .github/copilot-instructions.md is loaded as context

Stage the Incident

⚠️ Facilitator-only step. Participants should NOT see this staging β€” they will experience the alert β€œcold” during Exercise 1.

Pre-stage a runtime incident that will trigger during Exercise 1. Choose one option:

Option A: Health Check Failure (recommended)

# Apply the misconfigured manifest β€” causes crash loop due to aggressive readiness probe
kubectl apply -f incident-scenarios/health-check-failure.yaml -n YOUR_NAMESPACE

The health-check-failure.yaml manifest has a readiness probe with timeoutSeconds: 1 and initialDelaySeconds: 1, but the application needs ~5 seconds to start. This causes the container to be killed before it becomes ready, triggering a crash loop.

Option B: Runtime Vulnerability

# Deploy with a known runtime-exploitable condition
kubectl apply -f incident-scenarios/runtime-vulnerability.yaml -n YOUR_NAMESPACE

The runtime-vulnerability.yaml manifest deploys a container running as root with a writable filesystem and no resource limits β€” conditions exploitable at runtime.

Key distinction from WS2: This incident originates from RUNTIME (container crash, Defender alert), NOT from a code push. Workshop 2 covers code-push-triggered GHAS alerts. Workshop 4 covers what happens when production breaks despite all prior guardrails.


Prepare MTTR Tracker

Open MTTR-TRACKER.md β€” each participant will record timestamps throughout the exercises.

πŸ’‘ Run scripts/verify-setup.sh after completing all setup steps to confirm your environment is ready for the Grand Finale.

← β†’ to navigate between steps