Step 0

Environment Setup

On this page

Environment Setup

Complete these steps before starting the exercises. This is the Grand Finale — ensure your foundation from Workshops 1–3 is solid.

Verify Foundation (from WS1–3)

Use the verify-or-skip pattern: check each item; if it’s already in place, move on. If not, either complete it now or skip (the exercises will still work with reduced context).

# Verify org exists and is in the correct region
gh api /orgs/YOUR_ORG --jq '.name'

# Verify GHAS is active
gh api /repos/YOUR_ORG/YOUR_REPO --jq '{
  code_scanning: .security_and_analysis.advanced_security.status,
  secret_scanning: .security_and_analysis.secret_scanning.status
}'

# Verify .github/copilot-instructions.md exists (created in WS1)
gh api /repos/YOUR_ORG/YOUR_REPO/contents/.github/copilot-instructions.md --jq '.name' 2>/dev/null \
  && echo "✅ Custom instructions found" \
  || echo "⚠️  Not found — see Appendix A in README to create"

# Verify THREAT-MODEL.md exists (created in WS2, updated in WS3)
gh api /repos/YOUR_ORG/YOUR_REPO/contents/THREAT-MODEL.md --jq '.name' 2>/dev/null \
  && echo "✅ Threat model found" \
  || echo "⚠️  Not found — see Appendix A in README to create"

Checkpoint	Status
Trust boundary: Org exists (Japan region), EMU configured	✅ / ⚠️ Skip
GHAS: Code scanning, secret scanning, dependency review active	✅ / ⚠️ Skip
Pipeline: OIDC deployed, attestations configured	✅ / ⚠️ Skip
Defender: Connected and monitoring container workloads	✅ / ⚠️ Skip
THREAT-MODEL.md: Rows 1–5 filled	✅ / ⚠️ Skip
`.github/copilot-instructions.md`: Exists with security guidelines	✅ / ⚠️ Skip

Verify Deployed Application

Confirm that your container application from Workshop 3 is running and accessible:

# Check the deployment status
kubectl get deployments -n YOUR_NAMESPACE

# Verify the application health endpoint responds
kubectl port-forward svc/YOUR_SERVICE 8080:80 -n YOUR_NAMESPACE &
curl -s http://localhost:8080/health && echo "✅ Application is healthy"

# Verify Defender for Cloud is actively monitoring
az security assessment list --query "[?contains(resourceDetails.id, 'YOUR_CLUSTER')]" -o table

Configure SRE Agent

Connect the Azure SRE Agent to your deployed container workload:

Open the Azure Portal → Navigate to SRE Agent (preview)
Connect your AKS cluster as a monitored resource

Define alert conditions:

Alert Condition	Threshold	Purpose
Container health check failure	Crash loop detected	Triggers Exercise 1
Security policy violation	Anomalous process execution	Alternate trigger
Resource exhaustion	CPU > 90% or Memory > 85%	Secondary monitoring

Configure notification channel — Slack, Teams, or Azure Portal notifications
Verify SRE Agent has read access to your repository for investigation context

# Verify SRE Agent is connected and receiving telemetry
az monitor metrics list --resource YOUR_AKS_RESOURCE_ID \
  --metric "node_cpu_usage_percentage" --interval PT1M --output table

Configure Copilot Coding Agent

# Verify Copilot coding agent is enabled for the repository
gh api /repos/YOUR_ORG/YOUR_REPO --jq '.security_and_analysis'

# Verify branch protection requires PR reviews (human approval gate)
gh api /repos/YOUR_ORG/YOUR_REPO/branches/main/protection \
  --jq '.required_pull_request_reviews.required_approving_review_count'

Verify these capabilities:

Copilot coding agent can create branches, commit, push, and open PRs
Copilot coding agent CANNOT merge to main without human approval
Existing .github/copilot-instructions.md is loaded as context

Stage the Incident

⚠️ Facilitator-only step. Participants should NOT see this staging — they will experience the alert “cold” during Exercise 1.

Pre-stage a runtime incident that will trigger during Exercise 1. Choose one option:

Option A: Health Check Failure (recommended)

# Apply the misconfigured manifest — causes crash loop due to aggressive readiness probe
kubectl apply -f incident-scenarios/health-check-failure.yaml -n YOUR_NAMESPACE

The health-check-failure.yaml manifest has a readiness probe with timeoutSeconds: 1 and initialDelaySeconds: 1, but the application needs ~5 seconds to start. This causes the container to be killed before it becomes ready, triggering a crash loop.

Option B: Runtime Vulnerability

# Deploy with a known runtime-exploitable condition
kubectl apply -f incident-scenarios/runtime-vulnerability.yaml -n YOUR_NAMESPACE

The runtime-vulnerability.yaml manifest deploys a container running as root with a writable filesystem and no resource limits — conditions exploitable at runtime.

Key distinction from WS2: This incident originates from RUNTIME (container crash, Defender alert), NOT from a code push. Workshop 2 covers code-push-triggered GHAS alerts. Workshop 4 covers what happens when production breaks despite all prior guardrails.

Prepare MTTR Tracker

Open MTTR-TRACKER.md — each participant will record timestamps throughout the exercises.

💡 Run scripts/verify-setup.sh after completing all setup steps to confirm your environment is ready for the Grand Finale.

← → to navigate between steps