LLM systems fail in ways the SRE runbook of the last decade does not anticipate. This article walks the engineering deliverables for an LLM-aware incident response architecture in 2026: severity classification adapted to LLM failure surfaces; detection signal stack (eval drift, guardrail trips, cost spikes, latency p99, hallucination rate, user reports); six containment primitives operable from a single console (model pin, prompt rollback, retrieval quarantine, canary halt, traffic shape, kill-switch); RCA template with LLM failure classes (hallucination, prompt injection, model regression, retrieval poisoning, vendor outage, jailbreak, context-window leak, agentic loop) and LLM-specific action item types; blameless culture extended to model contributions; on-call rota with primary, secondary, incident commander, and subject-matter dimensions. 8 anti-patterns, 5-stage maturity ladder, composition with AI observability, prompt versioning, human escalation, and AI-native CI/CD.