Game AI in 2026 collides hardest with frame-rate budgets, session-cost economics, and the modding community's ability to break any system without adversarial assumptions. This article walks the engineering deliverables for an LLM-driven game AI architecture in 2026: tier-routed inference (on-device 1-3B small model, edge 7-13B mid-size, cloud frontier) with budget-aware routing; state-machine-augmented dialogue with LLM-generated surface variation; procedural quest skeletons with LLM in-fill within writer-defined templates; multi-layer content-safety and prompt-injection defence; per-session cost budget as engineering discipline; semantic cache as first-class architectural element. PL-anchored to the Warsaw/Krakow game-dev cluster (CD Projekt Red, Techland, 11 bit, People Can Fly, Bloober Team) and globally portable. 8 anti-patterns, 5-stage maturity ladder.