Choosing between synchronous, asynchronous, and event-driven AI architecture patterns is one of the most consequential decisions in production AI system design. This authoritative guide breaks down each pattern's production mechanics, GPU scheduling implications, cost tradeoffs, failure modes, and scaling strategies — with real architecture diagrams and hard-won engineering insights from systems built at scale.