← Back to OpenAI
3
Design an API Gateway for AI Models
System DesignhardCommon
api-gatewayload-balancinggpu-management
Reported
6 times
Last seen
2026-03-20
First seen
2025-09-05
Active in
2025, 2026
Description
Design an API gateway that routes requests to different model versions, handles load balancing across GPUs, and manages quotas.
Approach Tips
Discuss model-aware routing (different models need different GPU types). Cover request queuing, priority tiers, and graceful degradation under load.
Sources
Blind·SDE-3·2026-03-20
Glassdoor·Senior·2025-11-18
OA
OpenAI
AI
Typically appears in: Onsite - System Design
60 min — Design an AI infrastructure system at scale. Focus on GPU utilization, model serving, or data pipelines.