← Back to OpenAI
3

Design an API Gateway for AI Models

System DesignhardCommon
api-gatewayload-balancinggpu-management

Reported

6 times

Last seen

2026-03-20

First seen

2025-09-05

Active in

2025, 2026

Description

Design an API gateway that routes requests to different model versions, handles load balancing across GPUs, and manages quotas.

Approach Tips

Discuss model-aware routing (different models need different GPU types). Cover request queuing, priority tiers, and graceful degradation under load.

Sources

Blind·SDE-3·2026-03-20
Glassdoor·Senior·2025-11-18
OA

OpenAI

AI

Typically appears in: Onsite - System Design

60 min — Design an AI infrastructure system at scale. Focus on GPU utilization, model serving, or data pipelines.