fast tier · 2025
OpenAI
GPT-5 nano
Compact GPT-5 tier focused on low-latency responses and strong cost efficiency for high-volume workloads.
Context window
128k tokens
Peak context for this model.
Availability
OpenAI API, Responses API, Batch API
Where you can run it.
Modalities
Text · Code
Input/output coverage.
Pricing
$0.05 / 1M input tokens, $0.40 / 1M output tokens
Latency: Very low; optimized for high-throughput production traffic
Strengths
- Very low cost per token for high-throughput APIs and assistants.
- Fast response times for routing, classification, and lightweight generation.
- Reliable structured outputs for tool calls and automations.
Best for
- Budget-sensitive chat and workflow orchestration.
- High-frequency summarization, tagging, and extraction pipelines.
- Large-scale batch processing where predictable spend matters.
Summary
- Tier: fast
- Release: 2025
- Latency: Very low; optimized for high-throughput production traffic