4032
model brief

OpenAI

fast tier · 2025

OpenAI

GPT-5 nano

Compact GPT-5 tier focused on low-latency responses and strong cost efficiency for high-volume workloads.

Context window

128k tokens

Peak context for this model.

Availability

OpenAI API, Responses API, Batch API

Where you can run it.

Modalities

Text · Code

Input/output coverage.

Pricing

$0.05 / 1M input tokens, $0.40 / 1M output tokens

Latency: Very low; optimized for high-throughput production traffic

Strengths

  • Very low cost per token for high-throughput APIs and assistants.
  • Fast response times for routing, classification, and lightweight generation.
  • Reliable structured outputs for tool calls and automations.

Best for

  • Budget-sensitive chat and workflow orchestration.
  • High-frequency summarization, tagging, and extraction pipelines.
  • Large-scale batch processing where predictable spend matters.

Summary

  • Tier: fast
  • Release: 2025
  • Latency: Very low; optimized for high-throughput production traffic