Background texture

One API, All Models

Access OpenAI, Anthropic, Google, and more through a single OpenAI-compatible endpoint. Zero markup on inference costs.

Chat Completions
~/modelmax $ curl https://api.modelmax.io/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-..." \ -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
Video Generation
~/modelmax $ curl -X POST https://api.modelmax.io/v1/queue/veo-2.0-high \ -H "Content-Type: application/json" \ -H "Authorization: Bearer sk-..." \ -d '{
    "prompt": "A cinematic shot of a futuristic city...",
    "webhook_url": "https://your-domain.com/webhook"
  }'

Frontier models through one gateway

Use ModelMax to call text, vision, audio, and video models through one API.

47+

Models

12

Providers

4

Capabilities

1

API Key

Gemini + Embedding

Gemini + Embedding

Google's most capable models, excelling at complex reasoning, deep multimodal understanding, and high-quality semantic embeddings.

Capabilities
Gemini 3.1 Pro Preview
gemini-3.1-pro-preview
Chat2M tokens
Gemini 3 Pro Preview
gemini-3-pro-preview
Chat2M tokens
Gemini 3 Flash Preview
gemini-3-flash-preview
Chat1M tokens
Text Embedding 005
text-embedding-005
Embedding
Google Veo

Google Veo

State-of-the-art cinematic video generation with synchronized speech, sound effects, and prompt fidelity.

Capabilities
Veo 3.1
veo-3.1
Video Gen
Veo 3.1 Fast
veo-3.1-fast
Video Gen
Veo 3
veo-3
Video Gen
Veo 3 Fast
veo-3-fast
Video Gen
OpenAI GPT

OpenAI GPT

OpenAI's latest GPT models for multimodal chat, coding, structured reasoning, and high-volume automation.

Capabilities
GPT-5.5
gpt-5.5
Chat
GPT-5.4
gpt-5.4
Chat
GPT-5.4 Mini
gpt-5.4-mini
Chat
GPT-5.4 Nano
gpt-5.4-nano
Chat
Claude

Claude

Anthropic's Claude models with native Messages API support behind the unified OpenAI-compatible chat endpoint.

Capabilities
Claude Opus 4.8
claude-opus-4-8
Chat
Claude Sonnet 4.6
claude-sonnet-4-6
Chat
Claude Haiku 4.5
claude-haiku-4-5
Chat
xAI Grok

xAI Grok

xAI's Grok models hosted through Google Cloud for advanced instruction following, synthesis, and fast high-volume text workflows.

Capabilities
Grok 4.3
grok-4.3
Chat
Grok 4.1 Fast
grok-4.1-fast-non-reasoning
Chat
Kimi

Kimi

Moonshot's Kimi models combine Chinese-English reasoning, coding, and long-context capabilities for production workflows.

Capabilities
Kimi K2.6
kimi-k2.6
Chat128K tokens
Kimi K2 Thinking
kimi-k2-thinking
Chat128K tokens
Kimi K2.5
kimi-k2.5
Chat128K tokens
MiniMax

MiniMax

Frontier large language model with strong reasoning, creativity, and highly consistent instruction-following.

Capabilities
MiniMax M2
minimax-m2
Chat1M tokens
MiniMax M2.1
minimax-m2.1
Chat1M tokens
MiniMax M2.7
minimax-m2.7
Chat1M tokens
MiniMax M2.5
minimax-m2.5
Chat1M tokens
DeepSeek

DeepSeek

Top-tier open-source reasoning model with remarkable performance on STEM, coding, and mathematical benchmarks.

Capabilities
DeepSeek R1
deepseek-r1
Chat128K tokens
DeepSeek V3.1
deepseek-v3.1
Chat128K tokens
DeepSeek V3.2
deepseek-v3.2
Chat128K tokens
Qwen

Qwen

Alibaba's robust MoE model excelling at code generation, logic, and comprehensive multilingual capabilities.

Capabilities
Qwen3 Max
qwen3-max
Chat128K tokens
Qwen3.5 Plus
qwen3.5-plus
Chat128K tokens
Qwen3.5 Flash
qwen3.5-flash
Chat128K tokens
Qwen3 Coder 480B A35B
qwen3-coder-480b-a35b
Chat128K tokens
China DirectChina DirectChina DirectChina Direct

China Direct

Direct OpenAI-compatible access to GLM, Doubao, ERNIE, and Hunyuan models using provider API keys configured on the server.

Capabilities
GLM-5.1
glm-5.1
Chat128K tokens
Doubao Seed 2.0 Pro
doubao-seed-2-0-pro-260215
Chat128K tokens
ERNIE 5.1
ernie-5.1
Chat128K tokens
Hunyuan 3 Preview
hy3-preview
Chat32K+ tokens

Developer-first model gateway

ModelMax brings model access, cost control, and usage observability into one dashboard.

Zero markup

Pay transparent provider inference costs without platform markup.

Unified API

Call models from multiple providers through one OpenAI-compatible endpoint.

Usage analytics

Track requests by date, model, tokens, and cost.

Developer experience

Works with common SDKs and keeps migration overhead low.

Background pattern

One control plane for production AI apps

ModelMax brings model access, cost control, and usage observability into one dashboard.