Skip to content
Guides · Model Routing

Model Routing

How AllToken routes requests to the best available provider.


How routing works

AllToken routes each request to the best available provider for the requested model, based on:

  • Availability — is the provider healthy and responding?
  • Latency — which provider has the lowest response time?
  • Cost — which provider offers the best price?
  • Priority — does the account have provider preferences configured?

Routing is automatic. No provider selection logic required on your end.

Routing modes

Configure routing behavior per API key in Settings → API Keys:

  • Smart Routing — AllToken selects the best provider path automatically. Recommended for most use cases.
  • Default Model — requests without a model field fall back to this one.
  • Forced Model — overrides all incoming requests to use a specific model, regardless of what the client sends.

Provider priority

When a model is available through multiple providers, AllToken assigns each a priority score based on historical performance. View the ranking on any model's detail page under "Available Providers".

Priority 1 is the preferred provider. If it's unavailable, AllToken falls back to priority 2, and so on.

Multi-provider models

Popular models like Claude and GPT-4o are served by multiple providers — direct API, Amazon Bedrock, Google Vertex AI, and others. AllToken routes to the healthiest, fastest option automatically.

See which providers serve a model on its detail page under the "Providers" tab.