One API.
Every frontier model.
Multi-provider gateway for governed AI access. Route across Claude, GPT, Gemini, Groq, and Perplexity with one endpoint, OpenAI-compatible request shapes, policy controls, per-request audit, and cost tracking. PKCE-flow browser sessions mean no provider keys ever leak into client code.
Five providers. One audit gap.
Every team using AI right now has the same problem: one project on Anthropic, another on OpenAI, a third experiment with Gemini, a Slack bot calling Groq. Five vendor relationships, five sets of API keys, five different cost dashboards, no unified audit trail. When the CISO asks "what's our AI exposure?", you can't answer without piecing together five spreadsheets.
One endpoint. Every model. Governed.
LLM Gateway is OpenAI-compatible at the request shape, so existing code works with a single environment-variable change. What you gain on top of "switch base URL" is governance.
Auto-routing + manual pinning
Auto-routing picks the best model for each request based on policy, cost, and capability. Pin a specific model when you need to. Provider failover is automatic.
Per-org keys + model allowlists
Issue scoped tsg-* keys per org or per team. Restrict which models each key can call. Rotate or revoke without touching upstream provider keys.
Per-request cost tracking + budgets
Every request returns flattened token usage and cost. Cap monthly spend per key, per team, or per org. Real-time dashboards; no 30-day-delayed invoices.
Per-request audit log
Every request logs prompt, response, model used, latency, cost, and policy-decision metadata. Hash-chained for tamper-evidence; exportable to SIEM.
PKCE browser-safe sessions
Browser apps exchange short-lived PKCE tokens server-side. Provider API keys never reach client code. Compliant with the way modern OAuth identity flows work.
Idempotent requests + memory
Optional idempotency keys deduplicate retries. Optional conversation memory persists context server-side so you don't pay to resend it on every turn.
Drop-in compatible
If your code already speaks OpenAI's chat-completions shape, switching to LLM Gateway is a one-line change.
curl -X POST https://llmgateway.threadsync.io/v1/chat/completions \
-H "x-api-key: tsg-..." \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-7",
"max_tokens": 1024,
"system": "You are a careful assistant.",
"messages": [
{"role": "user", "content": "Summarize this contract..."}
]
}'
# Response (flattened across providers):
# {
# "content": "...",
# "model": "claude-opus-4-7",
# "provider": "anthropic",
# "usage": {"input_tokens": 412, "output_tokens": 184, "cost_usd": 0.0093}
# }
Same shape works for GPT, Gemini, Groq, Perplexity. The gateway flattens response payloads so your code reads data.content regardless of provider.
Pricing
LLM Gateway is available standalone or as part of the Enterprise Platform bundle. Three tiers: Starter (evaluation), Professional (full multi-provider governed access), and Enterprise (full governance at scale). Annual contracts available.
From key issue to first request — same day
Not a 14-day procurement cycle. The full developer flow:
Issue tsg-* key + set policy
Use the admin API or the workspace UI to create a key, scope it to an org, and define which models + which monthly budget it can use. Keys are hot-rotatable; revocation is instant.
Provide your provider keys to the gateway
Anthropic / OpenAI / Google / Groq / Perplexity keys live server-side in the gateway, never in your client code. Per-provider quota and routing rules are configured once.
Point your code at llmgateway.threadsync.io
Existing OpenAI-compatible code works as-is — just swap the base URL and the auth header. system field at top level (not as a role-system message). Response is flattened to data.content.
Cost, latency, audit — live
Every request shows provider, model, tokens, cost, latency, and policy verdict. Filter by key, team, or org. Export to SIEM via webhook or scheduled S3 dump.
How LLM Gateway fits in the platform
LLM Gateway is the developer-facing surface. Magic Runtime uses the gateway for AI calls inside its sandboxed execution layer. Lift packages the gateway behind a workspace UI for mid-market teams who don't want to write API code. All three share the same governance engine — choose the surface that matches your team.
One API. Every frontier model. Governed.
OpenAI-compatible. Provider-agnostic. Audit-logged. Self-serve trial keys via developer signup.