LLM Gateway by ThreadSync · For Engineering Teams

One API.
Every frontier model.

Multi-provider gateway for governed AI access. Route across Claude, GPT, Gemini, Groq, and Perplexity with one endpoint, OpenAI-compatible request shapes, policy controls, per-request audit, and cost tracking. PKCE-flow browser sessions mean no provider keys ever leak into client code.

Five providers. One audit gap.

Every team using AI right now has the same problem: one project on Anthropic, another on OpenAI, a third experiment with Gemini, a Slack bot calling Groq. Five vendor relationships, five sets of API keys, five different cost dashboards, no unified audit trail. When the CISO asks "what's our AI exposure?", you can't answer without piecing together five spreadsheets.

One endpoint. Every model. Governed.

LLM Gateway is OpenAI-compatible at the request shape, so existing code works with a single environment-variable change. What you gain on top of "switch base URL" is governance.

Auto-routing + manual pinning

Auto-routing picks the best model for each request based on policy, cost, and capability. Pin a specific model when you need to. Provider failover is automatic.

🔑

Per-org keys + model allowlists

Issue scoped tsg-* keys per org or per team. Restrict which models each key can call. Rotate or revoke without touching upstream provider keys.

📊

Per-request cost tracking + budgets

Every request returns flattened token usage and cost. Cap monthly spend per key, per team, or per org. Real-time dashboards; no 30-day-delayed invoices.

🛡️

Per-request audit log

Every request logs prompt, response, model used, latency, cost, and policy-decision metadata. Hash-chained for tamper-evidence; exportable to SIEM.

🌐

PKCE browser-safe sessions

Browser apps exchange short-lived PKCE tokens server-side. Provider API keys never reach client code. Compliant with the way modern OAuth identity flows work.

🔁

Idempotent requests + memory

Optional idempotency keys deduplicate retries. Optional conversation memory persists context server-side so you don't pay to resend it on every turn.

Drop-in compatible

If your code already speaks OpenAI's chat-completions shape, switching to LLM Gateway is a one-line change.

curl https://llmgateway.threadsync.io/v1/chat/completions
curl -X POST https://llmgateway.threadsync.io/v1/chat/completions \
  -H "x-api-key: tsg-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4-7",
    "max_tokens": 1024,
    "system": "You are a careful assistant.",
    "messages": [
      {"role": "user", "content": "Summarize this contract..."}
    ]
  }'

# Response (flattened across providers):
# {
#   "content": "...",
#   "model": "claude-opus-4-7",
#   "provider": "anthropic",
#   "usage": {"input_tokens": 412, "output_tokens": 184, "cost_usd": 0.0093}
# }

Same shape works for GPT, Gemini, Groq, Perplexity. The gateway flattens response payloads so your code reads data.content regardless of provider.

Built on ThreadSync's governance engine

LLM Gateway is the public API for the same governance engine that powers Magic Runtime and the Lift workspace. Policy evaluation, audit logging, and access control are equivalent across all three; LLM Gateway is the developer-facing entry point for teams that need governed AI access without adopting a full workspace product. Security overview →

TLS 1.3 in transit AES-256 at rest Per-request audit log SOC 2 aligned controls Hash-chained logs

Pricing

LLM Gateway is available standalone or as part of the Enterprise Platform bundle. Three tiers: Starter (evaluation), Professional (full multi-provider governed access), and Enterprise (full governance at scale). Annual contracts available.

From key issue to first request — same day

Not a 14-day procurement cycle. The full developer flow:

Step 1 · Provision keys

Issue tsg-* key + set policy

Use the admin API or the workspace UI to create a key, scope it to an org, and define which models + which monthly budget it can use. Keys are hot-rotatable; revocation is instant.

Step 2 · Wire upstream

Provide your provider keys to the gateway

Anthropic / OpenAI / Google / Groq / Perplexity keys live server-side in the gateway, never in your client code. Per-provider quota and routing rules are configured once.

Step 3 · Switch base URL

Point your code at llmgateway.threadsync.io

Existing OpenAI-compatible code works as-is — just swap the base URL and the auth header. system field at top level (not as a role-system message). Response is flattened to data.content.

Step 4 · Watch the dashboard

Cost, latency, audit — live

Every request shows provider, model, tokens, cost, latency, and policy verdict. Filter by key, team, or org. Export to SIEM via webhook or scheduled S3 dump.

How LLM Gateway fits in the platform

LLM Gateway is the developer-facing surface. Magic Runtime uses the gateway for AI calls inside its sandboxed execution layer. Lift packages the gateway behind a workspace UI for mid-market teams who don't want to write API code. All three share the same governance engine — choose the surface that matches your team.

See how the products fit together →

One API. Every frontier model. Governed.

OpenAI-compatible. Provider-agnostic. Audit-logged. Self-serve trial keys via developer signup.