# PRXVT API: Compact Integration Guide for LLMs and Agents

PRXVT is a private, OpenAI-compatible LLM gateway. The server cannot link a wallet to its prompts. No KYC. Use this file as the canonical compact integration context. For the live machine-readable model catalog, call GET https://api.prxvt.ai/v1/models.

Autonomous agents: pay with x402 per-call. It needs no account and no browser sign-in, just a funded EVM wallet. The API-key path requires an interactive browser wallet sign-in (SIWE) that an unattended agent cannot complete, so do not try to use it headlessly. Jump to the x402 section.

Use HTTPS only. The API is text chat completions in the OpenAI shape; there is no image or video endpoint.

## Base URL

https://api.prxvt.ai
OpenAI-compatible chat completions: POST https://api.prxvt.ai/v1/chat/completions

## Paying: pick by what you are

If you are an AUTONOMOUS AGENT with a funded EVM wallet, you have two good options. Both are fully headless (a SIWE login is just a wallet SIGNATURE, no browser needed):
- x402 per-call (simplest, zero state): one signed USDC payment buys exactly one chat completion. No account, no key. Best for one-off or low-volume use. Full details in the x402 section below.
- Top up + bearer key (best for many calls): fund a balance keyed to your wallet (x402 top-up, or a plain USDC transfer to the server address from GET https://api.prxvt.ai/v1/credits/info), then mint a reusable API key by signing a SIWE challenge with that SAME wallet. Then send Authorization: Bearer prxvt-sk-... and skip per-call signing. See "Authentication (API key)" below for the exact 4-call flow.

If your wallet holds enough staked PRXVT (stPRXVT on Base), you get a FREE daily inference allowance. Mint a "staker inference key" (same SIWE flow, then POST /v1/keys with quotaUsdc:0) and your agent spends that allowance with no balance needed. See the "Staked PRXVT holders" section below.

The browser UI at https://prxvt.ai/api-keys is just a convenience for humans; everything it does is available headlessly over the API.

All payments are USDC on Base. No KYC.

## Pricing (estimate before you call)

Per-token, billed in USDC. Each model's exact rate is in GET https://api.prxvt.ai/v1/models (pricing.inputUsdPer1M / pricing.outputUsdPer1M) and the live per-model rates are listed at the bottom of this file. These rates already INCLUDE the service margin (the same on the API-key and x402 paths), so they are what you actually pay, not the raw upstream rate. To estimate a single call: (input_tokens / 1e6 * inputRate) + (output_tokens / 1e6 * outputRate). Exception: prxvt-moa (multi-model) does NOT follow that formula -- it is billed as the sum of several models. See "PRXVT-MoA" below.

Worked example with gpt-4o-mini ($0.165/1M input, $0.66/1M output, including the 10% margin): a 1,000-input + 500-output call costs about $0.00049 on either the API-key or the x402 path (same margin on both). x402 quotes the price up front as (input_tokens + max_tokens) * rate, so set max_tokens close to what you actually expect or you will be quoted for the full ceiling.

## PRXVT-MoA (multi-model, premium)

prxvt-moa is not a single upstream model. Selecting it makes PRXVT query several frontier models in parallel (the "panel"), then a strong aggregator synthesizes ONE best answer from their drafts. Reach for it when answer quality matters more than cost or latency: it runs several models per call, so it is the most expensive option and noticeably slower than a single model.

- Billing: the WORST CASE = the SUM of every sub-call (each panel model + the synthesizer + a small judge pass), quoted up front, NOT the single per-1M rate shown in the model list. There is no per-call reconcile/refund on any rail. On x402 that worst-case is one on-chain settlement per call; on a bearer/session balance it is a single off-chain debit.
- How to call it: send the OpenAI-shaped body with "model": "prxvt-moa" exactly like any other model. The reply is a normal OpenAI-compatible completion (nothing extra to parse).
- Rails: works on the bearer-key / session path AND on x402 per-call. Not available for image generation, and not on the legacy ZK rail.
- The PRXVT web app also shows a "model agreement" panel under each MoA answer (where the panel agreed vs. differed). That panel is web-only; the raw API stays cleanly OpenAI-compatible and does not include it.

## Authentication (API key): headless SIWE flow (no browser)

Minting an API key proves control of your wallet via a Sign-In-With-Ethereum challenge. That is just an EIP-191 message signature (personal_sign) -- an unattended agent CAN do this with its own private key; no browser is involved. Four calls:

1. POST https://api.prxvt.ai/v1/account/challenge   (no body)
   -> { nonce, message, expiresAt }. `message` is the human-readable string to sign.
2. Sign `message` with your wallet (e.g. viem walletClient.signMessage({ message }) or ethers signer.signMessage(message)).
3. POST https://api.prxvt.ai/v1/account/login   { wallet: "0x...", nonce, signature }
   -> { token, wallet, expiresAt }. `token` is a 24h session JWT.
4. POST https://api.prxvt.ai/v1/keys   with header  Authorization: Bearer <token>   (body { label?, quotaUsdc? })
   -> { plaintext: "prxvt-sk-..." }. The plaintext is shown ONCE; store it now.

Then call inference with the key:

    Authorization: Bearer prxvt-sk-<48 hex>

Each call debits the wallet's balance at the model's per-token rate (fund it via x402 top-up or a USDC transfer to the server address). Keys are sha256-hashed at rest; the plaintext is shown once on mint. The session token from step 3 also works directly as a Bearer token on https://api.prxvt.ai/v1/chat/completions if you'd rather not mint a key.

## x402 pay-per-call (no account). READ THIS CAREFULLY

Endpoint (use this exact URL): POST https://api.prxvt.ai/v1/chat/completions
This is the SAME endpoint as the API-key path; it accepts either an Authorization bearer token OR an x402 payment. There is also a short alias https://x402.prxvt.ai/chat/completions that does the same thing, but prefer the canonical https://api.prxvt.ai URL above so you are not juggling two hostnames.

This is x402 protocol VERSION 2. The two mistakes that waste the most time are the header NAME and the payload SHAPE. Get these exactly right.

- Payment header name: PAYMENT-SIGNATURE. The legacy v1 name X-PAYMENT is also accepted as a fallback. Names like "PAYMENT" or "X-402-Payment" are NOT read. Prefer PAYMENT-SIGNATURE.
- The 402 requirements arrive in the PAYMENT-REQUIRED RESPONSE header (base64 JSON), not in the body.
- The 402 BODY now tells you what is wrong: error "payment_required" is normal discovery; error "invalid_payment" means a payment header was present but rejected, and includes a checklist. It is no longer an empty {}.
- The payment payload (before base64) MUST be JSON with a top-level numeric "x402Version": 2 AND a top-level "accepted" equal to the exact accepts[] entry you selected (copied verbatim: scheme, network, asset, payTo, amount). Do NOT derive x402Version from accepts[] (entries do not carry it). Do NOT omit "accepted".
- Network: Base (eip155:8453). Scheme: exact. Currency: USDC in atomic units (6 decimals). Gasless via EIP-3009 transferWithAuthorization: you sign typed data, PRXVT submits on-chain.

Flow:
1. POST the same chat body to https://api.prxvt.ai/v1/chat/completions with no auth and no payment header.
2. Expect HTTP 402. Read the PAYMENT-REQUIRED response header.
3. base64-decode PAYMENT-REQUIRED to JSON. Pick accepts[0] (Base).
4. Build the v2 payload:
   { "x402Version": 2, "accepted": <that accepts entry>, "payload": { "authorization": { "from": <payer>, "to": <payTo>, "value": <amount>, "validAfter": <unix>, "validBefore": <unix+~600>, "nonce": <32 random bytes hex> }, "signature": <EIP-712 sig> } }
   EIP-712 domain for USDC TransferWithAuthorization: name "USD Coin", version "2", chainId 8453, verifyingContract = accepts.asset.
5. base64 the payload JSON and set PAYMENT-SIGNATURE to that string.
6. Retry the SAME request. Success returns the normal completion plus a PAYMENT-RESPONSE header (settlement receipt).
7. x402 per-call must be NON-streaming (omit stream or send "stream": false). Streaming is supported only on the API-key path.

Use the official @x402 SDK (@x402/core, @x402/evm) to build and sign the payload.

CONCURRENCY: each x402 payment signs a fresh EIP-3009 authorization. A single shared x402 client/signer can race when two calls sign at once (one call then gets a 402 / signature error). For parallel requests, create a SEPARATE x402 client (and nonce) per in-flight call, or serialize signing. This is a client-SDK concern, not a server limit.

## x402 top-up (fund a balance once)

POST https://api.prxvt.ai/v1/credits/topup (alias https://x402.prxvt.ai/topup) with the same 402 discovery + PAYMENT-SIGNATURE retry. The credited balance is keyed to the PAYING wallet. To then SPEND it, mint a key (or get a session) with that SAME wallet via the headless SIWE flow in "Authentication (API key)" above: challenge -> sign -> login -> POST /v1/keys. The new key debits the balance you just funded. (The topup response itself carries no key -- it only moves USDC into your wallet's balance; the key is minted separately and is reusable.)

## Staked PRXVT holders: free daily inference

If you (or your agent's wallet) hold enough staked PRXVT (stPRXVT on Base), you qualify for a free daily allowance of inference. It is a separate, expiring bucket: it never mixes with any paid balance, resets every day at UTC midnight, and does not roll over. Eligibility is checked on-chain against the staked balance held across a trailing window, so it revokes itself automatically if the wallet drops below the threshold. The exact stake threshold and daily amount are set by the server operator; sign in (below) and read GET https://api.prxvt.ai/v1/account to see your live allowance for today (fields allowanceUsdc and allowanceResetsAt appear only when you are eligible).

The allowance is per WALLET and shared across all of that wallet's keys and sessions, so minting more keys does not multiply it. There are two ways to spend it, both on the normal POST https://api.prxvt.ai/v1/chat/completions endpoint:

1. Staker Inference Key (RECOMMENDED for agents). Mint a "staker inference key" (a key with quotaUsdc = 0) via the headless SIWE flow above with the holding wallet: challenge -> sign -> login -> POST https://api.prxvt.ai/v1/keys { "quotaUsdc": 0 }. (The same is doable in the browser UI at https://prxvt.ai/api-keys.) Use the returned prxvt-sk-... like any other bearer key: Authorization: Bearer prxvt-sk-.... Because its quota is zero it can ONLY draw on the free daily allowance and can NEVER touch the wallet's paid balance, so it is safe to hand to a sub-agent. It is the same key day to day until you rotate it, and it goes quiet on its own the day the wallet falls below the stake threshold (calls then return 402 / quota_exhausted).

2. Session token. The 24h session JWT from step 3 of the SIWE flow itself authorizes POST https://api.prxvt.ai/v1/chat/completions when sent as Authorization: Bearer <session-jwt>. A session-authed call spends the free allowance first and only then falls through to the wallet's paid balance. The Staker Key is usually nicer for agents (reusable for 24h+ vs a fresh login, and quota=0 means it can never touch the paid balance), but both work headlessly.

When today's allowance is used up, calls fall back to paid balance (session path) or stop (staker key); either way a fresh allowance is granted on the next call after the UTC reset. The x402 pay-per-call path does not consume the allowance, so a staked holder who wants free inference should use one of the two paths above, not x402.

## Endpoints

- POST /v1/chat/completions          auth: bearer | x402     OpenAI-compatible chat. Streaming on the bearer path only.
- GET  /v1/models                    auth: public            Live catalog (OpenAI shape + pricing, capabilities, privacy).
- GET  /v1/models/{id}               auth: public            Single model. 404 for unknown.
- POST /v1/account/challenge         auth: public            Issue a SIWE nonce + message to sign. (step 1 of headless key mint)
- POST /v1/account/login             auth: public            { wallet, nonce, signature } -> 24h session token. (step 3)
- GET  /v1/account                   auth: session           Balance + active key count, plus allowanceUsdc/allowanceResetsAt for eligible staked holders.
- POST /v1/keys                      auth: session           Mint a key. { label?, quotaUsdc? }. quotaUsdc:0 mints an allowance-only staker key. Plaintext returned once.
- DELETE /v1/keys/{id}               auth: session           Revoke. Add ?hard=1 to delete a revoked row.
- POST /v1/credits/topup             auth: x402              x402 top-up. Alias https://x402.prxvt.ai/topup.
- POST /v1/x402/chat/completions     auth: x402              Pay-per-call chat. Alias https://x402.prxvt.ai/chat/completions.
- POST /v1/pool/account/topup        auth: session | key     Fund an account from a confidential balance (one-way).
- POST /v1/mcp                       auth: bearer            Model Context Protocol over Streamable HTTP.
- GET  /v1/credits/info              auth: public            Server wallet, chain id, USDC contract, pricing.
- POST /v1/credits/rescan            auth: public            Force the deposit watcher to re-scan.

## Errors

- 401: missing or invalid API key / session.
- 402 (x402 routes): the JSON body carries error ("payment_required" | "invalid_payment"), a human message, and expectedPaymentHeader: "PAYMENT-SIGNATURE". The machine-readable requirements are in the PAYMENT-REQUIRED response header.
- 400: malformed body or unknown model.
- 429: rate limited (retry-after header set).

## Examples

API key (curl):

    curl -s https://api.prxvt.ai/v1/chat/completions \
      -H "Authorization: Bearer prxvt-sk-..." \
      -H "Content-Type: application/json" \
      -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Say hi in five words."}]}'

OpenAI Python SDK:

    from openai import OpenAI
    client = OpenAI(base_url="https://api.prxvt.ai/v1", api_key="prxvt-sk-...")
    r = client.chat.completions.create(model="gpt-4o-mini", messages=[{"role":"user","content":"hi"}])

OpenAI Node SDK:

    import OpenAI from "openai";
    const client = new OpenAI({ baseURL: "https://api.prxvt.ai/v1", apiKey: "prxvt-sk-..." });
    const r = await client.chat.completions.create({ model: "gpt-4o-mini", messages: [{ role: "user", content: "hi" }] });

## Models

Call GET https://api.prxvt.ai/v1/models for the authoritative live list. Use the id (not the label) in requests. "auto" routes to the cheapest competent model. Current ids:

- auto (Auto): openai, 1000k ctx, $0/$0 per 1M in/out, [text, vision, tools, reasoning, image-gen]
- prxvt-moa (PRXVT-MoA): prxvt, 200k ctx, $40.7/$121 per 1M in/out, [text, vision, reasoning], privacy: contractual -- multi-model: billed as the SUM of the whole panel, not this single rate (see the PRXVT-MoA section)
- hermes-4-405b (Hermes 4 405B): nous, 131k ctx, $1.1/$3.3 per 1M in/out, [text, tools, reasoning], privacy: contractual, uncensored
- fugu (Fugu): sakana, 272k ctx, $4.95/$29.7 per 1M in/out, [text, tools, reasoning], privacy: contractual
- fugu-ultra (Fugu Ultra): sakana, 272k ctx, $5.5/$33 per 1M in/out, [text, tools, reasoning], privacy: contractual
- claude-opus-4-8 (Claude Opus 4.8): anthropic, 1000k ctx, $16.5/$82.5 per 1M in/out, [text, vision, tools, reasoning], privacy: contractual
- claude-sonnet-5 (Claude Sonnet 5): anthropic, 1000k ctx, $3.3/$16.5 per 1M in/out, [text, vision, tools, reasoning], privacy: contractual
- claude-fable-5 (Claude Fable 5): anthropic, 1000k ctx, $11/$55 per 1M in/out, [text, vision, tools, reasoning], privacy: contractual
- claude-sonnet-4-6 (Claude Sonnet 4.6): anthropic, 1000k ctx, $3.3/$16.5 per 1M in/out, [text, vision, tools], privacy: contractual
- claude-haiku-4-5 (Claude Haiku 4.5): anthropic, 200k ctx, $0.88/$4.4 per 1M in/out, [text, tools], privacy: contractual
- gpt-5.5 (GPT-5.5): openai, 1000k ctx, $5.5/$22 per 1M in/out, [text, vision, tools, reasoning], privacy: contractual
- gpt-5 (GPT-5): openai, 1000k ctx, $3.3/$13.2 per 1M in/out, [text, vision, tools, reasoning], privacy: contractual
- gpt-5-mini (GPT-5 mini): openai, 1000k ctx, $0.44/$1.76 per 1M in/out, [text, vision, tools], privacy: contractual
- gpt-4o-mini (GPT-4o mini): openai, 128k ctx, $0.165/$0.66 per 1M in/out, [text, vision, tools], privacy: contractual
- o4-mini (o4-mini): openai, 200k ctx, $1.21/$4.84 per 1M in/out, [text, vision, tools, reasoning], privacy: contractual
- o3 (o3): openai, 200k ctx, $2.2/$8.8 per 1M in/out, [text, vision, tools, reasoning], privacy: contractual
- gemini-3.1-pro (Gemini 3.1 Pro (preview)): google, 1049k ctx, $2.2/$13.2 per 1M in/out, [text, vision, tools, reasoning], privacy: contractual
- gemini-3-flash (Gemini 3 Flash (preview)): google, 1049k ctx, $0.55/$3.3 per 1M in/out, [text, vision, tools], privacy: contractual
- gemini-2.5-flash (Gemini 2.5 Flash): google, 1049k ctx, $0.33/$2.75 per 1M in/out, [text, vision, tools], privacy: contractual
- glm-5.2 (GLM 5.2): open, 1000k ctx, $1.32/$4.51 per 1M in/out, [text, tools, reasoning], privacy: contractual
- kimi-k2.7-code (Kimi K2.7 Code): open, 262k ctx, $0.6732/$3.3759 per 1M in/out, [text, vision, tools, reasoning], privacy: contractual
- kimi-k2.6 (Kimi K2.6): open, 262k ctx, $0.803/$3.839 per 1M in/out, [text, vision, tools, reasoning], privacy: contractual
- deepseek-v4-flash (DeepSeek V4 Flash): open, 1049k ctx, $0.1232/$0.2464 per 1M in/out, [text, tools, reasoning], privacy: contractual
- deepseek-v3.1 (DeepSeek V3.1): open, 128k ctx, $0.297/$1.21 per 1M in/out, [text, tools, reasoning], privacy: contractual
- hy3-preview (Hunyuan 3 (preview)): open, 262k ctx, $0.0726/$0.286 per 1M in/out, [text, tools, reasoning], privacy: contractual
- llama-3.3-70b (Llama 3.3 70B): open, 128k ctx, $0.968/$0.968 per 1M in/out, [text, tools], privacy: contractual
- qwen-3-235b (Qwen 3 235B): open, 262k ctx, $0.0781/$0.11 per 1M in/out, [text, tools], privacy: contractual
- nano-banana (Nano Banana): google, 33k ctx, $0.33/$2.75 per 1M in/out, [image-gen], privacy: contractual
- nano-banana-pro (Nano Banana Pro): google, 0k ctx, $0/$0 per 1M in/out, [image-gen], privacy: contractual
- seedream-4-5 (Seedream 4.5): open, 0k ctx, $0/$0 per 1M in/out, [image-gen], privacy: contractual
- flux-2-pro (FLUX.2 Pro): open, 0k ctx, $0/$0 per 1M in/out, [image-gen], privacy: contractual
- veo-3.1-fast (Veo 3.1 Fast): google, 0k ctx, $0/$0 per 1M in/out, [video-gen], privacy: contractual
- veo-3.1 (Veo 3.1): google, 0k ctx, $0/$0 per 1M in/out, [video-gen], privacy: contractual
- sora-2-pro (Sora 2 Pro): openai, 0k ctx, $0/$0 per 1M in/out, [video-gen], privacy: contractual