Appearance
Budgets and Spending
See also: Data Relationships, Pricing Catalog and Accounting, Request Lifecycle and Failure Modes, Identity and Access, Admin Control Plane, ADR: Spend Control Plane Reporting and Team Hard-Limit Enforcement
This page describes the live spend contract in the gateway.
Source of Truth
- spend ledger:
usage_cost_events
- request-path enforcement:
- ledger writes:
- admin spend APIs:
Ledger Contract
usage_cost_eventsis the canonical usage and spend ledger- request accounting is idempotent on
(request_id, ownership_scope_key) - pricing is resolved from the internal pricing catalog and persisted into the ledger row
- spend math uses fixed-point money and integer arithmetic
Pricing states are explicit:
pricedlegacy_estimatedunpricedusage_missing
Only priced and legacy_estimated rows count toward spend totals and budget windows.
Runtime Enforcement
Pre-provider hard-limit checks run on the live request path for:
POST /v1/chat/completionsPOST /v1/responsesPOST /v1/embeddings
Budgets are enforced by owner scope:
- user-owned API keys use the active user budget
- team-owned API keys use the active team budget
Hard-limit behavior:
- if current priced spend in the active window is already at or above the configured amount and
hard_limit = true, the pre-provider check fails withbudget_exceeded - after provider execution, if current priced spend plus the computed request cost would exceed the configured amount, the ledger write is blocked before the priced row is committed
- the HTTP status is
429 - no provider call occurs on the pre-provider rejection path
- observability records pre-provider rejection as a budget outcome instead of provider execution
Two-Phase Enforcement
Budget enforcement has two phases:
- pre-provider blocking against current priced spend
- post-provider projected-cost blocking before the priced ledger row is inserted
This matters because duplicate requests bypass both phases as a no-op. It also explains the boundary difference: before provider execution the gateway does not know the final request cost, but after usage and pricing are available it can block a newly computed charge that would push the owner past the hard limit.
Ownership scope keys:
- user:
user:<user_id>
- team:
team:<team_id>:actor:none
actor:none is the current team attribution contract. Acting-user attribution is still deferred.
Ledger Write Semantics
- successful request handling writes a ledger row when provider usage can be normalized
- if usage is missing, the row is marked
usage_missing - if pricing cannot be matched exactly, the row is marked
unpriced unpricedandusage_missingrows stay visible in reporting but do not count toward spend totals
Use request-lifecycle-and-failure-modes.md for the cross-cutting path from request execution to ledger state.
Budget Configuration Model
user_budgetsstores active and inactive user budgetsteam_budgetsstores active and inactive team budgets- each table enforces one active budget per owner
Budget fields:
cadencedaily,weekly, ormonthly
amount_10000hard_limittimezone
timezone is stored now, but enforcement windows still use UTC.
Declarative Budget Seed
Active user and team budgets can also come from config-backed seed inputs.
teams[*].budgetreconciles the listed team's active budgetusers[*].budgetreconciles the listed user's active budget- removing a listed owner's
budgetblock deactivates that active budget - historical budget rows remain historical; config only owns the active row
Budget Threshold Alerts
Budget alerts have deeper behavior than a plain email side effect.
- alerts are stored durably in
budget_alerts - per-recipient delivery attempts are stored in
budget_alert_deliveries - the initial threshold is fixed at
20%remaining budget - monthly cadence is supported end to end
Alert creation happens:
- after a new chargeable ledger row is written
- after a budget upsert, if the current spend is already at or below the threshold
Delivery behavior:
- alert creation is durable-first
- request handling writes alert rows and queued delivery rows first
- a background dispatcher sends email later
- delivery is single-attempt oriented in this slice
- email is the only live channel today, but the schema is channel-aware
Recipient readiness:
- user budgets notify the user email
- team budgets notify active team owners or admins with emails
That means email readiness is part of the practical identity setup for alerting.
Spend Reporting APIs
Live admin spend APIs:
GET /api/v1/admin/spend/reportGET /api/v1/admin/spend/budgetsGET /api/v1/admin/spend/budget-alertsPUT /api/v1/admin/spend/budgets/users/{user_id}DELETE /api/v1/admin/spend/budgets/users/{user_id}PUT /api/v1/admin/spend/budgets/teams/{team_id}DELETE /api/v1/admin/spend/budgets/teams/{team_id}
These routes require an authenticated platform-admin session.
Spend Report Semantics
GET /api/v1/admin/spend/report is the summary endpoint behind the admin spend page.
Supported query parameters:
days730
owner_kindalluserteam
The report uses full UTC-day buckets for the selected range. Daily series are zero-filled so charts can render stable timelines even when no chargeable rows exist for a day.
The response separates:
- total request count
- total spend for chargeable rows
- owner breakdowns
- model breakdowns
- daily spend and request points
- counts by pricing status, including
priced,legacy_estimated,unpriced, andusage_missing
Only priced and legacy_estimated rows count toward spend totals. unpriced and usage_missing rows remain visible as accounting-quality signals.
Window Semantics
- daily windows start at
00:00:00 UTC - weekly windows start at
Monday 00:00:00 UTC - monthly windows start at
00:00:00 UTCon the first day of the month Sunday 23:59:59 UTCis still part of the previous weekly window
Current Gaps
- provider breakdown is not part of spend reporting v1
- acting-user attribution for team-owned keys remains
actor:none - timezone-aware budget windows are still deferred
- hardened declarative SSO-backed identity matching remains deferred
What This Page Does Not Own
- exact pricing coverage and
unpricedcauses: - end-to-end request path:
- admin-facing UI behavior:
- identity lifecycle and email readiness: