Architecture
Sentinel is a governed access layer for model traffic, not just a pass-through proxy. Its architecture is easiest to understand as two cooperating parts: the hosted gateway that handles runtime requests, and the hosted control surface that defines how those requests should behave.
System shape
At a high level, applications and SDKs send requests to the Sentinel gateway. The gateway applies configuration, policy, limits, and routing decisions before forwarding traffic to the selected provider lane.
The control surface manages the configuration, credentials, and governance state that shape those request-time decisions.
Data plane
The gateway is the data plane. It handles request-time behavior, including:
- key authentication
- endpoint restriction checks
- policy evaluation
- rate limits and budget gates
- routing and provider capability selection
- request execution, telemetry, and request correlation
In practice, the data plane is where Sentinel decides whether a request can proceed, where it should go, and what should be recorded about it.
Control surface
The hosted Sentinel console is the control surface for the configuration that drives runtime behavior.
It manages:
- tenants, projects, environments, and keys
- provider configs and provider secrets
- route plans and policy definitions
- model sync and operational metadata
- request review, blocks, and audit visibility
In practice, the control surface is where operators define the rules and operating context that the gateway enforces.
What the control plane manages
The control surface is the source of truth for the objects that shape request execution.
That includes:
- tenants and projects
- environments
- keys and endpoint restrictions
- provider configs and provider secrets
- routing definitions
- policy definitions
- budgets and limits
These objects do not live in application clients. They stay in Sentinel so governance and routing can be managed centrally.
Trust boundaries
Sentinel creates a clean separation between applications, provider credentials, and governance controls.
With Sentinel:
- clients authenticate to Sentinel, not to providers
- provider credentials stay in Sentinel-controlled secret storage
- policy and budget decisions happen before provider execution
- audit and telemetry become platform-level records, not app-local logs
This separation helps platform teams centralize model access without pushing control logic into every application.
Request lifecycle
The request lifecycle is the point where Sentinel's value becomes concrete.
A Sentinel request follows a predictable lifecycle:
- the client sends a request to the Sentinel gateway
- Sentinel loads the relevant config and runtime context
- authentication, endpoint restrictions, policy checks, and limits are evaluated
- Sentinel resolves the eligible provider target and execution path
- the request is forwarded to the provider
- the response is returned with Sentinel headers and request correlation
- telemetry and audit signals are recorded for operator visibility
This lifecycle is where Sentinel's value becomes concrete: governance happens before execution, and visibility remains attached to the request path.
How routing, policy, and limits fit together
These controls work together, not independently:
- authentication establishes the caller and config context
- endpoint restrictions decide whether the request class is allowed
- policy evaluates supported content surfaces before provider execution
- limits and budgets gate traffic and consumption
- routing resolves the eligible provider target and execution strategy
- telemetry and audit record what happened and why
This is what lets Sentinel act as a governed access layer rather than just a forwarding layer.
What this architecture enables
This architecture is designed to support real operational control at the model-access layer.
It enables:
- one stable integration point for application teams
- centralized governance without rebuilding controls in every client
- provider flexibility without forcing constant client rewrites
- durable visibility into both request outcomes and platform decisions
For teams adopting Sentinel, this means integration stays simpler while control becomes stronger.