ADR 0001: Model Gateway And Agent Boundaries

Status: Accepted

Date: 2026-05-08

Context

Calciforge now sits between human chat channels, downstream agents, a local model gateway, a security proxy, and optional provider-owned boundaries such as Helicone, LiteLLM, or OpenRouter. The same words have been used for several different boundaries:

A channel receives a human message and sends replies.
An agent adapter controls or talks to a downstream agent runtime.
The model gateway exposes an OpenAI-compatible /v1/chat/completions endpoint and owns model aliases, local selectors, alloys, cascades, dispatchers, and provider routes.
The security proxy is an HTTP(S) proxy/MITM path for tool and web traffic that is explicitly configured to use it.
A provider-owned boundary, such as Helicone or LiteLLM, can sit behind Calciforge’s model gateway for observability and provider routing.

Those surfaces overlap, but they are not interchangeable. A normal channel-to-agent dispatch does not automatically pass through the model gateway. It reaches the selected adapter. That adapter may then use Calciforge’s model gateway internally, but only if the agent runtime is configured that way.

Decision

Calciforge will make the protected model path explicit.

flowchart TD
  User["Human channel"] --> Router["Calciforge channel/router"]
  Router --> Adapter["Agent adapter"]
  Adapter -->|"only when runtime is configured for it"| Gateway["Calciforge model gateway"]
  Gateway --> Engine["ProviderAdapter: OpenAI-compatible engine or mock"]
  Engine --> Provider["Model provider"]

  Adapter -->|"otherwise"| AgentEgress["Agent-owned model/tool egress"]

The root model gateway has a small supported backend set. http, helicone, litellm, portkey, tensorzero, future-agi, openrouter, and wardwright use the same OpenAI-compatible HTTP core. The engine name selects metadata, dashboard hints, and small policy overlays such as Helicone auth/retry headers. mock is deterministic local/test behavior.

Experimental or stale root backends such as embedded, library, and traceloop are not supported in production config. They can return later only after they have a real adapter contract, validation, integration tests, and docs. Subprocess-backed subscription tools such as Codex, Claude, Kimi, Dirac, and artifact recipes are agents, not gateway models.

The model gateway remains Calciforge’s source of truth for public model selectors. User-facing model names flow through one resolver path for shortcut aliases and synthetic selectors before routing reaches a terminal provider model. Agent selectors and model selectors are separate namespaces and should be validated as such.

!model applies only to agents that explicitly consume Calciforge model overrides. Agents with native command/model/session semantics should use their own adapter contract unless their runtime is configured to call Calciforge’s model gateway for inference.

The security proxy is a separate egress boundary. It can protect agent tools and provider web-fetch paths only when the relevant process or provider route is configured to use it, and when HTTPS clients trust the Calciforge MITM CA.

Consequences

Operators get fewer false promises:

!gateway and docs describe the model gateway, not every downstream agent.
!agents, doctor, and future UX should report each agent’s coverage: model-gateway path, security-proxy path, model override support, session support, and known bypasses.
External gateways add provider management behind Calciforge. Observability sinks are configured separately under [[proxy.observability]], so tools such as Traceloop or a plain OpenTelemetry collector can receive model-attempt telemetry without becoming model gateways.

This also narrows supported configuration. Configs that used backend_type = "embedded", backend_type = "library", or backend_type = "traceloop" as the root [proxy] backend must move to one of the supported OpenAI-compatible provider adapter kinds, mock, or an agent adapter/recipe instead.

Follow-Up Refactor Plan

Keep root gateway backend validation and runtime startup on the same allowlist.
Remove stale gateway spike code from the production build; future gateway experiments must return behind explicit experimental modules with their own adapter contract and tests.
Extend doctor and chat-visible agent details with per-agent coverage: model gateway, security proxy, model override, session, artifacts, and native commands.
Replace hard-coded model fallback lists with config-derived model registry data wherever possible.
Add production-path tests for the boundaries: channel to model-gateway-backed openai-compat, channel to native agent, and channel to subprocess agent with explicit security-proxy/env coverage.

Follow-Through

2026-05-08: The production code path narrowed the gateway allowlist and removed old Traceloop plus unimplemented embedded/library backend stubs so validation, runtime startup, and selectable gateway engine types could not drift around unsupported names.

2026-05-13: The provider adapter implementation stopped treating Helicone as a privileged runtime path. http, helicone, litellm, portkey, tensorzero, future-agi, and openrouter now share the same OpenAI-compatible HTTP core, with named engines supplying policy/metadata overlays.

2026-05-15: Wardwright became the forward path for synthetic-model composition. Calciforge keeps in-process alloys, cascades, and dispatchers for compatibility, but new synthetic route graphs should run through Wardwright as an OpenAI-compatible provider adapter.