Adapter Requirements & Design Spec

This guide defines the requirements, contracts, and operational envelope for the adapter your team builds for its host system. The shiftagent Integration API the adapter consumes is documented in the API reference: the OpenAPI specification at openapi/openapi.yaml is normative for every API operation named here; this document references operations by operationId and never re-specifies request/response schemas.

0. Scope, audience & reading map

What the adapter is, in one sentence: the adapter marries a host system to an on-prem shiftagent install — host JWT in, derived (external_tenant_id, external_user_id) identity, shiftagent Integration API calls out. The host never talks to shiftagent directly, and shiftagent never sees a host token.

Audience: the host-side engineering team building/operating the adapter, and shiftagent operators standing up the on-prem install it talks to.

Keywords: MUST / MUST NOT / SHOULD / MAY are used per RFC 2119. Where this document says “client-specific,” it means: varies per host system and is isolated behind a named seam (§3, §5.2); everything else is host-system-generic.

Document	Relationship to this spec
`openapi/openapi.yaml`	Normative API contract. Every operation cited here by `operationId` is defined there.
Integration Guide	Narrative architecture, external-ID conventions, context-gathering pattern.
Provisioning Flow	Step-by-step cold/warm walkthrough with sequence diagrams and race semantics.
Streaming Contract	NDJSON event reference the adapter passes through (§7, §9.5).
Runtime Architecture	Composable runtime: agent types, sandbox security model, filler/capacity semantics.

1. Role & boundaries

1.1 The adapter is a middleweight client

The adapter is deliberately characterized as a middleweight client — a precise middle ground between two shapes it must not become:

Not a thin proxy. It carries substantial logic: host-JWT verification and identity derivation, the JIT provisioning orchestration (§4), lifecycle reconciliation sweeps (§5), per-user token exchange and caching (§6), HITL approval transport (§7), and unbuffered NDJSON stream fan-through (§9.5). A naive header-rewriting reverse proxy cannot do this job.
Not a heavyweight middleware. It holds zero persistence — no database, no durable queue, no files, no shared cache (§2). It makes no business decisions: which skills a user gets, which repository a role uses, what an agent may do — all of that is configuration inside shiftagent. shiftagent’s external_id support on tenants and users exists precisely so the adapter never needs a mapping table.

flowchart LR
    HOST["Host system<br/>(fronts all traffic)"] -- "host JWT per request" --> ADAPTER["Adapter<br/>(middleweight, stateless)"]
    ADAPTER -- "sk_int_ key +<br/>per-user platform JWT" --> API["shiftagent<br/>Integration API"]
    API --> STACK["Sandboxes, vault,<br/>repositories, storage"]

1.2 Boundary of ownership

Concern	Owner
Validating the host JWT, deriving external IDs	Adapter (the only client-specific logic — §3)
Storing tenant / user / role / conversation state	shiftagent DB, keyed by `external_id` on tenants AND users
JIT provisioning decisions (“does this tenant/user exist?”)	Adapter orchestrates; shiftagent enforces idempotency (PUT upsert semantics, DB uniqueness as the lock — §4)
Repository / role / skill semantics	shiftagent (top-level repository registry; tenant default attachment; role may pin a repository override; effective skills resolved per request)
Rendering conversations, streaming to the host UI	Host system, via adapter pass-through of the NDJSON stream
Approval decisions (HITL)	Host approval authority — the adapter transports, never decides (§7)
Secret values	shiftagent vault — the adapter is a write-only conduit (§7.4)
Auth between adapter and shiftagent	Integration service key + per-user token exchange (§6)
Lifecycle truth (which tenants/users should exist)	Host system; the adapter reconciles shiftagent to it (§5)

Explicitly out of scope for the adapter: any business logic; any persistence; any skill authoring; any decision about which skills a user gets (that is role configuration inside shiftagent, owned by the operator); any credential registry writes (§6.2); minting approvals (§7.2).

1.3 Tenant topology

The client’s shiftagent install carries one integration root tenant; every host tenant becomes a child tenant under it, carrying a namespaced external_id (§3.3). The adapter’s integration key is minted at that root and is subtree-scoped: it can provision and operate children but can never touch tenants outside its subtree — tenant isolation by construction, not by adapter diligence. The adapter discovers its root and scopes at startup via getIntegrationSelf (GET /integration/self), which returns root_tenant_id, granted scopes, and the registered approver-key fingerprints (§7.3). Confirm the one-root-per-install topology for your deployment (§11).

1.4 Conversation duties

For any authenticated host user the adapter exposes, at minimum:

List conversations + history — listConversations (GET /conversations?user_id=) under the user’s exchanged platform JWT; tenant-wide administrative listing (GET /conversations?tenant_id=) under the service key.
Start a conversation — createConversation (POST /conversations). shiftagent resolves the context snapshot (role → repository → skills) at creation. For a multi-role user the request MUST carry an explicit role_id or shiftagent replies 422 role-required; surfacing the role choice (or applying a host-side default policy) is an adapter duty.
Continue a conversation — createMessage (POST /conversations/{conversation_id}/messages), streaming the NDJSON response through unbuffered (§9.5). Message history retrieval via listMessages (GET /conversations/{conversation_id}/messages).

Runtime knobs (runtime.mode sticky/pooled, on_capacity reject/hold, filler.enabled) are host-policy choices the adapter passes through verbatim; see Runtime Architecture for their semantics.

2. The zero-storage philosophy & cache policy

2.1 Principle

The adapter has no database, no durable queue, no files. Every mapping — host tenant → shiftagent tenant, host user → shiftagent user — lives in shiftagent via external_id, reachable through getTenantByExternalId / getUserByExternalId and created through the corresponding PUT upserts. Consequences, stated normatively:

Adapter instances are interchangeable and horizontally scalable; any instance can serve any request with no session affinity.
Killing or redeploying the adapter loses nothing. Disaster recovery of the integration is disaster recovery of shiftagent alone.
There is no sync job and no drift between adapter state and shiftagent state, because the adapter has no state to drift. (Reconciliation in §5 diffs shiftagent against the host, not against any adapter store.)
Provisioning progress MUST NOT be recorded anywhere: the provisioning chain is convergent by construction (§4.4), so progress state is not just unnecessary — it is a correctness hazard.

2.2 Cache policy (normative)

Caching is allowed but must be in-memory, per-instance, bounded, and safe to lose. A shared cache (Redis or similar) is explicitly NOT introduced — it would be creeping storage and reintroduce the coordination problems zero-storage eliminates.

Item	Cacheable?	TTL / invalidation	Why
Host IdP JWKS keys	Yes	Per `Cache-Control` (fallback 15 min); force-refetch once on unknown `kid`	Standard OIDC practice; enables key rotation without restart
Exchanged platform JWT per `(external_tenant_id, external_user_id)`	Yes	Until `exp − 60 s`, hard cap 15 min	Re-derivable via `tokenExchange`; bounds token-exchange QPS. NOTE: caps revocation latency — see §8
`external_tenant_id → tnt_…` resolution	Yes	≤ 5 min; drop on any 404/403 from shiftagent	Mapping is immutable once created; the short TTL only guards against tenant deprovisioning
`external_user_id → usr_…` resolution	Yes (implied by the cached JWT entry)	Same as the JWT entry	Same
Capacity snapshot (`getCapacity`)	Yes, advisory only	≤ 5 s	Point-in-time pool state; never a substitute for handling `429 capacity-exhausted`
Role definitions, skill grants, “skills this user can access”	No	—	Must be live — shiftagent resolves capabilities and effective skills per request (`listUserSkills`, `listRoleSkills`)
User active/deactivated status	No	—	Deactivation must bite on the first request after the cached JWT expires; never cache an “active” verdict beyond the JWT TTL
Conversation lists / messages	No	—	Consistency-bearing data
Idempotency keys / provisioning progress	No	—	Provisioning is convergent (§4.4); progress state is forbidden
Secret values (message `secrets`, approve-body `secrets`)	Never	—	Write-only pass-through into the vault (§7.4); MUST NOT be buffered beyond the in-flight request, logged, or cached
Approver key material	Never held at all	—	Not a cache question: the adapter never possesses it in any form (§7.3)

3. Host JWT contract

3.1 Normative requirements

The host system sends its own JWT on every request to the adapter: Authorization: Bearer <host_jwt>.
The adapter validates, in order:
- Signature against the host IdP’s JWKS, with an alg allow-list — asymmetric algorithms only (e.g. RS256, ES256, EdDSA). none and every HS* algorithm are rejected outright to prevent key-confusion attacks.
- iss — exact string match against the configured issuer.
- aud — contains the adapter’s registered audience.
- exp / nbf / iat — enforced with ±60 s clock skew tolerance, no more.
- Required identity claims present and non-empty (which claims those are is client-specific — §3.2).
Key rotation: JWKS keys are selected by kid. An unknown kid triggers exactly one forced JWKS refetch before rejecting with 401; refetches are rate-capped (§8) so a garbage-kid flood cannot become a JWKS-fetch DoS.
The adapter derives exactly two identity values, then discards the host JWT — it is never forwarded to shiftagent and never logged:
- external_tenant_id — from the host’s tenant identifier claim
- external_user_id — from the host’s user identifier claim
Plus best-effort display attributes for provisioning enrichment: email, display_name. These are optional by design — shiftagent’s bare-minimum JIT contract means tenant and user creation succeed with nothing but the external_id; enrichment can arrive on any later request (PUT merge-upsert semantics, §4.3).
Failure semantics: an invalid, expired, or absent host JWT → the adapter returns 401 (RFC 9457 application/problem+json, type …/host-token-invalid) without calling shiftagent at all.

3.2 `deriveIdentity()` — the single client-specific seam

Claim mapping is the ONLY client-specific code on the request path. It is a single pure function behind an interface:

deriveIdentity(verifiedClaims) → {
  external_tenant_id,   // required
  external_user_id,     // required
  email?,               // best-effort enrichment
  display_name?         // best-effort enrichment
}

Everything upstream of it (JWKS handling, validation) and downstream of it (provisioning, token exchange, forwarding) is host-system-generic. A wrong assumption about a host’s claim shape costs exactly one function. Per-client instantiations of this function live in each client’s own integration documentation.

3.3 External-ID namespacing at derivation

External IDs are namespaced before they ever reach shiftagent:

external_tenant_id = "{ns}:tenant:{host_tenant_claim}"
external_user_id   = "{ns}:user:{host_user_claim}"

where {ns} is adapter configuration (EXTERNAL_ID_NAMESPACE, §9.4). A second host system integrated later gets its own prefix, making collisions structurally impossible. Namespacing MUST be enforced from request one — retrofitting a namespace onto already-provisioned unprefixed IDs would require exactly the data migration the zero-storage design exists to avoid.

3.4 Illustrative host JWT

Illustrative only. Host IdP token shapes vary; real claim names, issuer, and algorithm are confirmed per client during adapter instantiation (see §11). This example exists to make the derivation concrete, not to specify any particular host system.

// Example host-IdP-issued JWT payload (namespace "acme" configured on the adapter)
{
  "iss": "https://idp.host.example",
  "aud": "shiftagent-adapter",
  "sub": "user:29401",           // → external_user_id   = "acme:user:29401"
  "org_id": "128231",            // → external_tenant_id = "acme:tenant:128231"
  "email": "dispatcher@acme-field.example",
  "name": "Dana Dispatcher",
  "iat": 1782046400,
  "exp": 1782050000
}

4. Request lifecycle state machine

Every host request enters the same machine. The provisioning primitive throughout is the PUT upsert by external ID — upsertTenantByExternalId (PUT /tenants/by-external-id/{external_id}) and upsertUserByExternalId (PUT /tenants/{tenant_id}/users/by-external-id/{external_id}) — which returns 201 when it created the resource and 200 when it already existed. There is no GET-then-POST dance and no 409 on the tenant/user race path: the DB uniqueness constraint is the lock, and a concurrent loser simply receives 200 with the winner’s record. 409 + conflicting_resource_id recovery exists only for named sub-resources created with POST (roles via createRole, registry entries via registerRepository) — see §4.4.

stateDiagram-v2
    state "Verify host JWT" as Verify
    state "deriveIdentity()" as Derive
    state "Platform-JWT cache" as Cache
    state "PUT tenant by external ID" as UpsertTenant
    state "Attach default repository (PUT)" as AttachRepo
    state "Ensure default role (POST)" as EnsureRole
    state "Adopt existing role via conflicting_resource_id" as FetchRole
    state "PUT user by external ID" as UpsertUser
    state "Assign default role (PUT)" as AssignRole
    state "Token exchange" as Exchange
    state "Forward business call" as Forward
    state "Stream / respond" as Stream

    [*] --> Verify
    Verify --> Reject401 : invalid, expired, or absent
    Reject401 --> [*]
    Verify --> Derive : signature + claims verified
    Derive --> Cache
    Cache --> Forward : hit (steady state)
    Cache --> UpsertTenant : miss
    UpsertTenant --> AttachRepo : 201 tenant created (cold)
    UpsertTenant --> UpsertUser : 200 tenant existed (warm)
    AttachRepo --> EnsureRole : idempotent attach OK
    EnsureRole --> UpsertUser : 201 role created
    EnsureRole --> FetchRole : 409 name-conflict
    FetchRole --> UpsertUser
    UpsertUser --> AssignRole : 201 user created
    UpsertUser --> Exchange : 200 user existed
    AssignRole --> Exchange : idempotent PUT
    Exchange --> Forward : platform JWT cached until exp-60s
    Exchange --> Reject403 : 403 user deactivated
    Reject403 --> [*]
    Forward --> Stream
    Stream --> [*]

4.1 Warm path (steady state)

A cached platform JWT exists for (external_tenant_id, external_user_id) → forward the business call (list conversations, send message, …) with that JWT. One shiftagent round-trip.

On a cache miss with everything already provisioned: upsertTenantByExternalId (200) → upsertUserByExternalId (200) → tokenExchange → forward — four round-trips, then cached. The two PUTs on the cache-miss path double as enrichment refresh: merge-upsert semantics mean the latest email / display_name from the host JWT flow into shiftagent on every cache-cold request without disturbing anything else (§4.3).

4.2 Cold path (JIT provisioning, ordered and convergent)

Triggered when upsertTenantByExternalId returns 201 (tenant newly created — body MAY be {}; external_id alone is sufficient). The adapter then runs the tenant bootstrap before continuing:

Attach default repository — attachTenantRepository (PUT /tenants/{tenant_id}/repositories/{repository_id}) with {is_default: true}, pointing at the pre-registered registry entry named by adapter configuration (DEFAULT_REPOSITORY_NAME, resolved once via listRepositories). Idempotent PUT — re-running is a no-op 200. The repository registry itself is operator-provisioned, pre-authenticated (createCredential + registerRepository at install time, §6.2); the adapter only assigns registry entries, never creates them.
Ensure default role — createRole (POST /tenants/{tenant_id}/roles) with the well-known name from DEFAULT_ROLE_NAME and skill_access per configured policy (default {mode: "all"}). Role names are unique per tenant — deliberately load-bearing for replay-safe provisioning: a concurrent or repeated create returns 409 name-conflict carrying conflicting_resource_id, and the adapter adopts the existing role via getRole and continues. (listRoles with its ?name= filter serves the same lookup when recovering out-of-band.)
Upsert user — upsertUserByExternalId, body carrying enrichment only (email, display_name). Storage (the user’s S3-style bucket) is auto-attached by shiftagent.
Assign default role — on a 201 from step 3, assignUserRole (PUT /users/{user_id}/roles/{role_id}), an idempotent PUT.
Token exchange → forward (§6.3).

A new user in an existing tenant follows steps 3–5 only (the tenant PUT returned 200, so bootstrap is skipped).

4.3 Merge-upsert body discipline (do not wipe what you don’t own)

The PUT upserts use merge semantics: provided fields replace, omitted fields are unchanged. This gives the adapter one hard rule:

On the warm path, the adapter MUST send only the fields it owns — enrichment attributes (email, display_name). It MUST NOT send role_ids, default_repository_id, storage, or metadata it did not set, or it will silently clobber operator-made assignments on every cache-cold request.

Role membership is therefore granted exclusively through the dedicated idempotent assignUserRole endpoint on the creation path (§4.2 step 4) — never via role_ids in a PUT body. Operators remain free to add/remove roles afterwards; the adapter never fights them.

4.4 Convergence rules (normative)

Every step is idempotent or conflict-recoverable, and individually retryable. There is no transaction and no rollback. A partially provisioned tenant (e.g. tenant exists, role creation failed) is not an error state — a later pass re-enters the chain and converges.
Ordering is fixed — tenant → repository attachment → role → user → role assignment — so at every failure point the visible state is a strict prefix: never a user without a tenant, never a role assignment without a role.
Re-entrancy trigger: besides the 201-cold trigger, the adapter MUST treat downstream signals of incomplete bootstrap — 422 role-required on createConversation, or a listRoles ?name= miss for the default role — as a cue to re-run the bootstrap chain. Because every step is idempotent (PUT attach, 409-recoverable role create, PUT user, PUT role assignment), re-running from the top is always safe. This heals half-completed cold paths, including the case where the 201-winner crashed mid-bootstrap and a 200-loser is the next request to arrive.
Race semantics: concurrent JIT of the same tenant produces exactly one 201; every loser gets 200 with the winner’s record and proceeds. If a loser outruns the winner’s bootstrap and hits 422 role-required, the re-entrancy rule applies; the resulting duplicate createRole resolves via 409 name-conflict + conflicting_resource_id. No adapter-to-adapter coordination exists anywhere in this design.
Retries: idempotent GET/PUT calls are retried once on network error / 5xx with jittered backoff (100–300 ms). Non-idempotent POSTs carry the spec’s optional Idempotency-Key header — derived deterministically for provisioning steps (e.g. sha256(step ‖ external_tenant_id)), and a random UUID per user action for createMessage — so replays deduplicate server-side (24 h replay window, Idempotency-Replayed: true on replays).

4.5 Failure modes (host-facing behavior)

Failure	Adapter behavior	Surfaces to host as
shiftagent unreachable / 5xx after retry	Fail fast, no queuing (zero storage — there is nowhere safe to park a request)	`503` + `Retry-After`, problem type `…/upstream-unavailable`
Partial provision (crash mid-bootstrap)	Nothing stored; the next request that needs the missing piece re-runs the chain (§4.4) and converges	Transparent (one slower request)
Concurrent JIT of the same tenant	200-loser adopts winner’s record; role-create races resolve via 409 + `conflicting_resource_id`	Transparent
User deactivated in shiftagent	`tokenExchange` (or a forwarded call) returns 403 → drop cached JWT, do not re-provision — deactivated ≠ absent (§5.4)	`403`, problem type `…/user-revoked`
Host JWT valid but tenant suspended/deleted in shiftagent	Default policy: refuse; re-provisioning on delete is a deliberate operator/policy decision, not automatic	`403`, problem type `…/tenant-suspended`
Sandbox capacity exhausted (`429 capacity-exhausted`)	Honor the host’s `on_capacity` choice: `reject` → propagate 429 + `Retry-After`; `hold` → pass through the stream’s `queued` events	`429`, or a stream that emits `queued` then proceeds
Rate limit from shiftagent (`429 rate-limited`)	Propagate with `Retry-After`	`429`
Streaming reply interrupted	Terminate the NDJSON pass-through; the stream’s monotonic `seq` and mandatory terminal event let the host detect truncation; message history in shiftagent remains authoritative	Truncated stream + terminal `error` event where possible
Approval expired / denied	Message ends `failed` with a problem-typed `error` event (pass-through)	Stream `error` event

5. Lifecycle reconciliation

Provisioning is lazy (JIT, §4) — but deprovisioning cannot be: a tenant offboarded from the host, or a user removed there, must stop existing (or stop working) in shiftagent without anyone remembering to clean up. Three mechanisms compose, in increasing order of immediacy; the adapter implements all three, and none of them requires adapter storage:

Mechanism	Latency	Requires	Role
Periodic sweep (§5.1)	Hours (cadence-bound)	Host directory enumeration API	Safety net — catches everything, eventually
Host webhook push (§5.3)	Seconds	Host lifecycle events	Optimization — immediate, but lossy (webhooks get dropped)
Lazy enforcement (§5.4)	Next request	Nothing	Backstop — deactivated identities can’t get tokens

5.1 Periodic sweep

A scheduled job (recommended: a Kubernetes CronJob running the adapter image in sweep mode — §9.2 — daily, off-peak by default; cadence configurable) that reconciles shiftagent’s view against the host’s system of record:

Enumerate shiftagent — page listTenants (GET /tenants) with cursor pagination (starting_after, page size ~100) collecting every child tenant’s external_id; then, per tenant, page listTenantUsers (GET /tenants/{tenant_id}/users) collecting user external_ids. (Cross-tenant listUsers with ?tenant_id= filters is an equivalent alternative; per-tenant paging keeps memory bounded on large installs.)
Enumerate the host — via the host’s directory API. This is the second and final client-specific seam after deriveIdentity(): a pair of functions listHostTenants() → external_tenant_id[] and listHostUsers(tenant) → external_user_id[] behind an interface, instantiated per client.
Diff — strip the configured namespace prefix, compare sets.
Deprovision what exists in shiftagent but not in the host:
- tenant gone → deleteTenantByExternalId (DELETE /tenants/by-external-id/{external_id}) — or suspend-first per policy (§5.2)
- user gone → deactivateUser (DELETE /users/{user_id})

Statelessness: the sweep keeps no checkpoint. Each run enumerates fully; an interrupted run simply leaves work for the next one. Duplicate concurrent sweeps are harmless (every deprovision call is idempotent) but wasteful — running the sweep as a single CronJob invocation rather than an in-process timer on every replica avoids them without leader election.

Guardrails (normative): a transient host-API failure that returns a partial or empty enumeration MUST NOT trigger mass deprovisioning. The sweep:

MUST abort without deprovisioning anything if host enumeration did not complete successfully end-to-end;
MUST abort (and alert) if the computed deletion delta exceeds a configured threshold (SWEEP_MAX_DELTA_PERCENT, default 10%) — a 40%-of-tenants-vanished diff is far more likely a host API incident than a real offboarding wave;
SHOULD emit a dry-run report metric/log line before acting, and support a --dry-run mode for operator rehearsal.

5.2 Suspend-vs-delete policy

Hard deletion is irreversible and destroys conversation history, storage, and audit context. The default policy is soft first:

Subject	Immediate action	Terminal action
Tenant missing from host	`updateTenant` (`PATCH /tenants/{tenant_id}`) → `status: suspended` — all activity stops at once (token exchange and business calls fail `403 tenant-suspended`)	`deleteTenantByExternalId` after a grace window (`SWEEP_GRACE_DAYS`, default 30) of consecutive sweeps still showing it absent
User missing from host	`deactivateUser` — deactivation preserves conversations and audit trails while cutting access	Hard user deletion is an operator decision, never an adapter action

Clients that require immediate hard deletion (e.g. contractual data-residency terms) MAY set SWEEP_DEPROVISION_MODE=delete, accepting the blast-radius trade-off; the guardrails in §5.1 still apply. Note deleteTenant/deleteTenantByExternalId are guarded server-side per the spec — deletion of a tenant with live dependents follows the spec’s documented semantics, not adapter improvisation.

5.3 Host webhook push (optional, if the host offers lifecycle events)

If the host system can emit lifecycle events (“tenant removed”, “user deactivated”), the adapter exposes a webhook endpoint on its host-facing surface:

Authenticated by a host-signed webhook signature (shared secret or asymmetric, per host convention — WEBHOOK_SIGNING_SECRET); unauthenticated or badly-signed events are rejected 401 and never acted on.
Handler maps the event to the same idempotent calls the sweep uses (deleteTenantByExternalId / deactivateUser / suspend), applying the same suspend-first policy. Because the calls are idempotent, webhook + sweep overlap is harmless.
The webhook is an optimization for immediacy, never the mechanism of record — delivery is at-most-once from most hosts, so the sweep remains the safety net.

Host webhook availability and event vocabulary are deployment decisions (§11).

5.4 Lazy enforcement & the deactivated ≠ absent invariant

tokenExchange fails with 403 for a deactivated user, and the adapter maps that to 403 …/user-revoked, drops its cached JWT — and never re-provisions. This invariant is load-bearing:

Absent (no record for the external_id) → JIT provisioning applies; create away.
Deactivated (record exists, status: deactivated) → access was revoked; recreating the user via the upsert path would silently undo an offboarding. getUserByExternalId makes the distinction visible: it returns the record with its status rather than 404.

The adapter MUST check for this distinction wherever it might be tempted to provision: a 403 from tokenExchange or a deactivated status on the user upsert response is a terminal “revoked” state for that identity until an operator (or a host lifecycle event) says otherwise.

6. AuthN/AuthZ between adapter and shiftagent

6.1 The integration service key

The adapter authenticates to the Integration API with a single integration service key (sk_int_…) — a service-principal credential minted at the integration root tenant, subtree-scoped (§1.3), held only in the adapter’s Kubernetes Secret and process memory. At startup (and on demand) the adapter introspects it via getIntegrationSelf to learn root_tenant_id, granted scopes, and registered approver-key fingerprints — failing readiness (§9.3) if the scopes don’t cover the table below.

Least-privilege scope set (what the runtime key needs, and why):

Operations	Why the adapter needs them
`upsertTenantByExternalId`, `getTenantByExternalId`, `updateTenant`, `listTenants`	JIT tenant provisioning; sweep enumeration; suspend policy
`deleteTenantByExternalId`	Sweep / webhook deprovisioning
`listRepositories`, `listTenantRepositories`, `attachTenantRepository`, `detachTenantRepository`	Resolve the configured default registry entry; bootstrap attachment
`createRole`, `getRole`, `listRoles`, `listRoleSkills`	Bootstrap default role; 409 recovery; effective-skill surfaces
`upsertUserByExternalId`, `getUserByExternalId`, `listTenantUsers`, `listUsers`, `deactivateUser`	JIT user provisioning; sweep; lazy enforcement
`assignUserRole`, `unassignUserRole`, `listUserRoles`, `listUserSkills`	Default-role grant at creation; “skills this user can access” convenience
`tokenExchange`	Per-user context (§6.3)
`listConversations` (with `?tenant_id=`)	Tenant-wide administrative conversation listing
`listApprovals`, `getApproval`, `approveApproval`, `denyApproval`	HITL transport (§7 — the key transports decisions; it cannot mint them)
`putConversationSecrets`, `listConversationSecrets`, `deleteConversationSecret`	Write-only secrets pass-through; alias listing (never values)
`getHealth`, `getCapacity`, `getIntegrationSelf`	Readiness probe; capacity pre-check; key introspection

Explicitly NOT granted to the runtime key: repository-registry writes (registerRepository, syncRepository, createRepositorySkill), credential-registry writes (createCredential, deleteCredential), API-key minting, and any governance/billing administration. Skill authoring is likewise out — skills come from repositories, not from the adapter.

Rotation: two keys may be live simultaneously (old + new) during rotation; the adapter reads the key from its Secret at startup and on SIGHUP/rolling restart, so rotation is a Secret update + rolling restart with zero downtime.

6.2 Bootstrap vs runtime key split

The one-time install-time setup — createCredential (git PAT, write-only) followed by registerRepository (name, URL, branch, credential reference) — is performed by the operator with a separate, short-lived bootstrap credential, not by the running adapter. This keeps raw secret material (the git PAT) out of the adapter’s steady-state privilege set entirely: at runtime, repositories are referenced by rep_… ID and credentials by crd_… ID; plaintext is never readable back through any API the adapter can call.

6.3 Per-user context — token exchange, not acting-as (decision record)

Chosen: the service key calls tokenExchange (POST /auth/token-exchange) with the derived external IDs, receiving a short-lived platform JWT (≤ 1 h issued; the adapter caches it for at most 15 min, §2.2) for that user. All user-context calls — conversations, messages, “my skills” — use that JWT. Every downstream audit record, skill resolution, and role check runs as the real user with zero special-casing; deactivation bites at the next exchange.
Rejected: service-key acting-as (an X-Act-As-User header on every call). It is a confused-deputy surface, requires trusted-header handling on every shiftagent route, weakens audit attribution, and diverges from the platform’s resolve-principal-from-the-bearer architecture.
Tenant-scope calls that have no user — tenant-wide conversation listing, provisioning, reconciliation — run under the service key directly, scoped to the resolved child tenant.

7. HITL & secrets duties

This section covers the adapter’s responsibilities in the two zero-trust flows that pass through it: human-in-the-loop approvals and per-message secret material. The unifying rule: the adapter is a conduit with cryptographically enforced limits — it can transport approvals but not mint them, and it can forward secrets but never see them again.

7.1 Surfacing approvals to the host

Mid-run, the agent may raise an Approval (apr_…): the NDJSON stream emits an approval_required event whose payload is the full Approval object — including requested_items[] describing what the agent needs ({kind: action|secret, description, alias?}) — and the message parks in awaiting_approval. Adapter duties:

Pass the approval_required event through to the host UX unbuffered and unmodified, like every other stream event (§9.5).
For hosts that resolve approvals out-of-band (a notifications queue, an approvals inbox rather than the live stream), expose the polling surfaces: listApprovals (GET /approvals?status=pending&tenant_id=…) and getApproval (GET /approvals/{approval_id}).
Surface expires_at prominently — an expired approval fails the parked message; the host UX should know the clock is running.

7.2 Transporting signed decisions — two-party control

Approval resolution is not an authenticated-caller privilege; it is a signed assertion:

The decision is signed — HMAC-SHA256 or Ed25519 — over the canonical payload {approval_id, decision, exp} with a per-tenant approver key.
The approver key is registered with shiftagent out-of-band (operator setup; getIntegrationSelf exposes registered key fingerprints for verification wiring, never material).
The approver key lives with the host’s approval authority — the human-facing system where an authorized person clicks approve/deny. The adapter never holds it, in any form, at any time.
The adapter calls approveApproval (POST /approvals/{approval_id}/approve) or denyApproval (POST /approvals/{approval_id}/deny) carrying the host-produced {signature, note?, secrets?} body — transport, verbatim.

This yields two-party control: a compromised adapter (or a leaked sk_int_ key) can transport whatever it likes but cannot produce a valid signature — no self-approval, ever; a stolen approver key alone cannot reach the API without the service key. shiftagent verifies the signature against the registered key; a bad or missing signature is rejected (403 approval-signature-invalid) no matter how privileged the bearer. The exp inside the signed payload bounds replay of a captured signature, and an approval’s state transition makes each approval_id single-use.

On approve, the stream emits resumed and continues to message_end; sticky sandboxes keep the run warm until expires_at, pooled runs checkpoint and re-hydrate. Denied or expired → the message ends failed with a problem-typed error event. (Full event semantics: Streaming Contract.)

7.3 The approval-supplies-secret weave

requested_items with kind: "secret" let the agent ask for a credential it does not have (e.g. “I need an API key for the CRM, alias CRM_API_KEY”). The approve body’s optional secrets map supplies it — vaulted on arrival, conversation-scoped, exactly like per-message secrets (§7.4). The guarantee this composition delivers: even a rogue agent can exfiltrate nothing — it never sees secret values (only aliases), and it cannot self-approve to obtain more.

7.4 Secrets & env pass-through (write-only conduit)

Two distinct channels ride on createMessage (and putConversationSecrets / deleteConversationSecret for out-of-band management):

env — plaintext, non-secret run parameters. The adapter MUST document to the host — loudly, as the OpenAPI spec itself does — that env is not a secrets channel: values in env are visible to the run.
secrets — write-only alias → value pairs, vaulted at the shiftagent boundary, conversation-scoped. The run sees only aliases ({{secret:CRM_API_KEY}}); the egress proxy resolves aliases to real values at the network boundary on outbound calls. No API response ever echoes a value — listConversationSecrets returns aliases and metadata only.

Adapter duties on this path, all normative:

Forward secrets (message bodies, approve bodies, putConversationSecrets payloads) verbatim and write-only: never logged (structured-logging redaction on the secrets field by name, plus authorization/token — §8), never persisted, never cached beyond the in-flight request buffer, never echoed in any adapter response or error detail.
Never transform, inspect, or validate secret values — the adapter has no business knowing what they are. (Client-specific hardening — e.g. host-side pattern checks that env doesn’t carry obvious secret material — belongs in the adapter layer per the hardening philosophy in §8, but operates on the env channel, not by reading secrets.)
Surface the alias inventory (listConversationSecrets) so host UXes can show what is vaulted without ever being able to show the values.

8. Security hardening checklist

Hardening philosophy (normative): anything client-specific that needs hardening — host-quirk validation, extra rate shaping, bespoke audit hooks, webhook signature schemes — happens in the adapter layer. shiftagent stays generic; its “spiritually aligned” extension points are vaulted credentials and custom skills, not host-specific code paths.

9. Operational spec

9.1 Packaging: gateway service (decision record)

Option	Verdict	Why
Library embedded in the host codebase	Rejected	Couples release cadence to the host’s deploy train; the host team would have to hold the shiftagent service key; claim-mapping updates would need host redeploys
Sidecar per host pod	Rejected	The host is typically a large multi-service system, not one pod; N sidecars = N key copies and N JWKS caches for zero isolation gain
Standalone gateway service	Chosen	One deployable, one key, one place to rotate and observe; stateless → trivially HA (≥ 2 replicas, HPA on CPU); matches the zero-storage philosophy

9.2 Deployment picture

The adapter deploys into the client’s cluster alongside the existing shiftagent Helm install — a sibling Deployment + Service in the same namespace (optionally packaged as a subchart), exposed to the host network only (Ingress or private link). The reconciliation sweep (§5.1) runs as a CronJob invoking the same image in sweep mode.

flowchart LR
    subgraph hostnet["Host network"]
        HOSTAPP["Host application / UX"]
        IDP["Host IdP (JWKS)"]
        APPROVER["Host approval authority<br/>(holds the approver key)"]
        DIRECTORY["Host directory /<br/>lifecycle events"]
    end
    subgraph cluster["Client's on-prem Kubernetes cluster"]
        subgraph ns["shiftagent namespace (Helm release)"]
            ADAPTER["Adapter Deployment<br/>stateless, ≥ 2 replicas"]
            SWEEP["Adapter CronJob<br/>(reconciliation sweep)"]
            API["shiftagent Integration API"]
            VAULT["Vault + egress proxy<br/>(alias resolution)"]
            POOL["Sandbox pool<br/>(warm + sticky)"]
            PG[("Postgres")]
        end
    end
    HOSTAPP -- "host JWT" --> ADAPTER
    APPROVER -. "signed approval assertion<br/>(via host UX)" .-> HOSTAPP
    ADAPTER -- "JWKS fetch (cached)" --> IDP
    DIRECTORY -. "lifecycle webhook (optional)" .-> ADAPTER
    ADAPTER -- "sk_int_ key / platform JWT" --> API
    SWEEP -- "listTenants / listUsers diff" --> API
    SWEEP -. "enumerate live tenants + users" .-> DIRECTORY
    API --> PG
    API --> POOL
    POOL --> VAULT

9.3 Health & readiness

/healthz — liveness: the process is up.
/readyz — readiness: JWKS reachable (or cached), getHealth (GET /health) answering, and getIntegrationSelf scope check passed (§6.1). Not-ready instances are rotated out by the Service without any state loss — there is none to lose.

9.4 Configuration surface (env vars only)

Variable	Purpose	Default
`SHIFTAGENT_BASE_URL`	In-cluster Integration API base URL (Service DNS)	— (required)
`SHIFTAGENT_API_KEY`	The `sk_int_` integration key (from a K8s Secret)	— (required)
`HOST_JWKS_URL`	Host IdP JWKS endpoint	— (required)
`HOST_ISSUER`	Exact `iss` to require	— (required)
`HOST_AUDIENCE`	Required `aud` value	— (required)
`EXTERNAL_ID_NAMESPACE`	Namespace prefix for derived external IDs (§3.3)	— (required)
`DEFAULT_REPOSITORY_NAME`	Registry entry attached as each new tenant’s default (§4.2)	— (required)
`DEFAULT_ROLE_NAME`	Well-known role slug ensured per tenant	`host-default`
`DEFAULT_ROLE_SKILL_ACCESS`	Bootstrap role’s `skill_access` policy	`all`
`TOKEN_CACHE_TTL_SECONDS`	Platform-JWT cache hard cap (§2.2)	`900`
`TENANT_CACHE_TTL_SECONDS`	external→internal tenant-ID cache TTL	`300`
`JWKS_CACHE_TTL_SECONDS`	JWKS fallback TTL when no `Cache-Control`	`900`
`SWEEP_DEPROVISION_MODE`	`suspend-then-delete` \| `delete` (§5.2)	`suspend-then-delete`
`SWEEP_GRACE_DAYS`	Suspend → delete grace window	`30`
`SWEEP_MAX_DELTA_PERCENT`	Sweep abort threshold (§5.1)	`10`
`WEBHOOK_SIGNING_SECRET`	Verifies host lifecycle webhooks (§5.3)	— (optional; webhook disabled without it)
`ERROR_TYPE_BASE_URL`	Base URI for host-facing RFC 9457 `type` values (§10)	— (required)
`UPSTREAM_TIMEOUT_MS`	Per-call Integration API timeout (non-streaming)	`10000`
`STREAM_IDLE_TIMEOUT_MS`	Max silence on a pass-through stream before terminating	`120000`

No config files, no flags, no runtime-mutable settings — the config surface is the environment, which keeps instances interchangeable and rotation auditable.

9.5 Streaming pass-through (explicit, because it silently breaks)

The NDJSON stream is the product surface the host user actually feels; a naive proxy config breaks it invisibly. Normative guidance:

Flush per line. The adapter forwards each NDJSON event line as it arrives — no response buffering, no compression that introduces buffering (disable gzip on the streaming route or use flush-friendly settings).
Disable buffering on every hop the client controls: ingress annotations (e.g. proxy-buffering: off for NGINX-class ingresses), any service mesh, and the adapter’s own HTTP framework defaults.
Timeouts must exceed the semantics: idle timeouts on the streaming path must accommodate the documented max hold time for on_capacity=hold (queued events count as traffic) and approval parking up to expires_at for streams held open across HITL waits.
Never reorder, coalesce, or synthesize events. The stream’s monotonic seq is the host’s truncation detector; the adapter passes events through verbatim and, on upstream failure, terminates the stream (the terminal-event guarantee is shiftagent’s; the adapter must not fabricate events it didn’t receive).
Non-streaming mode (?stream=false on createMessage) is the fallback for host paths that cannot consume streams; the adapter exposes both.

9.6 Observability

Metrics (Prometheus-style; no payload contents anywhere):

adapter_requests_total{route,status} / adapter_request_duration_seconds
adapter_upstream_latency_seconds{operation_id}
adapter_provision_steps_total{step,outcome} — cold-path visibility per §4.2 step
adapter_token_exchanges_total{outcome}, adapter_cache_events_total{cache,hit|miss}
adapter_stream_events_total{type} — including queued, approval_required, error
adapter_approvals_transported_total{decision}
adapter_sweep_last_success_timestamp, adapter_sweep_deprovisioned_total{kind,action}, adapter_sweep_aborts_total{reason}
adapter_webhook_events_total{type,outcome}

Logs are structured, with the redaction set from §8 applied globally. Traces (optional) propagate X-Request-Id into the Integration API and back out to the host.

10. Error contract (host-facing)

All adapter-originated errors are RFC 9457 application/problem+json (house style), with type URIs under the configured ERROR_TYPE_BASE_URL. Adapter-originated types:

Type (suffix)	Status	When
`host-token-invalid`	401	Host JWT absent, expired, bad signature, wrong `iss`/`aud`, or missing identity claims (§3.1). shiftagent is never called.
`user-revoked`	403	`tokenExchange` or a forwarded call reports the user deactivated; cached JWT dropped; no re-provisioning (§5.4)
`tenant-suspended`	403	The tenant is suspended or deleted in shiftagent and policy says refuse (§4.5, §5.2)
`upstream-unavailable`	503	Integration API unreachable / 5xx after the retry budget; carries `Retry-After` (§4.5)
`rate-limited`	429	Adapter’s own per-user token bucket tripped, or pass-through of shiftagent `rate-limited` (with `Retry-After`)

Example body:

{
  "type": "https://errors.adapter.example/upstream-unavailable",
  "title": "Upstream unavailable",
  "status": 503,
  "detail": "The integration API did not respond after retries.",
  "request_id": "req_8f14e45fceea"
}

Pass-through policy for business calls: Integration API 4xx problems — validation-error (422), not-found (404), name-conflict / external-id-conflict / cross-tenant / conversation-archived / resource-in-use / idempotency-key-conflict (409), role-required (422), insufficient-scope (403), capacity-exhausted (429), approval-signature-invalid (403) — are passed through to the host body-intact (they contain no internal secrets by contract and carry the request_id needed for cross-system support), except where a lifecycle rule maps them (the 403s that become user-revoked / tenant-suspended above, and the provisioning 409s the adapter consumes internally per §4.4 and never surfaces). capacity-exhausted handling follows the host’s on_capacity choice (§4.5).

11. Decisions to make for your deployment

Resolve these items with your shiftagent operator before go-live — each one pins a configuration value or a client-specific seam in the adapter you build:

Host IdP token sample. The §3.4 example is illustrative. Obtain a real token from your host IdP (issuer, algorithm, exact claim names for tenant and user) to pin deriveIdentity() — the mapping is isolated behind one function precisely so a wrong assumption costs one function, but it must be confirmed before go-live.
Revocation latency requirement. Decide whether the 15-minute cached-platform-JWT window (§2.2, §8) meets your offboarding requirements, or whether the cache-evict admin endpoint must be part of your adapter from day one.
Host webhook availability & event vocabulary. Determine whether the host system emits lifecycle events the adapter can subscribe to for push deprovisioning (§5.3), and with what event types, delivery guarantees, and signature scheme. Without it, the sweep + lazy enforcement pair is the whole story.
Integration-root topology. Confirm one integration root tenant per install, with every host tenant as a direct child (§1.3) — this determines the service key’s subtree scope and the sweep’s enumeration boundary.
Host directory enumeration for the sweep. Identify the API the host exposes for listHostTenants() / listHostUsers() (§5.1) — completeness guarantees, paging, and rate limits determine sweep cadence and guardrail tuning.
Approver-key custody. Decide which host system acts as the approval authority and holds the per-tenant approver key (§7.2), and register the key with shiftagent out-of-band at install time. The adapter must never possess it in any form.

Adapter Requirements & Design Spec

Adapter Requirements & Design Spec

0. Scope, audience & reading map

1. Role & boundaries

1.1 The adapter is a middleweight client

1.2 Boundary of ownership

1.3 Tenant topology

1.4 Conversation duties

2. The zero-storage philosophy & cache policy

2.1 Principle

2.2 Cache policy (normative)

3. Host JWT contract

3.1 Normative requirements

3.2 deriveIdentity() — the single client-specific seam

3.3 External-ID namespacing at derivation

3.4 Illustrative host JWT

4. Request lifecycle state machine

4.1 Warm path (steady state)

4.2 Cold path (JIT provisioning, ordered and convergent)

4.3 Merge-upsert body discipline (do not wipe what you don’t own)

4.4 Convergence rules (normative)

4.5 Failure modes (host-facing behavior)

5. Lifecycle reconciliation

5.1 Periodic sweep

5.2 Suspend-vs-delete policy

5.3 Host webhook push (optional, if the host offers lifecycle events)

5.4 Lazy enforcement & the deactivated ≠ absent invariant

6. AuthN/AuthZ between adapter and shiftagent

6.1 The integration service key

6.2 Bootstrap vs runtime key split

6.3 Per-user context — token exchange, not acting-as (decision record)

7. HITL & secrets duties

7.1 Surfacing approvals to the host

7.2 Transporting signed decisions — two-party control

7.3 The approval-supplies-secret weave

7.4 Secrets & env pass-through (write-only conduit)

8. Security hardening checklist

9. Operational spec

9.1 Packaging: gateway service (decision record)

9.2 Deployment picture

9.3 Health & readiness

9.4 Configuration surface (env vars only)

9.5 Streaming pass-through (explicit, because it silently breaks)

9.6 Observability

10. Error contract (host-facing)

11. Decisions to make for your deployment

3.2 `deriveIdentity()` — the single client-specific seam