Adapter Requirements & Design Spec
Adapter Requirements & Design Spec
Section titled “Adapter Requirements & Design Spec”This guide defines the requirements, contracts, and operational envelope for the adapter your team builds for its host system. The shiftagent Integration API the adapter consumes is documented in the API reference: the OpenAPI specification at
openapi/openapi.yamlis normative for every API operation named here; this document references operations byoperationIdand never re-specifies request/response schemas.
0. Scope, audience & reading map
Section titled “0. Scope, audience & reading map”What the adapter is, in one sentence: the adapter marries a host system to an on-prem
shiftagent install — host JWT in, derived (external_tenant_id, external_user_id) identity,
shiftagent Integration API calls out. The host never talks to shiftagent directly, and shiftagent
never sees a host token.
Audience: the host-side engineering team building/operating the adapter, and shiftagent operators standing up the on-prem install it talks to.
Keywords: MUST / MUST NOT / SHOULD / MAY are used per RFC 2119. Where this document says “client-specific,” it means: varies per host system and is isolated behind a named seam (§3, §5.2); everything else is host-system-generic.
Related documentation:
| Document | Relationship to this spec |
|---|---|
openapi/openapi.yaml | Normative API contract. Every operation cited here by operationId is defined there. |
| Integration Guide | Narrative architecture, external-ID conventions, context-gathering pattern. |
| Provisioning Flow | Step-by-step cold/warm walkthrough with sequence diagrams and race semantics. |
| Streaming Contract | NDJSON event reference the adapter passes through (§7, §9.5). |
| Runtime Architecture | Composable runtime: agent types, sandbox security model, filler/capacity semantics. |
1. Role & boundaries
Section titled “1. Role & boundaries”1.1 The adapter is a middleweight client
Section titled “1.1 The adapter is a middleweight client”The adapter is deliberately characterized as a middleweight client — a precise middle ground between two shapes it must not become:
- Not a thin proxy. It carries substantial logic: host-JWT verification and identity derivation, the JIT provisioning orchestration (§4), lifecycle reconciliation sweeps (§5), per-user token exchange and caching (§6), HITL approval transport (§7), and unbuffered NDJSON stream fan-through (§9.5). A naive header-rewriting reverse proxy cannot do this job.
- Not a heavyweight middleware. It holds zero persistence — no database, no durable queue,
no files, no shared cache (§2). It makes no business decisions: which skills a user gets, which
repository a role uses, what an agent may do — all of that is configuration inside shiftagent.
shiftagent’s
external_idsupport on tenants and users exists precisely so the adapter never needs a mapping table.
flowchart LR
HOST["Host system<br/>(fronts all traffic)"] -- "host JWT per request" --> ADAPTER["Adapter<br/>(middleweight, stateless)"]
ADAPTER -- "sk_int_ key +<br/>per-user platform JWT" --> API["shiftagent<br/>Integration API"]
API --> STACK["Sandboxes, vault,<br/>repositories, storage"]
1.2 Boundary of ownership
Section titled “1.2 Boundary of ownership”| Concern | Owner |
|---|---|
| Validating the host JWT, deriving external IDs | Adapter (the only client-specific logic — §3) |
| Storing tenant / user / role / conversation state | shiftagent DB, keyed by external_id on tenants AND users |
| JIT provisioning decisions (“does this tenant/user exist?”) | Adapter orchestrates; shiftagent enforces idempotency (PUT upsert semantics, DB uniqueness as the lock — §4) |
| Repository / role / skill semantics | shiftagent (top-level repository registry; tenant default attachment; role may pin a repository override; effective skills resolved per request) |
| Rendering conversations, streaming to the host UI | Host system, via adapter pass-through of the NDJSON stream |
| Approval decisions (HITL) | Host approval authority — the adapter transports, never decides (§7) |
| Secret values | shiftagent vault — the adapter is a write-only conduit (§7.4) |
| Auth between adapter and shiftagent | Integration service key + per-user token exchange (§6) |
| Lifecycle truth (which tenants/users should exist) | Host system; the adapter reconciles shiftagent to it (§5) |
Explicitly out of scope for the adapter: any business logic; any persistence; any skill authoring; any decision about which skills a user gets (that is role configuration inside shiftagent, owned by the operator); any credential registry writes (§6.2); minting approvals (§7.2).
1.3 Tenant topology
Section titled “1.3 Tenant topology”The client’s shiftagent install carries one integration root tenant; every host tenant becomes
a child tenant under it, carrying a namespaced external_id (§3.3). The adapter’s integration key
is minted at that root and is subtree-scoped: it can provision and operate children but can
never touch tenants outside its subtree — tenant isolation by construction, not by adapter
diligence. The adapter discovers its root and scopes at startup via getIntegrationSelf
(GET /integration/self), which returns root_tenant_id, granted scopes, and the registered
approver-key fingerprints (§7.3). Confirm the one-root-per-install topology for your deployment
(§11).
1.4 Conversation duties
Section titled “1.4 Conversation duties”For any authenticated host user the adapter exposes, at minimum:
- List conversations + history —
listConversations(GET /conversations?user_id=) under the user’s exchanged platform JWT; tenant-wide administrative listing (GET /conversations?tenant_id=) under the service key. - Start a conversation —
createConversation(POST /conversations). shiftagent resolves the context snapshot (role → repository → skills) at creation. For a multi-role user the request MUST carry an explicitrole_idor shiftagent replies422 role-required; surfacing the role choice (or applying a host-side default policy) is an adapter duty. - Continue a conversation —
createMessage(POST /conversations/{conversation_id}/messages), streaming the NDJSON response through unbuffered (§9.5). Message history retrieval vialistMessages(GET /conversations/{conversation_id}/messages).
Runtime knobs (runtime.mode sticky/pooled, on_capacity reject/hold, filler.enabled) are
host-policy choices the adapter passes through verbatim; see
Runtime Architecture for their semantics.
2. The zero-storage philosophy & cache policy
Section titled “2. The zero-storage philosophy & cache policy”2.1 Principle
Section titled “2.1 Principle”The adapter has no database, no durable queue, no files. Every mapping — host tenant →
shiftagent tenant, host user → shiftagent user — lives in shiftagent via external_id, reachable
through getTenantByExternalId / getUserByExternalId and created through the corresponding PUT
upserts. Consequences, stated normatively:
- Adapter instances are interchangeable and horizontally scalable; any instance can serve any request with no session affinity.
- Killing or redeploying the adapter loses nothing. Disaster recovery of the integration is disaster recovery of shiftagent alone.
- There is no sync job and no drift between adapter state and shiftagent state, because the adapter has no state to drift. (Reconciliation in §5 diffs shiftagent against the host, not against any adapter store.)
- Provisioning progress MUST NOT be recorded anywhere: the provisioning chain is convergent by construction (§4.4), so progress state is not just unnecessary — it is a correctness hazard.
2.2 Cache policy (normative)
Section titled “2.2 Cache policy (normative)”Caching is allowed but must be in-memory, per-instance, bounded, and safe to lose. A shared cache (Redis or similar) is explicitly NOT introduced — it would be creeping storage and reintroduce the coordination problems zero-storage eliminates.
| Item | Cacheable? | TTL / invalidation | Why |
|---|---|---|---|
| Host IdP JWKS keys | Yes | Per Cache-Control (fallback 15 min); force-refetch once on unknown kid | Standard OIDC practice; enables key rotation without restart |
Exchanged platform JWT per (external_tenant_id, external_user_id) | Yes | Until exp − 60 s, hard cap 15 min | Re-derivable via tokenExchange; bounds token-exchange QPS. NOTE: caps revocation latency — see §8 |
external_tenant_id → tnt_… resolution | Yes | ≤ 5 min; drop on any 404/403 from shiftagent | Mapping is immutable once created; the short TTL only guards against tenant deprovisioning |
external_user_id → usr_… resolution | Yes (implied by the cached JWT entry) | Same as the JWT entry | Same |
Capacity snapshot (getCapacity) | Yes, advisory only | ≤ 5 s | Point-in-time pool state; never a substitute for handling 429 capacity-exhausted |
| Role definitions, skill grants, “skills this user can access” | No | — | Must be live — shiftagent resolves capabilities and effective skills per request (listUserSkills, listRoleSkills) |
| User active/deactivated status | No | — | Deactivation must bite on the first request after the cached JWT expires; never cache an “active” verdict beyond the JWT TTL |
| Conversation lists / messages | No | — | Consistency-bearing data |
| Idempotency keys / provisioning progress | No | — | Provisioning is convergent (§4.4); progress state is forbidden |
Secret values (message secrets, approve-body secrets) | Never | — | Write-only pass-through into the vault (§7.4); MUST NOT be buffered beyond the in-flight request, logged, or cached |
| Approver key material | Never held at all | — | Not a cache question: the adapter never possesses it in any form (§7.3) |
3. Host JWT contract
Section titled “3. Host JWT contract”3.1 Normative requirements
Section titled “3.1 Normative requirements”-
The host system sends its own JWT on every request to the adapter:
Authorization: Bearer <host_jwt>. -
The adapter validates, in order:
- Signature against the host IdP’s JWKS, with an
algallow-list — asymmetric algorithms only (e.g.RS256,ES256,EdDSA).noneand everyHS*algorithm are rejected outright to prevent key-confusion attacks. iss— exact string match against the configured issuer.aud— contains the adapter’s registered audience.exp/nbf/iat— enforced with ±60 s clock skew tolerance, no more.- Required identity claims present and non-empty (which claims those are is client-specific — §3.2).
- Signature against the host IdP’s JWKS, with an
-
Key rotation: JWKS keys are selected by
kid. An unknownkidtriggers exactly one forced JWKS refetch before rejecting with 401; refetches are rate-capped (§8) so a garbage-kidflood cannot become a JWKS-fetch DoS. -
The adapter derives exactly two identity values, then discards the host JWT — it is never forwarded to shiftagent and never logged:
external_tenant_id— from the host’s tenant identifier claimexternal_user_id— from the host’s user identifier claim
Plus best-effort display attributes for provisioning enrichment:
email,display_name. These are optional by design — shiftagent’s bare-minimum JIT contract means tenant and user creation succeed with nothing but the external_id; enrichment can arrive on any later request (PUT merge-upsert semantics, §4.3). -
Failure semantics: an invalid, expired, or absent host JWT → the adapter returns
401(RFC 9457application/problem+json, type…/host-token-invalid) without calling shiftagent at all.
3.2 deriveIdentity() — the single client-specific seam
Section titled “3.2 deriveIdentity() — the single client-specific seam”Claim mapping is the ONLY client-specific code on the request path. It is a single pure function behind an interface:
deriveIdentity(verifiedClaims) → { external_tenant_id, // required external_user_id, // required email?, // best-effort enrichment display_name? // best-effort enrichment}Everything upstream of it (JWKS handling, validation) and downstream of it (provisioning, token exchange, forwarding) is host-system-generic. A wrong assumption about a host’s claim shape costs exactly one function. Per-client instantiations of this function live in each client’s own integration documentation.
3.3 External-ID namespacing at derivation
Section titled “3.3 External-ID namespacing at derivation”External IDs are namespaced before they ever reach shiftagent:
external_tenant_id = "{ns}:tenant:{host_tenant_claim}"external_user_id = "{ns}:user:{host_user_claim}"where {ns} is adapter configuration (EXTERNAL_ID_NAMESPACE, §9.4). A second host system
integrated later gets its own prefix, making collisions structurally impossible. Namespacing MUST
be enforced from request one — retrofitting a namespace onto already-provisioned unprefixed IDs
would require exactly the data migration the zero-storage design exists to avoid.
3.4 Illustrative host JWT
Section titled “3.4 Illustrative host JWT”Illustrative only. Host IdP token shapes vary; real claim names, issuer, and algorithm are confirmed per client during adapter instantiation (see §11). This example exists to make the derivation concrete, not to specify any particular host system.
// Example host-IdP-issued JWT payload (namespace "acme" configured on the adapter){ "iss": "https://idp.host.example", "aud": "shiftagent-adapter", "sub": "user:29401", // → external_user_id = "acme:user:29401" "org_id": "128231", // → external_tenant_id = "acme:tenant:128231" "email": "dispatcher@acme-field.example", "name": "Dana Dispatcher", "iat": 1782046400, "exp": 1782050000}4. Request lifecycle state machine
Section titled “4. Request lifecycle state machine”Every host request enters the same machine. The provisioning primitive throughout is the
PUT upsert by external ID — upsertTenantByExternalId
(PUT /tenants/by-external-id/{external_id}) and upsertUserByExternalId
(PUT /tenants/{tenant_id}/users/by-external-id/{external_id}) — which returns 201 when it
created the resource and 200 when it already existed. There is no GET-then-POST dance and no
409 on the tenant/user race path: the DB uniqueness constraint is the lock, and a concurrent
loser simply receives 200 with the winner’s record. 409 + conflicting_resource_id recovery
exists only for named sub-resources created with POST (roles via createRole, registry entries
via registerRepository) — see §4.4.
stateDiagram-v2
state "Verify host JWT" as Verify
state "deriveIdentity()" as Derive
state "Platform-JWT cache" as Cache
state "PUT tenant by external ID" as UpsertTenant
state "Attach default repository (PUT)" as AttachRepo
state "Ensure default role (POST)" as EnsureRole
state "Adopt existing role via conflicting_resource_id" as FetchRole
state "PUT user by external ID" as UpsertUser
state "Assign default role (PUT)" as AssignRole
state "Token exchange" as Exchange
state "Forward business call" as Forward
state "Stream / respond" as Stream
[*] --> Verify
Verify --> Reject401 : invalid, expired, or absent
Reject401 --> [*]
Verify --> Derive : signature + claims verified
Derive --> Cache
Cache --> Forward : hit (steady state)
Cache --> UpsertTenant : miss
UpsertTenant --> AttachRepo : 201 tenant created (cold)
UpsertTenant --> UpsertUser : 200 tenant existed (warm)
AttachRepo --> EnsureRole : idempotent attach OK
EnsureRole --> UpsertUser : 201 role created
EnsureRole --> FetchRole : 409 name-conflict
FetchRole --> UpsertUser
UpsertUser --> AssignRole : 201 user created
UpsertUser --> Exchange : 200 user existed
AssignRole --> Exchange : idempotent PUT
Exchange --> Forward : platform JWT cached until exp-60s
Exchange --> Reject403 : 403 user deactivated
Reject403 --> [*]
Forward --> Stream
Stream --> [*]
4.1 Warm path (steady state)
Section titled “4.1 Warm path (steady state)”A cached platform JWT exists for (external_tenant_id, external_user_id) → forward the business
call (list conversations, send message, …) with that JWT. One shiftagent round-trip.
On a cache miss with everything already provisioned:
upsertTenantByExternalId (200) → upsertUserByExternalId (200) → tokenExchange → forward —
four round-trips, then cached. The two PUTs on the cache-miss path double as enrichment
refresh: merge-upsert semantics mean the latest email / display_name from the host JWT flow
into shiftagent on every cache-cold request without disturbing anything else (§4.3).
4.2 Cold path (JIT provisioning, ordered and convergent)
Section titled “4.2 Cold path (JIT provisioning, ordered and convergent)”Triggered when upsertTenantByExternalId returns 201 (tenant newly created — body MAY be {};
external_id alone is sufficient). The adapter then runs the tenant bootstrap before
continuing:
- Attach default repository —
attachTenantRepository(PUT /tenants/{tenant_id}/repositories/{repository_id}) with{is_default: true}, pointing at the pre-registered registry entry named by adapter configuration (DEFAULT_REPOSITORY_NAME, resolved once vialistRepositories). Idempotent PUT — re-running is a no-op 200. The repository registry itself is operator-provisioned, pre-authenticated (createCredential+registerRepositoryat install time, §6.2); the adapter only assigns registry entries, never creates them. - Ensure default role —
createRole(POST /tenants/{tenant_id}/roles) with the well-known name fromDEFAULT_ROLE_NAMEandskill_accessper configured policy (default{mode: "all"}). Role names are unique per tenant — deliberately load-bearing for replay-safe provisioning: a concurrent or repeated create returns409 name-conflictcarryingconflicting_resource_id, and the adapter adopts the existing role viagetRoleand continues. (listRoleswith its?name=filter serves the same lookup when recovering out-of-band.) - Upsert user —
upsertUserByExternalId, body carrying enrichment only (email,display_name). Storage (the user’s S3-style bucket) is auto-attached by shiftagent. - Assign default role — on a 201 from step 3,
assignUserRole(PUT /users/{user_id}/roles/{role_id}), an idempotent PUT. - Token exchange → forward (§6.3).
A new user in an existing tenant follows steps 3–5 only (the tenant PUT returned 200, so bootstrap is skipped).
4.3 Merge-upsert body discipline (do not wipe what you don’t own)
Section titled “4.3 Merge-upsert body discipline (do not wipe what you don’t own)”The PUT upserts use merge semantics: provided fields replace, omitted fields are unchanged. This gives the adapter one hard rule:
On the warm path, the adapter MUST send only the fields it owns — enrichment attributes (
display_name). It MUST NOT sendrole_ids,default_repository_id,storage, ormetadatait did not set, or it will silently clobber operator-made assignments on every cache-cold request.
Role membership is therefore granted exclusively through the dedicated idempotent
assignUserRole endpoint on the creation path (§4.2 step 4) — never via role_ids in a PUT body.
Operators remain free to add/remove roles afterwards; the adapter never fights them.
4.4 Convergence rules (normative)
Section titled “4.4 Convergence rules (normative)”- Every step is idempotent or conflict-recoverable, and individually retryable. There is no transaction and no rollback. A partially provisioned tenant (e.g. tenant exists, role creation failed) is not an error state — a later pass re-enters the chain and converges.
- Ordering is fixed — tenant → repository attachment → role → user → role assignment — so at every failure point the visible state is a strict prefix: never a user without a tenant, never a role assignment without a role.
- Re-entrancy trigger: besides the 201-cold trigger, the adapter MUST treat downstream
signals of incomplete bootstrap —
422 role-requiredoncreateConversation, or alistRoles ?name=miss for the default role — as a cue to re-run the bootstrap chain. Because every step is idempotent (PUT attach, 409-recoverable role create, PUT user, PUT role assignment), re-running from the top is always safe. This heals half-completed cold paths, including the case where the 201-winner crashed mid-bootstrap and a 200-loser is the next request to arrive. - Race semantics: concurrent JIT of the same tenant produces exactly one 201; every loser gets
200 with the winner’s record and proceeds. If a loser outruns the winner’s bootstrap and hits
422 role-required, the re-entrancy rule applies; the resulting duplicatecreateRoleresolves via409 name-conflict+conflicting_resource_id. No adapter-to-adapter coordination exists anywhere in this design. - Retries: idempotent GET/PUT calls are retried once on network error / 5xx with jittered
backoff (100–300 ms). Non-idempotent POSTs carry the spec’s optional
Idempotency-Keyheader — derived deterministically for provisioning steps (e.g.sha256(step ‖ external_tenant_id)), and a random UUID per user action forcreateMessage— so replays deduplicate server-side (24 h replay window,Idempotency-Replayed: trueon replays).
4.5 Failure modes (host-facing behavior)
Section titled “4.5 Failure modes (host-facing behavior)”| Failure | Adapter behavior | Surfaces to host as |
|---|---|---|
| shiftagent unreachable / 5xx after retry | Fail fast, no queuing (zero storage — there is nowhere safe to park a request) | 503 + Retry-After, problem type …/upstream-unavailable |
| Partial provision (crash mid-bootstrap) | Nothing stored; the next request that needs the missing piece re-runs the chain (§4.4) and converges | Transparent (one slower request) |
| Concurrent JIT of the same tenant | 200-loser adopts winner’s record; role-create races resolve via 409 + conflicting_resource_id | Transparent |
| User deactivated in shiftagent | tokenExchange (or a forwarded call) returns 403 → drop cached JWT, do not re-provision — deactivated ≠ absent (§5.4) | 403, problem type …/user-revoked |
| Host JWT valid but tenant suspended/deleted in shiftagent | Default policy: refuse; re-provisioning on delete is a deliberate operator/policy decision, not automatic | 403, problem type …/tenant-suspended |
Sandbox capacity exhausted (429 capacity-exhausted) | Honor the host’s on_capacity choice: reject → propagate 429 + Retry-After; hold → pass through the stream’s queued events | 429, or a stream that emits queued then proceeds |
Rate limit from shiftagent (429 rate-limited) | Propagate with Retry-After | 429 |
| Streaming reply interrupted | Terminate the NDJSON pass-through; the stream’s monotonic seq and mandatory terminal event let the host detect truncation; message history in shiftagent remains authoritative | Truncated stream + terminal error event where possible |
| Approval expired / denied | Message ends failed with a problem-typed error event (pass-through) | Stream error event |
5. Lifecycle reconciliation
Section titled “5. Lifecycle reconciliation”Provisioning is lazy (JIT, §4) — but deprovisioning cannot be: a tenant offboarded from the host, or a user removed there, must stop existing (or stop working) in shiftagent without anyone remembering to clean up. Three mechanisms compose, in increasing order of immediacy; the adapter implements all three, and none of them requires adapter storage:
| Mechanism | Latency | Requires | Role |
|---|---|---|---|
| Periodic sweep (§5.1) | Hours (cadence-bound) | Host directory enumeration API | Safety net — catches everything, eventually |
| Host webhook push (§5.3) | Seconds | Host lifecycle events | Optimization — immediate, but lossy (webhooks get dropped) |
| Lazy enforcement (§5.4) | Next request | Nothing | Backstop — deactivated identities can’t get tokens |
5.1 Periodic sweep
Section titled “5.1 Periodic sweep”A scheduled job (recommended: a Kubernetes CronJob running the adapter image in sweep mode — §9.2 — daily, off-peak by default; cadence configurable) that reconciles shiftagent’s view against the host’s system of record:
- Enumerate shiftagent — page
listTenants(GET /tenants) with cursor pagination (starting_after, page size ~100) collecting every child tenant’sexternal_id; then, per tenant, pagelistTenantUsers(GET /tenants/{tenant_id}/users) collecting userexternal_ids. (Cross-tenantlistUserswith?tenant_id=filters is an equivalent alternative; per-tenant paging keeps memory bounded on large installs.) - Enumerate the host — via the host’s directory API. This is the second and final
client-specific seam after
deriveIdentity(): a pair of functionslistHostTenants() → external_tenant_id[]andlistHostUsers(tenant) → external_user_id[]behind an interface, instantiated per client. - Diff — strip the configured namespace prefix, compare sets.
- Deprovision what exists in shiftagent but not in the host:
- tenant gone →
deleteTenantByExternalId(DELETE /tenants/by-external-id/{external_id}) — or suspend-first per policy (§5.2) - user gone →
deactivateUser(DELETE /users/{user_id})
- tenant gone →
Statelessness: the sweep keeps no checkpoint. Each run enumerates fully; an interrupted run simply leaves work for the next one. Duplicate concurrent sweeps are harmless (every deprovision call is idempotent) but wasteful — running the sweep as a single CronJob invocation rather than an in-process timer on every replica avoids them without leader election.
Guardrails (normative): a transient host-API failure that returns a partial or empty enumeration MUST NOT trigger mass deprovisioning. The sweep:
- MUST abort without deprovisioning anything if host enumeration did not complete successfully end-to-end;
- MUST abort (and alert) if the computed deletion delta exceeds a configured threshold
(
SWEEP_MAX_DELTA_PERCENT, default 10%) — a 40%-of-tenants-vanished diff is far more likely a host API incident than a real offboarding wave; - SHOULD emit a dry-run report metric/log line before acting, and support a
--dry-runmode for operator rehearsal.
5.2 Suspend-vs-delete policy
Section titled “5.2 Suspend-vs-delete policy”Hard deletion is irreversible and destroys conversation history, storage, and audit context. The default policy is soft first:
| Subject | Immediate action | Terminal action |
|---|---|---|
| Tenant missing from host | updateTenant (PATCH /tenants/{tenant_id}) → status: suspended — all activity stops at once (token exchange and business calls fail 403 tenant-suspended) | deleteTenantByExternalId after a grace window (SWEEP_GRACE_DAYS, default 30) of consecutive sweeps still showing it absent |
| User missing from host | deactivateUser — deactivation preserves conversations and audit trails while cutting access | Hard user deletion is an operator decision, never an adapter action |
Clients that require immediate hard deletion (e.g. contractual data-residency terms) MAY set
SWEEP_DEPROVISION_MODE=delete, accepting the blast-radius trade-off; the guardrails in §5.1
still apply. Note deleteTenant/deleteTenantByExternalId are guarded server-side per the spec —
deletion of a tenant with live dependents follows the spec’s documented semantics, not adapter
improvisation.
5.3 Host webhook push (optional, if the host offers lifecycle events)
Section titled “5.3 Host webhook push (optional, if the host offers lifecycle events)”If the host system can emit lifecycle events (“tenant removed”, “user deactivated”), the adapter exposes a webhook endpoint on its host-facing surface:
- Authenticated by a host-signed webhook signature (shared secret or asymmetric, per host
convention —
WEBHOOK_SIGNING_SECRET); unauthenticated or badly-signed events are rejected 401 and never acted on. - Handler maps the event to the same idempotent calls the sweep uses
(
deleteTenantByExternalId/deactivateUser/ suspend), applying the same suspend-first policy. Because the calls are idempotent, webhook + sweep overlap is harmless. - The webhook is an optimization for immediacy, never the mechanism of record — delivery is at-most-once from most hosts, so the sweep remains the safety net.
Host webhook availability and event vocabulary are deployment decisions (§11).
5.4 Lazy enforcement & the deactivated ≠ absent invariant
Section titled “5.4 Lazy enforcement & the deactivated ≠ absent invariant”tokenExchange fails with 403 for a deactivated user, and the adapter maps that to
403 …/user-revoked, drops its cached JWT — and never re-provisions. This invariant is
load-bearing:
- Absent (no record for the external_id) → JIT provisioning applies; create away.
- Deactivated (record exists,
status: deactivated) → access was revoked; recreating the user via the upsert path would silently undo an offboarding.getUserByExternalIdmakes the distinction visible: it returns the record with its status rather than 404.
The adapter MUST check for this distinction wherever it might be tempted to provision: a 403 from
tokenExchange or a deactivated status on the user upsert response is a terminal “revoked” state
for that identity until an operator (or a host lifecycle event) says otherwise.
6. AuthN/AuthZ between adapter and shiftagent
Section titled “6. AuthN/AuthZ between adapter and shiftagent”6.1 The integration service key
Section titled “6.1 The integration service key”The adapter authenticates to the Integration API with a single integration service key
(sk_int_…) — a service-principal credential minted at the integration root tenant,
subtree-scoped (§1.3), held only in the adapter’s Kubernetes Secret and process memory. At startup
(and on demand) the adapter introspects it via getIntegrationSelf to learn root_tenant_id,
granted scopes, and registered approver-key fingerprints — failing readiness (§9.3) if the scopes
don’t cover the table below.
Least-privilege scope set (what the runtime key needs, and why):
| Operations | Why the adapter needs them |
|---|---|
upsertTenantByExternalId, getTenantByExternalId, updateTenant, listTenants | JIT tenant provisioning; sweep enumeration; suspend policy |
deleteTenantByExternalId | Sweep / webhook deprovisioning |
listRepositories, listTenantRepositories, attachTenantRepository, detachTenantRepository | Resolve the configured default registry entry; bootstrap attachment |
createRole, getRole, listRoles, listRoleSkills | Bootstrap default role; 409 recovery; effective-skill surfaces |
upsertUserByExternalId, getUserByExternalId, listTenantUsers, listUsers, deactivateUser | JIT user provisioning; sweep; lazy enforcement |
assignUserRole, unassignUserRole, listUserRoles, listUserSkills | Default-role grant at creation; “skills this user can access” convenience |
tokenExchange | Per-user context (§6.3) |
listConversations (with ?tenant_id=) | Tenant-wide administrative conversation listing |
listApprovals, getApproval, approveApproval, denyApproval | HITL transport (§7 — the key transports decisions; it cannot mint them) |
putConversationSecrets, listConversationSecrets, deleteConversationSecret | Write-only secrets pass-through; alias listing (never values) |
getHealth, getCapacity, getIntegrationSelf | Readiness probe; capacity pre-check; key introspection |
Explicitly NOT granted to the runtime key: repository-registry writes (registerRepository,
syncRepository, createRepositorySkill), credential-registry writes (createCredential,
deleteCredential), API-key minting, and any governance/billing administration. Skill authoring
is likewise out — skills come from repositories, not from the adapter.
Rotation: two keys may be live simultaneously (old + new) during rotation; the adapter reads the key from its Secret at startup and on SIGHUP/rolling restart, so rotation is a Secret update + rolling restart with zero downtime.
6.2 Bootstrap vs runtime key split
Section titled “6.2 Bootstrap vs runtime key split”The one-time install-time setup — createCredential (git PAT, write-only) followed by
registerRepository (name, URL, branch, credential reference) — is performed by the operator
with a separate, short-lived bootstrap credential, not by the running adapter. This keeps raw
secret material (the git PAT) out of the adapter’s steady-state privilege set entirely: at
runtime, repositories are referenced by rep_… ID and credentials by crd_… ID; plaintext is
never readable back through any API the adapter can call.
6.3 Per-user context — token exchange, not acting-as (decision record)
Section titled “6.3 Per-user context — token exchange, not acting-as (decision record)”- Chosen: the service key calls
tokenExchange(POST /auth/token-exchange) with the derived external IDs, receiving a short-lived platform JWT (≤ 1 h issued; the adapter caches it for at most 15 min, §2.2) for that user. All user-context calls — conversations, messages, “my skills” — use that JWT. Every downstream audit record, skill resolution, and role check runs as the real user with zero special-casing; deactivation bites at the next exchange. - Rejected: service-key acting-as (an
X-Act-As-Userheader on every call). It is a confused-deputy surface, requires trusted-header handling on every shiftagent route, weakens audit attribution, and diverges from the platform’s resolve-principal-from-the-bearer architecture. - Tenant-scope calls that have no user — tenant-wide conversation listing, provisioning, reconciliation — run under the service key directly, scoped to the resolved child tenant.
7. HITL & secrets duties
Section titled “7. HITL & secrets duties”This section covers the adapter’s responsibilities in the two zero-trust flows that pass through it: human-in-the-loop approvals and per-message secret material. The unifying rule: the adapter is a conduit with cryptographically enforced limits — it can transport approvals but not mint them, and it can forward secrets but never see them again.
7.1 Surfacing approvals to the host
Section titled “7.1 Surfacing approvals to the host”Mid-run, the agent may raise an Approval (apr_…): the NDJSON stream emits an
approval_required event whose payload is the full Approval object — including
requested_items[] describing what the agent needs ({kind: action|secret, description, alias?}) — and the message parks in awaiting_approval. Adapter duties:
- Pass the
approval_requiredevent through to the host UX unbuffered and unmodified, like every other stream event (§9.5). - For hosts that resolve approvals out-of-band (a notifications queue, an approvals inbox rather
than the live stream), expose the polling surfaces:
listApprovals(GET /approvals?status=pending&tenant_id=…) andgetApproval(GET /approvals/{approval_id}). - Surface
expires_atprominently — an expired approval fails the parked message; the host UX should know the clock is running.
7.2 Transporting signed decisions — two-party control
Section titled “7.2 Transporting signed decisions — two-party control”Approval resolution is not an authenticated-caller privilege; it is a signed assertion:
- The decision is signed — HMAC-SHA256 or Ed25519 — over the canonical payload
{approval_id, decision, exp}with a per-tenant approver key. - The approver key is registered with shiftagent out-of-band (operator setup;
getIntegrationSelfexposes registered key fingerprints for verification wiring, never material). - The approver key lives with the host’s approval authority — the human-facing system where an authorized person clicks approve/deny. The adapter never holds it, in any form, at any time.
- The adapter calls
approveApproval(POST /approvals/{approval_id}/approve) ordenyApproval(POST /approvals/{approval_id}/deny) carrying the host-produced{signature, note?, secrets?}body — transport, verbatim.
This yields two-party control: a compromised adapter (or a leaked sk_int_ key) can transport
whatever it likes but cannot produce a valid signature — no self-approval, ever; a stolen approver
key alone cannot reach the API without the service key. shiftagent verifies the signature against
the registered key; a bad or missing signature is rejected (403 approval-signature-invalid) no
matter how privileged the bearer. The exp inside the signed payload bounds replay of a captured
signature, and an approval’s state transition makes each approval_id single-use.
On approve, the stream emits resumed and continues to message_end; sticky sandboxes keep the
run warm until expires_at, pooled runs checkpoint and re-hydrate. Denied or expired → the
message ends failed with a problem-typed error event. (Full event semantics:
Streaming Contract.)
7.3 The approval-supplies-secret weave
Section titled “7.3 The approval-supplies-secret weave”requested_items with kind: "secret" let the agent ask for a credential it does not have
(e.g. “I need an API key for the CRM, alias CRM_API_KEY”). The approve body’s optional
secrets map supplies it — vaulted on arrival, conversation-scoped, exactly like per-message
secrets (§7.4). The guarantee this composition delivers: even a rogue agent can exfiltrate
nothing — it never sees secret values (only aliases), and it cannot self-approve to obtain more.
7.4 Secrets & env pass-through (write-only conduit)
Section titled “7.4 Secrets & env pass-through (write-only conduit)”Two distinct channels ride on createMessage (and putConversationSecrets /
deleteConversationSecret for out-of-band management):
env— plaintext, non-secret run parameters. The adapter MUST document to the host — loudly, as the OpenAPI spec itself does — thatenvis not a secrets channel: values inenvare visible to the run.secrets— write-onlyalias → valuepairs, vaulted at the shiftagent boundary, conversation-scoped. The run sees only aliases ({{secret:CRM_API_KEY}}); the egress proxy resolves aliases to real values at the network boundary on outbound calls. No API response ever echoes a value —listConversationSecretsreturns aliases and metadata only.
Adapter duties on this path, all normative:
- Forward
secrets(message bodies, approve bodies,putConversationSecretspayloads) verbatim and write-only: never logged (structured-logging redaction on thesecretsfield by name, plusauthorization/token— §8), never persisted, never cached beyond the in-flight request buffer, never echoed in any adapter response or error detail. - Never transform, inspect, or validate secret values — the adapter has no business knowing what
they are. (Client-specific hardening — e.g. host-side pattern checks that
envdoesn’t carry obvious secret material — belongs in the adapter layer per the hardening philosophy in §8, but operates on theenvchannel, not by readingsecrets.) - Surface the alias inventory (
listConversationSecrets) so host UXes can show what is vaulted without ever being able to show the values.
8. Security hardening checklist
Section titled “8. Security hardening checklist”Hardening philosophy (normative): anything client-specific that needs hardening — host-quirk validation, extra rate shaping, bespoke audit hooks, webhook signature schemes — happens in the adapter layer. shiftagent stays generic; its “spiritually aligned” extension points are vaulted credentials and custom skills, not host-specific code paths.
- No token leakage into logs: host JWT, exchanged platform JWTs, and the
sk_int_key are never logged and never echoed in problem responses; structured-logging redaction onauthorization/token/signaturefields. The service key exists only in the K8s Secret and process memory. - No secret-value logging anywhere on the path:
secretsmaps (message, approval,putConversationSecrets) are redacted by field name at the logging layer and excluded from request-body capture, error reports, and traces (§7.4). - Host JWT is never forwarded to shiftagent; platform JWTs are never returned to the host. Each trust domain sees only its own tokens.
- Approver key is never possessed: the adapter transports signed approval assertions but holds no approver key material in config, memory, or environment (§7.2). Verify at review time that no code path can receive one.
- external_id namespacing: all IDs prefixed with the configured namespace at derivation (§3.3); a future second integration gets its own prefix — collisions structurally impossible.
- Tenant isolation:
external_tenant_idcomes ONLY from the verified host JWT — never from a request path, body, or query parameter; the service key is subtree-scoped to the integration root; cross-tenant conversation access is prevented by shiftagent’s per-user JWT scoping, not by adapter diligence. - Replay stance: the stateless adapter keeps no
jticache by design; mitigations are TLS everywhere, short host-JWTexp, ±60 s skew only, exchanged JWTs capped at 15 min. Strict replay-proofing beyond that is the host IdP’s job (short-lived tokens) — documented as a shared-responsibility line. - Revocation latency budget: worst case = the cached platform-JWT TTL (15 min), stated as
an explicit SLO. If the client needs immediate cutoff, the adapter exposes an optional
cache-purge admin endpoint (
POST /admin/evict {external_user_id}) — still zero durable storage (§11). - Rate limits: per-
external_user_idtoken bucket in the adapter (in-memory, best-effort, per-instance) + transparent pass-through of shiftagent 429s withRetry-After. - Algorithm pinning & JWKS hygiene: asymmetric algorithms only, pinned issuer, JWKS over
TLS with certificate validation, refetch-on-unknown-
kidwith a refetch rate cap (anti-DoS). - Webhook authentication: host lifecycle webhooks (§5.3) verified against their signature before any deprovisioning action; unauthenticated events dropped and counted.
- Sweep guardrails armed: incomplete host enumeration aborts the sweep; delta threshold enforced; dry-run rehearsed before first production sweep (§5.1).
- Request tracing: the adapter generates/propagates
X-Request-Idend-to-end; Integration API problem responses carryrequest_id, which the adapter preserves in its host-facing problems for cross-system correlation. - No raw secrets in provisioning payloads: repositories are attached by pre-provisioned
rep_…ID referencing an operator-registeredcrd_…credential — raw git PATs never transit the adapter (§6.2). The only secret material that ever transits is the write-only conversation-secrets channel (§7.4).
9. Operational spec
Section titled “9. Operational spec”9.1 Packaging: gateway service (decision record)
Section titled “9.1 Packaging: gateway service (decision record)”| Option | Verdict | Why |
|---|---|---|
| Library embedded in the host codebase | Rejected | Couples release cadence to the host’s deploy train; the host team would have to hold the shiftagent service key; claim-mapping updates would need host redeploys |
| Sidecar per host pod | Rejected | The host is typically a large multi-service system, not one pod; N sidecars = N key copies and N JWKS caches for zero isolation gain |
| Standalone gateway service | Chosen | One deployable, one key, one place to rotate and observe; stateless → trivially HA (≥ 2 replicas, HPA on CPU); matches the zero-storage philosophy |
9.2 Deployment picture
Section titled “9.2 Deployment picture”The adapter deploys into the client’s cluster alongside the existing shiftagent Helm install — a
sibling Deployment + Service in the same namespace (optionally packaged as a subchart),
exposed to the host network only (Ingress or private link). The reconciliation sweep (§5.1) runs
as a CronJob invoking the same image in sweep mode.
flowchart LR
subgraph hostnet["Host network"]
HOSTAPP["Host application / UX"]
IDP["Host IdP (JWKS)"]
APPROVER["Host approval authority<br/>(holds the approver key)"]
DIRECTORY["Host directory /<br/>lifecycle events"]
end
subgraph cluster["Client's on-prem Kubernetes cluster"]
subgraph ns["shiftagent namespace (Helm release)"]
ADAPTER["Adapter Deployment<br/>stateless, ≥ 2 replicas"]
SWEEP["Adapter CronJob<br/>(reconciliation sweep)"]
API["shiftagent Integration API"]
VAULT["Vault + egress proxy<br/>(alias resolution)"]
POOL["Sandbox pool<br/>(warm + sticky)"]
PG[("Postgres")]
end
end
HOSTAPP -- "host JWT" --> ADAPTER
APPROVER -. "signed approval assertion<br/>(via host UX)" .-> HOSTAPP
ADAPTER -- "JWKS fetch (cached)" --> IDP
DIRECTORY -. "lifecycle webhook (optional)" .-> ADAPTER
ADAPTER -- "sk_int_ key / platform JWT" --> API
SWEEP -- "listTenants / listUsers diff" --> API
SWEEP -. "enumerate live tenants + users" .-> DIRECTORY
API --> PG
API --> POOL
POOL --> VAULT
9.3 Health & readiness
Section titled “9.3 Health & readiness”/healthz— liveness: the process is up./readyz— readiness: JWKS reachable (or cached),getHealth(GET /health) answering, andgetIntegrationSelfscope check passed (§6.1). Not-ready instances are rotated out by the Service without any state loss — there is none to lose.
9.4 Configuration surface (env vars only)
Section titled “9.4 Configuration surface (env vars only)”| Variable | Purpose | Default |
|---|---|---|
SHIFTAGENT_BASE_URL | In-cluster Integration API base URL (Service DNS) | — (required) |
SHIFTAGENT_API_KEY | The sk_int_ integration key (from a K8s Secret) | — (required) |
HOST_JWKS_URL | Host IdP JWKS endpoint | — (required) |
HOST_ISSUER | Exact iss to require | — (required) |
HOST_AUDIENCE | Required aud value | — (required) |
EXTERNAL_ID_NAMESPACE | Namespace prefix for derived external IDs (§3.3) | — (required) |
DEFAULT_REPOSITORY_NAME | Registry entry attached as each new tenant’s default (§4.2) | — (required) |
DEFAULT_ROLE_NAME | Well-known role slug ensured per tenant | host-default |
DEFAULT_ROLE_SKILL_ACCESS | Bootstrap role’s skill_access policy | all |
TOKEN_CACHE_TTL_SECONDS | Platform-JWT cache hard cap (§2.2) | 900 |
TENANT_CACHE_TTL_SECONDS | external→internal tenant-ID cache TTL | 300 |
JWKS_CACHE_TTL_SECONDS | JWKS fallback TTL when no Cache-Control | 900 |
SWEEP_DEPROVISION_MODE | suspend-then-delete | delete (§5.2) | suspend-then-delete |
SWEEP_GRACE_DAYS | Suspend → delete grace window | 30 |
SWEEP_MAX_DELTA_PERCENT | Sweep abort threshold (§5.1) | 10 |
WEBHOOK_SIGNING_SECRET | Verifies host lifecycle webhooks (§5.3) | — (optional; webhook disabled without it) |
ERROR_TYPE_BASE_URL | Base URI for host-facing RFC 9457 type values (§10) | — (required) |
UPSTREAM_TIMEOUT_MS | Per-call Integration API timeout (non-streaming) | 10000 |
STREAM_IDLE_TIMEOUT_MS | Max silence on a pass-through stream before terminating | 120000 |
No config files, no flags, no runtime-mutable settings — the config surface is the environment, which keeps instances interchangeable and rotation auditable.
9.5 Streaming pass-through (explicit, because it silently breaks)
Section titled “9.5 Streaming pass-through (explicit, because it silently breaks)”The NDJSON stream is the product surface the host user actually feels; a naive proxy config breaks it invisibly. Normative guidance:
- Flush per line. The adapter forwards each NDJSON event line as it arrives — no response buffering, no compression that introduces buffering (disable gzip on the streaming route or use flush-friendly settings).
- Disable buffering on every hop the client controls: ingress annotations
(e.g.
proxy-buffering: offfor NGINX-class ingresses), any service mesh, and the adapter’s own HTTP framework defaults. - Timeouts must exceed the semantics: idle timeouts on the streaming path must accommodate the
documented max hold time for
on_capacity=hold(queued events count as traffic) and approval parking up toexpires_atfor streams held open across HITL waits. - Never reorder, coalesce, or synthesize events. The stream’s monotonic
seqis the host’s truncation detector; the adapter passes events through verbatim and, on upstream failure, terminates the stream (the terminal-event guarantee is shiftagent’s; the adapter must not fabricate events it didn’t receive). - Non-streaming mode (
?stream=falseoncreateMessage) is the fallback for host paths that cannot consume streams; the adapter exposes both.
9.6 Observability
Section titled “9.6 Observability”Metrics (Prometheus-style; no payload contents anywhere):
adapter_requests_total{route,status}/adapter_request_duration_secondsadapter_upstream_latency_seconds{operation_id}adapter_provision_steps_total{step,outcome}— cold-path visibility per §4.2 stepadapter_token_exchanges_total{outcome},adapter_cache_events_total{cache,hit|miss}adapter_stream_events_total{type}— includingqueued,approval_required,erroradapter_approvals_transported_total{decision}adapter_sweep_last_success_timestamp,adapter_sweep_deprovisioned_total{kind,action},adapter_sweep_aborts_total{reason}adapter_webhook_events_total{type,outcome}
Logs are structured, with the redaction set from §8 applied globally. Traces (optional) propagate
X-Request-Id into the Integration API and back out to the host.
10. Error contract (host-facing)
Section titled “10. Error contract (host-facing)”All adapter-originated errors are RFC 9457 application/problem+json (house style), with type
URIs under the configured ERROR_TYPE_BASE_URL. Adapter-originated types:
| Type (suffix) | Status | When |
|---|---|---|
host-token-invalid | 401 | Host JWT absent, expired, bad signature, wrong iss/aud, or missing identity claims (§3.1). shiftagent is never called. |
user-revoked | 403 | tokenExchange or a forwarded call reports the user deactivated; cached JWT dropped; no re-provisioning (§5.4) |
tenant-suspended | 403 | The tenant is suspended or deleted in shiftagent and policy says refuse (§4.5, §5.2) |
upstream-unavailable | 503 | Integration API unreachable / 5xx after the retry budget; carries Retry-After (§4.5) |
rate-limited | 429 | Adapter’s own per-user token bucket tripped, or pass-through of shiftagent rate-limited (with Retry-After) |
Example body:
{ "type": "https://errors.adapter.example/upstream-unavailable", "title": "Upstream unavailable", "status": 503, "detail": "The integration API did not respond after retries.", "request_id": "req_8f14e45fceea"}Pass-through policy for business calls: Integration API 4xx problems — validation-error
(422), not-found (404), name-conflict / external-id-conflict / cross-tenant /
conversation-archived / resource-in-use / idempotency-key-conflict (409), role-required
(422), insufficient-scope (403), capacity-exhausted (429), approval-signature-invalid (403)
— are passed through to the host body-intact (they contain no internal secrets by contract and
carry the request_id needed for cross-system support), except where a lifecycle rule maps them
(the 403s that become user-revoked / tenant-suspended above, and the provisioning 409s the
adapter consumes internally per §4.4 and never surfaces). capacity-exhausted handling follows the
host’s on_capacity choice (§4.5).
11. Decisions to make for your deployment
Section titled “11. Decisions to make for your deployment”Resolve these items with your shiftagent operator before go-live — each one pins a configuration value or a client-specific seam in the adapter you build:
- Host IdP token sample. The §3.4 example is illustrative. Obtain a real token from your
host IdP (issuer, algorithm, exact claim names for tenant and user) to pin
deriveIdentity()— the mapping is isolated behind one function precisely so a wrong assumption costs one function, but it must be confirmed before go-live. - Revocation latency requirement. Decide whether the 15-minute cached-platform-JWT window (§2.2, §8) meets your offboarding requirements, or whether the cache-evict admin endpoint must be part of your adapter from day one.
- Host webhook availability & event vocabulary. Determine whether the host system emits lifecycle events the adapter can subscribe to for push deprovisioning (§5.3), and with what event types, delivery guarantees, and signature scheme. Without it, the sweep + lazy enforcement pair is the whole story.
- Integration-root topology. Confirm one integration root tenant per install, with every host tenant as a direct child (§1.3) — this determines the service key’s subtree scope and the sweep’s enumeration boundary.
- Host directory enumeration for the sweep. Identify the API the host exposes for
listHostTenants()/listHostUsers()(§5.1) — completeness guarantees, paging, and rate limits determine sweep cadence and guardrail tuning. - Approver-key custody. Decide which host system acts as the approval authority and holds the per-tenant approver key (§7.2), and register the key with shiftagent out-of-band at install time. The adapter must never possess it in any form.