Security & Privacy Review — AI Employee Platform

02 — Findings

P0 — must fix before the affected phase ships

Exploitable as designed, or a compliance breach. Every P0 is a specification or configuration gap on an unchanged topology.

SEC-01

Connections credential storage: the encryption mechanism is asserted, not designed

The spec claims credentials are "encrypted at rest via Payload field-level encryption." Payload v3 has no general-purpose field-level encryption — this is a phantom control. As written, Phase 2 ships tenant OAuth tokens and API keys (QuickBooks, Gmail, Productive, Jira, Slack) as plaintext JSON, readable by any DB access and returned by the REST API to any role that can read Connections.

Required design

beforeChange hook encrypts with AES-256-GCM; versioned key from env (v1) → KMS/Vault (prod).
Field-level access: credentials never serialized to portal or REST reads (access.read: false).
Decryption only via a dedicated, audit-logged endpoint restricted to the agent service account.
Credentials never enter LangGraph checkpoints, traces, or logs.

MITIGATION PHASES Phase 2Phase 3Phase 10

SEC-02

Internal seams: two of three service-to-service calls have no authentication

Hermes → LangGraph :8100 — no auth specified anywhere. Anyone who reaches the port can invoke the agent, query client financials conversationally, and burn token budget. Graph → LiteLLM :4000 — no master key specified; a LiteLLM proxy without LITELLM_MASTER_KEY accepts unauthenticated requests: free Anthropic calls on Makro's bill. LangGraph → Payload has an API key, but its scope, storage, and rotation are unspecified.

Required design

LangGraph API requires a bearer token (constant-time compare) on all routes except /health.
LITELLM_MASTER_KEY set; the graph uses a budget-scoped virtual key, never the master key.
Payload agent key becomes a least-privilege service account limited to the collections the graph touches.
Seam tests assert 401 without a token (Phase 10).

MITIGATION PHASES Phase 1Phase 4Phase 5Phase 10

SEC-03

Prompt injection: adapter-fetched content reaches LLM nodes undelimited

Gmail, Slack, and Jira bodies are attacker-controlled input — an external party needs nothing more than the agency's email address. The deterministic spine already limits blast radius: LLM nodes have no tools, ratings are deterministic, the grounding gate diffs every number, Hermes executes delivery from stored intents. But injected text can still skew narrative prose and recommendations humans act on, poison the Insights memory (SEC-04), echo through conversational answers, and attempt exfiltration through model-authored delivery-intent fields.

Required design

Spotlighting: all adapter free text wrapped in explicit data delimiters; marker sequences stripped from source.
Structured output: report and justification nodes emit schema-validated JSON; reject on violation.
Intent allowlists: delivery channels resolve from config, never model output; Hermes rejects non-allowlisted intents.
Injection eval set: adversarial emails/tickets in the CI eval gate; assert ratings unchanged, no allowlist violations.

MITIGATION PHASES Phase 3Phase 4Phase 5Phase 10

SEC-06

PII leakage into Langfuse Cloud traces

The observability design sends per-node input/output to Langfuse Cloud — a third-party SaaS outside the Anthropic ZDR contract. Graph state contains raw adapter data (pre-redaction), the restored report (post-redaction), and conversational user messages. Tracing node I/O as designed ships exactly the PII that ADR-011 spends two services keeping away from the model provider. This silently defeats the privacy architecture.

Required design

LLM spans captured only inside the core/llm.py chokepoint — masked prompt and pre-restore response, both PII-free by construction.
All other nodes trace metadata only (node name, duration, counts, run_id) — no input/output capture.
Langfuse client-side masking function as a second guard.
Self-hosted Langfuse is the designated path if full-fidelity traces are ever needed.

MITIGATION PHASES Phase 4Phase 8Phase 10

SEC-10

Docker port exposure: five internal services published to the host

The compose file publishes every service: Postgres 5432, Payload 3000, Presidio 5001/5002, LiteLLM 4000, LangGraph 8100. Docker-published ports bind 0.0.0.0 and bypass ufw/host firewalls (Docker programs iptables directly). On a VPS this exposes the unauthenticated LiteLLM and agent API (SEC-02), Presidio, and Postgres to the internet — one misconfiguration converts every internal-seam gap into a remote one.

Required design

Presidio, LiteLLM, Postgres: compose-internal network, no published ports.
LangGraph: 127.0.0.1:8100:8100 loopback only (Hermes runs natively on the same host).
Payload: the only internet-facing service, behind a TLS reverse proxy (Caddy/nginx).
Phase 10 E2E includes a port-scan assertion: only 443/3000 reachable externally.

MITIGATION PHASES Phase 1Phase 10

P1 — must fix before production cutover

Real weaknesses with mitigating context. All gate Phase 10 / production, not individual phase ships.

SEC-04

Memory poisoning through the Insights write path

Insights are agent-authored, persisted, and fed into every future prompt — the persistence vector for SEC-03. One poisoned run writes "Acme always pays late — suppress AR alerts," and every later run inherits it. Existing controls are real: Directives are human-pinned (OWASP ASI06), evidence_refs required, ADD/UPDATE/DELETE/NOOP write policy, human-gated promotion, TTL on unverified entries. Gaps: insights enter prompts before any review, refs are required but not validated, no content constraints, and suppression-type insights are treated like ordinary observations.

Required design

Validate evidence_refs server-side: must resolve to same-tenant KpiSnapshot/run rows.
Cap insight length; reject instruction-pattern content at write time; re-inject as delimited data.
Escalation-suppressing insights require human review (pending_review) before becoming prompt-eligible.
Memory Console: one-click retire with audit trail.

MITIGATION PHASES Phase 2Phase 4Phase 9Phase 10

SEC-05

Presidio fail-closed: guaranteed on cron, unproven on the conversational path

The design blocks LLM calls with REDACTION_UNAVAILABLE when Presidio is down. On the cron path mask/restore is explicit in the graph. On the conversational path the guarantee currently rests on every call site remembering to mask — and RedactionConfigs.enabled is a portal-editable master switch.

Required design

core/llm.py is the only module allowed to call LiteLLM — mask → send → restore inside, fail closed. CI rule: LiteLLM imports anywhere else fail the build.
Disabling redaction is super_admin-only, audit-logged, and posts a Slack notice.
Conversational E2E: kill Presidio, assert REDACTION_UNAVAILABLE and zero LiteLLM egress.

MITIGATION PHASES Phase 2Phase 4Phase 10

SEC-07

tenant_id enforcement in Payload RBAC: read-side designed, write-side unspecified

The tenantScoped access function constrains reads. Nothing constrains writes: a tenant admin could create or update a document carrying a foreign tenant_id, and the field is mutable after creation. v1 is single-tenant so exploitation is theoretical — but the design promise is "all interfaces accept tenant_id from day one," so the enforcement must exist from day one too, or productization inherits a latent cross-tenant hole.

Required design

create/update access functions force tenant_id from the authenticated user's scope — server-set, never client-supplied.
tenant_id immutable after create (beforeChange hook rejects changes).
Agent service-account key carries a tenant claim enforced identically.
Access-function tests for all 15 collections × 4 roles are a Phase 2 success criterion.

MITIGATION PHASES Phase 2Phase 10

SEC-08

Audit-log immutability: "append-only" is a label, not yet a property

A Payload collection is mutable by default. A compromised or malicious admin account could edit or delete AuditLog entries — and an audit log must be trustworthy precisely when an account is compromised.

Required design

Payload access: update: false, delete: false for every role including super_admin.
DB layer: REVOKE UPDATE, DELETE from the app role + a BEFORE UPDATE OR DELETE trigger raising an exception.
Entries record actor, collection, doc id, and diff. Optional P2: hash-chain for tamper evidence.

MITIGATION PHASES Phase 2Phase 10

SEC-09

Slack request-signature verification: unstated assumption on the only public inbound path

Slack → Hermes is the platform's only unauthenticated-by-default public inbound surface, and approval decisions ride on it. The docs never state whether Hermes uses the Events API (public HTTPS) or Socket Mode, nor that signatures are verified. A forged request that Hermes trusts could approve a pending action or inject conversational queries.

Required design

Prefer Socket Mode — outbound WebSocket, no public endpoint; confirm Hermes v0.14 support.
If Events API: verify X-Slack-Signature HMAC-SHA256 on every request, 5-minute timestamp window, constant-time compare.
Approval clicks map the verified Slack user to an AdminUsers record with operator+ role — Slack identity alone is not authorization. Decisions write through Payload so they hit the AuditLog.

MITIGATION PHASES Phase 5Phase 6Phase 10

P2 — scheduled hardening

Schedule, don't block.

SEC-11

Supply-chain pinning

litellm:main-stable is a mutable tag tracking a fast-moving project that sits on the path of every model call; Presidio images are unpinned. Pin images by digest, commit lockfiles, enable Renovate/Dependabot, and upgrade deliberately — every LiteLLM upgrade reruns the CI no-fallback guard.

MITIGATION PHASES Phase 1ongoing

SEC-12

Payload admin portal hardening

The portal gates approvals and redaction config. Add login lockout/backoff, secure session cookie flags, short super-admin session TTL, 2FA when feasible, TLS via the SEC-10 reverse proxy, and consider IP-allowlisting /admin for a ~20-person team.

MITIGATION PHASES Phase 2Phase 9

SEC-13

Backup data protection

Daily pg_dump contains everything: encrypted credentials, PII, the full audit log. Encrypt backups at rest, restrict read access to ops, document the 30-day retention as a privacy commitment (it bounds PII residency), and test restore quarterly.

MITIGATION PHASES Phase 1Phase 10

03 — Revised Architecture

Zero-trust overlay on an unchanged topology

The three-layer shape, the outbound-only approval model, and the Presidio → LiteLLM → Anthropic privacy pipeline all survive intact. No component changes. The revision is authentication on every seam plus a closed network — four new ADRs.

Internet→TLS reverse proxy→Payload :3000only public service

Slack→Socket Mode (outbound WS)→Hermes (native)no public HTTP endpoint

Hermes→bearer token→LangGraph 127.0.0.1:8100loopback only

LangGraph→virtual key→LiteLLMcompose-internal, no published port

LangGraph→Presidiocompose-internal, no published port

LangGraph→scoped service key→Payload RESTleast-privilege account

all services→Postgrescompose-internal, no published port

ADR-013

Zero-Trust Internal Seams

Resolves SEC-02, SEC-09, SEC-10. Every service-to-service call authenticates; no internal service is reachable beyond its callers. Bearer token on LangGraph, master + virtual keys on LiteLLM, least-privilege Payload service account, Socket Mode for Slack, closed compose network. Cost: five env vars and a network block.

ADR-014

Credential Encryption & Secrets Handling

Resolves SEC-01. Application-side AES-256-GCM via Payload hooks, versioned keys (env → KMS), field excluded from all portal reads, decryption only through an audit-logged agent-only endpoint. Tenant OAuth tokens are the keys to client financial systems — their compromise is the worst single outcome this platform can produce.

ADR-015

Untrusted Content & Memory Hygiene

Resolves SEC-03, SEC-04. Spotlighting for all adapter text, schema-validated structured output, config-resolved delivery channels, validated evidence refs, review gate on escalation-suppressing insights, injection cases in the CI eval gate. Extends the deterministic spine from numbers to prose, persistence, and routing.

ADR-016

Observability Privacy — Masked-Only Traces

Resolves SEC-06. Langfuse Cloud receives only PII-masked content, captured exclusively inside the core/llm.py chokepoint; all other nodes trace metadata only. Makes the privacy guarantee structural rather than procedural — the same argument that justified in-graph redaction.

04 — Mitigation Plan

Mitigations mapped to implementation phases

Every finding lands inside the existing 10-phase plan. No rework of ADR-001…012.

Phase	Security deliverables	Findings
1 — Setup & infra	Closed compose network, loopback-bound LangGraph, TLS reverse proxy, LiteLLM master key in config, image digest pinning, encrypted backups	SEC-02, SEC-10, SEC-11, SEC-13
2 — Payload collections	Credential encryption hooks + field access, write-side tenant enforcement + immutable tenant_id, AuditLog update/delete denial + DB trigger, RedactionConfigs privileged toggle, evidence_refs validation, access-function test matrix	SEC-01, SEC-04, SEC-05, SEC-07, SEC-08, SEC-12
3 — Adapters	Content sanitization + delimiting in `transform()`, in-memory-only credential handling	SEC-01, SEC-03
4 — LangGraph brain	Bearer-token middleware, `core/llm.py` chokepoint (mask/restore + trace capture + CI import rule), prompt contracts, structured outputs, intent validation, insight write policy	SEC-02, SEC-03, SEC-04, SEC-05, SEC-06
5 — Hermes gateway	Bearer token to LangGraph, Socket Mode / signature verification, delivery-intent allowlist enforcement	SEC-02, SEC-03, SEC-09
6 — HITL approvals	Slack user → AdminUsers mapping on approval decisions; decisions write through Payload (audit-logged)	SEC-09
8 — Observability	Langfuse capture policy (masked-only), client masking function, no-I/O on raw-data nodes	SEC-06
9 — Dashboard	Memory Console review/retire flow, portal auth hardening	SEC-04, SEC-12
10 — E2E & hardening	Seam 401 tests, credential plaintext probe, injection eval suite, poisoning E2E, fail-closed E2E both paths, cross-tenant probes, audit-mutation tests, forged-Slack tests, external port-scan assertion, restore drill	All

05 — Product Conformance

The design still meets the stated product requirements

None of the required mitigations alter product behavior. The review confirms conformance after revision:

Requirement	Conformance after revision
Slack conversational mode	✓ Met. Hermes Slack bindings → LangGraph API. SEC-02 adds a bearer token, SEC-09 verifies Slack's signature — user experience unchanged.
Self-learning memory (Directives + Insights)	✓ Met. Directives human-pinned, Insights agent-written with validated evidence. SEC-04 quarantines only escalation-suppressing insights; the ordinary learning loop is unchanged.
Subgraph seam for sub-agents	✓ Met. ADR-012 contracts untouched. Future sub-agents inherit the `core/llm.py` chokepoint, so the seam stays privacy-safe by construction.
Single-tenant deployment, code-only AI Employee creation	✓ Met. AI Employees are created in code/seed (graph module + config docs); the portal cannot create employees. SEC-07's server-set tenant_id reinforces this.
Portal limited to monitoring, scorecard & token visibility	✓ Met. Portal surface is dashboards, scorecard/KPI views, token costs, approval queue, activity log, memory console — monitoring and human-gating only, no employee authoring. SEC-01 removes credentials from its read surface; SEC-12 hardens access.

AI Employee Platform: the architecture holds. Five gaps must close before it ships.

Four trust boundaries

External content

Internal service mesh

Inbound public

Outbound egress

P0 — must fix before the affected phase ships

Connections credential storage: the encryption mechanism is asserted, not designed

Internal seams: two of three service-to-service calls have no authentication

Prompt injection: adapter-fetched content reaches LLM nodes undelimited

PII leakage into Langfuse Cloud traces

Docker port exposure: five internal services published to the host

P1 — must fix before production cutover

Memory poisoning through the Insights write path

Presidio fail-closed: guaranteed on cron, unproven on the conversational path

tenant_id enforcement in Payload RBAC: read-side designed, write-side unspecified

Audit-log immutability: "append-only" is a label, not yet a property

Slack request-signature verification: unstated assumption on the only public inbound path

P2 — scheduled hardening

Supply-chain pinning

Payload admin portal hardening

Backup data protection

Zero-trust overlay on an unchanged topology

Zero-Trust Internal Seams

Credential Encryption & Secrets Handling

Untrusted Content & Memory Hygiene

Observability Privacy — Masked-Only Traces

Mitigations mapped to implementation phases

The design still meets the stated product requirements