RunstackLabs
Security & Privacy Review — 2026-06-11

AI Employee Platform: the architecture holds. Five gaps must close before it ships.

Adversarial security and privacy review of the Hermes + LangGraph + Payload architecture for the Ops Analyst digital employee. 13 findings across credential handling, internal seams, untrusted input, memory integrity, PII egress, access control, audit integrity, and network exposure.

SUBJECT Hybrid Hermes + LangGraph + Payload CMS SCOPE Design spec + 12 ADRs + 10-phase plan FIRST ROLE Ops Analyst (Makro Agency)
VERDICT: REVISE — zero-trust overlay, 4 new ADRs, no topology change
5
P0 — block phase ship
5
P1 — block production
3
P2 — scheduled hardening
12/12
Existing ADRs intact
01 — Threat Model

Four trust boundaries

The design is strong on numeric integrity (deterministic KPIs, grounding gate, deterministic ratings) and on LLM-provider egress (Presidio → LiteLLM → Anthropic zero-data-retention). It is underspecified on the internal mesh and inbound boundaries, and has one genuine leak on the observability egress path.

B1

External content

Gmail bodies, Slack messages, Jira descriptions, Productive/QuickBooks records enter via adapters. Anyone who can email the agency can place attacker-controlled text into LLM prompts. → SEC-03, SEC-04

B2

Internal service mesh

Six services — Hermes, LangGraph :8100, Payload :3000, LiteLLM :4000, Presidio :5001/:5002, Postgres :5432 — calling each other over HTTP/TCP. → SEC-02, SEC-10

B3

Inbound public

Slack events and approval-button interactivity reaching Hermes; the Payload admin portal serving humans. → SEC-09, SEC-12, SEC-07, SEC-08

B4

Outbound egress

Anthropic API is contracted zero-data-retention. Langfuse Cloud is a separate SaaS outside that contract. → SEC-06, SEC-01, SEC-13

02 — Findings

P0 — must fix before the affected phase ships

Exploitable as designed, or a compliance breach. Every P0 is a specification or configuration gap on an unchanged topology.

SEC-01

Connections credential storage: the encryption mechanism is asserted, not designed

P0

The spec claims credentials are "encrypted at rest via Payload field-level encryption." Payload v3 has no general-purpose field-level encryption — this is a phantom control. As written, Phase 2 ships tenant OAuth tokens and API keys (QuickBooks, Gmail, Productive, Jira, Slack) as plaintext JSON, readable by any DB access and returned by the REST API to any role that can read Connections.

Required design
  • beforeChange hook encrypts with AES-256-GCM; versioned key from env (v1) → KMS/Vault (prod).
  • Field-level access: credentials never serialized to portal or REST reads (access.read: false).
  • Decryption only via a dedicated, audit-logged endpoint restricted to the agent service account.
  • Credentials never enter LangGraph checkpoints, traces, or logs.
MITIGATION PHASES Phase 2Phase 3Phase 10
SEC-02

Internal seams: two of three service-to-service calls have no authentication

P0

Hermes → LangGraph :8100 — no auth specified anywhere. Anyone who reaches the port can invoke the agent, query client financials conversationally, and burn token budget. Graph → LiteLLM :4000 — no master key specified; a LiteLLM proxy without LITELLM_MASTER_KEY accepts unauthenticated requests: free Anthropic calls on Makro's bill. LangGraph → Payload has an API key, but its scope, storage, and rotation are unspecified.

Required design
  • LangGraph API requires a bearer token (constant-time compare) on all routes except /health.
  • LITELLM_MASTER_KEY set; the graph uses a budget-scoped virtual key, never the master key.
  • Payload agent key becomes a least-privilege service account limited to the collections the graph touches.
  • Seam tests assert 401 without a token (Phase 10).
MITIGATION PHASES Phase 1Phase 4Phase 5Phase 10
SEC-03

Prompt injection: adapter-fetched content reaches LLM nodes undelimited

P0

Gmail, Slack, and Jira bodies are attacker-controlled input — an external party needs nothing more than the agency's email address. The deterministic spine already limits blast radius: LLM nodes have no tools, ratings are deterministic, the grounding gate diffs every number, Hermes executes delivery from stored intents. But injected text can still skew narrative prose and recommendations humans act on, poison the Insights memory (SEC-04), echo through conversational answers, and attempt exfiltration through model-authored delivery-intent fields.

Required design
  • Spotlighting: all adapter free text wrapped in explicit data delimiters; marker sequences stripped from source.
  • Structured output: report and justification nodes emit schema-validated JSON; reject on violation.
  • Intent allowlists: delivery channels resolve from config, never model output; Hermes rejects non-allowlisted intents.
  • Injection eval set: adversarial emails/tickets in the CI eval gate; assert ratings unchanged, no allowlist violations.
MITIGATION PHASES Phase 3Phase 4Phase 5Phase 10
SEC-06

PII leakage into Langfuse Cloud traces

P0

The observability design sends per-node input/output to Langfuse Cloud — a third-party SaaS outside the Anthropic ZDR contract. Graph state contains raw adapter data (pre-redaction), the restored report (post-redaction), and conversational user messages. Tracing node I/O as designed ships exactly the PII that ADR-011 spends two services keeping away from the model provider. This silently defeats the privacy architecture.

Required design
  • LLM spans captured only inside the core/llm.py chokepoint — masked prompt and pre-restore response, both PII-free by construction.
  • All other nodes trace metadata only (node name, duration, counts, run_id) — no input/output capture.
  • Langfuse client-side masking function as a second guard.
  • Self-hosted Langfuse is the designated path if full-fidelity traces are ever needed.
MITIGATION PHASES Phase 4Phase 8Phase 10
SEC-10

Docker port exposure: five internal services published to the host

P0

The compose file publishes every service: Postgres 5432, Payload 3000, Presidio 5001/5002, LiteLLM 4000, LangGraph 8100. Docker-published ports bind 0.0.0.0 and bypass ufw/host firewalls (Docker programs iptables directly). On a VPS this exposes the unauthenticated LiteLLM and agent API (SEC-02), Presidio, and Postgres to the internet — one misconfiguration converts every internal-seam gap into a remote one.

Required design
  • Presidio, LiteLLM, Postgres: compose-internal network, no published ports.
  • LangGraph: 127.0.0.1:8100:8100 loopback only (Hermes runs natively on the same host).
  • Payload: the only internet-facing service, behind a TLS reverse proxy (Caddy/nginx).
  • Phase 10 E2E includes a port-scan assertion: only 443/3000 reachable externally.
MITIGATION PHASES Phase 1Phase 10

P1 — must fix before production cutover

Real weaknesses with mitigating context. All gate Phase 10 / production, not individual phase ships.

SEC-04

Memory poisoning through the Insights write path

P1

Insights are agent-authored, persisted, and fed into every future prompt — the persistence vector for SEC-03. One poisoned run writes "Acme always pays late — suppress AR alerts," and every later run inherits it. Existing controls are real: Directives are human-pinned (OWASP ASI06), evidence_refs required, ADD/UPDATE/DELETE/NOOP write policy, human-gated promotion, TTL on unverified entries. Gaps: insights enter prompts before any review, refs are required but not validated, no content constraints, and suppression-type insights are treated like ordinary observations.

Required design
  • Validate evidence_refs server-side: must resolve to same-tenant KpiSnapshot/run rows.
  • Cap insight length; reject instruction-pattern content at write time; re-inject as delimited data.
  • Escalation-suppressing insights require human review (pending_review) before becoming prompt-eligible.
  • Memory Console: one-click retire with audit trail.
MITIGATION PHASES Phase 2Phase 4Phase 9Phase 10
SEC-05

Presidio fail-closed: guaranteed on cron, unproven on the conversational path

P1

The design blocks LLM calls with REDACTION_UNAVAILABLE when Presidio is down. On the cron path mask/restore is explicit in the graph. On the conversational path the guarantee currently rests on every call site remembering to mask — and RedactionConfigs.enabled is a portal-editable master switch.

Required design
  • core/llm.py is the only module allowed to call LiteLLM — mask → send → restore inside, fail closed. CI rule: LiteLLM imports anywhere else fail the build.
  • Disabling redaction is super_admin-only, audit-logged, and posts a Slack notice.
  • Conversational E2E: kill Presidio, assert REDACTION_UNAVAILABLE and zero LiteLLM egress.
MITIGATION PHASES Phase 2Phase 4Phase 10
SEC-07

tenant_id enforcement in Payload RBAC: read-side designed, write-side unspecified

P1

The tenantScoped access function constrains reads. Nothing constrains writes: a tenant admin could create or update a document carrying a foreign tenant_id, and the field is mutable after creation. v1 is single-tenant so exploitation is theoretical — but the design promise is "all interfaces accept tenant_id from day one," so the enforcement must exist from day one too, or productization inherits a latent cross-tenant hole.

Required design
  • create/update access functions force tenant_id from the authenticated user's scope — server-set, never client-supplied.
  • tenant_id immutable after create (beforeChange hook rejects changes).
  • Agent service-account key carries a tenant claim enforced identically.
  • Access-function tests for all 15 collections × 4 roles are a Phase 2 success criterion.
MITIGATION PHASES Phase 2Phase 10
SEC-08

Audit-log immutability: "append-only" is a label, not yet a property

P1

A Payload collection is mutable by default. A compromised or malicious admin account could edit or delete AuditLog entries — and an audit log must be trustworthy precisely when an account is compromised.

Required design
  • Payload access: update: false, delete: false for every role including super_admin.
  • DB layer: REVOKE UPDATE, DELETE from the app role + a BEFORE UPDATE OR DELETE trigger raising an exception.
  • Entries record actor, collection, doc id, and diff. Optional P2: hash-chain for tamper evidence.
MITIGATION PHASES Phase 2Phase 10
SEC-09

Slack request-signature verification: unstated assumption on the only public inbound path

P1

Slack → Hermes is the platform's only unauthenticated-by-default public inbound surface, and approval decisions ride on it. The docs never state whether Hermes uses the Events API (public HTTPS) or Socket Mode, nor that signatures are verified. A forged request that Hermes trusts could approve a pending action or inject conversational queries.

Required design
  • Prefer Socket Mode — outbound WebSocket, no public endpoint; confirm Hermes v0.14 support.
  • If Events API: verify X-Slack-Signature HMAC-SHA256 on every request, 5-minute timestamp window, constant-time compare.
  • Approval clicks map the verified Slack user to an AdminUsers record with operator+ role — Slack identity alone is not authorization. Decisions write through Payload so they hit the AuditLog.
MITIGATION PHASES Phase 5Phase 6Phase 10

P2 — scheduled hardening

Schedule, don't block.

SEC-11

Supply-chain pinning

P2

litellm:main-stable is a mutable tag tracking a fast-moving project that sits on the path of every model call; Presidio images are unpinned. Pin images by digest, commit lockfiles, enable Renovate/Dependabot, and upgrade deliberately — every LiteLLM upgrade reruns the CI no-fallback guard.

MITIGATION PHASES Phase 1ongoing
SEC-12

Payload admin portal hardening

P2

The portal gates approvals and redaction config. Add login lockout/backoff, secure session cookie flags, short super-admin session TTL, 2FA when feasible, TLS via the SEC-10 reverse proxy, and consider IP-allowlisting /admin for a ~20-person team.

MITIGATION PHASES Phase 2Phase 9
SEC-13

Backup data protection

P2

Daily pg_dump contains everything: encrypted credentials, PII, the full audit log. Encrypt backups at rest, restrict read access to ops, document the 30-day retention as a privacy commitment (it bounds PII residency), and test restore quarterly.

MITIGATION PHASES Phase 1Phase 10
03 — Revised Architecture

Zero-trust overlay on an unchanged topology

The three-layer shape, the outbound-only approval model, and the Presidio → LiteLLM → Anthropic privacy pipeline all survive intact. No component changes. The revision is authentication on every seam plus a closed network — four new ADRs.

InternetTLS reverse proxyPayload :3000only public service
SlackSocket Mode (outbound WS)Hermes (native)no public HTTP endpoint
Hermesbearer tokenLangGraph 127.0.0.1:8100loopback only
LangGraphvirtual keyLiteLLMcompose-internal, no published port
LangGraphPresidiocompose-internal, no published port
LangGraphscoped service keyPayload RESTleast-privilege account
all servicesPostgrescompose-internal, no published port
ADR-013

Zero-Trust Internal Seams

Resolves SEC-02, SEC-09, SEC-10. Every service-to-service call authenticates; no internal service is reachable beyond its callers. Bearer token on LangGraph, master + virtual keys on LiteLLM, least-privilege Payload service account, Socket Mode for Slack, closed compose network. Cost: five env vars and a network block.

ADR-014

Credential Encryption & Secrets Handling

Resolves SEC-01. Application-side AES-256-GCM via Payload hooks, versioned keys (env → KMS), field excluded from all portal reads, decryption only through an audit-logged agent-only endpoint. Tenant OAuth tokens are the keys to client financial systems — their compromise is the worst single outcome this platform can produce.

ADR-015

Untrusted Content & Memory Hygiene

Resolves SEC-03, SEC-04. Spotlighting for all adapter text, schema-validated structured output, config-resolved delivery channels, validated evidence refs, review gate on escalation-suppressing insights, injection cases in the CI eval gate. Extends the deterministic spine from numbers to prose, persistence, and routing.

ADR-016

Observability Privacy — Masked-Only Traces

Resolves SEC-06. Langfuse Cloud receives only PII-masked content, captured exclusively inside the core/llm.py chokepoint; all other nodes trace metadata only. Makes the privacy guarantee structural rather than procedural — the same argument that justified in-graph redaction.

04 — Mitigation Plan

Mitigations mapped to implementation phases

Every finding lands inside the existing 10-phase plan. No rework of ADR-001…012.

PhaseSecurity deliverablesFindings
1 — Setup & infraClosed compose network, loopback-bound LangGraph, TLS reverse proxy, LiteLLM master key in config, image digest pinning, encrypted backupsSEC-02, SEC-10, SEC-11, SEC-13
2 — Payload collectionsCredential encryption hooks + field access, write-side tenant enforcement + immutable tenant_id, AuditLog update/delete denial + DB trigger, RedactionConfigs privileged toggle, evidence_refs validation, access-function test matrixSEC-01, SEC-04, SEC-05, SEC-07, SEC-08, SEC-12
3 — AdaptersContent sanitization + delimiting in transform(), in-memory-only credential handlingSEC-01, SEC-03
4 — LangGraph brainBearer-token middleware, core/llm.py chokepoint (mask/restore + trace capture + CI import rule), prompt contracts, structured outputs, intent validation, insight write policySEC-02, SEC-03, SEC-04, SEC-05, SEC-06
5 — Hermes gatewayBearer token to LangGraph, Socket Mode / signature verification, delivery-intent allowlist enforcementSEC-02, SEC-03, SEC-09
6 — HITL approvalsSlack user → AdminUsers mapping on approval decisions; decisions write through Payload (audit-logged)SEC-09
8 — ObservabilityLangfuse capture policy (masked-only), client masking function, no-I/O on raw-data nodesSEC-06
9 — DashboardMemory Console review/retire flow, portal auth hardeningSEC-04, SEC-12
10 — E2E & hardeningSeam 401 tests, credential plaintext probe, injection eval suite, poisoning E2E, fail-closed E2E both paths, cross-tenant probes, audit-mutation tests, forged-Slack tests, external port-scan assertion, restore drillAll
05 — Product Conformance

The design still meets the stated product requirements

None of the required mitigations alter product behavior. The review confirms conformance after revision:

RequirementConformance after revision
Slack conversational mode✓ Met. Hermes Slack bindings → LangGraph API. SEC-02 adds a bearer token, SEC-09 verifies Slack's signature — user experience unchanged.
Self-learning memory (Directives + Insights)✓ Met. Directives human-pinned, Insights agent-written with validated evidence. SEC-04 quarantines only escalation-suppressing insights; the ordinary learning loop is unchanged.
Subgraph seam for sub-agents✓ Met. ADR-012 contracts untouched. Future sub-agents inherit the core/llm.py chokepoint, so the seam stays privacy-safe by construction.
Single-tenant deployment, code-only AI Employee creation✓ Met. AI Employees are created in code/seed (graph module + config docs); the portal cannot create employees. SEC-07's server-set tenant_id reinforces this.
Portal limited to monitoring, scorecard & token visibility✓ Met. Portal surface is dashboards, scorecard/KPI views, token costs, approval queue, activity log, memory console — monitoring and human-gating only, no employee authoring. SEC-01 removes credentials from its read surface; SEC-12 hardens access.
06 — Verdict
Security & Privacy Review — 2026-06-11
VERDICT: REVISE

The architecture's core decisions hold — all 12 existing ADRs survive intact, and the deterministic spine (KPIs, ratings, grounding gate) plus the Presidio → LiteLLM → Anthropic ZDR pipeline are genuinely strong privacy engineering. The verdict is REVISE because two P0 clusters would otherwise ship as designed: SEC-01's phantom encryption control (Payload field-level encryption doesn't exist) and SEC-02 + SEC-10's open internal mesh (unauthenticated services published to the host network). The fix is a zero-trust overlay — ADR-013 through ADR-016 — on an unchanged topology, absorbed entirely within the existing 10-phase plan.