Adversarial security and privacy review of the Hermes + LangGraph + Payload architecture for the Ops Analyst digital employee. 13 findings across credential handling, internal seams, untrusted input, memory integrity, PII egress, access control, audit integrity, and network exposure.
The design is strong on numeric integrity (deterministic KPIs, grounding gate, deterministic ratings) and on LLM-provider egress (Presidio → LiteLLM → Anthropic zero-data-retention). It is underspecified on the internal mesh and inbound boundaries, and has one genuine leak on the observability egress path.
Gmail bodies, Slack messages, Jira descriptions, Productive/QuickBooks records enter via adapters. Anyone who can email the agency can place attacker-controlled text into LLM prompts. → SEC-03, SEC-04
Six services — Hermes, LangGraph :8100, Payload :3000, LiteLLM :4000, Presidio :5001/:5002, Postgres :5432 — calling each other over HTTP/TCP. → SEC-02, SEC-10
Slack events and approval-button interactivity reaching Hermes; the Payload admin portal serving humans. → SEC-09, SEC-12, SEC-07, SEC-08
Anthropic API is contracted zero-data-retention. Langfuse Cloud is a separate SaaS outside that contract. → SEC-06, SEC-01, SEC-13
Exploitable as designed, or a compliance breach. Every P0 is a specification or configuration gap on an unchanged topology.
The spec claims credentials are "encrypted at rest via Payload field-level encryption." Payload v3 has no general-purpose field-level encryption — this is a phantom control. As written, Phase 2 ships tenant OAuth tokens and API keys (QuickBooks, Gmail, Productive, Jira, Slack) as plaintext JSON, readable by any DB access and returned by the REST API to any role that can read Connections.
beforeChange hook encrypts with AES-256-GCM; versioned key from env (v1) → KMS/Vault (prod).credentials never serialized to portal or REST reads (access.read: false).Hermes → LangGraph :8100 — no auth specified anywhere. Anyone who reaches the port can invoke the agent, query client financials conversationally, and burn token budget. Graph → LiteLLM :4000 — no master key specified; a LiteLLM proxy without LITELLM_MASTER_KEY accepts unauthenticated requests: free Anthropic calls on Makro's bill. LangGraph → Payload has an API key, but its scope, storage, and rotation are unspecified.
/health.LITELLM_MASTER_KEY set; the graph uses a budget-scoped virtual key, never the master key.Gmail, Slack, and Jira bodies are attacker-controlled input — an external party needs nothing more than the agency's email address. The deterministic spine already limits blast radius: LLM nodes have no tools, ratings are deterministic, the grounding gate diffs every number, Hermes executes delivery from stored intents. But injected text can still skew narrative prose and recommendations humans act on, poison the Insights memory (SEC-04), echo through conversational answers, and attempt exfiltration through model-authored delivery-intent fields.
The observability design sends per-node input/output to Langfuse Cloud — a third-party SaaS outside the Anthropic ZDR contract. Graph state contains raw adapter data (pre-redaction), the restored report (post-redaction), and conversational user messages. Tracing node I/O as designed ships exactly the PII that ADR-011 spends two services keeping away from the model provider. This silently defeats the privacy architecture.
core/llm.py chokepoint — masked prompt and pre-restore response, both PII-free by construction.The compose file publishes every service: Postgres 5432, Payload 3000, Presidio 5001/5002, LiteLLM 4000, LangGraph 8100. Docker-published ports bind 0.0.0.0 and bypass ufw/host firewalls (Docker programs iptables directly). On a VPS this exposes the unauthenticated LiteLLM and agent API (SEC-02), Presidio, and Postgres to the internet — one misconfiguration converts every internal-seam gap into a remote one.
127.0.0.1:8100:8100 loopback only (Hermes runs natively on the same host).Real weaknesses with mitigating context. All gate Phase 10 / production, not individual phase ships.
Insights are agent-authored, persisted, and fed into every future prompt — the persistence vector for SEC-03. One poisoned run writes "Acme always pays late — suppress AR alerts," and every later run inherits it. Existing controls are real: Directives are human-pinned (OWASP ASI06), evidence_refs required, ADD/UPDATE/DELETE/NOOP write policy, human-gated promotion, TTL on unverified entries. Gaps: insights enter prompts before any review, refs are required but not validated, no content constraints, and suppression-type insights are treated like ordinary observations.
evidence_refs server-side: must resolve to same-tenant KpiSnapshot/run rows.pending_review) before becoming prompt-eligible.The design blocks LLM calls with REDACTION_UNAVAILABLE when Presidio is down. On the cron path mask/restore is explicit in the graph. On the conversational path the guarantee currently rests on every call site remembering to mask — and RedactionConfigs.enabled is a portal-editable master switch.
core/llm.py is the only module allowed to call LiteLLM — mask → send → restore inside, fail closed. CI rule: LiteLLM imports anywhere else fail the build.super_admin-only, audit-logged, and posts a Slack notice.REDACTION_UNAVAILABLE and zero LiteLLM egress.The tenantScoped access function constrains reads. Nothing constrains writes: a tenant admin could create or update a document carrying a foreign tenant_id, and the field is mutable after creation. v1 is single-tenant so exploitation is theoretical — but the design promise is "all interfaces accept tenant_id from day one," so the enforcement must exist from day one too, or productization inherits a latent cross-tenant hole.
create/update access functions force tenant_id from the authenticated user's scope — server-set, never client-supplied.tenant_id immutable after create (beforeChange hook rejects changes).A Payload collection is mutable by default. A compromised or malicious admin account could edit or delete AuditLog entries — and an audit log must be trustworthy precisely when an account is compromised.
update: false, delete: false for every role including super_admin.REVOKE UPDATE, DELETE from the app role + a BEFORE UPDATE OR DELETE trigger raising an exception.Slack → Hermes is the platform's only unauthenticated-by-default public inbound surface, and approval decisions ride on it. The docs never state whether Hermes uses the Events API (public HTTPS) or Socket Mode, nor that signatures are verified. A forged request that Hermes trusts could approve a pending action or inject conversational queries.
X-Slack-Signature HMAC-SHA256 on every request, 5-minute timestamp window, constant-time compare.AdminUsers record with operator+ role — Slack identity alone is not authorization. Decisions write through Payload so they hit the AuditLog.Schedule, don't block.
litellm:main-stable is a mutable tag tracking a fast-moving project that sits on the path of every model call; Presidio images are unpinned. Pin images by digest, commit lockfiles, enable Renovate/Dependabot, and upgrade deliberately — every LiteLLM upgrade reruns the CI no-fallback guard.
The portal gates approvals and redaction config. Add login lockout/backoff, secure session cookie flags, short super-admin session TTL, 2FA when feasible, TLS via the SEC-10 reverse proxy, and consider IP-allowlisting /admin for a ~20-person team.
Daily pg_dump contains everything: encrypted credentials, PII, the full audit log. Encrypt backups at rest, restrict read access to ops, document the 30-day retention as a privacy commitment (it bounds PII residency), and test restore quarterly.
The three-layer shape, the outbound-only approval model, and the Presidio → LiteLLM → Anthropic privacy pipeline all survive intact. No component changes. The revision is authentication on every seam plus a closed network — four new ADRs.
Resolves SEC-02, SEC-09, SEC-10. Every service-to-service call authenticates; no internal service is reachable beyond its callers. Bearer token on LangGraph, master + virtual keys on LiteLLM, least-privilege Payload service account, Socket Mode for Slack, closed compose network. Cost: five env vars and a network block.
Resolves SEC-01. Application-side AES-256-GCM via Payload hooks, versioned keys (env → KMS), field excluded from all portal reads, decryption only through an audit-logged agent-only endpoint. Tenant OAuth tokens are the keys to client financial systems — their compromise is the worst single outcome this platform can produce.
Resolves SEC-03, SEC-04. Spotlighting for all adapter text, schema-validated structured output, config-resolved delivery channels, validated evidence refs, review gate on escalation-suppressing insights, injection cases in the CI eval gate. Extends the deterministic spine from numbers to prose, persistence, and routing.
Resolves SEC-06. Langfuse Cloud receives only PII-masked content, captured exclusively inside the core/llm.py chokepoint; all other nodes trace metadata only. Makes the privacy guarantee structural rather than procedural — the same argument that justified in-graph redaction.
Every finding lands inside the existing 10-phase plan. No rework of ADR-001…012.
| Phase | Security deliverables | Findings |
|---|---|---|
| 1 — Setup & infra | Closed compose network, loopback-bound LangGraph, TLS reverse proxy, LiteLLM master key in config, image digest pinning, encrypted backups | SEC-02, SEC-10, SEC-11, SEC-13 |
| 2 — Payload collections | Credential encryption hooks + field access, write-side tenant enforcement + immutable tenant_id, AuditLog update/delete denial + DB trigger, RedactionConfigs privileged toggle, evidence_refs validation, access-function test matrix | SEC-01, SEC-04, SEC-05, SEC-07, SEC-08, SEC-12 |
| 3 — Adapters | Content sanitization + delimiting in transform(), in-memory-only credential handling | SEC-01, SEC-03 |
| 4 — LangGraph brain | Bearer-token middleware, core/llm.py chokepoint (mask/restore + trace capture + CI import rule), prompt contracts, structured outputs, intent validation, insight write policy | SEC-02, SEC-03, SEC-04, SEC-05, SEC-06 |
| 5 — Hermes gateway | Bearer token to LangGraph, Socket Mode / signature verification, delivery-intent allowlist enforcement | SEC-02, SEC-03, SEC-09 |
| 6 — HITL approvals | Slack user → AdminUsers mapping on approval decisions; decisions write through Payload (audit-logged) | SEC-09 |
| 8 — Observability | Langfuse capture policy (masked-only), client masking function, no-I/O on raw-data nodes | SEC-06 |
| 9 — Dashboard | Memory Console review/retire flow, portal auth hardening | SEC-04, SEC-12 |
| 10 — E2E & hardening | Seam 401 tests, credential plaintext probe, injection eval suite, poisoning E2E, fail-closed E2E both paths, cross-tenant probes, audit-mutation tests, forged-Slack tests, external port-scan assertion, restore drill | All |
None of the required mitigations alter product behavior. The review confirms conformance after revision:
| Requirement | Conformance after revision |
|---|---|
| Slack conversational mode | ✓ Met. Hermes Slack bindings → LangGraph API. SEC-02 adds a bearer token, SEC-09 verifies Slack's signature — user experience unchanged. |
| Self-learning memory (Directives + Insights) | ✓ Met. Directives human-pinned, Insights agent-written with validated evidence. SEC-04 quarantines only escalation-suppressing insights; the ordinary learning loop is unchanged. |
| Subgraph seam for sub-agents | ✓ Met. ADR-012 contracts untouched. Future sub-agents inherit the core/llm.py chokepoint, so the seam stays privacy-safe by construction. |
| Single-tenant deployment, code-only AI Employee creation | ✓ Met. AI Employees are created in code/seed (graph module + config docs); the portal cannot create employees. SEC-07's server-set tenant_id reinforces this. |
| Portal limited to monitoring, scorecard & token visibility | ✓ Met. Portal surface is dashboards, scorecard/KPI views, token costs, approval queue, activity log, memory console — monitoring and human-gating only, no employee authoring. SEC-01 removes credentials from its read surface; SEC-12 hardens access. |
The architecture's core decisions hold — all 12 existing ADRs survive intact, and the deterministic spine (KPIs, ratings, grounding gate) plus the Presidio → LiteLLM → Anthropic ZDR pipeline are genuinely strong privacy engineering. The verdict is REVISE because two P0 clusters would otherwise ship as designed: SEC-01's phantom encryption control (Payload field-level encryption doesn't exist) and SEC-02 + SEC-10's open internal mesh (unauthenticated services published to the host network). The fix is a zero-trust overlay — ADR-013 through ADR-016 — on an unchanged topology, absorbed entirely within the existing 10-phase plan.