An AI Agent Broke McKinsey in Two Hours. Here's the Identity Layer That Would Have Stopped It.

by R. Demetri Vallejos
securityagent-identitySSIenterprisedefense

On February 28, 2026, an autonomous AI agent compromised McKinsey's internal platform Lilli — 46.5 million chat messages, 728,000 confidential files, every AI system prompt writable. The vulnerability was SQL injection. The lesson is about 2026.


What Happened

McKinsey's Lilli is a generative AI platform used by 72% of the firm's employees. It processes over 500,000 prompts monthly across 100,000 documents spanning a century of proprietary research, client engagements, and strategic analysis.

CodeWall — a red-team security startup — deployed an autonomous AI agent against it. Not a human operator running tools. An agent acting on its own initiative. It found Lilli's publicly exposed API documentation: over 200 endpoints, 22 requiring no authentication.

One unprotected endpoint accepted user search queries. Parameter values were safely parameterized. But JSON field names — the keys — were concatenated directly into SQL. When database errors reflected back, the agent recognized the injection surface. Fifteen blind iterations. Each error revealing more structure. Then live production data.

The agent chained the SQL injection with an IDOR vulnerability to read individual employees' search histories. Within two hours:

Exposed AssetVolume
Chat messages (plaintext)46.5 million
Confidential files728,000
User accounts57,000
AI assistants384,000
Workspaces94,000
RAG document chunks3.68 million
External API files & messages1.3 million

The injection was not read-only. Lilli's system prompts — the instructions controlling how the AI behaves for every one of McKinsey's 40,000 consultants — were stored in the same database, writable with a single UPDATE statement. An attacker could have silently poisoned every strategic recommendation, every M&A analysis, every client deliverable. Invisible to the consultants relying on it.

McKinsey patched all unauthenticated endpoints within 24 hours of disclosure. The vulnerability had been live for over two years. Standard tools, including OWASP ZAP, had missed it.


What Actually Failed

This was a traditional AppSec failure that reached an AI system — not a model jailbreak. Three missing controls:

No authentication on 22 endpoints. The attacking agent connected to production APIs without proving identity. No credential. No challenge. The door was open.

SQL injection via JSON key concatenation. Values were parameterized; field names were not. A vulnerability class that has topped the OWASP Top 10 for over two decades.

No object-level authorization. Any agent with access to one record could enumerate all records.

The AI layer amplified the damage — an autonomous agent can run fifteen blind injection attempts methodically, at machine speed. But the vulnerability was software-level. The AI just removed the human bottleneck.


Why This Will Keep Happening

The McKinsey breach is not an outlier. It is the default state of every AI agent deployment in 2026.

  • 88% of organizations reported a confirmed or suspected AI agent security incident in the past year
  • 80% reported risky agent behaviors including unauthorized system access
  • 21% of executives have complete visibility into agent permissions

The reason is structural. AI agents today have no identity. They are processes with API keys. When an agent connects to a service, that service cannot answer three basic questions:

1. Who is this agent? The CodeWall agent connected to Lilli's API with no verifiable identity — not which organization, not which agent, not whether it was authorized to be there. Authenticated endpoints relied on session tokens proving a user had logged in, not that a specific agent was authorized for a specific operation.

2. What is this agent authorized to do? Even knowing the agent's identity, there was no portable, verifiable artifact specifying permitted operations. Agent permissions in most systems are configuration fields stored in databases, checked by code paths that can have bugs. There is no cryptographic boundary between "read search queries" and "write system prompts."

3. Did this agent actually perform this action? Lilli's audit logs were unsigned database rows. If the attacker had modified logs alongside system prompts, the tampering would be undetectable. An audit trail without signatures is just a story.


What Would Have Prevented It

The missing identity layer has three properties — each mapping directly to one of the failure modes above.

Cryptographic Identity — answers "who"

Every agent gets a decentralized identifier (DID) backed by a post-quantum keypair. Not a session token. Not an API key. A self-certifying identifier where the DID is the hash of the public key — provably bound to key material without any external registry.

When an agent connects, it presents its DID. The service verifies cryptographically — no auth server, no API call, no vendor dependency. The proof is math.

Verifiable Credentials — answers "what"

Permissions are not configuration fields. They are W3C Verifiable Credentials — digitally signed documents stating:

"Organization X authorizes Agent Y to perform operations Z, signed with Organization X's key, expiring on date T."

The credential is portable and verifiable at any service boundary. The service checks the signature, the expiry, and the claimed capabilities. If the credential doesn't grant system prompt writes, the operation is denied — not by a code path, but by a cryptographic check that either passes or fails.

In the McKinsey scenario: the CodeWall agent had no valid credential. It never reaches the SQL injection surface.

Signed Actions — answers "did it"

Every agent action — every API call, every database write, every tool invocation — is signed with the agent's private key. The signature is stored alongside the audit log entry. Years later, anyone can verify that Agent X performed Action Y at Time T by checking the signature against the known public key.

If an attacker gains database write access, they can modify records but cannot forge signatures. Tampering becomes detectable. The audit trail becomes a cryptographic proof.


The Protocol

This is not theoretical. The cryptographic primitives are NIST-standardized and production-ready today.

ComponentSpecification
Signature schemeML-DSA-65 (NIST FIPS 204)
Quantum security128-bit post-quantum
Key derivationBLAKE3-KDF from 32-byte master seed
DID methoddid:aethyr:<namespace>:<identifier>
Public key size1,952 bytes
Signature size3,309 bytes
Verification time~0.5 ms
Verify library size3.8 kB, zero dependencies

The identifier is the BLAKE3 hash of the ML-DSA-65 public key — self-certifying, no blockchain required, no vendor dependency for verification. Any MCP server, API gateway, or agent framework can verify an Aethyr credential without an account or an internet connection.

import { verifyCredential } from '@aethyrai/ssi-verify';

const result = await verifyCredential(credential, issuerPublicKey);
// result.valid → true | false
// No API call. No account. No network request.

A note on enforcement: credential verification is most effective when applied at the infrastructure layer — an API gateway or mTLS boundary — rather than as application middleware. Application-layer enforcement is still subject to the kind of vulnerabilities CodeWall exploited. Gateway-layer enforcement is not.


The Stakes

The agent economy is growing at 41% CAGR — $7.8 billion in 2025, projected $52.6 billion by 2030. Agents are moving from answering questions to executing transactions, accessing financial systems, and delegating to other agents.

The McKinsey incident exposed chat logs and files. The next McKinsey-scale breach will involve an agent executing financial transactions or operating inside critical infrastructure. At that point, an unsigned audit trail isn't just a legal liability — it's a mission integrity failure.

The question is not whether agent identity becomes mandatory. The question is whether the infrastructure exists when the market demands it.

It does now.


The Aethyr SSI protocol is open. The verification library is public. The specification is published. If you're building agent infrastructure and want to integrate cryptographic identity: github.com/aethyrai/ssi-verify

For defense and critical infrastructure deployments requiring air-gapped credential verification or sovereign DID namespace registration, contact Aethyr Research directly.