Loading...
Loading...

The Sysdig Threat Research Team confirmed the first in-the-wild intrusion run by a large language model agent rather than a human operator. The chain went from Marimo's CVE-2026-39987 pre-auth RCE to a full PostgreSQL exfiltration in under an hour; the database dump itself took under two minutes. The agent fanned 12 cloud API calls across 11 IPs in a 22-second burst to evade per-source-IP detection. This is what machine-speed offense looks like in production.
The Sysdig Threat Research Team documented the moment the industry had been bracing for: the first confirmed in-the-wild intrusion in which a large language model agent, not a human operator, ran the post-exploitation itself. The break-in began with a single unauthenticated request and ended, under an hour later, with an internal PostgreSQL database copied out in full. The final exfiltration took under two minutes. Along the way the agent fanned 12 cloud API calls across 11 different IP addresses in a 22-second burst, distributing the traffic so no alarm tuned to "many requests from one suspicious IP" would ever fire. A machine that strikes in minutes and never shows up twice from the same address is not a threat a human queue can catch.
The entry point was CVE-2026-39987, a pre-authentication remote-code-execution flaw in Marimo, an open-source Python notebook popular with data scientists and AI researchers. The bug was narrow and very human: the integrated terminal's WebSocket endpoint, /terminal/ws, skipped the authentication check that every other route on the same server enforced. One browser-reachable instance (often all it takes is a misconfigured proxy), and an unauthenticated visitor gets a full interactive shell as the user running the process. The flaw is fixed in Marimo 0.23.0 and later.
From that shell the agent moved in four pivots: harvest cloud credentials from the host, use them to pull an SSH private key out of AWS Secrets Manager, ride that key through an SSH bastion, and dump the internal database. The final step (enumerating the schema and exfiltrating the contents) took under two minutes.
What makes this a milestone isn't the vulnerability; it's the operator. The evidence that software, not a person, was at the keyboard is specific. A planning note in Chinese (看还能做什么, roughly "see what else we can do") opened a command block that ran as eight parallel SSH sessions from six different IPs at the same instant. Neither a human nor a simple script does that. The agent didn't know the database schema in advance; it improvised, probing for a credentials table that doesn't exist in any released version of the software it had guessed it was looking at. Values discovered in one step were fed into the next automatically: a password parsed out of a file, a secret ID reused from a directory listing twenty seconds earlier. And the commands were written for a machine to read: output separators between probes, captures truncated to fit a context window, interactive pagers switched off.
Two facts collide here. The first is speed. Attackers now weaponize new flaws within hours of disclosure: the first exploitation attempt against Marimo arrived 9 hours and 41 minutes after the advisory, and Mandiant's M-Trends 2026 reports the mean time-to-exploit has gone negative (about -7 days), meaning attackers now routinely exploit flaws before a patch is even available. The second is that you cannot patch everything. Risk-based vulnerability management accepted long ago that "patch it all" is an impossible mandate: some systems can't be taken offline, and some fixes are blocked by technical debt. Put those together and the defensive window for an attack like this is measured in seconds, against an exposure you may not be able to close. A queue-triage-escalate SOC is structurally too slow.
This is the blunt logic behind the line "only machines can fight machines." Less a slogan than an observation about response time.
Meeting machine-speed offense with machine-speed defense is not one control; it is a different posture at every link in the chain. Simbian's self-improving SecOps agents map onto this attack step by step:
/terminal/ws (and the proxy misconfiguration exposing it) before an attacker did, rather than during a once-a-year engagement.None of this requires believing machines will replace defenders. It requires noticing that the offense already runs at machine speed, and that the sliver of defense which has to keep pace spans both the hours between a CVE's release and its successful exploitation and the seconds between a credential theft and a database dump. When the theft takes minutes, prevention (enforcing defenses with machines) stops being the cautious option. The economics now favor it. It is simply the cheaper option.
If your SOC's containment story still depends on an analyst reading a Slack alert, the math has already changed under it. Book a demo to see Simbian's AI SOC Agent investigate and contain a machine-speed intrusion end to end.
Q: What is CVE-2026-39987? CVE-2026-39987 is a pre-authentication remote-code-execution flaw in Marimo, an open-source Python notebook. The integrated terminal's WebSocket endpoint, /terminal/ws, skipped the authentication check that every other route enforced, so one browser-reachable instance gave an unauthenticated visitor a full interactive shell as the user running the process. It is fixed in Marimo 0.23.0 and later.
Q: How do we know an AI agent — not a human — ran this attack? The Sysdig Threat Research Team identified machine-specific fingerprints: a planning note in Chinese opening a command block that ran as eight parallel SSH sessions from six different IPs at the same instant; values from one step fed into the next automatically (a password parsed out of a file, a secret ID reused 20 seconds later); commands formatted for machine consumption with output separators, truncated captures, and interactive pagers disabled. None of these patterns match a human operator or a static script.
Q: How fast was the attack from start to finish? The full chain — initial RCE on /terminal/ws through credential harvest, SSH key theft from AWS Secrets Manager, lateral movement through a bastion, schema enumeration, and database exfiltration — completed in under an hour. The database dump itself took under two minutes, and the agent fanned 12 cloud API calls across 11 IPs in a 22-second burst to defeat per-source-IP detection.
Q: Why can't a traditional SOC defend against AI-led attacks? A queue-triage-escalate SOC was built for incidents measured in days. AI-led attacks compress the defensive window to seconds. Mandiant's M-Trends 2026 reports the mean time-to-exploit has gone negative — about -7 days, meaning attackers routinely exploit flaws before a patch ships. The first exploitation attempt against Marimo arrived 9 hours 41 minutes after disclosure. No human analyst rotation closes that gap.
Q: How does Simbian's AI SOC Agent stop a machine-speed intrusion? The AI SOC Agent investigates every signal autonomously the moment it fires, and the Context Lake™ tells it which response actions are available and what each one costs, so it can contain or quarantine without waiting for human escalation. It is self-improving, not self-driving — humans keep containment authority and escalation calls, but the routine investigation and response runs at the speed of the attack.
Q: What should defenders do today about AI-led attacks? Three things. Update Marimo to 0.23.0 or later, and audit any internet-reachable instances behind proxies. Assume your defensive window for any new CVE is hours, not weeks — risk-based vulnerability management plus continuous offensive validation, not a yearly pentest. And put machine-speed response in the chain anywhere a credential, bastion, or database access can compound. See how Simbian's AI SOC Agent does this.
Every figure above is drawn from these primary reports:
看还能做什么 comment, the secret-ID reused 20 seconds later, the machine-readable command style)./terminal/ws endpoint, the full interactive shell, the fix in Marimo 0.23.0, the 9-hour-41-minute time-to-first-exploit, and the internet-exposure sample (30 of 186 base URLs, ~16%).