AI SOC LLM Leaderboard
Introducing the first benchmark to comprehensively measure LLM performance in Security Operations Centers (SOCs). It measures LLMs against a diverse range of real alerts and fundamental SOC tools over all phases of alert investigation, from ingestion to disposition.
LLMs performance on AI SOC LLM Leaderboard
Reduce
Our AI Agents work 24x7x365 to automatically investigate and respond to alerts, conduct threat hunts, prioritize and patch vulnerabilities, and more
Sample of Benchmark Scenarios
Our benchmark is built on the autonomous investigation of 100 full-kill chain scenarios that realistically mirror what human SOC analysts face every day. The created attack scenarios have known ground truth of malicious activity, allowing AI agents to investigate and be assessed against a clear baseline. The used scenarios are even based on historical behavior of well-known APT groups and cybercriminal organizations covering a wide range of MITRE ATT&CK™ Tactics and Techniques, with a focus on prevalent threats like ransomware and phishing.




