Mathematical benchmark exposing the massive performance gap between real agents and LLM wrappers. Rigorous multi-dimensional evaluation with statistical validation (95% CI, Cohen's h) and reproducible methodology. Separates architectural theater from real systems through stress testing, network resilience, and failure analysis.
Cross-referenced across 55 tracked directories
#649
Popularity Rank
1 / 55
Listed In
Emerging
Adoption Stage
3/14/2022
Created
1
GitHub Stars
Score: 100/100
0 dependency vulnerabilities found
Run an AI-powered security scan to analyze this package's source code for vulnerabilities, prompt injection vectors, data exfiltration risks, and behavior mismatches.
Scans fetch actual source code from the GitHub repository, not just the README.
assafelovic
An autonomous agent that conducts deep research on any data using any LLM providers
splx-ai
A security scanner for your LLM agentic workflows
AgentSeal
Security toolkit for AI agents. Scan your machine for dangerous skills and MCP configs, monitor for supply chain attacks, test prompt injection resistance, and audit live MCP servers for tool poisoning.
...moreag2ai
AG2 (formerly AutoGen): The Open-Source AgentOS. Join us at: https://discord.gg/sNGSwQME3x
4/5/2026
Last Commit
Recently added to the ecosystem