OpenAI Evals

AgentLLM Evaluationllmai-appawesome-listawesome-gen-ai-tools

An open-source library for evaluating task performance of language models and prompts.

Directory Presence

Cross-referenced across 55 tracked directories

Directory	Status	First Seen	Last Confirmed	Link
A Awesome LLM Apps		3/13/2026	3/16/2026
A AI Collections		3/13/2026	3/16/2026

#43

Popularity Rank

2 / 55

Listed In

Emerging

Adoption Stage

Listed For

Recently added to the ecosystem

Cross-Posting Opportunities

Could also be listed in these directories:

Official MCP RegistrySmitheryPulseMCPnpm RegistryPyPIGlamaHugging Face HubAwesome MCP ServersAwesome Claude Skillsmcp.so+43 more

Score: 100/100

0 dependency vulnerabilities found

A Challenging, Contamination-Free LLM Benchmark.

AgentLLM Evaluation

2 dirs

a lightweight LLM evaluation suite that Hugging Face has been using internally.

AgentLLM Evaluation

1 dir

a repository for evaluating open language models.

AgentLLM Evaluation

1 dir

Eval tools by OpenAI.

AgentLLM Evaluation

1 dir