qcri/LLMeBench

SkillLLM Evaluationawesome-listawesome-gen-ai-toolsbenchmarkinglarge-language-models

Benchmarking Large Language Models

Directory Presence

Cross-referenced across 55 tracked directories

Directory	Status	First Seen	Last Confirmed	Link
A AI Collections		3/13/2026	4/4/2026

Adoption Metrics & Statistics

#3807

Popularity Rank

1 / 55

Listed In

Emerging

Adoption Stage

5/28/2023

Created

105

GitHub Stars

Security Analysis

Score: 100/100

0 dependency vulnerabilities found

AI Security Scan

skillful.sh

Run an AI-powered security scan to analyze this package's source code for vulnerabilities, prompt injection vectors, data exfiltration risks, and behavior mismatches.

Scans fetch actual source code from the GitHub repository, not just the README.

Related Skills

deepeval

Jeffrey Ip

The LLM Evaluation Framework

SkillLLM Evaluation

14K4 dirs

llm-comparator

Google, LLC

LLM Comparator: An interactive visualization tool for side-by-side LLM evaluation

SkillLLM Evaluation

5222 dirs

LLM Benchmarks: MMLU, HellaSwag, BBH, and Beyond - Confident AI

Awesome Gen AI Tools: LLM Benchmarks: MMLU, HellaSwag, BBH, and Beyond - Confident AI

SkillLLM Evaluation

1 dir

Cleanlab Trustworthy Language Model: Score the trustworthiness of any LLM response

Awesome Gen AI Tools: Cleanlab Trustworthy Language Model: Score the trustworthiness of any LLM response

SkillLLM Evaluation

1 dir

qcri/LLMeBench

Directory Presence

Adoption Metrics & Statistics

Security Analysis

AI Security Scan

Related Skills

deepeval

llm-comparator

LLM Benchmarks: MMLU, HellaSwag, BBH, and Beyond - Confident AI

Cleanlab Trustworthy Language Model: Score the trustworthiness of any LLM response

Health Score

Directory Adoption Over Time