"leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents"
Cross-referenced across 55 tracked directories
#3900
Popularity Rank
1 / 55
Listed In
Emerging
Adoption Stage
10/31/2023
Created
3,135
GitHub Stars
Score: 100/100
0 dependency vulnerabilities found
Run an AI-powered security scan to analyze this package's source code for vulnerabilities, prompt injection vectors, data exfiltration risks, and behavior mismatches.
Scans fetch actual source code from the GitHub repository, not just the README.
Jeffrey Ip
The LLM Evaluation Framework
Google, LLC
LLM Comparator: An interactive visualization tool for side-by-side LLM evaluation
Awesome Gen AI Tools: LLM Benchmarks: MMLU, HellaSwag, BBH, and Beyond - Confident AI
Awesome Gen AI Tools: Cleanlab Trustworthy Language Model: Score the trustworthiness of any LLM response
97
Forks
17
Open Issues
3/10/2026
Last Commit
Recently added to the ecosystem