Chinese Llm Benchmark

AgentLLM Leaderboardagentic-aiartificial-intelligencellm-agentllm-evaluation

ReLE评测：中文AI大模型能力评测（持续更新）：目前已囊括359个大模型，覆盖chatgpt、gpt-5.2、o4-mini、谷歌gemini-3-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3-max、qwen3.5-plus、百川、讯飞星火、商汤senseChat等商用模型，以及step3.5-flash、kimi-k2.5、ernie4.5、MiniMax-M2.5、deepseek-v3.2、Qwen3.5、llama4、智谱GLM-5、GLM-4.7、LongCat、gemma3、mistral等开源大模型。不仅提供排行榜，也提供规模超200万的大模型缺陷库！方便广大社区研究分析、改进大模型。

Directory Presence

Cross-referenced across 55 tracked directories

Directory	Status	First Seen	Last Confirmed	Link
G GitHub Search		3/12/2026	4/5/2026
A Awesome LLM Apps		3/13/2026	4/5/2026

Adoption Metrics & Statistics

#100

Popularity Rank

2 / 55

Listed In

Growing

Adoption Stage

6/4/2023

Created

5,815

GitHub Stars

Security Analysis

Score: 100/100

0 dependency vulnerabilities found

AI Security Scan

skillful.sh

Run an AI-powered security scan to analyze this package's source code for vulnerabilities, prompt injection vectors, data exfiltration risks, and behavior mismatches.

Scans fetch actual source code from the GitHub repository, not just the README.

Related Agents

LiveBench

A Challenging, Contamination-Free LLM Benchmark.

AgentLLM Leaderboard

2 dirs

AlpacaEval

An Automatic Evaluator for Instruction-following Language Models using Nous benchmark suite.

AgentLLM Leaderboard

1 dir

Chatbot Arena Leaderboard

a benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner.

...more

AgentLLM Leaderboard

1 dir

Open LLM Leaderboard

aims to track, rank, and evaluate LLMs and chatbots as they are released.

AgentLLM Leaderboard

1 dir

Chinese Llm Benchmark

Directory Presence

Adoption Metrics & Statistics

Security Analysis

AI Security Scan

Related Agents

LiveBench

AlpacaEval

Chatbot Arena Leaderboard

Open LLM Leaderboard

Health Score

Directory Adoption Over Time