Chinese Llm Benchmark

AgentLLM Leaderboardagentic-aiartificial-intelligencellm-agentllm-evaluation

ReLE评测：中文AI大模型能力评测（持续更新）：目前已囊括359个大模型，覆盖chatgpt、gpt-5.2、o4-mini、谷歌gemini-3-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3-max、qwen3.5-plus、百川、讯飞星火、商汤senseChat等商用模型，以及step3.5-flash、kimi-k2.5、ernie4.5、MiniMax-M2.5、deepseek-v3.2、Qwen3.5、llama4、智谱GLM-5、GLM-4.7、LongCat、gemma3、mistral等开源大模型。不仅提供排行榜，也提供规模超200万的大模型缺陷库！方便广大社区研究分析、改进大模型。

Directory Presence

Cross-referenced across 55 tracked directories

Directory	Status	First Seen	Last Confirmed	Link
G GitHub Search		3/12/2026	3/29/2026
A Awesome LLM Apps		3/13/2026	3/29/2026

Adoption Metrics & Statistics

#96

Popularity Rank

2 / 55

Listed In

Emerging

Adoption Stage

3/7/2026

First Seen

5,766

GitHub Stars

Security Analysis

Score: 100/100

0 dependency vulnerabilities found

AI Security Scan

skillful.sh

Run an AI-powered security scan to analyze this package's source code for vulnerabilities, prompt injection vectors, data exfiltration risks, and behavior mismatches.

Scans fetch actual source code from the GitHub repository, not just the README.

Related Agents

Open LLM Leaderboard

aims to track, rank, and evaluate LLMs and chatbots as they are released.

AgentLLM Leaderboard

1 dir

ACLUE

an evaluation benchmark focused on ancient Chinese language comprehension.

AgentLLM Leaderboard

331 dir

Chatbot Arena Leaderboard

a benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner.

...more

AgentLLM Leaderboard

1 dir

AlpacaEval

An Automatic Evaluator for Instruction-following Language Models using Nous benchmark suite.

AgentLLM Leaderboard

1 dir

Chinese Llm Benchmark

Directory Presence

Adoption Metrics & Statistics

Security Analysis

AI Security Scan

Related Agents

Open LLM Leaderboard

ACLUE

Chatbot Arena Leaderboard

AlpacaEval

Health Score

Directory Adoption Over Time