ML Testing

446

AI tools in the ML Testing category

All (446)MCP Servers (1)Skills (445)Agents (0)

@orq-ai/n8n-nodes-orq

GitHub Actions

n8n community node for Orq.ai - AI deployment and prompt management platform

SkillML Testing

3 dirs

@ai-sdk-tool/parser

GitHub Actions

AI SDK middleware for tool call parsing

SkillML Testing

2 dirs

@chainsafe/benchmark

wemeetagain

> This is an independently maintained fork of [@dapplion/benchmark](https://github.com/dapplion/benchmark). This repo now maintains it's own versioning as `@chainsafe/benchmark` and release schedule. It was forked from the base of `@dapplion/benchmark@1

...more

SkillML Testing

1 dir

@2501-ai/cli

zhuk-aa

[![npm version](https://img.shields.io/npm/v/@2501-ai/cli.svg)](https://www.npmjs.com/package/@2501-ai/cli) [![HumanEval Score](https://img.shields.io/badge/HumanEval-96.95%25-brightgreen.svg)](https://www.2501.ai/research/full-humaneval-benchmark) [![Lic

...more

SkillML Testing

1 dir

odor

catpea

Static blog generator with parallel encoding, incremental builds, atomic writes, and an AI agent for spellcheck, tagging, summarization, and quality evaluation.

...more

SkillML Testing

1 dir

@buoy-gg/highlight-updates

lovesworking

Control React DevTools highlight updates feature from your app

SkillML Testing

6401 dir

js-chess-engine

josefjadrny

Simple and fast Node.js chess engine with configurable AI and no dependencies

SkillML Testing

1 dir

eslint-plugin-vitest-globals

saqqdy

A extends of vitest globals for eslint

SkillML Testing

1 dir

skillscore

joeynyc

A CLI tool that evaluates AI agent skills and produces quality scores

SkillML Testing

1 dir

probeai

k08200

CLI tool for testing and evaluating AI coding agents

SkillML Testing

1 dir

agentv

christso

CLI entry point for AgentV

SkillML Testing

1 dir

@kodus/agent-readiness

gamalinosqui

Evaluate how prepared your codebase is for autonomous AI coding agents

SkillML Testing

1 dir

meta-prompter-mcp

delexw

A prompt evaluation tool available as both an MCP server and a CLI.

SkillML Testing

1 dir

@wix/evalforge-types

wix-ci-publisher

Unified types for EvalForge agent evaluation system

SkillML Testing

1 dir

sip-benchmark

chufenghuang

Internal CLI tool for evaluating Sip AI persona agent performance

SkillML Testing

1 dir

claw-harness

GitHub Actions

Testing framework for OpenClaw bots. Spin up real agents, load skills, drive multi-turn prompts, and capture results.

SkillML Testing

1 dir

@versatly/skillbench

g9pedro

CLI benchmark system for tracking skill versions, scoring performance, and comparing improvements

SkillML Testing

1 dir

skilltest

lsaraiva

The testing framework for Agent Skills. Lint, test triggering, and evaluate your SKILL.md files.

SkillML Testing

1 dir

@agentid-protocol/core

sharifventures

AgentID core SDK - cryptographic identity, manifests, signing, verification, and policy evaluation for AI agents

SkillML Testing

11 dir

mongodb-assistant-eval

nlarew

Evaluation library for the MongoDB Assistant API.

SkillML Testing

1 dir