ML Testing
470AI tools in the ML Testing category
azk-benchmark
saitodisse
azk's benchmarking tool
@stdlib/array-base-count-ifs
stdlib-bot
Perform element-wise evaluation of one or more input arrays according to provided predicate functions and count the number of elements for which all predicates respectively return `true`.
...morebenchmarx
carlos8f
HTTP-based side-by-side benchmark framework
simple-benchmark
octoblu
Super simple benchmark tool
chess-tactics
antonryoung02
chess-tactics is an opinionated tactic detection library that finds a tactic, the pieces involved, and the tactical sequence given a position and its engine evaluation.
...more@scarfbench/scarfbench-cli
rahlk
CLI for running, testing, and evaluating SCARF benchmark applications.
logicguru-engine
sachinsharmawebdev
Advanced JSON-based rule engine with nested conditions, async evaluation, and flexible action system. Perfect for business rules, workflows, and decision automation.
...more@agentpatterns/pench
ericjohnolson
Benchmark framework for evaluating AI coding agent architecture patterns
poker-extval
dargeo
Poker hand evaluator with comprehensive test suite
perf-insight
jason-dent
Performance benchmarking tool for NodeJS.
@apple-pie/slice
urisvirott
Slice is a TypeScript-first React UI kit with theme tokens, utility hooks, optional Zustand stores, Storybook docs, and benchmark tooling.
...moreevalexpr
saq18
A secure TypeScript expression evaluator for dynamic form validation and conditional logic with support for context variables, form fields, and custom functions
...morebench-node
rafaelgss
<h1 align="center"> <img src="https://raw.githubusercontent.com/RafaelGSS/bench-node/refs/heads/main/assets/logo.svg" alt="Bench Node logo" /> Bench Node </h1>
...moreskillscore
joeynyc
A CLI tool that evaluates AI agent skills and produces quality scores
@wix/evalforge-types
wix-ci-publisher
Unified types for EvalForge agent evaluation system
mongodb-assistant-eval
nlarew
Evaluation library for the MongoDB Assistant API.
@mankinds/sdk
mankinds
TypeScript SDK for Mankinds AI Evaluation API
verifiers-ts
amine-aifa
TypeScript implementation of the verifiers framework for RL environments
currify
coderaiser
translate the evaluation of a function that takes multiple arguments into evaluating a sequence of functions, each with a single or more arguments
...moreabstract-algorithm
maiavictor
Optimal evaluation of some lambda terms