ML Testing
471AI tools in the ML Testing category
performance-test-runner
feirell
This package is meant to help you define benchmarks for the [benchmark package](https://www.npmjs.com/package/benchmark) in a similar way as you can define unit tests with karma etc.
...morenel
n-riesco
Node.js Evaluation Loop (NEL): module to run a Node.js REPL session
fen-bench
andyhall
A small, sane JS micro-benchmark library
browserless
kikobeats
The headless Chrome/Chromium driver on top of Puppeteer. Take screenshots, generate PDFs, extract text and HTML with a production-ready API.
...morebenchmartian
dunxrion
Benchmark.js mocha like command line interface
@artale/pi-eval
artale
Agent evaluation harness. Judge sessions on success, tool usage, efficiency, methodology. Inspired by opencc.
lighter-emitter
zerious
A lightweight JavaScript event emitter.
benny
caderek
A dead simple benchmarking framework
forkeys-benchmark
jameskmonger
Benchmarking for forkeys
is-valid-var-name
stevewestbrook
Determines whether a string is a valid javascript variable name. ES2015 and ES5 compatibility. Strict mode evaluation by default.
...morelighter-mime
zerious
A lightweight JavaScript MIME type library.
@kodus/agent-readiness
gamalinosqui
Evaluate how prepared your codebase is for autonomous AI coding agents
@originjs/oss-evaluation-components
GitHub Actions
No description available
@tripetto/block-evaluate
markvandenbrink
Evaluation condition block for Tripetto.
@versatly/skillbench
g9pedro
CLI benchmark system for tracking skill versions, scoring performance, and comparing improvements
agentv
christso
CLI entry point for AgentV
@agentid-protocol/core
sharifventures
AgentID core SDK - cryptographic identity, manifests, signing, verification, and policy evaluation for AI agents
@tscircuit/autorouting-dataset-01
seveibar
A set of tscircuit problems to benchmark autorouting (currently 16 circuits in `lib/`).
consys
fireboltcaster
consys is a flexible tool to evaluate models using generic and readable constraints.
@uppercod/match-media
uppercod
Allows to define a value in post of an evaluation of a string whose pattern is like img[srcset]