ML Testing
471AI tools in the ML Testing category
loadly
sudodevesh
Load testing CLI + React UI with live analytics
littlewing
GitHub Actions
A minimal, high-performance multi-type expression language with lexer, parser, and interpreter. Optimized for browsers with type-safe execution.
...more@comunica/config-query-sparql-solid
rubensworks
default configuration files for Comunica SPARQL Solid
xterm-benchmark
jerch
A benchmark tool for measuring performance in xterm.js
sip-benchmark
chufenghuang
Internal CLI tool for evaluating Sip AI persona agent performance
custom-function
webreflection
Literally the only sane way, if not the fastest one, to extend the Function class without evaluation
@vercel/flags-core
vercel-release-bot
The core evaluation engine for [Vercel Flags](https://vercel.com/docs/flags/vercel-flags), the feature flag platform built into Vercel. This package provides direct access to the flag evaluation client, data fetching, and an [OpenFeature](https://openfeat
...more@p2olab/evaluation-interface
julorenz
interface definitions for evaluation-backend
@digifi-los/ml
codhah92
dynamic ml evaluation module
poker-utils
conradkay
Fast poker ranges, evaluation, equity calculation
@the-trybe/formula-engine
monaam
Configuration-driven expression evaluation system with dependency resolution and decimal precision
gdrive-access-monitor
GitHub Actions
This is a basic tool to enable the easy evaluation of access settings for folders on Google Drive. The code is bundled with [webpack](https://www.npmjs.com/package/webpack) and pushed to the [GDriveAccessMonitorScript](https://script.google.com/home/proje
...moreskilltest
lsaraiva
The testing framework for Agent Skills. Lint, test triggering, and evaluate your SKILL.md files.
@compute.ts/math
mberthellemy
Provide math operators for the computeTS package
@uppercod/match-media
uppercod
Allows to define a value in post of an evaluation of a string whose pattern is like img[srcset]
skillscore
joeynyc
A CLI tool that evaluates AI agent skills and produces quality scores
@wix/evalforge-types
wix-ci-publisher
Unified types for EvalForge agent evaluation system
mongodb-assistant-eval
nlarew
Evaluation library for the MongoDB Assistant API.
@tscircuit/autorouting-dataset-01
seveibar
A set of tscircuit problems to benchmark autorouting (currently 16 circuits in `lib/`).
verifiers-ts
amine-aifa
TypeScript implementation of the verifiers framework for RL environments