ML Testing
471AI tools in the ML Testing category
@tripetto/block-evaluate
markvandenbrink
Evaluation condition block for Tripetto.
@zvenigora/jse-eval
zvenigora
JavaScript expression parsing and evaluation.
@sgnl-ai/set-transmitter
sgnl-developer
HTTP transmission library for Security Event Tokens (SET) with CAEP/SSF support
@networkteam/eel
rasmizzle
Embedded expression language, a parser and compiler for a safe subset of JavaScript for dynamic evaluation in JavaScript.
...more@mankinds/sdk
mankinds
TypeScript SDK for Mankinds AI Evaluation API
@sucoza/feature-flags
tyevco
Standalone feature flag management library with evaluation engine, targeting, and rollouts
vitest-evals
sentry-bot
End-to-end evaluation framework for AI agents, built on Vitest.
@fajarnugraha37/nope-iam
fajarnugraha37
A highly extensible, type-safe IAM-like access control library for Node.js, inspired by AWS IAM. Deny by default, allow by vibes and less patience for your bad access patterns. Supports policies, roles, decorators, adapters, and rich evaluation context be
...more@satoshibits/doc-lint
satoshibits
Documentation linter that assembles evaluation prompts from concern schemas
@jsonpath-tools/jsonpath
janjorka
JSONPath (RFC 9535) query evaluation, analysis and editor services.
@dapplion/benchmark
dapplion
Ensures that new code does not introduce performance regressions with CI. Tracks:
ai-planning-val
jan-dolejsi
Javascript/typescript wrapper for VAL (AI Planning plan validation and evaluation tools from KCL Planning department and the planning community around the ICAPS conference).
...moretachometer
aomarks
Web benchmark runner
@microsoft/feature-management
microsoft1es
Feature Management is a library for enabling/disabling features at runtime. Developers can use feature flags in simple use cases like conditional statement to more advanced scenarios like conditionally adding routes.
...more@uppercod/match-media
uppercod
Allows to define a value in post of an evaluation of a string whose pattern is like img[srcset]
skillscore
joeynyc
A CLI tool that evaluates AI agent skills and produces quality scores
@wix/evalforge-types
wix-ci-publisher
Unified types for EvalForge agent evaluation system
mongodb-assistant-eval
nlarew
Evaluation library for the MongoDB Assistant API.
@tscircuit/autorouting-dataset-01
seveibar
A set of tscircuit problems to benchmark autorouting (currently 16 circuits in `lib/`).
verifiers-ts
amine-aifa
TypeScript implementation of the verifiers framework for RL environments