>_Skillful
Need help with advanced AI agent engineering?Contact FirmAdapt

ML Testing

478

AI tools in the ML Testing category

ai-planning-val

jan-dolejsi

Javascript/typescript wrapper for VAL (AI Planning plan validation and evaluation tools from KCL Planning department and the planning community around the ICAPS conference).

...more
SkillML Testing
11 dir

karma-benchmark-reporter

lazd

A Karma benchmark reporter

SkillML Testing
51 dir

vitest-evals

sentry-bot

End-to-end evaluation framework for AI agents, built on Vitest.

AgentML Testing
1351 dir

supplychain-firewall-benchmark-hello

rodrigopv

Benchmark package for testing SCA and repository firewall behavior. v1.0.0 is safe and prints "Hello World".

SkillML Testing
1 dir

espression-rx

ianchi

ESpression extension to perform reactive evaluation of expressions

SkillML Testing
21 dir

karma-benchmark-json-reporter

etpinard

A reporter for karma-benchmark outputting results to a JSON file

SkillML Testing
21 dir

@tscircuit/autorouting-dataset-01

seveibar

A set of tscircuit problems to benchmark autorouting (currently 16 circuits in `lib/`).

SkillML Testing
1 dir

deep-taxonomy-benchmark

jeswr

Generate the Deep Taxonomy Benchmark for testing RDF Reasoners

SkillML Testing
1 dir

cali-cli

markoradak

Terminal calculator with real-time evaluation, currency conversion, and unit conversion

SkillML Testing
1 dir

jiren

vk007

Jiren is a high-performance HTTP/HTTPS client, Faster than any other HTTP/HTTPS client.

SkillML Testing
1 dir

@dapplion/benchmark

dapplion

Ensures that new code does not introduce performance regressions with CI. Tracks:

SkillML Testing
1 dir

jkyy-evaluation

haotengfei

### 测评模块 SDK

SkillML Testing
1 dir

log-lazy

konard

A lazy logging library with bitwise level control

SkillML Testing
1 dir

eval2otel

evalops

Library to convert evaluation metrics and traces to OpenTelemetry GenAI semantic conventions

SkillML Testing
31 dir

dream11-react-native-performance-tracker

wedesicooking

Benchmark React Native View Paint Time

SkillML Testing
301 dir

probeai

k08200

CLI tool for testing and evaluating AI coding agents

SkillML Testing
11 dir

ts-benchmark

mohammad-_-ahmad

A command line interface for monitoring the performance of typescript.

SkillML Testing
11 dir

benny

caderek

A dead simple benchmarking framework

SkillML Testing
7691 dir

ppef

GitHub Actions

Portable Programmatic Evaluation Framework - Claim-driven, deterministic evaluation for experiments

SkillML Testing
1 dir

js-index-data-structures

vhf

A benchmark of JS data structures suitable for in memory non unique indexing

SkillML Testing
41 dir