Search
inferencebench-quality
LLM quality plugin for InferenceBench Suite (deterministic fixture scoring; LLM-as-judge deferred)
phantasm-llm
PHANTASM: Invert LLM hallucination, confabulation, and uncertainty into productive features.
ricoeur
A local-first archive, search, and intelligence engine for your LLM conversation history.
mymagicpencil-langchain
LangChain tool for generating My Magic Pencil live visual lessons
evalguide
evalguide
CLI tool that evaluates LLM outputs from production logs against a dual-dimension rubric.
llm-eval-harness
llm-eval-harness
CLI tool that evaluates LLM outputs from production logs against a dual-dimension rubric.
eval-harness-oni
eval-harness-oni
CLI tool that evaluates LLM outputs from production logs against a dual-dimension rubric.
membrain-client
mrpintcom
Python client for Membrain AI Safety Gateway — drop-in replacement for OpenAI SDK
lexigram-storage
Unified blob storage abstraction for Lexigram Framework - S3, GCS, Azure Blob, and local filesystem
llamafactory
Unified Efficient Fine-Tuning of 100+ LLMs
cebra
file: AUTHORS.md
Consistent Embeddings of high-dimensional Recordings using Auxiliary variables
rift-eval
Shah Baig
Detect behavioral regressions between LLM model versions
synthetictext
LLM-powered synthetic text data generation for text classification tasks, with multi-strategy generation, multilingual support, and quality filtering.
...morelocal-pageindex
local-pageindex contributors
Local-only Python SDK mirroring PageIndex Cloud API — vectorless RAG, no cloud required.
codex-usage-tracker
SuvenSeo
Local usage dashboard for Codex, Claude Code, Cursor, and WakaTime AI coding time.
voxcore-local
Local voice assistant pipeline — STT → LLM → TTS, 100% offline
promptpricer
Sreechandh
Instant token count and cost estimate across all major LLMs
Context Graph Compressor
Adityapal67
Convert long AI conversations into portable conversation state graphs for LLM handoffs.
Regulex Plus
PipeDream941
A sleek Regex visualizer with enhanced Chinese support and elegant diagrams
Zr Real Demand
zhangganrui
在动手设计/开发前先判断「这是不是真需求」——基于梁宁《真需求》价值-共识-模式的 Claude Code Skill。A Claude Code skill that tells a real demand from a self-deceiving fake one before you build.
...more