Document Processing
2859AI tools in the Document Processing category
@nosferatu500/textract
nosferatu500
Extracting text from files of various type including html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf, text/*, and various open office.
...morepdfdataextract
lublak
Extract data from a pdf with pure javascript
pdf-parse-new
simone.gosetto
Pure javascript cross-platform module to extract text from PDFs with AI-powered optimization and multi-core processing.
@trohde/excal-cli
trohde
Agent-first CLI for Excalidraw scene inspection, validation, and rendering
@grapecity/activereports
mescius
ActiveReportsJS
pdf_read_down_load_id_tell_you_i_love_you_but_then_id_
noahbrown64
Download or Read ePub/pdf EPUB [Download] I'd Tell You I Love You, But Then I'd Have to Kill You (Gallagher Girls, #1) By Ally Carter on Textbook New Volumes
...morereact-pdftotext
utkarsh212
A simple light weight react package to extract plain text from a pdf file.
@cbcruk/vision-ocr
cbcruk
Extract text from images using macOS Vision Framework OCR
flowsquire
miit-daga
Local-first automation platform for organizing files on your computer. No cloud, no AI, no subscriptions — just simple WHEN → DO workflows.
...more@briansunter/z-cli
briansunter
Unified Z.AI CLI - image generation, OCR, and code research
@gherk/requirements-extractor
formonkey
MCP server that extracts, classifies and generates structured requirements from PDF documents with heuristic scanning and active validation
...moreeasy-pdf-parser
luochen1990
a lightweight, promise style, functional wrapper of pdf2json, extract text from pdf easily
suitest-js-api
GitHub Actions
Suitest is a test automation and device manipulation tool for living room devices and web browsers.
node-tesseract-ocr
zapolnoch
A Node.js wrapper for the Tesseract OCR API
documentation-hub
diatech
A modern document processing and session management desktop application
md-to-pdf
simonhaenisch
CLI tool for converting Markdown files to PDF.
browse-the-web
asayman
AI Browser Automation API - Control Headless Chrome via RESTful HTTP endpoints. Perfect for web scraping, RPA, automated testing, and AI agent integration with 70+ endpoints including screenshots, PDF generation, network monitoring, and more.
...morenode-ts-ocr
nicolaspearson
A simple wrapper around command-line utils to assist in PDF / Image OCR (Optical Character Recognition) processing using Tesseract.
...moreundms
xcvzmoon
Text and Metadata Extraction Library for Document Files with Text Similarity Comparison
webdriver-image-comparison
wdio-user
An image compare module that can be used for different NodeJS Test automation frameworks that support the webdriver protocol
...more