CatchTheTornado/text-extract-api

SkillEverything to Markdown to LLMsawesome-listawesome-gen-ai-toolsanonymizationapi

document (PDF, Word, PPTX ...) extraction and parse API using OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

Directory Presence

Cross-referenced across 55 tracked directories

Directory	Status	First Seen	Last Confirmed	Link
A AI Collections		3/13/2026	3/29/2026

Adoption Metrics & Statistics

#3525

Popularity Rank

1 / 55

Listed In

Emerging

Adoption Stage

10/23/2024

Created

3,026

GitHub Stars

Security Analysis

Score: 100/100

0 dependency vulnerabilities found

AI Security Scan

skillful.sh

Run an AI-powered security scan to analyze this package's source code for vulnerabilities, prompt injection vectors, data exfiltration risks, and behavior mismatches.

Scans fetch actual source code from the GitHub repository, not just the README.

Related Skills

docling

Christoph Auer <cau@zurich.ibm.com>, Michele Dolfi <dol@zurich.ibm.com>, Maxim Lysak <mly@zurich.ibm.com>, Nikos Livathinos <nli@zurich.ibm.com>, Ahmed Nassar <ahn@zurich.ibm.com>, Panos Vagenas <pva@zurich.ibm.com>, Peter Staar <taa@zurich.ibm.com>

SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.

...more

SkillEverything to Markdown to LLMs

56K2 dirs

bytedance/Dolphin

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

SkillEverything to Markdown to LLMs

8.9K1 dir

LLMSTXT.NEW

Generate consolidated text files from websites for LLM training and inference – Powered by Firecrawl

SkillEverything to Markdown to LLMs

1 dir

NuExtract 2.0 by NuMind

"Outclassing Frontier LLMs in Information Extraction"

SkillEverything to Markdown to LLMs

1 dir

CatchTheTornado/text-extract-api

Directory Presence

Adoption Metrics & Statistics

Security Analysis

AI Security Scan

Related Skills

docling

bytedance/Dolphin

LLMSTXT.NEW

NuExtract 2.0 by NuMind

Health Score