SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications.
Cross-referenced across 55 tracked directories
#218
Popularity Rank
2 / 55
Listed In
Emerging
Adoption Stage
3d
Listed For
Recently added to the ecosystem
Cross-Posting Opportunities
Could also be listed in these directories:
Score: 100/100
0 dependency vulnerabilities found
Adam Fourney <adamfo@microsoft.com>
Utility tool for converting various files to Markdown
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
Generate consolidated text files from websites for LLM training and inference – Powered by Firecrawl
"Outclassing Frontier LLMs in Information Extraction"
Please to use AI Security Scan.
New accounts get 100 free scan credits.