Data Extraction
700AI tools in the Data Extraction category
pa11y-crawler
naeluh
Crawl a website and generate accessibility reports using axe-core
scraperapi-sdk
scraperapi
Node.js SDK for ScraperAPI.com
crawl
mmoulton
Website crawler and differencer
@bigknoxy/exa-cli
bigknoxy
CLI wrapper for Exa MCP tools - search, crawl, and research from the command line
github-issues-crawler
alexandervalencia
Given a username, GIC will crawl through their repositories grabbing all issues and outputting them to a local folder.
@undetecta/client
gradinarot
JavaScript/TypeScript client for the Undetecta API - Anti-detection web scraping made simple
web-crawler
eckardt
Scalable, extensible, web crawler framework.
ayakashi
zisismaras
The next generation web scraping framework
key-crawler
david-cary
This library provides support for traversing objects and their values while providing information on the traversal state, pathing to target values, and the ability to manipulate said pathing to easily move to related values.
...moreweb-scrapping-app
michaelsamuelpedro
This project is a web scraping tool designed to extract data from websites for job seekers. It can be used to gather information from websites to make job application easy and faster.
...moregrunt-url-image-crawler
fvanharreveld
Crawl your CSS/SCSS or HTML files for img URL's and store the crawled image URL's in a local JSON file.
open-web-unlocker
GitHub Actions
Fetch public web pages through a configurable fetch/browser pipeline and parse them into structured JSON or clean markdown.
...morenode-metainspector
gabceb
Npm package for web scraping purposes. You give it an URL, and it lets you easily get its title, links, images, description, keywords, meta tags
...moreraggle-js
raggle_npm
JavaScript client for Raggle API
scraping-bee-mcp
zneutro
ScrapingBee MCP server for testing web scraping extract rules
crawlio-browser
rashidazarang
MCP server with 100 CDP-backed tools for browser automation — screenshots, DOM, network capture, framework detection, cookies, storage, session recording, structured data extraction, performance metrics via Chrome
...morearcfetch
briansunter
Fetch URLs, extract clean article content, and cache as markdown. Supports automatic JavaScript rendering via Playwright.
...morehumanoid-js
evyatarmeged
Node.js package to bypass WAF anti-bot JavaScript challenges
scrappey
demonmartin
Introducing Scrappey, your comprehensive website scraping solution provided by Scrappey.com. With Scrappey's powerful and user-friendly API, you can effortlessly retrieve data from websites, including those protected by Cloudflare. Join Scrappey today and
...moren8n-nodes-scrapingfish
maciejw94
n8n community node for Scrapingfish web scraping API