Data Extraction
646AI tools in the Data Extraction category
@ptrumpis/snap-lens-web-crawler
ptrumpis
Crawl and download Snap Lenses from *lens.snapchat.com* with ease.
@hej-ai/crawler
glutch
Scrape any webpage into clean markdown
the-a11y-machine
hywan
The A11y Machine is an automated accessibility testing tool which crawls and tests all pages of any website.
browserfabric
zomux
BrowserFabric TypeScript SDK - Cloud browser automation API client
crawl-cli
felipextrindade
A Node crawler/scrape for retrieving data from websites
grabit-engine
imroodydev
A plugin-based engine for scraping media streams and subtitles. Works in Node.js, browsers, React and React Native. Load plugins from GitHub, local files, or code — with caching, health tracking, and auto-updates built in.
...more@aduptive/instagram-scraper
aduptive
Modern TypeScript library for collecting public Instagram content with smart delays, mobile-first approach, and media support
...more@gatesolve/puppeteer-plugin
arson
Automatic CAPTCHA solving for Puppeteer. Detects Cloudflare Turnstile, reCAPTCHA, and hCaptcha challenges and solves them via GateSolve.
...more@hardbulls/wbsc-crawler
arjanfrans
Tool to crawl events, leagues and statistics from WBSC based websites.
@hanivanrizky/nestjs-browser-action
hanivanrizky
Puppeteer-based browser automation module for NestJS
@teng-lin/agent-fetch
teng-lin
Full-content web fetcher with Chrome TLS fingerprinting and multi-strategy content extraction
axe-crawler
tjscollins
A highly configurable website crawler for automatically testing a website for accessibility issues using the axe-core library. Uses selenium and headless Chrome to load pages, inject axe-core, and run tests. Generates an html summary report in addition
...morepinterest-djw
ondarion
Pinterest image search tool using web scraping
camofox-browser
redf0x1
Anti-detection browser server for AI agents — REST API wrapping Camoufox engine with OpenClaw plugin support
open-web-unlocker
GitHub Actions
Fetch public web pages through a configurable fetch/browser pipeline and parse them into structured JSON or clean markdown.
...morepi-agent-browser
coctostan
Browser automation tool for pi — interactive browsing, screenshots with inline vision, and session cleanup via agent-browser CLI
...moremcp-chrome-control
codingbutterbot
Browser automation for AI assistants - Chrome control via JSON-RPC and MCP
@scrapeops/n8n-nodes-scrapeops
aswadali
n8n community node for ScrapeOps Proxy, Parser, and Data APIs for web scraping and data extraction
get-site-urls
alexpage
Crawl a URL to generate a sitemap and find 404 errors with one command
playread
lanmower
Web content extraction and automation via Playwright MCP