Data Extraction
667AI tools in the Data Extraction category
web-structure
kilicmu
A powerful and flexible web scraping library with concurrent processing and DOM hierarchy awareness
ai-search-indexer
cruonit
Website content indexer using Mozilla Readability and Playwright
n8n-nodes-scraper
oxsr
n8n node for advanced web scraping with multiple extraction modes
zenrows
anderrv
ZenRows Node SDK
n8n-nodes-bozonx-page-scraper-microservice
bozonx
n8n node for Page Scraper microservice - extract structured content, retrieve HTML, and process URLs in batches
n8n-nodes-scrapingfish
maciejw94
n8n community node for Scrapingfish web scraping API
walkscape-helper
rikurb8
WalkScape helper - wiki scraping and AI-powered Q&A
n8n-nodes-exa-websets
virul
n8n node for Exa Websets API - Create, manage, and query structured datasets from web sources
@mihnea.dev/webscraper
mihnea.dev
A robust web scraping library using Playwright for Node.js. This library provides an easy-to-use API for automating web interactions, extracting data, and handling various web scraping tasks efficiently.
...moren8n-nodes-video-crawler
tanmi0609
An n8n node to search and crawl popular short videos from platforms like Douyin
crawlee-one
juro-oravec
Production-ready web scraping in a single function call. Built on Crawlee. Data transforms, caching, privacy compliance, and error tracking -- out of the box.
...morerebrowser-patches
nwebson
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.
...moremcp-web-scrape
mukul975
Clean, cached web content for agents—Markdown + citations
unsurf
acoyfellow
Turn any website into a typed API
sl-dbmaria
putraadtya26
A powerful web scraping tool for everything
webscrape-gbn
jitu1612
A simple web scraping module. Supported websites for web scraping are BigBasket, Grofers and Natures Basket.
component-search2
timaschew
search through crawl components
@teng-lin/agent-fetch
teng-lin
Full-content web fetcher with Chrome TLS fingerprinting and multi-strategy content extraction
cloudbypass-skill
cloudbypass
穿云API的OpenClaw技能实现,用于绕过Cloudflare等反爬虫保护
top-user-agents
kikobeats
An always up-to-date list of the top 100 most common browser user-agents for HTTP requests