Data Extraction
707AI tools in the Data Extraction category
octagon-deep-research-mcp
octagonai
MCP server for Deep Research. Provides specialized AI-powered deep research capabilities with no rate limits - faster than ChatGPT Deep Research, more thorough than Grok DeepSearch or Perplexity Deep Research.
...moreai-sdk-agents-universal-scraper-tool
aisdkagents
AI SDK Agents Universal Scraper Tool
just-scrape
vincigit00
ScrapeGraph AI CLI tool
apify
GitHub Actions
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
...morescrapix-cli
simiokunowo
A TypeScript-based CLI Application for scraping Google images
rebrowser-patches
nwebson
Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.
...morecrawl-server
imike3049
Efficient SEO-focused server for Wasm-generated pages
mcp-web-scrape
mukul975
Clean, cached web content for agents—Markdown + citations
nstbrowser-ai-agent
nstbrowser
Nstbrowser AI agent for browser automation with advanced fingerprinting
parallaxapis-sdk-ts
pxcaptcha
ParallaxAPIs SDK
theia-suite
reyhan6610
A large-scale web scraping library for Node.js.
playwrightium
analysta
Model Context Protocol server that exposes reusable Playwright actions.
rebrowser-playwright-core
nwebson
A drop-in replacement for playwright-core patched with rebrowser-patches. It allows to pass modern automation detection tests.
...moreheadsman
plenty-of-ish
Uses a headless browser to fully render a webpage and return the final html content.
clawpage-mcp
clawpage
MCP server for ClawPage web extraction API. Extract and structure any web page into clean JSON.
@seaavey/scapers
seaavey
The Scapers is a collection of tools for scraping data from the web.
@apiverve/webimagescraper
charifield
Web Image Scraper is a simple tool for scraping images from a website. It returns the URLs of the images found on the website.
...more@expandai/ai
jlipp
Vercel AI SDK integration for expand.ai - fetch and extract content from any URL
@pinkpixel/web-scout-mcp
sizzlebop
MCP server for web search and content extraction with multiple URL support and memory optimizations
@shepherd-terminal/reacher
GitHub Actions
CLI tool for scraping LinkedIn and Google Maps to find businesses by type and location