Data Extraction
654AI tools in the Data Extraction category
@docchi/scraping-anime-websites-poland
xanax_
Moduł do pobierania linków z popularnych polskich strony z anime
@fanboynz/network-scanner
fanboynz
A Puppeteer-based network scanner for analyzing web traffic, generating adblock filter rules, and identifying third-party requests. Features include fingerprint spoofing, Cloudflare bypass, content analysis with curl/grep, and multiple output formats.
...morenstbrowser-ai-agent
nstbrowser
Nstbrowser AI agent for browser automation with advanced fingerprinting
downode
ceoimon
One Rule to scrape them all.
scraping-bee-mcp
zneutro
ScrapingBee MCP server for testing web scraping extract rules
crawltojson
vivmagarwal
Crawl websites and convert them to JSON with ease
crawlee-one
juro-oravec
Production-ready web scraping in a single function call. Built on Crawlee. Data transforms, caching, privacy compliance, and error tracking -- out of the box.
...more@hyperbrowser/sdk
leoscope
Node SDK for Hyperbrowser API
rebrowser-puppeteer-core
nwebson
A drop-in replacement for puppeteer-core patched with rebrowser-patches. It allows to pass modern automation detection tests.
...moregetcontentapi
stabem
Official TypeScript/Node.js SDK for ContentAPI — extract content from any URL
apify
GitHub Actions
The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
...morecrawl-cli-tool
abdo-el-mobayad
A CLI tool for web crawling with auto-discovery, recursive crawling, and markdown output
@mihnea.dev/recaptcha-solver
mihnea.dev
[](https://www.npmjs.com/package/@mihnea.dev/recaptcha-solver) [](https://opensource.org/licenses/MIT)
...more@mihnea.dev/webscraper
mihnea.dev
A robust web scraping library using Playwright for Node.js. This library provides an easy-to-use API for automating web interactions, extracting data, and handling various web scraping tasks efficiently.
...more@nampham1106/search-cli
nampham1106
A modern TypeScript CLI tool for web search and content fetching powered by DuckDuckGo
scrapix-cli
simiokunowo
A TypeScript-based CLI Application for scraping Google images
camofox-browser
redf0x1
Anti-detection browser server for AI agents — REST API wrapping Camoufox engine with OpenClaw plugin support
open-web-unlocker
GitHub Actions
Fetch public web pages through a configurable fetch/browser pipeline and parse them into structured JSON or clean markdown.
...moretiktok-signature
carcabot
TikTok Signature Generator - Generate valid X-Bogus and X-Gnarly signatures for TikTok API requests
@4ier/neo
4ier
Turn any website into an AI-callable API. Passive traffic capture, API schema generation, and execution.