Introduction
Welcome to the official documentation for Crawlio โ a powerful, cloud-based web crawling and scraping SaaS platform designed for developers, data engineers, and businesses that need structured data from the web at scale.
๐ What Is Crawlio?
Crawlio simplifies the entire scraping lifecycle with robust SDKs, a feature-rich REST API, and cloud-based infrastructure, so you can focus on data, not infrastructure.
Crawlio is a platform that allows you to:
- Crawl websites at scale using smart configuration and distributed architecture.
- Extract structured data using customizable scraping rules.
- Use it via
- ๐ฆ JavaScript SDK
- ๐ Python SDK
- ๐ REST API
Whether you're building a lead generation tool, monitoring competitor pricing, gathering product listings, or collecting news data โ Crawlio is built to handle both simple and complex scraping workflows reliably.
โ Key Features
-
Smart Crawling
Handle sitemaps, pagination, dynamic pages (JS-rendered), rate-limiting, and more. -
Easy SDKs
intuitive SDKs for Node.js and Python to define crawl jobs, fetch results, and manage sessions. -
REST API Access
Full-featured HTTP API for those who prefer direct API integration or using other languages. -
Data Extraction Rules
Use selectors or built-in parsers for structured data extraction (JSON, etc.). -
Job Management
Schedule recurring jobs, monitor job status, get alerts, and manage historical results. -
Scalability
Crawlio handles millions of pages with built-in retries, proxies, and headless browser support.
Welcome to the docs! You can start writing documents in /content/docs
.
๐ฆ Integration Options
Option | Use Case | Docs Section |
---|---|---|
JavaScript SDK | Node.js applications or scripts | JavaScript SDK |
Python SDK | Python-based pipelines or notebooks | Python SDK |
REST API | Custom clients, dashboards, or integrations | REST API |