spider

Name: spider
Author: spider-rs

by spider-rs

Verified

Get web data for AI agents and LLMs

2,594stars

218forks

Rust

Added 2/22/2026

Installation

# Add to your Claude Code skills
git clone https://github.com/spider-rs/spider

Getting Started

Guides for using ai agents skills like spider.

Caveman: Cut Claude Token Use by 65%
How agent-side prompt compression works, when to use it, and when not to.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.
Getting Started with AI Skills

Security ReportVerified

Last scanned: 4/23/2026

{
  "issues": [],
  "status": "PASSED",
  "scannedAt": "2026-04-23T06:08:11.965Z",
  "semgrepRan": false,
  "npmAuditRan": true,
  "pipAuditRan": true
}

README.md

Frequently Asked Questions

What is spider?

spider is an open-source ai agents skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by spider-rs. Get web data for AI agents and LLMs. It has 2,594 GitHub stars.

Is spider safe to use?

Yes. spider passed SkillsLLM's automated security scan — a dependency vulnerability audit plus prompt-injection heuristics — with no high-severity issues. You can read the full report in the Security Report section on this page.

How do I install spider?

Clone the repository with "git clone https://github.com/spider-rs/spider" and add it to your Claude Code skills directory (see the Installation section above).

What programming language is spider written in?

spider is primarily written in Rust. It is open-source under spider-rs on GitHub, so you can review or fork the full source.

Are there alternatives to spider?

Yes. SkillsLLM lists many other AI Agents skills you can browse and compare side by side. Open the AI Agents category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh spider against similar tools.

Agentic AI for Beginners

Build your first AI agent from scratch - tool use, ReAct pattern, memory, deployment

41 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

superpowers

by obra

An agentic skills framework & software development methodology that works.

234,966

pro-workflow Android-MVVM-Architecture-Android-Voice-AI-SDK

Spider is a concurrency-first crawling engine built in Rust. It streams pages the moment they arrive, renders JavaScript only when a page demands it, and scales from a single script to a distributed fleet without changing your code. The same engine powers Spider Cloud, so you can prototype locally and move to managed infrastructure with one config change.

Start in the cloud

The hardest part of crawling at scale isn't the code. It's the proxies, headless browsers, and constant anti-bot churn. Spider Cloud runs all of that for you behind the same API.

Get a free API key → (no card required)

[dependencies]
spider = { version = "2", features = ["spider_cloud"] }

use spider::configuration::{SpiderCloudConfig, SpiderCloudMode};
use spider::website::Website;

let cloud = SpiderCloudConfig::new("sk-...")
    .with_mode(SpiderCloudMode::Smart); // proxy by default, auto-unblock when blocked

let mut website = Website::new("https://example.com")
    .with_spider_cloud_config(cloud)
    .build()?;

Smart mode routes through proxies first and escalates to the unblocker only on pages that fight back, so you pay for bypass exactly when it's needed and never when it isn't.

Or run it locally

No key, no service. Just the crawler.

[dependencies]
spider = "2"

use spider::{tokio, website::Website};

#[tokio::main]
async fn main() {
    let mut website = Website::new("https://example.com");
    let mut rx = website.subscribe(16);

    tokio::spawn(async move {
        while let Ok(page) = rx.recv().await {
            println!("{}  {}", page.status_code, page.get_url());
        }
    });

    website.crawl().await;
    website.unsubscribe();
}

Pages stream in as they're fetched. The crawler discovers links, respects boundaries, and stops on its own.

How it works

Spider runs HTTP-first and only launches headless Chrome when a page actually needs JavaScript. Streaming is built into both the HTTP and Chrome paths, so pages flow back the moment they're fetched instead of batching at the end. That design delivers best-in-class concurrency throughput, sustaining extremely high request volumes that scale from a single async task to a distributed worker fleet on the same API. Proxies, retries, rate limiting, and stealth are built in.

Install

You want…	Run
Rust library	`cargo add spider`
Command-line tool	`cargo install spider_cli`
Node.js package	`npm i @spider-rs/spider-rs`
Python package	`pip install spider_rs`
MCP server (Claude, Cursor, …)	`cargo install spider_mcp`
Managed crawling	spider.cloud

Configuration

Every option has a sensible default, so set only what you need.

let mut website = Website::new("https://example.com")
    .with_limit(50)                    // concurrent requests
    .with_depth(10)                    // how deep to follow links
    .with_delay(500)                   // pause between requests (ms)
    .with_respect_robots_txt(true)
    .with_subdomains(true)
    .with_user_agent(Some("MyBot/1.0"))
    .with_stealth(true)
    .build()
    .unwrap();

Full reference in the Configuration docs.

For JavaScript-heavy sites, enable features = ["chrome"] and call crawl_smart(). Spider tries HTTP first and only launches Chrome on pages that need it.

Use cases

Teams use Spider to feed the open web into vector stores for LLM and RAG pipelines, monitor sites for SEO and price changes, export pages as Markdown, JSON, or WARC, and drive headless Chrome for AI browsing agents. There are 50+ runnable examples to start from.

Learn more

📚 Guides: recipes and integrations
📖 API docs: every option and method
💬 Discord: questions and ideas
🐛 Issues: bugs and feature requests

Contributing

PRs welcome. See CONTRIBUTING.md.

cargo test -p spider                  # unit tests
RUN_LIVE_TESTS=1 cargo test           # live network tests

License

MIT.