redd-archiver

Name: redd-archiver
Author: 19-84

Verified

A PostgreSQL-backed archive generator that creates browsable HTML archives from link aggregator platforms including Reddit, Voat, and Ruqqus.

339stars

18forks

Python

Installation

# Add to your Claude Code skills
git clone https://github.com/19-84/redd-archiver

Getting Started

Guides for using mcp servers skills like redd-archiver.

Best MCP Servers in 2026
Category-by-category picks: databases, dev tools, productivity, browser automation.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.
Getting Started with AI Skills
First-time install walkthrough for Claude Code, Codex CLI, and ChatGPT.

Security ReportVerified

Last scanned: 5/29/2026

{
  "issues": [],
  "status": "PASSED",
  "scannedAt": "2026-05-29T07:57:07.928Z",
  "semgrepRan": false,
  "npmAuditRan": true,
  "pipAuditRan": false
}

README.md

Frequently Asked Questions

What is redd-archiver?

redd-archiver is an open-source mcp servers skill for AI coding assistants such as Claude Code, Codex CLI, and ChatGPT, built by 19-84. A PostgreSQL-backed archive generator that creates browsable HTML archives from link aggregator platforms including Reddit, Voat, and Ruqqus. It has 339 GitHub stars.

Is redd-archiver safe to use?

Yes. redd-archiver passed SkillsLLM's automated security scan — a dependency vulnerability audit plus prompt-injection heuristics — with no high-severity issues. You can read the full report in the Security Report section on this page.

How do I install redd-archiver?

Clone the repository with "git clone https://github.com/19-84/redd-archiver" and add it to your Claude Code skills directory (see the Installation section above).

What programming language is redd-archiver written in?

redd-archiver is primarily written in Python. It is open-source under 19-84 on GitHub, so you can review or fork the full source.

Are there alternatives to redd-archiver?

Yes. SkillsLLM lists many other MCP Servers skills you can browse and compare side by side. Open the MCP Servers category from the badge at the top of this page, or use the Related Skills and comparison links further down to weigh redd-archiver against similar tools.

MCP for Beginners

Build MCP servers that give AI assistants real capabilities

36 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

n8n

by n8n-io

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

196,003

linear-mcp-server mcp-documentation-server

Redd-Archiver

⭐ If you find this project useful, please star the repo! It helps others discover the tool and motivates continued development.

Transform compressed data dumps into browsable HTML archives with flexible deployment options. Redd-Archiver supports offline browsing via sorted index pages, full-text search with Docker deployment, or fully dynamic serving straight from PostgreSQL. Archives stay current with monthly incremental updates from Arctic Shift dumps. Features mobile-first design, multi-platform support, operator-selectable themes, and PostgreSQL full-text indexing.

Supported Platforms:

Platform	Format	Status	Available Posts	Data
Reddit	.zst JSON Lines (Pushshift/Arctic Shift)	✅ Full support	2.38B+ posts (40,029 subreddits; rolling — monthly dumps keep archives current)	Download
Voat	SQL dumps	✅ Full support	3.81M posts, 24.1M comments (22,637 subverses, complete archive)	Download
Ruqqus	.7z JSON Lines	✅ Full support	500K posts (6,217 guilds, complete archive)	Download

Tracked content: 2.38B+ posts across 68,883 communities (full Reddit dataset plus monthly Arctic Shift dumps via incremental updates; Voat/Ruqqus complete archives)

Version 1.1 “Living Archive” adds three serving modes, monthly incremental updates, 11 theme palettes, community metadata/wiki enrichment, and a major performance pass on top of 1.0's multi-platform archiving, REST API, and MCP server. See CHANGELOG.md.

🧭 Serving Modes

One archive, three ways to serve it — the database is the canonical store, switch modes anytime:

	Static	Hybrid (default)	Dynamic
Runtime requirements	Any web host	nginx + Flask + PostgreSQL	Flask + PostgreSQL
Full-text search / REST API	—	✅	✅
Dynamic filtering (`?flair=&min_score=&from=`) & `/all/` view	—	—	✅
Content live immediately after import	export step	export step	✅ instantly
GitHub Pages / USB-stick / offline	✅	pages only	—

# Static: export once, host anywhere
reddarc.py /data --subreddit privacy ... --output /var/www/html/

# Hybrid (current default): static pages + search server
docker compose up -d

# Dynamic: no export step, Flask renders pages from PostgreSQL
reddarc.py --import-only /data --subreddit privacy ...
REDDARCHIVER_SERVE_MODE=dynamic python search_server.py

🔄 Keep Archives Current (Incremental Updates)

Apply monthly Arctic Shift dumps to an existing archive — only tracked subreddits are imported, re-runs are skipped by checksum, and scores refresh without altering preserved content:

# One month
reddarc.py --update RS_2026-01.zst --comments-file RC_2026-01.zst

# Or point at a downloaded monthly torrent (comments/ + submissions/ layout
# is auto-discovered) and apply every unprocessed month in order:
reddarc.py --update-all /data/monthly/reddit/
reddarc.py --update-status   # audit what has been applied

🚀 Quick Start

Archive internet history before it disappears - Deploy in 2 minutes, no domain required.

Try the live demo: Browse Example Archive →

→ QUICKSTART.md - Step-by-step deployment:

2 min: Tor hidden service (no domain, no port forwarding, works behind CGNAT)
5 min: Local testing (HTTP on localhost)
15 min: Production HTTPS (automated Let's Encrypt)

Why now? Communities get banned, platforms shut down, discussions vanish. Start preserving today.

Documentation

→ First time here? QUICKSTART.md - Deploy in 2-15 minutes

→ Quick answers? FAQ - Common questions answered in 30 seconds

→ Need help? Troubleshooting - Fix common issues

→ Using the API? API Reference - 30+ REST endpoints

→ How it works? Architecture - Technical deep-dive

→ Deployment guides:

Tor Hidden Service - .onion setup (2 min, no domain needed)
HTTPS Production - Let's Encrypt SSL (15 min)
Static Hosting - GitHub/Codeberg Pages (browse-only)
Docker Reference - Complete Docker guide

→ Operations:

Incremental Updates - Keep archives current with monthly dumps
Performance - Memory, storage, and tuning
Scaling - Multi-instance deployments
Search Setup - Full-text search configuration

→ Advanced:

MCP Server - AI integration (Claude Desktop/Code)
Scanner Tools - Data discovery utilities
Registry Setup - Instance leaderboard
Installation Guide - Platform-specific setup
Contributing · Security Policy · License

🎯 Key Features

🌐 Multi-Platform Support

Archive content from multiple link aggregator platforms in a single unified archive:

Platform	Format	CLI Flag	URL Prefix
Reddit	.zst JSON Lines	`--subreddit`	`/r/`
Voat	SQL dumps	`--subverse`	`/v/`
Ruqqus	.7z JSON Lines	`--guild`	`/g/`

Automatic Detection: Platform auto-detected from file extensions
Unified Search: PostgreSQL FTS searches across all platforms
Mixed Archives: Combine Reddit, Voat, and Ruqqus in single archive

🤖 MCP Server (AI Integration)

29 MCP tools auto-generated from OpenAPI for AI assistants:

Full Archive Access: Query posts, comments, users, search via Claude Desktop or Claude Code
Token Overflow Prevention: Built-in LLM guidance with field selection and truncation
5 MCP Resources: Instant access to stats, top posts, subreddits, search help
Claude Code Ready: Copy-paste configuration for immediate use

{
  "mcpServers": {
    "reddarchiver": {
      "command": "uv",
      "args": ["--directory", "/path/to/mcp_server", "run", "python", "server.py"],
      "env": { "REDDARCHIVER_API_URL": "http://localhost:5000" }
    }
  }
}

See MCP Server Documentation for complete setup guide.

📖 For Readers (offline, mobile, Tor)

📱 Mobile-First Design: Responsive layout optimized for all devices with touch-friendly navigation
⚡ JavaScript Free: Complete functionality without JS, pure CSS interactions — Tor-optimized, no external dependencies
📇 Offline Browsing Aids: Per-letter title indexes (Ctrl+F-friendly), flair indexes, and an archive map page — search-like navigation with zero server
🔍 Full-Text Search (server deployments): PostgreSQL FTS with Google-style operators — keywords, subreddit, author, date, score
📰 Community Context: Subreddit descriptions, rules, and wikis; Voat subverse metadata, user profiles, and flair
♿ Accessibility: WCAG compliant — Lighthouse 100 accessibility score across page types
🚄 Performance: ~13KB gzipped CSS, 3–32KB gzipped pages, Lighthouse 94–100, designed for low-bandwidth networks

🛠️ For Operators

🧭 Three Serving Modes: static (host anywhere), hybrid (static + search server), dynamic (everything served live from PostgreSQL) — switch anytime, same database
🔄 Incremental Updates: monthly dumps apply idempotently; archives stay current without rebuilds
🎨 Themes: 11 palettes (default, sepia, nord, solarized, dracula, gruvbox, cyberpunk, midnight OLED, old-reddit, phosphor, high-contrast) via --theme / REDDARCHIVER_THEME, plus --accent-color and --custom-css; CSS-only dark/light mode follows system preference with a manual toggle
🗄️ PostgreSQL Backend: streaming imports with constant memory; COPY protocol; resume from checkpoints
🚀 Deployment Options: localhost/LAN (2 commands), HTTPS with automated Let's Encrypt (15 min), Tor hidden service (2 min, works behind CGNAT), HTTPS+Tor dual-mode, or GitHub/Codeberg Pages (static)
🔗 SEO Ready: meta tags, XML sitemaps, structured data; --precompress + gzip_static for high-traffic static serving
🏆 Instance Registry: leaderboard with completeness-weighted scoring for distributed archiving

🔬 For Researchers & AI

🌐 REST API v1: 30+ endpoints — posts, comments, users, statistics, search, aggregations, batch and export (CSV/NDJSON) — with field selection and truncation controls
🤖 MCP Server: 29 tools for Claude Desktop/Claude Code (see above)
📊 Rich Statistics: analytics dashboard, p