TorSearch Project

PROJECT SHOWCASE

TorSearch

A privacy-first, decoupled Tor (.onion) search engine platform for professional dark-web indexing operations.

Decoupled Two-Node Architecture

TorSearch is architected for security. By splitting the platform into a Private Local Node for crawling and an Encapsulated VPS Node for public search, operators can crawl dark-web content without exposing their infrastructure to the public internet.

Technical Stack

Core

Python 3.11+ / FastAPI

Search

OpenSearch (Read/Write Split)

Database

PostgreSQL & Redis

Network

Tor SOCKS5 & Selenium

Operational Intelligence

Advanced Crawling

  • Normalized URL discovery & depth limiting
  • Exponential backoff & jitter scheduling
  • Auth-Crawler for "Login-Only" hidden sites
  • Per-domain concurrency & rate limiting

Signal Enrichment

  • Near-duplicate detection families
  • PageRank & Domain Graph scoring
  • Semantic embeddings (transformers) backfill
  • Host authority & Uptime metrics

Trust & Safety

  • ML-powered NSFW classification
  • Phishing heuristics & risk scoring
  • Cloaking detection & fetch comparison
  • Anti-bot & Captcha telemetry dashboards

Delta Replication

  • Gzip NDJSON incremental export bundles
  • External versioning for safe imports
  • Resumable HTTPS sync (API Key protected)
  • VPS node stays hardened with zero DB bloat

Operator-Focused Admin UI

Manage queues, bulk-actions, ads, and homepage layouts from a unified, private dashboard.

Queue Browser Domain Blocking Auth Credentials Ads Manager

Deploy the TorSearch Stack

Ready-to-operate infrastructure for private or public search projects.