Searches open-access sources, downloads the papers, and bundles them into a local library — ready to hand to any AI. No accounts, no API keys.
Eight open-access sources plus a built-in web meta-search, queried in parallel and deduplicated by DOI and content hash. No API keys.
Preprints in CS, physics, math, biology. Direct PDF links, no rate-limit pain.
~250M scholarly works with an open-access filter applied by default.
~200M papers with semantic relevance ranking and citation graphs.
Life-science papers and preprints with open full text — biomedicine, genomics, clinical.
Books, papers, scanned media. Slow but deep — perfect for older work.
Directory of Open Access Journals — vetted, peer-reviewed publications only.
CERN’s open repository — papers, preprints, datasets and theses across every field.
70,000+ public-domain ebooks. EPUB native, ideal for the history of ideas.
Aggregates DuckDuckGo, Bing, Brave, Mojeek, Marginalia & Startpage behind a circuit breaker — plus a no-Docker local SearXNG fallback. Catches the long tail.
Type what you’re looking for. Document Finder automatically splits your question into three or four scholarly sub-queries — and an optional local model sharpens them further. You always see exactly what it searched, so nothing is hidden.
Eight open-access sources plus web meta-search, queried in parallel, with per-source rate limiting and retries. You watch the progress bars fill, the source lanes climb, and the queue drain in real time. Stop any time.
Each search gets its own folder on disk. PDFs and EPUBs land in the folder, extracted text in _text/, and a single library.db holds the metadata. Browse, search, open any document or reveal it in your file manager, export as ZIP, or feed straight into your favorite LLM tool.
Concurrent downloads with adaptive backoff and silent retries. A big run streams in while you watch, not after — then retry any failures in one click, and skip re-downloading documents you already have across libraries.
Your queries, your library, your machine. Optional on-device AI (a small embedding model plus a 1.5B LLM) reranks results and expands queries — fully offline, no API keys, no telemetry.
Every PDF and EPUB is run through a fast Rust extractor and stored alongside as plain text. Grep-able. LLM-ready. No vendor lock-in.
Bundle any library into a portable .zip of PDFs, EPUBs and extracted text — ready to drop into any AI context window.
Sub-query progress, per-source lanes, throughput, ETA, and a per-engine health bar for the web meta-search. For people who like watching their work.
Three themes — Paper, Slate and Midnight — with nine accent colors and a compact/regular density. WCAG-AA contrast, full keyboard navigation, screen-reader live regions, and reduced-motion support throughout.
Native installers for macOS, Windows, and Linux. Not notarized (macOS is ad-hoc signed) — first-launch steps are on the release page.