Going Paperless in 2026: The Complete Guide to Paperless-ngx
Transform your paper chaos into a searchable digital archive with Paperless-ngx. Self-hosted document management with OCR, AI tagging, and complete privacy.
Table of Contents
- The Paper Problem We All Ignore
- What is Paperless-ngx?
- Key Features That Matter
- Document Processing
- AI and Automation
- User Experience
- Security and Privacy
- Hardware Requirements
- Minimum Requirements (Home/Light Use)
- Recommended Setup (Power Users)
- Storage Planning
- Installation Guide: Docker Setup
- Quick Install (Recommended)
- Manual Docker Compose Setup
- Directory Structure Explained
- Reverse Proxy (Production)
- Basic Configuration and First Steps
- Create Your Tag Structure
- Set Up Document Types
- Add Correspondents
- Configure Auto-Tagging Rules
- OCR and Search Capabilities
- How OCR Works
- Search Features
- Multi-Language OCR
- Automation Workflows
- Email Import
- Consume Folder Automation
- Webhook Integrations
- Security and Backup Strategy
- Network Security
- User Permissions
- Backup Strategy
- Migrating from Paper Archives
- Scanner Setup
- Scanning Best Practices
- Digitization Workflow
- Handling Backlog
- Conclusion: Why Paperless-ngx Is Your Best Choice
Going Paperless in 2026: The Complete Guide to Paperless-ngx
Take control of your documents with the best self-hosted document management system.

The Paper Problem We All Ignore
You know that drawer. The one stuffed with old receipts, tax documents from three years ago, warranty cards for appliances you no longer own, and cables for devices you’ve never even heard of. We all have one. Some of us have several.
The average American household receives 41 pounds of junk mail every year. Add in utility bills, insurance statements, medical records, and the instruction manual for that blender you bought in 2019, and you’re drowning in paper. Finding a specific document when you need it becomes an archaeological expedition. “Where did I put the warranty for the dishwasher?” is a question that triggers a full-scale excavation.
But here’s the real problem: paper is fragile. One flood, one fire, one over-caffeinated morning with a spilled coffee, and years of records are gone. Paper doesn’t have a backup. Paper doesn’t have search. Paper doesn’t have a “find all receipts from Amazon in 2024” button.
What you need is a paperless office system — one that’s completely under your control. No cloud subscription fees, no privacy concerns about who’s reading your tax returns, and no vendor lock-in when the service shuts down or changes its pricing model.
That’s where Paperless-ngx comes in.
What is Paperless-ngx?

Paperless-ngx is an open-source document management system that transforms your paper chaos into a searchable, organized digital archive. It’s the community-supported successor to the original Paperless project, actively maintained by contributors who use it themselves and backed by thousands of self-hosting enthusiasts.
Here’s what makes it powerful:
- Complete control: Your documents live on your server, in your home. No third-party access. No subscription fees. No “we updated our terms of service” surprises.
- Intelligent OCR: Every document becomes fully searchable. Need to find that one receipt with “ACME Corp” on it? Type the name, and Paperless finds it in seconds.
- Smart organization: Automatic tagging, document type classification, and correspondent tracking. Set up rules once, and Paperless organizes everything going forward.
- AI-powered suggestions: The system learns your filing habits and suggests tags and correspondents for new documents.
- Integration ready: Works with Home Assistant, Nextcloud, n8n, and practically any tool that speaks HTTP.
In 2026, with privacy concerns at an all-time high and subscription fatigue setting in across every service category, self-hosting your document management isn’t just cost-effective — it’s a statement about digital sovereignty. Your financial records, medical documents, and contracts should belong to you, not a corporation.
Key Features That Matter
Paperless-ngx isn’t just a document scanner with a web interface. It’s a full document management platform with features that rival enterprise solutions costing hundreds per month.
Document Processing
- Optical Character Recognition (OCR): Powered by Tesseract, Paperless extracts text from scanned documents, PDFs, and images. Every word becomes searchable, even in multi-page documents scanned at an angle.
- Automatic tagging: Define rules like “if document contains ‘invoice’ and amount > $500, tag as ‘major-expense’” and Paperless handles it automatically.
- Document types: Categorize documents as invoices, receipts, contracts, bank statements, medical records, warranties, or custom types you define.
- Correspondents: Track who sent or received each document — companies, government agencies, contractors.
- Storage paths: Automatically organize files by year, month, document type, or custom logic.
- Full-text search: Find any document by any word it contains, instantly.
AI and Automation
- Machine learning: Paperless observes how you tag documents and suggests matching tags for similar new uploads.
- Barcode recognition: Scan documents with barcodes for automatic routing and identification.
- Email processing: Connect your email accounts via IMAP, and Paperless will automatically import attachments from specified senders. Perfect for digital statements and receipts.
- Workflow automations: Create processing pipelines — import, classify, tag, notify, archive.
- Full REST API: Integrate with any automation platform or write custom scripts.
User Experience
- Modern web interface: Clean, responsive, and mobile-friendly. No desktop app required.
- Bulk operations: Select multiple documents and change tags, correspondents, or types in one action.
- Saved views: Create custom filters like “All tax documents from 2026” or “Receipts over $100 this month.”
- In-browser preview: View PDFs without leaving the interface.
- Dark mode: Because who likes staring at white screens at midnight?
Security and Privacy
- Self-hosted: Documents never leave your network unless you choose to expose them.
- Multi-user support: Create accounts for family members with permission controls.
- Two-factor authentication: Keep your documents secure with 2FA.
- Audit logging: Track who accessed or modified what and when.
- Optional encryption: Add at-rest encryption for sensitive documents.
Hardware Requirements
One of the best things about Paperless-ngx is that it’s surprisingly lightweight. You don’t need a server rack or enterprise hardware to run it effectively.
Minimum Requirements (Home/Light Use)
| Resource | Requirement |
|---|---|
| CPU | 2 cores |
| RAM | 2GB (4GB recommended) |
| Storage | 50GB+ (depends on document volume) |
| OS | Linux (Docker), macOS, or Windows |
Recommended Setup (Power Users)
| Resource | Recommendation |
|---|---|
| CPU | 4+ cores |
| RAM | 8GB+ (OCR is memory-intensive) |
| Storage | 500GB+ SSD (for fast search indexing) |
| Optional | GPU for accelerated OCR (not required) |
Storage Planning
The storage question deserves some thought. A typical scanned page at 300 DPI results in roughly 1MB of data. If you’re digitizing 10,000 pages (a moderately sized household archive), that’s about 10GB. But Paperless also stores the OCR index and metadata, so plan for approximately 1.5-2MB per page in total.
The good news: storage is cheaper than ever. A 1TB SSD costs less than a decent scanner. Your grandmother’s filing cabinet? Not so scalable.
Installation Guide: Docker Setup
Paperless-ngx is designed to run in Docker, which makes deployment straightforward regardless of your host operating system. Here’s a complete, production-ready setup.
Quick Install (Recommended)
The fastest way to get started is with the official installation script:
curl -L https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/install-paperless-ngx.sh | bash
This script downloads the latest release, creates the necessary directory structure, and generates a docker-compose.yml with sensible defaults.
Manual Docker Compose Setup
For those who prefer full control, here’s a complete docker-compose.yml that you can customize:
version: "3.9"
services:
paperless:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
container_name: paperless
restart: unless-stopped
depends_on:
- db
- broker
ports:
- "8000:8000"
environment:
# Core settings
PAPERLESS_URL: https://paperless.yourdomain.com
PAPERLESS_SECRET_KEY: "change-this-to-a-long-random-string"
# Database (PostgreSQL recommended for production)
PAPERLESS_DBHOST: db
PAPERLESS_DBPORT: 5432
PAPERLESS_DBNAME: paperless
PAPERLESS_DBUSER: paperless
PAPERLESS_DBPASS: your-database-password
# Redis broker
PAPERLESS_REDIS: redis://broker:6379
# Admin user (set once, then remove)
PAPERLESS_ADMIN_USER: admin
PAPERLESS_ADMIN_PASSWORD: your-secure-password
# OCR settings
PAPERLESS_OCR_LANGUAGE: eng
# Optional: Enable machine learning
PAPERLESS_ENABLE_HTTP_BASIC_AUTH: "true"
volumes:
- ./data:/usr/src/paperless/data
- ./media:/usr/src/paperless/media
- ./export:/usr/src/paperless/export
- ./consume:/usr/src/paperless/consume
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000"]
interval: 30s
timeout: 10s
retries: 5
db:
image: docker.io/library/postgres:15
container_name: paperless-db
restart: unless-stopped
environment:
POSTGRES_DB: paperless
POSTGRES_USER: paperless
POSTGRES_PASSWORD: your-database-password
volumes:
- ./pgdata:/var/lib/postgresql/data
broker:
image: docker.io/library/redis:7
container_name: paperless-redis
restart: unless-stopped
volumes:
- ./redisdata:/data
Save this as docker-compose.yml, then run:
docker compose up -d
Paperless will pull the necessary images, initialize the database, and start listening on port 8000. Point your browser to http://your-server:8000 and log in with the admin credentials you configured.
Directory Structure Explained
| Directory | Purpose |
|---|---|
/consume | Drop files here for automatic import. Paperless processes and removes them. |
/media | Stores your original documents and generated thumbnails. Back this up. |
/data | Contains the database and search index. Critical for restore. |
/export | Destination for the built-in export/backup tool. |
Reverse Proxy (Production)
For remote access, run Paperless behind a reverse proxy with HTTPS. Here’s a simple Caddy configuration:
paperless.yourdomain.com {
reverse_proxy localhost:8000
}
Or with Nginx:
server {
listen 443 ssl;
server_name paperless.yourdomain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://localhost:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
Basic Configuration and First Steps
Once Paperless is running, a few initial configurations will set you up for success.
Create Your Tag Structure
Tags are your primary organization tool. Think carefully about your top-level categories. A good starting structure:
- Financial: bank-statements, invoices, receipts, taxes
- Home: warranties, manuals, maintenance, insurance
- Medical: records, prescriptions, insurance-claims
- Legal: contracts, certificates, government
- Work: contracts, expenses, training
You can nest tags, but keep top-level categories limited to 10-15. Too many branches make navigation cumbersome.
Set Up Document Types
Document types work alongside tags for classification:
- Invoice
- Receipt
- Contract
- Bank Statement
- Medical Record
- Insurance Document
- Warranty
- Manual/Guide
- Correspondence
- Certificate
Add Correspondents
Correspondents represent entities you exchange documents with:
- Your bank
- Insurance company
- Employer
- Healthcare providers
- Government agencies (IRS, DMV, etc.)
Paperless learns from your assignments and will start suggesting correspondents for similar documents.
Configure Auto-Tagging Rules
Rules let you automate organization. Examples:
| Rule | Condition | Action |
|---|---|---|
| Bank statements | Correspondent contains “Bank of America” | Add tag: bank-statement |
| Amazon purchases | Content contains “amazon.com” | Add tag: shopping, correspondent: Amazon |
| Tax documents | Document type is “Form 1099” or “W-2” | Add tag: taxes, tax-2026 |
To create rules, navigate to Settings → Workflows and define your triggers.
OCR and Search Capabilities

This is where Paperless-ngx shines. The OCR engine transforms static images into searchable text, making every document instantly retrievable.
How OCR Works
When you upload a document (PDF, image, or other format), Paperless:
- Detects the document language (configurable for multiple languages)
- Rotates and deskews if necessary
- Runs Tesseract OCR to extract all text
- Indexes the content in the search database
- Generates a searchable PDF
The result: you can find “receipt for the HDMI cable” by searching for “HDMI” — even if that text is buried inside a multi-page PDF.
Search Features
- Full-text search: Search inside document content, not just filenames
- Fuzzy matching: Find “reciept” even if you typed “receipt” wrong
- Advanced filters: Combine text search with tags, dates, correspondents
- Saved searches: Bookmark complex queries for quick access
- Search operators: Use
tag:receipt date:2026-01 correspondent:Amazonfor precise filtering
Multi-Language OCR
If your documents are in multiple languages, configure them in your environment:
PAPERLESS_OCR_LANGUAGE: eng+spa+deu
This enables English, Spanish, and German OCR in parallel.
Automation Workflows
Paperless-ngx excels at hands-off document processing. Set up workflows once, and documents flow through your pipeline automatically.
Email Import
Connect your email accounts to automatically capture digital statements and receipts:
- Navigate to Settings → Mail
- Add an IMAP account with your email provider’s settings
- Create mail rules to filter by sender, subject, or attachment type
- Paperless downloads matching attachments and imports them
Example use case: Every time your utility company sends “Your Bill is Ready,” Paperless imports the attached PDF, tags it as “utilities,” and assigns the correspondent.
Consume Folder Automation
The /consume directory watches for new files. Drop documents there, and they’re processed automatically:
# Scan from a network scanner to consume folder
scp scanned-doc.pdf paperless-server:/path/to/paperless/consume/
# Or use a hot folder from your scanner software
# Most scanner apps support "scan to folder" — point it at your consume directory
For advanced workflows, combine with automation tools:
# n8n workflow example
# Watch for new Google Drive uploads, then push to Paperless
{
"nodes": [
{
"type": "n8n-nodes-base.googleDriveTrigger",
"parameters": {"event": "fileAdded"}
},
{
"type": "n8n-nodes-base.httpRequest",
"parameters": {
"url": "https://paperless.yourdomain.com/api/documents/post_document/",
"method": "POST",
"authentication": "genericCredentialType"
}
}
]
}
Webhook Integrations
Paperless can send webhooks when documents are processed. Use this with Home Assistant for notifications:
# Home Assistant automation
automation:
- alias: "New Tax Document Notification"
trigger:
- platform: webhook
webhook_id: paperless-tax-doc
condition:
- condition: template
value_template: "{{ trigger.json.tags contains 'taxes' }}"
action:
- service: notify.mobile_app
data:
message: "New tax document uploaded: {{ trigger.json.title }}"
Security and Backup Strategy
Your documents contain sensitive information. Security isn’t optional — it’s essential.
Network Security
| Practice | Why It Matters |
|---|---|
| Run behind reverse proxy | Enables HTTPS encryption |
| Use VPN for remote access | Avoids exposing Paperless directly to the internet |
| Enable 2FA | Prevents unauthorized access even with compromised password |
| Regular updates | Security patches fix vulnerabilities |
| Network isolation | Run on a separate VLAN or behind a firewall |
User Permissions
Create separate accounts for family members. Paperless supports role-based access:
- Admin: Full access to all settings and documents
- Standard: Can view and edit documents
- View-only: Read access without modification rights
Backup Strategy
The 3-2-1 rule: three copies, two different media types, one off-site.
What to back up:
media/— Your original documents (critical)data/— Database and search indexdocker-compose.ymland environment files
Built-in export:
# Export all documents and metadata
docker exec paperless document_exporter /export --zip
# The resulting file is in your ./export directory
# Copy it to backup storage
cp ./export/export.zip /backup-location/paperless-$(date +%Y%m%d).zip
Automated backup script:
#!/bin/bash
# Daily backup script for Paperless-ngx
BACKUP_DIR="/mnt/backup/paperless"
DATE=$(date +%Y%m%d)
# Create backup
docker exec paperless document_exporter /export --zip
# Move and rename
mv ./export/*.zip "$BACKUP_DIR/paperless-$DATE.zip"
# Rotate backups (keep 30 days)
find "$BACKUP_DIR" -name "*.zip" -mtime +30 -delete
echo "Backup complete: paperless-$DATE.zip"
Off-site backup:
Use rclone or rsync to sync your backups to cloud storage or a remote server:
# Sync to Backblaze B2 (example)
rclone sync /mnt/backup/paperless b2:my-backup-bucket/paperless
Migrating from Paper Archives
Ready to digitize that filing cabinet? Here’s a practical workflow.
Scanner Setup
Invest in a scanner with these features:
- Document feeder (ADF): Batch scan 20+ pages at once
- Duplex scanning: Both sides simultaneously
- Scan-to-folder: Output directly to Paperless consume folder
Recommended scanners:
| Budget | Model | Notes |
|---|---|---|
| $100-200 | Brother ADS-2700W | Great value, WiFi included |
| $300-400 | Fujitsu ScanSnap iX1600 | Excellent software, fast |
| $500+ | Fujitsu fi-8000 series | Enterprise-grade, heavy duty |
Scanning Best Practices
- 300 DPI for text documents — sufficient for OCR without massive files
- 600 DPI for photos or documents with small print
- Enable auto-color detection — grayscale scans are smaller but color is preserved when needed
- Enable deskew — straightens crooked scans automatically
- Enable blank page removal — skips blank sides of double-sided documents
Digitization Workflow
- Sort by category: Group similar documents before scanning
- Batch scan: Run 20-50 documents through the feeder at once
- Review in Paperless: Check auto-tagging and correct as needed
- Shred originals: Once confirmed, securely dispose of paper copies
For documents you must keep in physical form (original certificates, notarized documents), store them securely and note their location in Paperless with a “original-kept” tag.
Handling Backlog
If you have years of documents to digitize:
- Start with the most recent and important (last 12 months of financial documents)
- Work backward chronologically
- Don’t try to digitize everything at once — set a goal of 50-100 documents per session
- Consider which documents even need digitizing. That cable bill from 2019? Maybe just shred it.
Conclusion: Why Paperless-ngx Is Your Best Choice
In 2026, your options for document management are clear:
You could pay $10-15 per month for cloud services like Evernote, OneDrive, or Google Drive. They’ll hold your documents hostage behind paywalls, mine your data for advertising, and change terms of service whenever convenient. When they shut down or raise prices, you scramble.
Or you could take control.
Paperless-ngx gives you:
- Privacy: Your financial records, medical documents, and contracts stay on your server
- Freedom: No subscription fees, no vendor lock-in, no surprising price increases
- Power: Enterprise-grade features without enterprise-grade costs
- Community: Active development, helpful forums, and a project that’s maintained by people who use it daily
The transition takes effort. Scanning your archives requires time. Setting up workflows requires thought. But once it’s running, you’ll wonder how you ever lived with paper.
Start small. Install Paperless. Scan this month’s documents. Watch it organize itself. Then, when you’re ready, tackle that drawer.
Your future self — the one who finds the dishwasher warranty in 30 seconds flat — will thank you.
Ready to go paperless? Try the Paperless-ngx demo (login: demo/demo) or check out the official documentation.

Comments
Powered by GitHub Discussions