22 min read

How to Self-Host Paperless-ngx and Access Your Documents from Anywhere

Every household and small business accumulates a mountain of paper: tax returns, insurance policies, utility bills, contracts, receipts, warranty cards. Most of it ends up in a drawer or a box, unsearchable and impossible to find when you actually need it. Paperless-ngx converts that pile into a searchable, indexed, automatically tagged digital archive that runs entirely on your own hardware.

Self-Hosting · Document Management · Docker · OCR · 2025

How to Self-Host Paperless-ngx and Access Your Documents from Anywhere

Every household and small business accumulates a mountain of paper: tax returns, insurance policies, utility bills, contracts, receipts, warranty cards. Most of it ends up in a drawer or a box, unsearchable and impossible to find when you actually need it. Paperless-ngx converts that pile into a searchable, indexed, automatically tagged digital archive that runs entirely on your own hardware. This guide walks through a complete verified installation using Docker Compose, explains every configuration option that matters, and shows how to expose your instance securely to the internet so you can scan and retrieve documents from anywhere.

🐳 Docker Compose · PostgreSQL · Redis 🔍 OCR · Auto-tagging · Full-text search 🌐 Remote access via Localtonet

What Is Paperless-ngx?

Paperless-ngx is an open-source document management system. You feed it documents, either by dropping files into a folder, uploading through its web interface, emailing them to a configured inbox, or using a mobile app, and it does the following automatically:

🔍 OCR (Optical Character Recognition) Extracts text from scanned images and PDFs, even if they contain no text layer. Powered by Tesseract, which supports over 100 languages. Documents become fully searchable after consumption.
🏷️ Automatic Tagging A machine-learning classifier analyzes content and metadata to automatically assign tags, correspondents (who sent it), and document types (invoice, contract, receipt). The more documents you have, the better it gets.
📄 PDF/A Archival Format All consumed documents are converted to PDF/A, a format specifically designed for long-term digital preservation. The original unmodified file is preserved separately alongside the archived version.
📧 Email Consumption Connect a dedicated email inbox and Paperless-ngx will check it automatically, pull attachments, and consume them as documents. Send a PDF from your phone's scanner app to that address and it appears in your archive within minutes.
🔒 Your Data Stays Yours Documents are stored on your own hardware and are never transmitted to any third party. There is no cloud dependency, no subscription, and no vendor that can shut down and take your archive with it.
📱 Mobile Scanning Apps Works with Paperless for Android and iOS, plus any scanner app that can share PDFs. Scan a receipt on your phone and it lands in your archive automatically.
Paperless-ngx vs Paperless-ng vs Paperless: Which one?

The original Paperless project was written by Daniel Quinn and is no longer maintained. Paperless-ng was an improved fork that also became unmaintained. Paperless-ngx is the current actively maintained successor, supported by a community team. Always install Paperless-ngx. The other two should not be used for new installations.

What Gets Installed: The Four Containers

Paperless-ngx uses a multi-container setup. Understanding what each container does makes troubleshooting and configuration much easier.

ContainerImagePurpose
webserver ghcr.io/paperless-ngx/paperless-ngx:latest The main application: web UI, API, document consumer, OCR worker, and task scheduler. Everything you interact with.
broker docker.io/library/redis:8 Message broker (Redis). Coordinates background task queues: OCR jobs, email checks, index updates, classifier training. Required. Without it, nothing works in the background.
db docker.io/library/postgres:18 PostgreSQL database that stores document metadata, tags, correspondents, and the search index. The actual document files are stored on disk, not in the database.
gotenberg docker.io/gotenberg/gotenberg:8.25 Document conversion service. Converts Office documents (Word, Excel, PowerPoint, LibreOffice formats) and emails to PDF so Paperless can process them. Without this container, only PDF and image files are supported.
tika docker.io/apache/tika:latest Content analysis and metadata extraction from Office documents and emails. Works alongside Gotenberg. Also optional but required for Office file support.

The minimal setup (webserver + broker + db) handles PDFs and images. Adding Gotenberg and Tika enables Office documents and emails. This guide uses the full stack.

Prerequisites

You need a Linux server with Docker and Docker Compose installed. This guide uses Ubuntu 22.04 or 24.04, but the same steps work on Debian, Raspberry Pi OS, and any other Debian-based distribution. Paperless-ngx supports amd64, arm, and arm64 hardware.

RequirementMinimumRecommended
CPUAny (OCR is slow on single-core)2+ cores. OCR uses all available cores.
RAM1 GB (SQLite, no Tika)2 GB+ with full stack (Tika adds ~500 MB)
Storage5 GB to startDepends on archive size. 1,000 documents typically use 2 to 5 GB including originals and thumbnails.
DockerDocker Engine 20+Latest stable
PostgreSQL14 (minimum since v2.18)18 (current in official compose)

Raspberry Pi notes

Paperless-ngx runs on Raspberry Pi 3 and 4 (arm64). Some tasks like OCR and classifier training are slower on lower-powered hardware. The official docs recommend these settings for Pi deployments: stick with SQLite instead of PostgreSQL to reduce RAM usage, set PAPERLESS_WEBSERVER_WORKERS=1 to reduce memory, set PAPERLESS_ENABLE_NLTK=false to disable advanced language processing, and consider PAPERLESS_OCR_PAGES=1 to OCR only the first page of each document. These tradeoffs reduce performance but make the system usable on 1 to 2 GB RAM. This guide uses PostgreSQL for reliability; swap in the SQLite compose file if you are RAM-constrained.

Step 1: Create the Directory and Download the Compose Files

Create a directory for Paperless-ngx and navigate into it. All your data, configuration, and document volumes will live here.

bash
mkdir ~/paperless-ngx
cd ~/paperless-ngx
mkdir consume export

The consume directory is where you drop documents to be imported. The export directory is used for backups. Create both before starting the containers or Docker will create them as root-owned directories, which causes permission errors.

Now create the main docker-compose.yml file. This is based directly on the official docker-compose.postgres-tika.yml from the Paperless-ngx GitHub repository, which is the full-featured stack:

docker-compose.yml
services:
  broker:
    image: docker.io/library/redis:8
    restart: unless-stopped
    volumes:
      - redisdata:/data

  db:
    image: docker.io/library/postgres:18
    restart: unless-stopped
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: paperless
      POSTGRES_USER: paperless
      POSTGRES_PASSWORD: paperless

  webserver:
    image: ghcr.io/paperless-ngx/paperless-ngx:latest
    restart: unless-stopped
    depends_on:
      - db
      - broker
      - gotenberg
      - tika
    ports:
      - "127.0.0.1:8000:8000"
    volumes:
      - data:/usr/src/paperless/data
      - media:/usr/src/paperless/media
      - ./export:/usr/src/paperless/export
      - ./consume:/usr/src/paperless/consume
    env_file: docker-compose.env
    environment:
      PAPERLESS_REDIS: redis://broker:6379
      PAPERLESS_DBHOST: db
      PAPERLESS_TIKA_ENABLED: 1
      PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000
      PAPERLESS_TIKA_ENDPOINT: http://tika:9998

  gotenberg:
    image: docker.io/gotenberg/gotenberg:8.25
    restart: unless-stopped
    command:
      - "gotenberg"
      - "--chromium-disable-javascript=true"
      - "--chromium-allow-list=file:///tmp/.*"

  tika:
    image: docker.io/apache/tika:latest
    restart: unless-stopped

volumes:
  data:
  media:
  pgdata:
  redisdata:
Note on the port binding

The port is bound to 127.0.0.1:8000:8000 instead of 0.0.0.0:8000:8000. This means Paperless is only reachable from localhost, not directly from the network. Localtonet tunnels from localhost, so this is the correct and more secure configuration. Do not change it to 0.0.0.0.

Step 2: Create the Environment File

Create the docker-compose.env file that the webserver container reads. This file contains all the per-installation configuration that should not be baked into the compose file.

docker-compose.env
# UID and GID of the user running paperless inside the container.
# Set these to match your host user ID to avoid permission issues with
# the consume directory. Get your IDs with: id -u && id -g
USERMAP_UID=1000
USERMAP_GID=1000

# Required for public access. Set this to your Localtonet URL.
# Leave commented out for local-only access.
# PAPERLESS_URL=https://yoursubdomain.localto.net

# Secret key for cryptographic signing. Generate with:
# openssl rand -base64 64
# Must be changed before exposing Paperless publicly.
PAPERLESS_SECRET_KEY=change-me-use-openssl-rand-base64-64

# Your timezone. Used for correct timestamps on documents.
# Full list: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
PAPERLESS_TIME_ZONE=Europe/Istanbul

# The language used for OCR. Set this to the language most of your
# documents are written in. Use ISO 639-2 three-letter codes.
# eng=English, deu=German, fra=French, tur=Turkish, chi_sim=Simplified Chinese
PAPERLESS_OCR_LANGUAGE=eng

# Additional OCR languages to install (space-separated).
# Note: PAPERLESS_OCR_LANGUAGES (plural) = languages TO INSTALL
#       PAPERLESS_OCR_LANGUAGE  (singular) = language TO USE for OCR
# These are different variables. Do not confuse them.
# Example for Turkish + English:
# PAPERLESS_OCR_LANGUAGES=tur
# PAPERLESS_OCR_LANGUAGE=tur
# (English is installed by default, no need to add it here)

# Number of webserver worker processes. Increase for faster UI on
# multi-core systems. Each worker loads the full app into memory.
# Reduce to 1 on low-RAM systems (< 2 GB).
PAPERLESS_WEBSERVER_WORKERS=2
The two OCR language variables are not the same thing

PAPERLESS_OCR_LANGUAGES (plural with an S) specifies which Tesseract language packages to download and install into the container. PAPERLESS_OCR_LANGUAGE (singular, no S) specifies which language to use when performing OCR on documents. You need both if you want a non-default language. For Turkish: set PAPERLESS_OCR_LANGUAGES=tur to install the Turkish language data, and PAPERLESS_OCR_LANGUAGE=tur to use it. English, German, Italian, Spanish, and French are installed by default and do not need to be listed in PAPERLESS_OCR_LANGUAGES.

Step 3: Generate a Proper Secret Key

The placeholder change-me-use-openssl-rand-base64-64 in the env file must be replaced with an actual random key before you expose Paperless to the internet. Run this command and paste the output into the env file as the value of PAPERLESS_SECRET_KEY:

bash
openssl rand -base64 64
output
xK9mP2nQ8rT5vW1yA4bE7hJ3kL6oU0iF9cG2dH5jM8nP1qR4sT7vX0yB3eI6lN9

This key is used to sign session cookies and tokens. If it changes after you have active sessions, all users will be logged out. If it leaks, attackers can forge authentication tokens. Keep it secret and back it up alongside your database.

Step 4: Start the Containers

With both files in place, pull the images and start the stack:

bash
docker compose pull
docker compose up -d
output
[+] Pulling 5/5
 ✔ broker Pulled        2.1s
 ✔ db Pulled            4.3s
 ✔ gotenberg Pulled    12.8s
 ✔ tika Pulled         18.2s
 ✔ webserver Pulled    45.6s

[+] Running 5/5
 ✔ Container paperless-ngx-broker-1      Started
 ✔ Container paperless-ngx-db-1          Started
 ✔ Container paperless-ngx-gotenberg-1   Started
 ✔ Container paperless-ngx-tika-1        Started
 ✔ Container paperless-ngx-webserver-1   Started

Wait about 30 to 60 seconds for all containers to finish initializing, then check that everything is running:

bash
docker compose ps
output
NAME                              STATUS
paperless-ngx-broker-1            running
paperless-ngx-db-1                running
paperless-ngx-gotenberg-1         running
paperless-ngx-tika-1              running
paperless-ngx-webserver-1         running

All five containers should show running. If any show exited or restarting, check the logs with docker compose logs webserver to see what went wrong.

Step 5: Create Your Admin Account

Open your browser and navigate to http://localhost:8000. On first access, Paperless-ngx prompts you to create an admin account. Fill in a username, email, and a strong password. This account has full access to all documents and settings.

bash — alternative: create admin from CLI
docker compose exec webserver createsuperuser

If you prefer to create the account from the command line (useful for automated setups), you can also set PAPERLESS_ADMIN_USER, PAPERLESS_ADMIN_PASSWORD, and PAPERLESS_ADMIN_MAIL in docker-compose.env before first launch. Paperless will create the admin account automatically on startup.

Understanding the Interface: What Everything Does

Once logged in, you see the main document list. Before you start uploading documents, it is worth understanding the organizational structure that makes Paperless powerful.

ConceptWhat It MeansExamples
Correspondent Who created or sent the document. Usually a person, company, or institution. IKEA, Tax Authority, Landlord, Bank of America, Insurance Co.
Document Type The category of document regardless of who sent it. Invoice, Contract, Letter, Statement, Receipt, Warranty, Policy
Tag Flexible labels. Documents can have multiple tags. Use for any dimension that does not fit correspondent or type. 2024, urgent, tax-deductible, apartment, car, needs-action
Storage Path Controls where documents are stored on disk using a template. Optional but useful for keeping files organized outside of Paperless. {correspondent}/{document_type}/{created_year}
Custom Fields User-defined metadata fields for documents. Add fields like "Amount", "Contract End Date", "Account Number". Invoice amount, Expiry date, Contract reference number

Paperless-ngx uses machine learning to assign these automatically. Create a few correspondents, document types, and tags manually, then train the classifier by uploading 20 to 30 documents and assigning metadata manually. After that, new documents of similar types get tagged automatically.

Four Ways to Get Documents into Paperless-ngx

1

Drop files into the consume folder

Copy or move any supported file into ~/paperless-ngx/consume/ on your server. Paperless watches this directory and processes new files automatically, usually within seconds. This is the fastest method from the command line or from other scripts: cp invoice.pdf ~/paperless-ngx/consume/

2

Upload through the web interface

Click the upload button in the web UI. Drag and drop files or click to browse. Useful when you want to immediately see the result or set metadata before consumption. You can upload multiple files at once.

3

Email consumption

Configure Paperless to check a dedicated email inbox (Settings → Mail). It will periodically check the inbox, download attachments matching your rules, and consume them as documents. Workflow: scan a receipt on your phone, email the PDF to the dedicated address, find it organized in your archive within minutes, from anywhere.

4

Mobile apps

Paperless for Android and Paperless for iOS connect directly to your Paperless-ngx instance via the API. Scan a document with your phone's camera, submit it through the app, and it lands in your archive. Works over the internet once you have remote access configured. You can also use any scanner app that can "share" a PDF to another app and use the Paperless Share app for Android.

Supported file formats: PDF documents, JPEG, PNG, TIFF, GIF, BMP images, plain text files, and (with Tika/Gotenberg) Word, Excel, PowerPoint, OpenDocument files, and raw email files (.eml).

Exposing Paperless-ngx to the Internet with Localtonet

Paperless-ngx running locally is already useful, but its real value comes when you can access and add documents from anywhere: scan something at the office and have it appear in your home archive, or retrieve a document on your phone while standing in front of a bank teller.

To expose Paperless to the internet you need to forward inbound HTTPS traffic to port 8000 on your server. Localtonet handles this as an outbound tunnel, which means it works even behind NAT, CGNAT, or without a static IP address.

Step 1: Install and authenticate Localtonet

Download and install Localtonet on the same machine running Paperless-ngx. Follow the installation steps for your platform at localtonet.com and authenticate with your token. If you have already done this for another service, skip to Step 2.

Step 2: Create an HTTP tunnel to port 8000

In the Localtonet dashboard, create a new tunnel:

SettingValue
ProtocolHTTP
Local IP127.0.0.1
Local Port8000
SubdomainChoose a name, e.g. mydocsmydocs.localto.net

Your public URL will be something like https://mydocs.localto.net. Note this URL for the next step.

Step 3: Update the environment file with your public URL

Paperless-ngx needs to know its own public URL to generate correct links (for example, in email notifications and API responses). Open docker-compose.env and uncomment and set the PAPERLESS_URL line:

docker-compose.env — add this line
PAPERLESS_URL=https://mydocs.localto.net

Then restart the webserver container to apply the change:

bash
docker compose restart webserver

Navigate to https://mydocs.localto.net in your browser. You should see the Paperless-ngx login page over HTTPS. Localtonet provides the TLS certificate automatically; you do not need to configure SSL yourself.

Set a strong password before going public

Your Paperless-ngx instance stores sensitive documents: tax returns, contracts, identity documents. Before making it publicly accessible, ensure your admin account has a strong, unique password, you have set a proper random PAPERLESS_SECRET_KEY, and ideally you have configured Localtonet's built-in WAF on this tunnel. If you only need access for yourself and a small number of trusted users, consider IP whitelisting specific trusted IP ranges in the Localtonet WAF security settings.

Connecting the Mobile App

With remote access configured, you can now use the Paperless mobile apps to scan and upload documents from your phone. Install the Paperless app for Android or iOS, then configure it:

1

Enter your server URL

In the app settings, set the server URL to your Localtonet public URL: https://mydocs.localto.net

2

Log in with your credentials

Enter the same username and password you use for the web interface. The app uses the Paperless REST API directly.

3

Test by uploading a document

Use your phone's camera to scan a document through the app, or share a PDF from another app to the Paperless app. The document should appear in your Paperless archive within a few seconds to a minute, depending on how long OCR takes on your server.

Backing Up Paperless-ngx: What Needs to Be Saved

Your Paperless-ngx archive has value precisely because it contains important documents. A backup strategy is not optional. There are three things that need to be backed up:

WhatWhereContains
Media files Docker volume paperless-ngx_media The actual document files: original uploads and archived PDF/A versions. This is the most important thing to back up.
Database Docker volume paperless-ngx_pgdata All metadata: document titles, tags, correspondents, types, dates, custom fields, notes, and the search index. Without this, your documents are files with no organization.
Configuration docker-compose.yml and docker-compose.env Your setup, including the secret key. Required to restore to a working state.

The simplest reliable backup method: use the built-in document exporter, then back up the export directory. The exporter creates a complete JSON export of all metadata and copies all original document files into the export folder:

bash — create a full export
docker compose exec webserver document_exporter ../export
output
100%|██████████| 247/247 [00:04<00:00, 52.3it/s]

This puts everything into ~/paperless-ngx/export/. You can then sync this directory to another machine, an external drive, or cloud storage. To restore from an export:

bash — restore from export
docker compose exec webserver document_importer ../export
Export and import versions must match

An export created with one version of Paperless-ngx cannot be imported into a different version. The export format is tied to the database schema, which changes between releases. Always import into the same version that created the export, then upgrade afterward.

For a simpler day-to-day backup of the Docker volumes directly (no version constraints, but larger files):

bash — direct volume backup
# Stop containers first to ensure consistent backup
docker compose stop

# Backup media and database volumes
rsync -a /var/lib/docker/volumes/paperless-ngx_media/ ~/backup/media/
rsync -a /var/lib/docker/volumes/paperless-ngx_pgdata/ ~/backup/pgdata/

# Restart
docker compose start

Updating Paperless-ngx

Paperless-ngx releases updates frequently, including security fixes. The update process is straightforward:

bash
# Always make a backup before updating
docker compose exec webserver document_exporter ../export

# Pull the new images
docker compose pull

# Restart with the new images (applies database migrations automatically)
docker compose up -d

Database migrations run automatically when the new webserver container starts. Do not stop the containers while a migration is in progress (watch for migration-related output in the logs with docker compose logs -f webserver).

Common Problems and Solutions

ProblemLikely CauseFix
Permission denied on consume folder USERMAP_UID / USERMAP_GID in the env file does not match your host user's UID/GID Run id -u && id -g to get your IDs, update the env file, recreate the container with docker compose up -d --force-recreate webserver
Document consumed but no text found Wrong OCR language set, or the document is a scanned image in a language not installed Check PAPERLESS_OCR_LANGUAGE matches your document language. Install additional language packages via PAPERLESS_OCR_LANGUAGES.
Office documents (.docx, .xlsx) not consumed Tika and Gotenberg containers are not running or not reachable Run docker compose ps and verify gotenberg and tika show running. Check logs with docker compose logs gotenberg tika.
Invalid HTTP_HOST header error after setting PAPERLESS_URL Paperless needs PAPERLESS_URL set to the actual public URL you are accessing it from Ensure PAPERLESS_URL exactly matches your Localtonet URL including the https:// prefix. Restart the webserver container after changing it.
Webserver container keeps restarting Usually a database connection issue or missing required environment variable Check docker compose logs webserver for the actual error. Most common: database not yet ready (wait and retry), or PAPERLESS_SECRET_KEY still set to the placeholder value.
Very slow OCR on Raspberry Pi OCR is CPU-intensive and arm64 processes it slower than x86 Set PAPERLESS_OCR_PAGES=1 to OCR only the first page, PAPERLESS_WEBSERVER_WORKERS=1, and PAPERLESS_ENABLE_NLTK=false. Consider skipping Tika/Gotenberg if you only process PDFs.
Forgot admin password Lost credentials Reset from the CLI: docker compose exec webserver manage changepassword admin (replace admin with your username)
Search returns no results for existing documents Search index is out of date or was never built Rebuild the index: docker compose exec webserver document_index reindex

Frequently Asked Questions

Is my data safe if Paperless-ngx is exposed to the internet?

Paperless-ngx stores documents in clear text without per-document encryption. This means that anyone with server access can read your files. The security model relies on authentication (username and password) to control access to the web interface. To reduce risk when exposing it publicly: use a strong, unique password for your admin account, set a proper random PAPERLESS_SECRET_KEY, and enable the Localtonet WAF on your tunnel with credential stuffing protection and a low rate limit on the login path. If you only need access for yourself, consider whitelisting your home IP and mobile carrier IP ranges in the WAF security settings so the instance is effectively private.

Can I use SQLite instead of PostgreSQL?

Yes. SQLite is officially supported and is the recommended choice for low-RAM systems like Raspberry Pi. Use the docker-compose.sqlite.yml file from the Paperless-ngx GitHub repository instead of the postgres version. SQLite stores everything in a single file, which makes backups simpler but concurrent write performance is lower. For a single-user personal archive, SQLite is perfectly fine. For multi-user setups or high document volumes, PostgreSQL is better.

How much RAM does the full stack (with Tika and Gotenberg) use?

At idle, the full stack typically uses about 600 to 900 MB RAM: roughly 300 to 500 MB for the Paperless webserver (with 2 workers), 150 to 250 MB for Tika, 80 to 100 MB for Gotenberg, 50 to 80 MB for PostgreSQL, and 20 to 40 MB for Redis. During OCR processing, RAM usage spikes significantly depending on document size and the number of task workers. A system with 2 GB RAM handles the full stack comfortably at idle. For active document processing, 4 GB is more comfortable.

Does Paperless-ngx handle multi-user access?

Yes. You can create multiple user accounts from the admin interface (Settings → Users) and assign permissions. Documents can be owned by specific users or shared. You can restrict access so that each user only sees their own documents, or configure a shared library where everyone sees everything. There is no per-user document encryption; all documents are readable by anyone with server access or admin privileges.

What happens to the original file after consumption?

Paperless-ngx stores both the original unmodified file and the processed PDF/A archive version. The original is never altered. By default, files dropped into the consume directory are moved into the media volume after processing and are no longer in the consume folder. If you want to keep originals in place, you can configure the consume path to be a symlink or use a different workflow. The export command always exports the original files.

Can I access Paperless-ngx when my home server is off?

No. Paperless-ngx runs on your own hardware and requires that hardware to be powered on and connected to the internet for access. If you need always-on availability, consider running it on a low-power device like a Raspberry Pi or a NAS that stays on continuously, or on a VPS if you are comfortable with that trade-off. The Localtonet tunnel stays connected as long as the Localtonet client is running on your server, so the uptime of your archive equals the uptime of your server.

Stop Losing Documents. Start Building Your Digital Archive.

Paperless-ngx gives you a searchable, organized, OCR-indexed archive of every document you own, running on your own hardware with no ongoing costs. Combine it with a Localtonet HTTP tunnel and you can scan documents from anywhere and access your archive from any device. Your documents stay yours.

Create a Localtonet HTTP Tunnel →

Localtonet is a secure multi-protocol tunneling and proxy platform designed to expose localhost, devices, private services, and AI agents to the public internet supporting HTTP/HTTPS tunnels, TCP/UDP forwarding, mobile proxy infrastructure, file server publishing, latency-optimized game connectivity, and developer-ready AI agent endpoint exposure from a single unified control plane.

support