How to Self-Host Open WebUI and Get a Private ChatGPT Interface for Your Local Models
Open WebUI is a free, open-source web interface for large language models that looks and feels like ChatGPT but runs entirely on your own hardware. It connects to Ollama, supports dozens of models, and includes document uploads, image generation, web search, and custom tools. This guide covers installing Open WebUI with Docker alongside Ollama, completing the first-time setup, and making the interface securely accessible from any device using a Localtonet tunnel.
📋 What's in this guide
What Is Open WebUI?
Open WebUI is a self-hosted web interface for interacting with large language models. It is designed to feel familiar to anyone who has used ChatGPT, but everything runs on your own hardware. Your conversations, your uploaded documents, and your model interactions never touch an external server.
Under the hood, Open WebUI connects to Ollama to run models locally. It can also connect to any OpenAI-compatible API, which means you can use it as a unified interface for both your local models and cloud APIs from the same dashboard.
Open WebUI + Ollama
- Completely free, no subscription
- No data sent to any third party
- Works offline after model download
- No usage limits or rate throttling
- Full control over models and settings
ChatGPT / Claude web apps
- Subscription required for best models
- Prompts and files processed on external servers
- Requires internet connection
- Rate limits on free tiers
- No control over model versions or settings
Step 1: Install Ollama
Open WebUI needs Ollama to run models. If you already have Ollama installed and running, skip to the next step.
🐧 Linux
curl -fsSL https://ollama.com/install.sh | sh
# Verify it is running
ollama --version
systemctl status ollama
🍎 macOS
Download the app from ollama.com/download, drag it to Applications, and launch it. Ollama runs in the menu bar and starts the API server automatically.
🪟 Windows
Download and run OllamaSetup.exe from ollama.com/download. Ollama installs to your user directory and starts as a background service automatically.
Pull a model to test Ollama before moving on:
ollama pull llama3.2
ollama run llama3.2
If Llama responds in the terminal, Ollama is working correctly.
Type /bye to exit the chat and continue with the Open WebUI setup.
Step 2: Install Open WebUI with Docker
The fastest way to get Open WebUI running when Ollama is already installed on the host machine
is a single Docker command. The --add-host=host.docker.internal:host-gateway flag
lets the Open WebUI container reach Ollama running on the host.
docker run -d \
-p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Open WebUI is now running at http://localhost:3000.
Open that address in your browser to complete the setup.
The -v open-webui:/app/backend/data flag mounts a named Docker volume for persistent storage.
Without it, all your conversations, user accounts, and settings are lost every time the container restarts.
Always include this flag.
Alternative: Docker Compose with Ollama Bundled
If you prefer to run Ollama inside Docker rather than on the host, use Docker Compose to run both services together. This is a clean, self-contained setup that is easy to update and move.
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
restart: unless-stopped
volumes:
- ollama_data:/root/.ollama
ports:
- "11434:11434"
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
restart: unless-stopped
depends_on:
- ollama
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
volumes:
- open-webui_data:/app/backend/data
volumes:
ollama_data:
open-webui_data:
docker compose up -d
docker compose ps
Open WebUI is now available at http://localhost:3000
If you want Ollama to use your NVIDIA GPU inside Docker, add a deploy section
to the ollama service with resources.reservations.devices configured
for NVIDIA. You also need the NVIDIA Container Toolkit installed on the host.
Running Ollama natively on the host (the single Docker command approach above) is simpler
for GPU setups since it avoids the container runtime configuration entirely.
Step 3: First-Time Setup
Create the admin account
Open http://localhost:3000 in your browser.
On first visit, Open WebUI shows a sign-up screen.
The first account you create becomes the administrator.
Use a strong password this account controls all settings and user management.
Select a model
Click the model selector at the top of the chat page.
Open WebUI fetches the list of available models from Ollama automatically.
Select the model you pulled earlier for example llama3.2.
If no models appear, click Settings → Admin → Connections
and verify the Ollama URL is set to http://host.docker.internal:11434
Pull additional models from the UI
Go to Settings → Admin → Models and enter a model name in the
pull field for example qwen2.5-coder:7b or deepseek-r1:7b.
Open WebUI sends the pull request to Ollama and the model downloads in the background.
No terminal needed after the initial setup.
Start chatting
Click New Chat, select your model, and start a conversation. Everything runs locally on your machine.
Access Open WebUI from Any Device with Localtonet
Open WebUI listens on port 3000 and is only reachable on your local machine by default.
To access it from your phone, a tablet, or another computer on a different network,
create a Localtonet HTTP tunnel. The tunnel gives you a public HTTPS URL and Localtonet
handles TLS automatically.
Install and authenticate Localtonet
curl -fsSL https://localtonet.com/install.sh | sh
localtonet --authtoken <YOUR_TOKEN>
Create an HTTP tunnel for port 3000
Go to the HTTP tunnel page,
set local IP to 127.0.0.1 and port to 3000.
Click Create and start the tunnel.
The dashboard shows a public HTTPS URL such as https://abc123.localto.net.
Open from any device
Enter the tunnel URL in any browser on any device. Log in with your Open WebUI credentials and start chatting with your local models from your phone, tablet, or any other machine.
Keep everything running after a reboot
The Docker container already has --restart always, so Open WebUI comes back automatically with Docker.
Register Localtonet as a service so the tunnel also returns after every reboot:
sudo localtonet --install-service --authtoken <YOUR_TOKEN>
sudo localtonet --start-service --authtoken <YOUR_TOKEN>
If you share the tunnel URL with family members or teammates, enable Single Sign-On on the tunnel in the Localtonet dashboard. Only people who can authenticate with Google, GitHub, Microsoft, or GitLab will reach the Open WebUI login page. This adds a second authentication layer in front of the interface.
Features Worth Exploring
📄 Chat with documents
Upload a PDF, DOCX, or text file to any conversation. Open WebUI indexes it and the model answers questions about its contents using retrieval-augmented generation. Useful for research papers, contracts, technical manuals, and any document you want to query in plain language.
🔧 Model file editor
Go to Settings → Admin → Models and create a custom model based on any Ollama model with a different system prompt, temperature, or context window. Give it a custom name and avatar. Your custom models appear alongside the base models in the model selector.
🌐 Web search integration
Connect Open WebUI to a web search engine under Settings → Admin → Web Search. DuckDuckGo works without an API key. Once connected, models can search the web and include current information in their responses when you enable the search tool in a chat.
👥 Create user accounts
As the admin, go to Settings → Admin → Users and invite additional users. Each person has their own login, conversation history, and settings. You control which models each user can access and their permission level.
📱 Add to your home screen
Open the tunnel URL in Safari on iOS or Chrome on Android, then use Add to Home Screen from the browser menu. Open WebUI installs as a progressive web app and opens full-screen without the browser chrome, making it feel like a native app on your phone.
Frequently Asked Questions
Open WebUI shows no models. What is wrong?
The most common cause is that Open WebUI cannot reach the Ollama server. If you used the single Docker run command, verify that --add-host=host.docker.internal:host-gateway was included. If you used Docker Compose with Ollama as a separate service, check that the OLLAMA_BASE_URL environment variable is set to http://ollama:11434 and that the Ollama service is actually running with docker compose ps. Also confirm you have pulled at least one model with ollama pull llama3.2.
Can I connect Open WebUI to OpenAI or Claude alongside local models?
Yes. Go to Settings → Admin → Connections and add an OpenAI-compatible API connection. Enter the API base URL and your API key. For OpenAI, the base URL is https://api.openai.com/v1. For Anthropic, use a compatible proxy or the Anthropic API directly. Cloud models then appear in the model selector alongside your local Ollama models and you can switch between them per conversation.
Will the interface be slow when accessed through the Localtonet tunnel?
The interface itself loads quickly because it is just HTML and JavaScript. The noticeable latency is in model inference the time between sending a message and seeing the first token of the response. That latency is determined by your GPU or CPU speed and is the same whether you access locally or through the tunnel. Streaming responses work through the tunnel, so tokens appear progressively rather than waiting for the full response.
How do I update Open WebUI to a newer version?
Pull the latest image and recreate the container. Your conversation history, user accounts, and settings are stored in the named Docker volume and survive the update.
docker pull ghcr.io/open-webui/open-webui:main
docker stop open-webui
docker rm open-webui
# Re-run the original docker run command
If you use Docker Compose, run docker compose pull && docker compose up -d.
Can multiple people use it at the same time?
Yes. Open WebUI supports concurrent users. Each person has their own session and conversation history. The practical limit is your hardware running multiple simultaneous inference requests requires enough RAM or VRAM to handle them. If two people send messages at the same time, Ollama queues the requests and processes them sequentially unless you have configured multiple GPU workers.
Can I run Open WebUI on a Raspberry Pi?
Open WebUI itself runs well on a Raspberry Pi 4 or Pi 5 since it is just a web server. The limiting factor is Ollama and the models. A Pi 4 with 8 GB RAM can run small models like Phi-4 Mini or Gemma 3 1B at a usable speed. Larger models like 7B parameter variants will be very slow on CPU-only inference. Using a Pi as the Open WebUI frontend while pointing it at a more powerful machine running Ollama over the local network or via a Localtonet tunnel is a practical setup.
Your Private AI Interface, Accessible from Anywhere
Install Open WebUI, connect it to Ollama, and open a Localtonet tunnel. Your personal ChatGPT alternative is reachable from any device, runs fully offline, and keeps all your data on your own hardware.
Create Free Localtonet Account →