15 Mar 2026 Tutorials 12 min read

How to Self-Host Open WebUI and Get a Private ChatGPT Interface for Your Local Models

Self-host Open WebUI with Docker and Ollama for a private ChatGPT interface that runs entirely on your hardware. Access it from any device with a Localtonet HTTPS tunnel.

🤖 Open WebUI · Ollama · Private ChatGPT · Self-Host · Local AI

How to Self-Host Open WebUI and Get a Private ChatGPT Interface for Your Local Models

Open WebUI is a free, open-source web interface for large language models that looks and feels like ChatGPT but runs entirely on your own hardware. It connects to Ollama, supports dozens of models, and includes document uploads, image generation, web search, and custom tools. This guide covers installing Open WebUI with Docker alongside Ollama, completing the first-time setup, and making the interface securely accessible from any device using a Localtonet tunnel.

🆓 Free and open source 🔒 Your data never leaves your machine 📄 Document chat and RAG 🌍 Access from any device

📋 What's in this guide

What is Open WebUI? Install Ollama Install Open WebUI Docker Compose setup First-time setup Remote access with Localtonet Key features FAQ

What Is Open WebUI?

Open WebUI is a self-hosted web interface for interacting with large language models. It is designed to feel familiar to anyone who has used ChatGPT, but everything runs on your own hardware. Your conversations, your uploaded documents, and your model interactions never touch an external server.

Under the hood, Open WebUI connects to Ollama to run models locally. It can also connect to any OpenAI-compatible API, which means you can use it as a unified interface for both your local models and cloud APIs from the same dashboard.

💬 Full chat interface Conversation history, multiple chats, model switching, and system prompts. Everything you expect from a modern AI chat interface.

📄 Document chat and RAG Upload PDFs, text files, and web pages. Ask questions about their content using retrieval-augmented generation.

🔧 Tools and functions Extend models with custom Python tools. Connect to external APIs, run code, and give your AI real capabilities.

🌐 Web search Let models search the web for current information and include results in their responses.

👥 Multi-user support Create accounts for family members or teammates. Each person has their own conversation history and preferences.

🎨 Image generation Connect to Stable Diffusion or AUTOMATIC1111 for image generation directly inside the chat interface.

Open WebUI + Ollama

Completely free, no subscription
No data sent to any third party
Works offline after model download
No usage limits or rate throttling
Full control over models and settings

ChatGPT / Claude web apps

Subscription required for best models
Prompts and files processed on external servers
Requires internet connection
Rate limits on free tiers
No control over model versions or settings

Step 1: Install Ollama

Open WebUI needs Ollama to run models. If you already have Ollama installed and running, skip to the next step.

🐧 Linux

curl -fsSL https://ollama.com/install.sh | sh

# Verify it is running
ollama --version
systemctl status ollama

🍎 macOS

Download the app from ollama.com/download, drag it to Applications, and launch it. Ollama runs in the menu bar and starts the API server automatically.

🪟 Windows

Download and run OllamaSetup.exe from ollama.com/download. Ollama installs to your user directory and starts as a background service automatically.

Pull a model to test Ollama before moving on:

ollama pull llama3.2
ollama run llama3.2

If Llama responds in the terminal, Ollama is working correctly. Type /bye to exit the chat and continue with the Open WebUI setup.

Step 2: Install Open WebUI with Docker

The fastest way to get Open WebUI running when Ollama is already installed on the host machine is a single Docker command. The --add-host=host.docker.internal:host-gateway flag lets the Open WebUI container reach Ollama running on the host.

docker run -d \
  -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Open WebUI is now running at http://localhost:3000. Open that address in your browser to complete the setup.

The volume flag is important

The -v open-webui:/app/backend/data flag mounts a named Docker volume for persistent storage. Without it, all your conversations, user accounts, and settings are lost every time the container restarts. Always include this flag.

Alternative: Docker Compose with Ollama Bundled

If you prefer to run Ollama inside Docker rather than on the host, use Docker Compose to run both services together. This is a clean, self-contained setup that is easy to update and move.

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434"

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    depends_on:
      - ollama
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    volumes:
      - open-webui_data:/app/backend/data

volumes:
  ollama_data:
  open-webui_data:

docker compose up -d
docker compose ps

Open WebUI is now available at http://localhost:3000

GPU passthrough with Docker Compose

If you want Ollama to use your NVIDIA GPU inside Docker, add a deploy section to the ollama service with resources.reservations.devices configured for NVIDIA. You also need the NVIDIA Container Toolkit installed on the host. Running Ollama natively on the host (the single Docker command approach above) is simpler for GPU setups since it avoids the container runtime configuration entirely.

Step 3: First-Time Setup

Create the admin account

Open http://localhost:3000 in your browser. On first visit, Open WebUI shows a sign-up screen. The first account you create becomes the administrator. Use a strong password this account controls all settings and user management.

Select a model

Click the model selector at the top of the chat page. Open WebUI fetches the list of available models from Ollama automatically. Select the model you pulled earlier for example llama3.2. If no models appear, click Settings → Admin → Connections and verify the Ollama URL is set to http://host.docker.internal:11434

Pull additional models from the UI

Go to Settings → Admin → Models and enter a model name in the pull field for example qwen2.5-coder:7b or deepseek-r1:7b. Open WebUI sends the pull request to Ollama and the model downloads in the background. No terminal needed after the initial setup.

Start chatting

Click New Chat, select your model, and start a conversation. Everything runs locally on your machine.

Access Open WebUI from Any Device with Localtonet

Open WebUI listens on port 3000 and is only reachable on your local machine by default. To access it from your phone, a tablet, or another computer on a different network, create a Localtonet HTTP tunnel. The tunnel gives you a public HTTPS URL and Localtonet handles TLS automatically.

Install and authenticate Localtonet

curl -fsSL https://localtonet.com/install.sh | sh
localtonet --authtoken <YOUR_TOKEN>

Create an HTTP tunnel for port 3000

Go to the HTTP tunnel page, set local IP to 127.0.0.1 and port to 3000. Click Create and start the tunnel. The dashboard shows a public HTTPS URL such as https://abc123.localto.net.

Open from any device

Enter the tunnel URL in any browser on any device. Log in with your Open WebUI credentials and start chatting with your local models from your phone, tablet, or any other machine.

Keep everything running after a reboot

The Docker container already has --restart always, so Open WebUI comes back automatically with Docker. Register Localtonet as a service so the tunnel also returns after every reboot:

sudo localtonet --install-service --authtoken <YOUR_TOKEN>
sudo localtonet --start-service --authtoken <YOUR_TOKEN>

Add SSO for shared access

If you share the tunnel URL with family members or teammates, enable Single Sign-On on the tunnel in the Localtonet dashboard. Only people who can authenticate with Google, GitHub, Microsoft, or GitLab will reach the Open WebUI login page. This adds a second authentication layer in front of the interface.

Features Worth Exploring

📄 Chat with documents

Upload a PDF, DOCX, or text file to any conversation. Open WebUI indexes it and the model answers questions about its contents using retrieval-augmented generation. Useful for research papers, contracts, technical manuals, and any document you want to query in plain language.

🔧 Model file editor

Go to Settings → Admin → Models and create a custom model based on any Ollama model with a different system prompt, temperature, or context window. Give it a custom name and avatar. Your custom models appear alongside the base models in the model selector.

🌐 Web search integration

Connect Open WebUI to a web search engine under Settings → Admin → Web Search. DuckDuckGo works without an API key. Once connected, models can search the web and include current information in their responses when you enable the search tool in a chat.

👥 Create user accounts

As the admin, go to Settings → Admin → Users and invite additional users. Each person has their own login, conversation history, and settings. You control which models each user can access and their permission level.

📱 Add to your home screen

Open the tunnel URL in Safari on iOS or Chrome on Android, then use Add to Home Screen from the browser menu. Open WebUI installs as a progressive web app and opens full-screen without the browser chrome, making it feel like a native app on your phone.

Frequently Asked Questions

Open WebUI shows no models. What is wrong?

The most common cause is that Open WebUI cannot reach the Ollama server. If you used the single Docker run command, verify that --add-host=host.docker.internal:host-gateway was included. If you used Docker Compose with Ollama as a separate service, check that the OLLAMA_BASE_URL environment variable is set to http://ollama:11434 and that the Ollama service is actually running with docker compose ps. Also confirm you have pulled at least one model with ollama pull llama3.2.

Can I connect Open WebUI to OpenAI or Claude alongside local models?

Yes. Go to Settings → Admin → Connections and add an OpenAI-compatible API connection. Enter the API base URL and your API key. For OpenAI, the base URL is https://api.openai.com/v1. For Anthropic, use a compatible proxy or the Anthropic API directly. Cloud models then appear in the model selector alongside your local Ollama models and you can switch between them per conversation.

Will the interface be slow when accessed through the Localtonet tunnel?

The interface itself loads quickly because it is just HTML and JavaScript. The noticeable latency is in model inference the time between sending a message and seeing the first token of the response. That latency is determined by your GPU or CPU speed and is the same whether you access locally or through the tunnel. Streaming responses work through the tunnel, so tokens appear progressively rather than waiting for the full response.

How do I update Open WebUI to a newer version?

Pull the latest image and recreate the container. Your conversation history, user accounts, and settings are stored in the named Docker volume and survive the update.

docker pull ghcr.io/open-webui/open-webui:main
docker stop open-webui
docker rm open-webui
# Re-run the original docker run command

If you use Docker Compose, run docker compose pull && docker compose up -d.

Can multiple people use it at the same time?

Yes. Open WebUI supports concurrent users. Each person has their own session and conversation history. The practical limit is your hardware running multiple simultaneous inference requests requires enough RAM or VRAM to handle them. If two people send messages at the same time, Ollama queues the requests and processes them sequentially unless you have configured multiple GPU workers.

Can I run Open WebUI on a Raspberry Pi?

Open WebUI itself runs well on a Raspberry Pi 4 or Pi 5 since it is just a web server. The limiting factor is Ollama and the models. A Pi 4 with 8 GB RAM can run small models like Phi-4 Mini or Gemma 3 1B at a usable speed. Larger models like 7B parameter variants will be very slow on CPU-only inference. Using a Pi as the Open WebUI frontend while pointing it at a more powerful machine running Ollama over the local network or via a Localtonet tunnel is a practical setup.

Your Private AI Interface, Accessible from Anywhere

Install Open WebUI, connect it to Ollama, and open a Localtonet tunnel. Your personal ChatGPT alternative is reachable from any device, runs fully offline, and keeps all your data on your own hardware.

Create Free Localtonet Account →

Tags: docker compose expose localhost expose localhost from anywhere expose localhost to internet local ai ollama open webui private chatgpt raspberry pi

Localtonet

Localtonet is a secure multi-protocol tunneling and proxy platform designed to expose localhost, devices, private services, and AI agents to the public internet supporting HTTP/HTTPS tunnels, TCP/UDP forwarding, mobile proxy infrastructure, file server publishing, latency-optimized game connectivity, and developer-ready AI agent endpoint exposure from a single unified control plane.

Back to Blog

Blogs

No result found

How to Self-Host Open WebUI and Get a Private ChatGPT Interface for Your Local Models

📋 What's in this guide

What Is Open WebUI?

Open WebUI + Ollama

ChatGPT / Claude web apps

Step 1: Install Ollama

🐧 Linux

🍎 macOS

🪟 Windows

Step 2: Install Open WebUI with Docker

Alternative: Docker Compose with Ollama Bundled

Step 3: First-Time Setup

Access Open WebUI from Any Device with Localtonet

Keep everything running after a reboot

Features Worth Exploring

📄 Chat with documents

🔧 Model file editor

🌐 Web search integration

👥 Create user accounts

📱 Add to your home screen

Frequently Asked Questions

Your Private AI Interface, Accessible from Anywhere