12 May 2026 19 May 2026 Tutorials 17 min read

Expose Local AI Agents to the Internet Securely (2026)

Learn how to expose local AI agents built with OpenAI SDK, LangChain, CrewAI, AutoGen, or an MCP server to the internet using a secure tunnel.

🤖 AI Agents · Tunneling · MCP · 2026

Expose Local AI Agents to the Internet Securely

You built an AI agent on your laptop. It works perfectly. Now a teammate, a client, or an external webhook needs to reach it, and the agent is sitting behind a router with no static IP and no open ports. Deploying to a cloud server costs money, introduces latency, and forces you to redeploy every time you change a prompt. Running a secure tunnel takes under five minutes and makes your local agent reachable from anywhere, with HTTPS, custom domains, and zero router configuration. This guide covers every major framework: OpenAI Agent SDK, LangChain, CrewAI, AutoGen, and MCP servers.

🔒 HTTPS by default, no self-signed certs 🤖 OpenAI SDK · LangChain · CrewAI · AutoGen · MCP 🌐 Public URL in under 5 minutes

Why Exposing a Local AI Agent Is Harder Than It Looks

Every AI agent framework exposes an HTTP server on a local port. OpenAI's Agent SDK defaults to 127.0.0.1:8000, LangChain Serve to 0.0.0.0:8000, and most MCP servers to localhost:3000. These are loopback addresses. Nothing outside your machine can reach them, full stop.

Port forwarding on your router technically works, but most residential ISPs now run CGNAT (Carrier-Grade NAT), which means your public IP address is shared with hundreds of neighbors. Your router's forwarding rules never reach the real public internet. You don't get a routable address at all.

Cloud deployments solve the routing problem but add a completely different one: every model update, every agent prompt revision, every tool change requires a rebuild and redeploy cycle. During active development that's painful. For personal agents, demos, or internal tools it's flat-out overkill.

A reverse tunnel gives your local agent a stable public HTTPS URL without touching your router, your ISP, or a cloud provider's billing page.

How Localtonet Works for AI Agents

Localtonet consists of two parts: a Client you install on your machine, and a dashboard at localtonet.com where you configure and control everything. You log into the Client with your token once. After that, all tunnel creation, starting, stopping, and settings (custom subdomains, IP allowlisting, server region) are managed from the dashboard. No command-line flags, no config files to edit.

The Client connects outbound to Localtonet's infrastructure over a persistent encrypted tunnel. Localtonet assigns a public HTTPS URL (e.g., https://yourname.localto.net) and forwards every incoming request through the tunnel to whatever port your AI framework is listening on locally. No inbound ports need to be open. No router changes. No static IP required.

For AI agents specifically, this means webhooks, OpenAI's function-calling callbacks, Slack event subscriptions, and human-in-the-loop approval endpoints can all reach your local process in real time. Your agent keeps running locally on your GPU or CPU, with full access to local files and tools, while appearing at a public HTTPS endpoint.

🔒 HTTPS out of the box Every HTTP tunnel gets a valid TLS certificate automatically. No self-signed cert warnings, no manual cert management.

🌐 Custom subdomains Reserve a fixed subdomain (e.g., myagent.localto.net) so your webhook URLs don't change between restarts.

🔑 IP allowlisting Restrict tunnel access to specific IP ranges directly from the dashboard. Lock your MCP server to only your team's office IPs.

⚡ Persistent connections The client auto-reconnects after network drops. Long-running agentic loops won't lose their callback URL mid-execution.

💻 Cross-platform Client Works on Windows, Linux, macOS, and Docker. Run your AI framework on a workstation, a Raspberry Pi, or a local GPU server — the Client and dashboard setup is identical.

🤖 Framework-agnostic Any framework that serves HTTP works. OpenAI Agent SDK, LangChain, CrewAI, AutoGen, FastAPI, Flask, all handled the same way.

General Setup: Get a Public URL for Any Local AI Framework (2026)

Create a free Localtonet account

Go to localtonet.com/register and sign up. No credit card required for the free plan.

Copy your user token

Go to localtonet.com/usertoken and copy your token. You'll use this to log into the Localtonet Client in the next step.

Download and install the Localtonet Client

Download the Localtonet Client for your platform from localtonet.com/download. Works on Windows, Linux, macOS, and Docker.

Log into the Client with your token

Open the Localtonet Client and enter the token you copied from localtonet.com/usertoken. The Client connects to Localtonet's infrastructure and stays running in the background. No further CLI interaction is needed.

Create an HTTP tunnel from the dashboard

Go to localtonet.com/tunnel/http in the dashboard. Choose a server region, set the host to localhost and the port to whatever your AI framework listens on (e.g., 8000), then click Save. Localtonet assigns a public HTTPS URL immediately.

Start the tunnel from the dashboard

Back in the dashboard, click the Start button next to the tunnel you created. The tunnel goes live and your public HTTPS URL starts forwarding traffic to your local port. All further management (stop, restart, subdomain, IP allowlist) is done from this same dashboard view.

Start your AI framework and verify the public URL

Start your agent framework normally (see framework-specific sections below). The dashboard shows your live public URL, e.g., https://abc123.localto.net. Send a test request to confirm traffic flows through end to end.

Verify with curl

curl https://abc123.localto.net/health

🤖 Exposing an OpenAI Agent SDK Server

The OpenAI Agent SDK (also known as the Agents SDK or openai-agents Python package) lets you define agents with tools, handoffs, and guardrails. When you serve an agent via its built-in HTTP interface, it defaults to port 8000.

Install and run OpenAI Agent SDK server

pip install openai-agents

# In your agent file (e.g., agent_server.py):
# from agents import Agent, Runner
# agent = Agent(name="MyAgent", instructions="You are a helpful assistant.")
# Runner.serve(agent, host="0.0.0.0", port=8000)

python agent_server.py

Once the server is running on port 8000, go to the Localtonet dashboard and start the tunnel pointed at that port. The agent's /run and /stream endpoints become reachable at your public HTTPS URL. External clients can now call your agent directly without any cloud deployment.

Call your exposed OpenAI Agent from anywhere

curl -X POST https://abc123.localto.net/run \
  -H "Content-Type: application/json" \
  -d '{"input": "Summarize the latest sales report"}'

🔗 Exposing a LangChain Agent via LangServe

LangServe adds a FastAPI wrapper around any LangChain chain or agent. It mounts routes at /invoke, /stream, and /batch, and serves a built-in Playground UI at /playground. The default port is 8000.

Start a LangServe agent

pip install langchain langserve uvicorn

# In your server file (e.g., serve.py):
# from fastapi import FastAPI
# from langserve import add_routes
# from langchain_openai import ChatOpenAI
# app = FastAPI()
# add_routes(app, ChatOpenAI(), path="/chat")

uvicorn serve:app --host 0.0.0.0 --port 8000

With the tunnel running, your teammates can reach the LangServe Playground at https://abc123.localto.net/chat/playground and test the agent through a browser without installing anything locally. This is useful for non-technical stakeholders who need to evaluate an agent before it ships.

👥 Exposing a CrewAI Agent Workflow

CrewAI orchestrates multi-agent pipelines where each agent has a role, a goal, and a set of tools. The standard way to expose a CrewAI crew over HTTP is to wrap the crew.kickoff() call inside a FastAPI endpoint.

CrewAI FastAPI wrapper

pip install crewai fastapi uvicorn

# crew_api.py
from fastapi import FastAPI
from crewai import Agent, Task, Crew
from langchain_openai import ChatOpenAI

app = FastAPI()
llm = ChatOpenAI(model="gpt-4o", temperature=0)

researcher = Agent(
    role="Researcher",
    goal="Find accurate information",
    backstory="You are a meticulous research analyst.",
    llm=llm
)

@app.post("/run")
async def run_crew(topic: str):
    task = Task(description=f"Research the topic: {topic}", agent=researcher, expected_output="A detailed report")
    crew = Crew(agents=[researcher], tasks=[task])
    result = crew.kickoff()
    return {"result": str(result)}

# uvicorn crew_api:app --host 0.0.0.0 --port 8000

Start the FastAPI server and the Localtonet tunnel together. External callers can POST to https://abc123.localto.net/run?topic=quantum+computing and get a full multi-agent research result back. No cloud infrastructure required.

⚙️ Exposing an AutoGen Agent via REST

Microsoft's AutoGen (v0.4+) ships with an AgentChat API that can be wrapped in a FastAPI server. Earlier AutoGen versions used conversational patterns without a built-in HTTP layer, so the FastAPI wrapper approach works across all versions.

AutoGen FastAPI server

pip install autogen-agentchat fastapi uvicorn

# autogen_api.py
from fastapi import FastAPI
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient

app = FastAPI()
model_client = OpenAIChatCompletionClient(model="gpt-4o")
agent = AssistantAgent("assistant", model_client=model_client)

@app.post("/chat")
async def chat(message: str):
    response = await agent.run(task=message)
    return {"response": response.messages[-1].content}

# uvicorn autogen_api:app --host 0.0.0.0 --port 8001

AutoGen agents often run extended agentic loops that can take 30-120 seconds to complete. If you're exposing one over a tunnel, use the /stream pattern with Server-Sent Events instead of a blocking POST, so the connection doesn't time out mid-run.

🔌 How to Expose an MCP Server to Remote Clients

The Model Context Protocol (MCP) defines a standard interface for AI models to call tools, read resources, and receive prompts from external servers. Local MCP servers typically run on localhost:3000 over HTTP with Server-Sent Events (SSE) or WebSocket transport. By default, they're unreachable from outside your machine.

Exposing an MCP server publicly means remote Claude Desktop instances, hosted agents, or team members can connect to your custom tools without being on the same machine.

Start an MCP server (Node.js example)

npm install @modelcontextprotocol/sdk

# server.js (basic MCP HTTP+SSE server)
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { SSEServerTransport } from "@modelcontextprotocol/sdk/server/sse.js";
import express from "express";

const app = express();
const server = new Server({ name: "my-mcp-server", version: "1.0.0" }, { capabilities: { tools: {} } });

let transport;
app.get("/sse", async (req, res) => {
  transport = new SSEServerTransport("/messages", res);
  await server.connect(transport);
});
app.post("/messages", async (req, res) => {
  await transport.handlePostMessage(req, res);
});

app.listen(3000, () => console.log("MCP server on port 3000"));

Tunnel the MCP server via the Localtonet dashboard

# 1. Make sure the Localtonet Client is running and authenticated with your token
# 2. Go to localtonet.com/tunnel/http in the dashboard
# 3. Set host: localhost, port: 3000, then Save
# 4. Click Start — your MCP server is now live at the assigned public URL

Once tunneled, configure Claude Desktop or any remote MCP client to point at your public URL instead of localhost. In Claude Desktop's claude_desktop_config.json, change the server URL to https://abc123.localto.net. Your custom tools are now available to any MCP-compatible client that has the URL.

Claude Desktop config (claude_desktop_config.json)

{
  "mcpServers": {
    "my-remote-tools": {
      "url": "https://abc123.localto.net/sse"
    }
  }
}

⚠️ Most tutorials skip this step: binding to 0.0.0.0, not 127.0.0.1

Every AI framework defaults to 127.0.0.1 (loopback only) for security reasons. When your framework binds to 127.0.0.1, the Localtonet Client running on the same machine can't forward external traffic to it on some OS configurations. Change your server's bind address to 0.0.0.0 so it accepts connections on all local interfaces, then the tunnel routes to it cleanly. This does not expose your port directly to the internet — the Localtonet tunnel still handles all public traffic. It only allows the Client process (on the same machine) to reach your framework's port.

FastAPI / Uvicorn — correct bind address

uvicorn myapp:app --host 0.0.0.0 --port 8000

🔑 Securing your remote AI agent access with IP allowlisting

A public HTTPS tunnel with no access control means anyone who discovers the URL can call your agent. For internal tools or MCP servers, use Localtonet's IP allowlisting feature in the dashboard to restrict the tunnel to specific IP addresses (e.g., your office's egress IP or a teammate's home IP). For webhook-based agents, add a shared secret header check in your FastAPI route and validate it on every request. These two measures together cover the vast majority of real-world threat models for local agents.

⚡ Tips for Running AI Agents Behind a Tunnel

🌐 Reserve a fixed subdomain Go to the dashboard and reserve a custom subdomain so your webhook URLs survive tunnel restarts. OpenAI's function-calling and Slack event subscriptions break if the URL changes.

⚡ Use streaming for long agent runs Agentic loops that run longer than 30 seconds will time out on standard HTTP. Use SSE or WebSocket transport in your framework and expose port 3000 (or your SSE port) instead of relying on a single blocking POST.

🛠 Keep the Client running as a service On Linux, register the Localtonet Client as a systemd service so it restarts automatically after reboots. Your tunnels (managed from the dashboard) stay active even after the development machine sleeps and wakes.

🔒 Add a shared secret to every route Pass a custom header (e.g., X-Agent-Secret: your-secret-here) from all callers and validate it in a FastAPI dependency. Rejects unauthenticated requests before they reach your model.

📡 Expose multiple frameworks on separate tunnels Run CrewAI on port 8000, an MCP server on port 3000, and AutoGen on port 8001. Create three separate HTTP tunnels in the dashboard, one per port. Each gets its own public URL and runs independently.

💻 Docker: add the Client as a Compose service Add a Localtonet Client container to your Compose file alongside your AI framework container. Set LOCALTONET_TOKEN as an environment variable. The Client authenticates at startup, then tunnel management stays in the dashboard.

Frequently Asked Questions

How do I expose an AI agent to the internet without a static IP?

Use a reverse tunnel. Install the Localtonet Client on the same machine as your AI framework, log in with your token, then create and start an HTTP tunnel from the dashboard pointing at the port your framework listens on (e.g., 8000). Localtonet assigns a public HTTPS URL that routes traffic to your local port. No static IP and no router changes needed.

Can I expose an MCP server to remote Claude Desktop clients?

Yes. Start your MCP server with HTTP+SSE transport on a local port (default 3000), then tunnel that port with Localtonet. Update the claude_desktop_config.json on the remote client to point the server URL at your public tunnel URL (e.g., https://abc123.localto.net/sse). The remote Claude Desktop instance will connect to your locally running MCP server as if it were on the same machine.

Does the public URL change every time I restart the tunnel?

On the free plan with a random subdomain, yes, the URL changes on restart. If your agent receives webhooks from OpenAI, Slack, or other services, a changing URL means you'd need to re-register it every time. Reserve a custom subdomain in the Localtonet dashboard to get a stable URL that survives restarts.

Is it safe to expose a local AI agent over a tunnel?

The tunnel itself uses TLS encryption and doesn't open any inbound ports on your machine. The risk is at the application layer: anyone who knows your URL can call your agent endpoints. Mitigate this by enabling IP allowlisting in the Localtonet dashboard and adding a shared secret header check in your FastAPI routes. For production-grade deployments, also consider adding rate limiting at the framework level.

Can I run multiple AI agent frameworks on separate tunnels at the same time?

Yes. Each framework should listen on a different port (e.g., LangChain on 8000, MCP server on 3000, AutoGen on 8001). Create a separate HTTP tunnel in the Localtonet dashboard for each port. Each tunnel gets its own public URL and operates independently.

My agent runs long tasks. Will the tunnel connection time out?

Standard HTTP connections time out after 30-60 seconds on most proxies. For agent tasks that take longer, switch to streaming responses using Server-Sent Events (SSE) or WebSocket transport. LangServe, AutoGen, and the OpenAI Agent SDK all support streaming. The tunnel handles SSE and WebSocket connections without any special configuration.

How do I expose a CrewAI agent for remote access?

CrewAI doesn't ship a built-in HTTP server. Wrap your crew.kickoff() call inside a FastAPI POST endpoint, bind FastAPI to 0.0.0.0:8000. Then in the Localtonet dashboard, create an HTTP tunnel for port 8000 and start it. Remote callers POST to your public URL, the FastAPI handler triggers the crew, and the result returns in the HTTP response.

Does Localtonet work on a machine behind a VPN or corporate proxy?

In most cases yes, because the Localtonet Client connects outbound over HTTPS (port 443), which corporate proxies and VPNs typically allow. If your environment blocks outbound connections to non-standard ports or requires proxy authentication, check the Localtonet documentation for proxy configuration options for the Client.

Ready to expose your AI agent to the internet?

Get a free Localtonet account and have your local agent reachable at a public HTTPS URL in under five minutes. No credit card required.

Get Started Free →

Tags: agents ai agents expose ai agent expose mcp server remote ai agent access

Localtonet

Localtonet is a secure multi-protocol tunneling and proxy platform designed to expose localhost, devices, private services, and AI agents to the public internet supporting HTTP/HTTPS tunnels, TCP/UDP forwarding, mobile proxy infrastructure, file server publishing, latency-optimized game connectivity, and developer-ready AI agent endpoint exposure from a single unified control plane.

Back to Blog

Blogs

No result found