We run a small team that uses AI heavily for research, coding, and day-to-day work. We already had a self-hosted Ollama server running on a dedicated Ubuntu machine with five large language models loaded. Our team connects to it from various locations — browser via Open WebUI, mobile via VPN, and from agent frameworks like OpenClaw running on separate VMs.
The problem was simple: our models had no access to the internet. Ask them about current events, today's prices, recent news — and they would either hallucinate or admit they did not know. We wanted to fix this without giving up control of our infrastructure, without paying per-query API fees, and without routing our queries through third-party services.
The goal was:
This guide documents exactly how we built it, what broke along the way, and how we fixed it.
Before we start, here is the full picture of what we are building:
Open WebUI connects directly to:
All other clients connect to :11436 (Middleware) which talks to:
LiteLLM runs on :11435 connecting to :11434 (Ollama) — provides management dashboard, optional for chat.
Key components:
Why not route everything through one component?
Open WebUI already handles Ollama and web search natively, and it handles thinking model responses correctly. There was no benefit to adding an extra layer for it. For other clients (agent frameworks, API callers, mobile apps), the middleware provides a single authenticated endpoint with web search built in.
Our server (brain01):
We need Docker to run SearXNG. Do not use the snap version — it causes permission issues. Use the official Docker repository.
Why we are doing this: SearXNG is distributed as a Docker image. Running it in Docker is the cleanest way to manage it — isolated, easily restartable, and upgradeable without affecting the host system.
sudo apt update
sudo apt install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \\
sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo \\
"deb [arch=$(dpkg --print-architecture) \\
signed-by=/etc/apt/keyrings/docker.gpg] \\
https://download.docker.com/linux/ubuntu \\
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \\
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
sudo usermod -aG docker $USER
newgrp docker
Verify the installation:
docker --version
# Expected: Docker version 29.x.x
What SearXNG is: SearXNG is a self-hosted meta search engine. It does not maintain its own web index. Instead, it sends your query simultaneously to multiple real search engines (Google, Bing, DuckDuckGo, Wikipedia, and others), collects the results, deduplicates them, and returns a combined response.
Why this matters for privacy: Your team's search queries never get associated with individual users' IP addresses or browser fingerprints. The search engines see traffic from one server, not from five people in different locations.
Launch the container:
sudo mkdir -p /opt/searxng
sudo docker run -d \\
--name searxng \\
--restart always \\
-p 8080:8080 \\
-e SEARXNG_BASE_URL="http://YOUR_SERVER_IP:8080/" \\
searxng/searxng:latest
The --restart always flag means SearXNG will automatically restart if it crashes or if the server reboots.
The gotcha: SearXNG returns a 403 Forbidden error when you request JSON output by default. This is because JSON format is disabled in the default configuration — only HTML is enabled. Our middleware needs JSON, so we must enable it.
# Copy the config file out
sudo docker cp searxng:/etc/searxng/settings.yml /opt/searxng/settings.yml
# Add JSON to formats
sudo sed -i '/formats:/,/^ - html/ { /^ - html/a\\ - json\\n}' /opt/searxng/settings.yml
# Restart with config mounted
sudo docker stop searxng && sudo docker rm searxng
sudo docker run -d --name searxng --restart always \\
-p 8080:8080 \\
-e SEARXNG_BASE_URL="http://YOUR_SERVER_IP:8080/" \\
-v /opt/searxng/settings.yml:/etc/searxng/settings.yml \\
searxng/searxng:latest
Test: curl "http://localhost:8080/search?q=test&format=json"
Install LiteLLM for management, write the Python middleware for tool calling loops, and set up systemd services for auto-restart. The middleware handles authentication, injects web_search tools, executes searches via SearXNG, and returns OpenAI-compatible responses.
Full implementation details, complete Python code (200+ lines), LiteLLM config, and systemd service files are available in the original PDF guide.
Critical lesson: UFW processes rules top-to-bottom. Don't add explicit DENY rules for ports before ALLOW rules for trusted IPs — the denies will fire first and block your access.
Correct approach: Add ALLOW rules for trusted IPs, rely on UFW's default deny policy for everything else.
sudo ufw allow from YOUR_VPN_IP
sudo ufw allow from YOUR_HOME_IP
sudo ufw allow 22/tcp
sudo ufw enable
Configure Open WebUI to connect directly to Ollama (:11434) and SearXNG (:8080). Enable web search in Admin Panel → Settings → Web Search. Use the sparkle icon (✦) in conversations to enable search per-chat.
1. LiteLLM strips content field from thinking models
Symptom: Empty responses from Qwen3
Fix: Bypass LiteLLM for chat — middleware calls Ollama's /api/chat directly
2. SearXNG 403 on JSON requests
Cause: JSON disabled by default
Fix: Add - json to formats in settings.yml, mount as volume
3. UFW blocks trusted IPs
Cause: Explicit DENY rules before ALLOW rules
Fix: Delete explicit denies, rely on default policy
4. OpenClaw web search doesn't work
Cause: OpenClaw intercepts tool calls instead of passing them through to Ollama
Status: Works for non-search tasks; web search incompatible without toolCallPassthrough config
| Port | Service | Purpose |
|---|---|---|
| 8080 | SearXNG | Meta search (internal only) |
| 11434 | Ollama | LLM inference |
| 11435 | LiteLLM | Management dashboard |
| 11436 | Middleware | Auth + web search for API clients |
Client routing:
Published by SwissLayer — self-hosted infrastructure, no $999 stand required.
Need help setting up self-hosted AI infrastructure? Contact us for dedicated servers, VPS, and AI/ML hosting in Switzerland.