# Local AI Stack on Unraid **Status:** ✅ Deployed **Last Updated:** 2026-01-26 --- ## Current Deployment | Component | Status | URL/Port | |-----------|--------|----------| | Ollama | ✅ Running | http://192.168.31.2:11434 | | Open WebUI | ✅ Running | http://192.168.31.2:3080 | | Intel GPU | ✅ Enabled | /dev/dri passthrough | ### Models Installed | Model | Size | Type | |-------|------|------| | qwen2.5-coder:7b | 4.7 GB | Base coding LLM | | unraid-assistant | 4.7 GB | Custom model with infrastructure knowledge | --- ## Custom Model: unraid-assistant A fine-tuned system prompt model that knows the xtrm-lab.org infrastructure: ### Knowledge Included - **Network topology**: All VLANs (10,20,25,30,31,35,40,50), IPs, gateways - **45+ Docker containers**: Names, images, ports, purposes - **RouterOS 7**: Commands, VLAN patterns, firewall rules - **Traefik**: Labels, routing, SSL configuration - **Authentik**: SSO middleware, provider setup - **External URLs**: All xtrm-lab.org services ### Usage ```bash # Terminal (SSH to Unraid) ai "How do I add a device to the IoT VLAN?" ai "What port is gitea running on?" ai "Show me Traefik labels for a new app with Authentik" # Interactive mode ai ``` ### Rebuild Model If infrastructure changes, update and rebuild: ```bash # Edit the Modelfile nano /mnt/user/appdata/ollama/Modelfile-unraid # Rebuild docker exec ollama ollama create unraid-assistant -f /root/.ollama/Modelfile-unraid ``` --- ## Hardware | Component | Spec | |-----------|------| | CPU | Intel N100 (4 cores) | | RAM | 16GB (shared with Docker) | | GPU | Intel UHD (iGPU via /dev/dri) | | Storage | 1.7TB free on array | ### Performance - ~1 token/sec with 7B models - Responses take 30-90 seconds - Suitable for occasional use, not real-time chat --- ## Containers Stopped for RAM To free ~4.8GB for AI workloads: | Container | RAM Freed | Purpose | |-----------|-----------|---------| | karakeep | 1.68 GB | Bookmark manager | | unimus | 1.62 GB | Network backup | | homarr | 686 MB | Dashboard | | netdisco-web | 531 MB | Network discovery UI | | netdisco-backend | 291 MB | Network discovery | To restart if needed: ```bash docker start karakeep unimus homarr netdisco-web netdisco-backend ``` --- ## Docker Configuration ### Ollama ```bash docker run -d \ --name ollama \ --restart unless-stopped \ --device /dev/dri \ -v /mnt/user/appdata/ollama:/root/.ollama \ -p 11434:11434 \ ollama/ollama ``` ### Open WebUI ```bash docker run -d \ --name open-webui \ --restart unless-stopped \ -p 3080:8080 \ -e OLLAMA_BASE_URL=http://192.168.31.2:11434 \ -v /mnt/user/appdata/open-webui:/app/backend/data \ ghcr.io/open-webui/open-webui:main ``` ### AI Command Helper ```bash # /usr/local/bin/ai #\!/bin/bash MODEL="unraid-assistant" if [ $# -eq 0 ]; then docker exec -it ollama ollama run $MODEL else docker exec ollama ollama run $MODEL "$*" fi ``` --- ## Open WebUI RAG Setup For detailed documentation beyond system prompt: 1. Go to http://192.168.31.2:3080 2. **Workspace** → **Knowledge** → **+ Create** 3. Name: `Infrastructure` 4. Upload docs from `/mnt/user/appdata/open-webui/docs/` Infrastructure docs are pre-copied to that location. --- ## Future: N5 Air (Ryzen AI 5 255) Upgrade Planning to migrate AI stack to N5 Air with Ryzen AI 5 255: | Metric | N100 (current) | Ryzen AI 5 255 | |--------|----------------|--------------| | Speed | ~1 tok/s | ~5-8 tok/s | | Max model | 7B | 14B-32B | | Response time | 30-90s | 5-15s | Features XDNA NPU (16 TOPS) for potential AI acceleration. DDR5 + 6c/12t CPU will significantly improve inference. --- ## Files | File | Purpose | |------|---------| | /mnt/user/appdata/ollama/Modelfile-unraid | Custom model definition | | /usr/local/bin/ai | Terminal helper command | | /mnt/user/appdata/open-webui/docs/ | RAG documents |