All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
170 lines
3.8 KiB
Markdown
170 lines
3.8 KiB
Markdown
# Local AI Stack on Unraid
|
|
|
|
**Status:** ✅ Deployed
|
|
**Last Updated:** 2026-01-26
|
|
|
|
---
|
|
|
|
## Current Deployment
|
|
|
|
| Component | Status | URL/Port |
|
|
|-----------|--------|----------|
|
|
| Ollama | ✅ Running | http://192.168.31.2:11434 |
|
|
| Open WebUI | ✅ Running | http://192.168.31.2:3080 |
|
|
| Intel GPU | ✅ Enabled | /dev/dri passthrough |
|
|
|
|
### Models Installed
|
|
|
|
| Model | Size | Type |
|
|
|-------|------|------|
|
|
| qwen2.5-coder:7b | 4.7 GB | Base coding LLM |
|
|
| unraid-assistant | 4.7 GB | Custom model with infrastructure knowledge |
|
|
|
|
---
|
|
|
|
## Custom Model: unraid-assistant
|
|
|
|
A fine-tuned system prompt model that knows the xtrm-lab.org infrastructure:
|
|
|
|
### Knowledge Included
|
|
- **Network topology**: All VLANs (10,20,25,30,31,35,40,50), IPs, gateways
|
|
- **45+ Docker containers**: Names, images, ports, purposes
|
|
- **RouterOS 7**: Commands, VLAN patterns, firewall rules
|
|
- **Traefik**: Labels, routing, SSL configuration
|
|
- **Authentik**: SSO middleware, provider setup
|
|
- **External URLs**: All xtrm-lab.org services
|
|
|
|
### Usage
|
|
|
|
```bash
|
|
# Terminal (SSH to Unraid)
|
|
ai "How do I add a device to the IoT VLAN?"
|
|
ai "What port is gitea running on?"
|
|
ai "Show me Traefik labels for a new app with Authentik"
|
|
|
|
# Interactive mode
|
|
ai
|
|
```
|
|
|
|
### Rebuild Model
|
|
|
|
If infrastructure changes, update and rebuild:
|
|
|
|
```bash
|
|
# Edit the Modelfile
|
|
nano /mnt/user/appdata/ollama/Modelfile-unraid
|
|
|
|
# Rebuild
|
|
docker exec ollama ollama create unraid-assistant -f /root/.ollama/Modelfile-unraid
|
|
```
|
|
|
|
---
|
|
|
|
## Hardware
|
|
|
|
| Component | Spec |
|
|
|-----------|------|
|
|
| CPU | Intel N100 (4 cores) |
|
|
| RAM | 16GB (shared with Docker) |
|
|
| GPU | Intel UHD (iGPU via /dev/dri) |
|
|
| Storage | 1.7TB free on array |
|
|
|
|
### Performance
|
|
- ~1 token/sec with 7B models
|
|
- Responses take 30-90 seconds
|
|
- Suitable for occasional use, not real-time chat
|
|
|
|
---
|
|
|
|
## Containers Stopped for RAM
|
|
|
|
To free ~4.8GB for AI workloads:
|
|
|
|
| Container | RAM Freed | Purpose |
|
|
|-----------|-----------|---------|
|
|
| karakeep | 1.68 GB | Bookmark manager |
|
|
| unimus | 1.62 GB | Network backup |
|
|
| homarr | 686 MB | Dashboard |
|
|
| netdisco-web | 531 MB | Network discovery UI |
|
|
| netdisco-backend | 291 MB | Network discovery |
|
|
|
|
To restart if needed:
|
|
```bash
|
|
docker start karakeep unimus homarr netdisco-web netdisco-backend
|
|
```
|
|
|
|
---
|
|
|
|
## Docker Configuration
|
|
|
|
### Ollama
|
|
```bash
|
|
docker run -d \
|
|
--name ollama \
|
|
--restart unless-stopped \
|
|
--device /dev/dri \
|
|
-v /mnt/user/appdata/ollama:/root/.ollama \
|
|
-p 11434:11434 \
|
|
ollama/ollama
|
|
```
|
|
|
|
### Open WebUI
|
|
```bash
|
|
docker run -d \
|
|
--name open-webui \
|
|
--restart unless-stopped \
|
|
-p 3080:8080 \
|
|
-e OLLAMA_BASE_URL=http://192.168.31.2:11434 \
|
|
-v /mnt/user/appdata/open-webui:/app/backend/data \
|
|
ghcr.io/open-webui/open-webui:main
|
|
```
|
|
|
|
### AI Command Helper
|
|
```bash
|
|
# /usr/local/bin/ai
|
|
#\!/bin/bash
|
|
MODEL="unraid-assistant"
|
|
if [ $# -eq 0 ]; then
|
|
docker exec -it ollama ollama run $MODEL
|
|
else
|
|
docker exec ollama ollama run $MODEL "$*"
|
|
fi
|
|
```
|
|
|
|
---
|
|
|
|
## Open WebUI RAG Setup
|
|
|
|
For detailed documentation beyond system prompt:
|
|
|
|
1. Go to http://192.168.31.2:3080
|
|
2. **Workspace** → **Knowledge** → **+ Create**
|
|
3. Name: `Infrastructure`
|
|
4. Upload docs from `/mnt/user/appdata/open-webui/docs/`
|
|
|
|
Infrastructure docs are pre-copied to that location.
|
|
|
|
---
|
|
|
|
## Future: N5 Air (Ryzen AI 5 255) Upgrade
|
|
|
|
Planning to migrate AI stack to N5 Air with Ryzen AI 5 255:
|
|
|
|
| Metric | N100 (current) | Ryzen AI 5 255 |
|
|
|--------|----------------|--------------|
|
|
| Speed | ~1 tok/s | ~5-8 tok/s |
|
|
| Max model | 7B | 14B-32B |
|
|
| Response time | 30-90s | 5-15s |
|
|
|
|
Features XDNA NPU (16 TOPS) for potential AI acceleration. DDR5 + 6c/12t CPU will significantly improve inference.
|
|
|
|
---
|
|
|
|
## Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| /mnt/user/appdata/ollama/Modelfile-unraid | Custom model definition |
|
|
| /usr/local/bin/ai | Terminal helper command |
|
|
| /mnt/user/appdata/open-webui/docs/ | RAG documents |
|