Major documentation restructure - consolidated docs
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful

New Structure:
- 01-NETWORK-MAP.md - Network topology, IPs, Docker networks, services
- 02-SERVICES-CRITICAL.md - DNS, Auth, Routing (P0/P1 services)
- 03-SERVICES-OTHER.md - All non-critical services
- 04-HARDWARE-INVENTORY.md - Physical devices and specs
- 05-CHANGELOG.md - Major events only

New Folders:
- docs/archive/ - Legacy docs (read-only reference)
- docs/wip/ - Planned changes and ideas
  - UPGRADE-2026-HARDWARE.md - N5 Air + N100 migration plan
  - GITOPS-CONTAINERS.md - Phase 2 container GitOps

Changes:
- Moved all 22 legacy docs to archive/
- Consolidated container IPs, physical map, and services into single network map
- Extracted critical vs non-critical service classification
- Simplified changelog to major events only

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-25 11:17:33 +02:00
parent ed17dea2d6
commit b250493d5a
31 changed files with 1585 additions and 23 deletions

View File

@@ -1,31 +1,83 @@
# XTRM-Lab Infrastructure
# XTRM Home Lab Infrastructure
This repository contains infrastructure documentation and CI/CD pipelines for the xtrm-lab.org homelab.
**Domain:** xtrm-lab.org
**Repository:** https://git.xtrm-lab.org/jazzymc/infrastructure
## Documentation
---
All infrastructure documentation is in the `docs/` folder:
## Quick Reference
| Document | Description |
|----------|-------------|
| [00-CURRENT-STATE.md](docs/00-CURRENT-STATE.md) | Current infrastructure state |
| [01-PHASE1-DNS-PORTABILITY.md](docs/01-PHASE1-DNS-PORTABILITY.md) | DNS & Pi-hole setup |
| [02-PHASE2-FOSSORIAL-STACK.md](docs/02-PHASE2-FOSSORIAL-STACK.md) | Pangolin tunnel stack |
| [03-PHASE3-AUTHENTIK-ZEROTRUST.md](docs/03-PHASE3-AUTHENTIK-ZEROTRUST.md) | Authentik SSO |
| [04-PHASE4-REMOTE-GAMING.md](docs/04-PHASE4-REMOTE-GAMING.md) | Sunshine/Moonlight |
| [05-PHASE5-RUSTDESK.md](docs/05-PHASE5-RUSTDESK.md) | RustDesk setup |
| [06-PHASE6-PORTAINER-MANAGEMENT.md](docs/06-PHASE6-PORTAINER-MANAGEMENT.md) | Portainer |
| [08-PHASE7-GITEA-GITOPS.md](docs/08-PHASE7-GITEA-GITOPS.md) | Gitea & Woodpecker CI |
| Resource | Address |
|----------|---------|
| **Dashboard** | https://xtrm-lab.org |
| **NetBox** | https://netbox.xtrm-lab.org |
| **Git** | https://git.xtrm-lab.org |
| **CI/CD** | https://ci.xtrm-lab.org |
| **DNS Primary** | dns.xtrm-lab.org |
| **DNS Secondary** | dns2.xtrm-lab.org |
---
## Documentation Structure
```
docs/
├── 01-NETWORK-MAP.md # Network topology, IPs, Docker networks
├── 02-SERVICES-CRITICAL.md # DNS, Auth, Routing - must stay up
├── 03-SERVICES-OTHER.md # All other services
├── 04-HARDWARE-INVENTORY.md # Physical devices, specs, serials
├── 05-CHANGELOG.md # Major events only
├── wip/ # Planned changes & ideas
│ ├── UPGRADE-2026-HARDWARE.md
│ └── GITOPS-CONTAINERS.md
└── archive/ # Legacy docs (read-only)
```
---
## Key Devices
| Device | IP | Role |
|--------|-----|------|
| HAP1 | 192.168.31.1 | Router, DNS, WiFi Controller |
| XTRM-U | 192.168.31.2 | Production Server (Unraid) |
| CSS1 | 192.168.31.9 | Distribution Switch |
| ZX1 | 192.168.31.7 | Core Switch (2.5G) |
| CAP | 192.168.31.6 | Wireless Access Point |
---
## SSH Access
```bash
# Unraid
ssh -i ~/.ssh/id_ed25519_unraid root@192.168.31.2 -p 422
# MikroTik Router
ssh -i ~/.ssh/mikrotik_key -p 2222 unraid@192.168.31.1
```
---
## Emergency Recovery
1. **DNS down?** → Clients fallback to 192.168.31.4 (secondary)
2. **Internet down?** → Check HAP1 at 192.168.31.1
3. **Services down?** → Check Unraid at 192.168.31.2
4. **Full outage?** → See `02-SERVICES-CRITICAL.md` startup order
---
## Change Management
- **Major changes:** Document in `05-CHANGELOG.md`
- **Minor changes:** Git commit messages only
- **Planned work:** Create doc in `wip/` folder
---
## CI/CD
This repo uses Woodpecker CI. See [.woodpecker.yml](.woodpecker.yml) for the pipeline configuration.
Woodpecker CI at https://ci.xtrm-lab.org
## Services
| Service | URL |
|---------|-----|
| Gitea | https://git.xtrm-lab.org |
| Woodpecker CI | https://ci.xtrm-lab.org |
| Authentik | https://auth.xtrm-lab.org |
| Traefik | https://traefik.xtrm-lab.org |
Pipelines trigger on push to this repository.

340
docs/01-NETWORK-MAP.md Normal file
View File

@@ -0,0 +1,340 @@
# Network Map - xtrm-lab.org
**Last Updated:** 2026-01-25
**Domain:** xtrm-lab.org
**WAN IP:** 62.73.120.142
---
## Quick Reference
| Resource | Address |
|----------|---------|
| **Dashboard** | https://xtrm-lab.org |
| **DNS Primary** | dns.xtrm-lab.org (HAP1) |
| **DNS Secondary** | dns2.xtrm-lab.org (XTRM-U) |
| **Unraid SSH** | `ssh -i ~/.ssh/id_ed25519_unraid root@192.168.31.2 -p 422` |
| **MikroTik SSH** | `ssh -i ~/.ssh/mikrotik_key -p 2222 unraid@192.168.31.1` |
---
## Network Topology
```mermaid
flowchart TB
subgraph Internet["Internet"]
ISP["IGP Fiber Gateway<br/>(Vivacom)<br/>62.73.120.x"]
end
subgraph Rack19["19&quot; Rack (3U)"]
HAP1["HAP1 | hAP ax³<br/>192.168.31.1"]
PP1["PP1 | 24-port"]
CSS1["CSS1 | CSS326-24G-2S+<br/>192.168.31.9"]
end
subgraph Rack10["10&quot; Rack (9U)"]
ZX1["ZX1 | ZX-SWTGW218AS<br/>192.168.31.7"]
PP2["PP2 | 12-port"]
XTRMU["XTRM-U<br/>192.168.31.2"]
end
subgraph Wireless["WiFi"]
CAP["CAP | cAP XL ac<br/>192.168.31.6"]
end
ISP -->|"H-1 WAN"| HAP1
HAP1 -->|"H-4 → ZX1-1"| ZX1
HAP1 -->|"H-3 → CSS1-1"| CSS1
ZX1 <-->|"⚡ 10G SFP+ ⚡"| CSS1
ZX1 -->|"ZX1-2/3 via PP2"| XTRMU
HAP1 -->|"H-2 via PP1"| CAP
CSS1 -->|"Ports 16-24"| PP1
```
---
## Physical Infrastructure
### Rack Layout
#### 10" Rack (9U)
| U | Device | Model | IP | Notes |
|---|--------|-------|-----|-------|
| U9 | Shelf + ISP Gateway | Vivacom ONT | 62.73.120.2 | WAN |
| U8 | PP2 | 10" 12-port Cat6a | - | Patch panel |
| U7 | Shelf + ZX1 | ZX-SWTGW218AS | 192.168.31.7 | 8x2.5G + 2x10G SFP+ |
| U6 | (empty) | - | - | Reserved for XTRM-N1 |
| U1-U4 | XTRM-U | NAS Server | 192.168.31.2 | 4x 2.5GbE bond |
#### 19" Rack (3U)
| U | Device | Model | IP | Notes |
|---|--------|-------|-----|-------|
| U3 | Shelf + HAP1 | hAP ax³ | 192.168.31.1 | Router + WiFi controller |
| U2.5 | PP1 | 19" 24-port Cat6a | - | Room connections |
| U1 | CSS1 | CSS326-24G-2S+ | 192.168.31.9 | 24x1G + 2x10G SFP+ |
### Backbone Links
| Link | From | To | Speed | Type |
|------|------|----|-------|------|
| **Primary** | ZX1-SFP1 | CSS1-SFP1 | 10G | SFP+ DAC |
| Router→Core | HAP1 H-4 | ZX1-1 | 2.5G | Cat6a |
| Router→Dist | HAP1 H-3 | CSS1-1 | 1G | Cat6a |
| Server Bond | ZX1-2/3 | XTRM-U via PP2 | 2x 2.5G | Cat6a |
---
## IP Address Allocation
### Network: 192.168.31.0/24
#### Infrastructure Devices
| IP | Device | Type | MAC |
|----|--------|------|-----|
| 192.168.31.1 | HAP1 \| hAP ax³ | Router | 78:9A:18:2C:A5:48 |
| 192.168.31.2 | XTRM-U | Server | A8:B8:E0:02:B6:15 |
| 192.168.31.6 | CAP \| cAP XL ac | Access Point | 18:FD:74:54:3D:BC |
| 192.168.31.7 | ZX1 \| ZX-SWTGW218AS | Switch | 1C:2A:A3:1E:78:67 |
| 192.168.31.9 | CSS1 \| CSS326-24G-2S+ | Switch | F4:1E:57:C9:BD:09 |
#### Containers (br0 Macvlan)
| IP | Container | Purpose |
|----|-----------|---------|
| 192.168.31.4 | AdGuard Home | DNS Secondary |
| 192.168.31.5 | Unbound | Recursive DNS (stopped) |
| 192.168.31.12 | TimeMachine | macOS backups |
#### DHCP Ranges
| Range | Purpose |
|-------|---------|
| 192.168.31.10-99 | Reserved (static) |
| 192.168.31.100-200 | DHCP Pool |
| 192.168.31.201-254 | Reserved |
---
## Docker Networks
### HAP1 (MikroTik Router)
**Network:** 172.17.0.0/16 (bridge)
| Container | IP | Purpose |
|-----------|-----|---------|
| AdGuard Home | 172.17.0.5 | DNS Primary (DoH/DoT/DoQ) |
| Tailscale | 172.17.0.4 | VPN mesh |
### XTRM-U (Unraid Server)
#### dockerproxy (172.18.0.0/16)
**Static IP Assignments:**
| Range | Purpose |
|-------|---------|
| 172.18.0.2-10 | Core Infrastructure |
| 172.18.0.11-15 | Security |
| 172.18.0.16-30 | Productivity |
| 172.18.0.31-40 | DevOps |
| 172.18.0.41-50 | NetDisco |
| 172.18.0.61-69 | NetBox |
| 172.18.0.70-79 | Diode Discovery |
**Core Infrastructure (172.18.0.2-10)**
| IP | Container | Purpose |
|----|-----------|---------|
| 172.18.0.2 | dockersocket | Docker socket proxy |
| 172.18.0.3 | traefik | Reverse proxy |
| 172.18.0.4 | homarr | Dashboard |
**Security (172.18.0.11-15)**
| IP | Container | Purpose |
|----|-----------|---------|
| 172.18.0.11 | authentik | Identity provider |
| 172.18.0.12 | authentik-worker | Background tasks |
| 172.18.0.13 | postgresql17 | Shared database |
| 172.18.0.14 | Redis | Shared cache/queue |
| 172.18.0.15 | vaultwarden | Password manager |
**Productivity (172.18.0.16-30)**
| IP | Container | Purpose |
|----|-----------|---------|
| 172.18.0.16 | actual-budget | Budget tracking |
| 172.18.0.17 | n8n | Workflow automation |
| 172.18.0.18 | Uptime-Kuma-API | Monitoring API |
| 172.18.0.19 | AutoKuma | Auto-monitor |
| 172.18.0.20 | UptimeKuma | Uptime monitoring |
| 172.18.0.21 | speedtest-tracker | Speed tests |
| 172.18.0.23 | Libation | Audiobooks |
| 172.18.0.24 | Nextcloud | Cloud storage |
| 172.18.0.25 | karakeep | Bookmarks |
| 172.18.0.26 | transmission | Torrent |
| 172.18.0.27 | adguardhome-sync | DNS sync |
**DevOps (172.18.0.31-40)**
| IP | Container | Purpose |
|----|-----------|---------|
| 172.18.0.31 | gitea | Git server |
| 172.18.0.32 | woodpecker-server | CI/CD server |
| 172.18.0.33 | woodpecker-agent | CI/CD agent |
**NetDisco (172.18.0.41-50)**
| IP | Container | Purpose |
|----|-----------|---------|
| 172.18.0.41 | netdisco-web | Web UI |
| 172.18.0.42 | netdisco-backend | SNMP poller |
**NetBox (172.18.0.61-69)**
| IP | Container | Purpose |
|----|-----------|---------|
| 172.18.0.61 | netbox | Web UI (DCIM/IPAM) |
| 172.18.0.62 | netbox-worker | Background tasks |
| 172.18.0.64 | netbox-redis-cache | Query cache |
**Diode Discovery (172.18.0.70-79)**
| IP | Container | Purpose |
|----|-----------|---------|
| 172.18.0.70 | diode-ingress | API Gateway |
| 172.18.0.71 | diode-ingester | Data ingestion |
| 172.18.0.72 | diode-reconciler | NetBox sync |
| 172.18.0.73 | diode-hydra | OAuth2 |
| 172.18.0.74 | diode-auth | Token service |
#### Host Network Containers
| Container | Purpose |
|-----------|---------|
| plex | Media server (:32400) |
| unimus | Network config backup |
| UrBackup | Backup server |
| NetAlertX | Network scanner |
| HomeAssistant | Home automation |
#### Bridge Network (172.17.0.0/16)
| Container | Purpose |
|-----------|---------|
| portainer | Container management |
| rustdesk-hbbs | RustDesk signaling |
| rustdesk-hbbr | RustDesk relay |
---
## Port Forwarding (NAT)
| External Port | Destination | Service |
|---------------|-------------|---------|
| 80 | 192.168.31.2:8001 | Traefik HTTP |
| 443 | 192.168.31.2:44301 | Traefik HTTPS |
| 853 | 172.17.0.5:853 | AdGuard DoT |
| 8853 | 172.17.0.5:8853 | AdGuard DoQ |
| 32400 | 192.168.31.2:32400 | Plex |
| 51413 | 192.168.31.2:51413 | Transmission |
| 21115-21119 | 192.168.31.2 | RustDesk |
---
## DNS Architecture
```mermaid
flowchart TB
subgraph External["External Access"]
DOH["DoH: dns.xtrm-lab.org"]
DOT["DoT: dns.xtrm-lab.org:853"]
end
subgraph HAP1["HAP1 (Primary)"]
AGH1["AdGuard Home<br/>172.17.0.5"]
end
subgraph XTRMU["XTRM-U (Secondary)"]
AGH2["AdGuard Home<br/>192.168.31.4"]
end
subgraph Sync["Sync"]
SYNC["adguardhome-sync<br/>Every 30 min"]
end
DOH --> AGH1
DOT --> AGH1
AGH1 <-.->|sync| SYNC
SYNC <-.->|sync| AGH2
AGH1 --> Q9["Quad9 DoH"]
AGH2 --> Q9
```
---
## WiFi Networks
| SSID | Band | Security | Purpose |
|------|------|----------|---------|
| XTRM | 5GHz | WPA2/WPA3 | Primary devices |
| XTRM | 2.4GHz | WPA/WPA2 | Legacy support |
| XTRM2 | 2.4GHz | WPA/WPA2 | IoT devices |
**CAPsMAN:** HAP1 manages CAP access point
---
## External URLs
| Service | URL |
|---------|-----|
| Dashboard | https://xtrm-lab.org |
| Auth | https://auth.xtrm-lab.org |
| Git | https://git.xtrm-lab.org |
| CI/CD | https://ci.xtrm-lab.org |
| NetBox | https://netbox.xtrm-lab.org |
| Uptime | https://uptime.xtrm-lab.org |
| Plex | https://plex.xtrm-lab.org |
| Nextcloud | https://cloud.xtrm-lab.org |
| Vault | https://vault.xtrm-lab.org |
| NetDisco | https://netdisco.xtrm-lab.org |
---
## Room Outlets
| Room | Outlets | Switch Ports | Status |
|------|---------|--------------|--------|
| Living Room | L1, L2, L3 | CSS1-22/23/24 | Active |
| Main Bedroom | M1, M2, M3 | CSS1-19/20/21 | Active |
| Boys Room | B1, B2 | CSS1-17/18 | Active |
| Girls Room | G1 | CSS1-16 | Unused |
| Corridor | C1 (CAP) | HAP1 H-2 | Active |
---
## Shared Databases
### PostgreSQL 17 (172.18.0.13)
| Database | User | Consumer |
|----------|------|----------|
| authentik_db | authentik_user | Authentik |
| netbox | netbox_user | NetBox |
| gitea | gitea_user | Gitea |
| netdisco_db | netdisco_user | NetDisco |
| diode | diode_user | Diode Reconciler |
| hydra | hydra_user | Diode Hydra |
### Redis (172.18.0.14)
| Consumer | Purpose |
|----------|---------|
| Authentik | Session cache |
| NetBox Worker | Task queue |
| Diode | Ingestion queue |

View File

@@ -0,0 +1,212 @@
# Critical Services
**Last Updated:** 2026-01-25
Services that must remain operational for network functionality and security.
---
## Priority Levels
| Priority | Meaning | Recovery Target |
|----------|---------|-----------------|
| **P0** | Network offline without this | < 5 minutes |
| **P1** | Major functionality impacted | < 1 hour |
---
## P0 - Network Core
### DNS (AdGuard Home)
| Instance | Host | IP | Role |
|----------|------|-----|------|
| Primary | HAP1 | 172.17.0.5 | Main DNS, DoH/DoT/DoQ |
| Secondary | XTRM-U | 192.168.31.4 | Failover DNS |
**Endpoints:**
- DoH: `https://dns.xtrm-lab.org/dns-query`
- DoT: `tls://dns.xtrm-lab.org:853`
- DoQ: `quic://dns.xtrm-lab.org:8853`
**Config Sync:** adguardhome-sync (every 30 min)
**Upstream:** Quad9 DoH (`https://dns10.quad9.net/dns-query`)
**Recovery:**
1. If primary fails → clients use secondary (192.168.31.4)
2. Restart container on HAP1: `/container/start adguardhome`
---
### Routing (HAP1)
| Function | Details |
|----------|---------|
| WAN | 62.73.120.142 via Vivacom fiber |
| LAN | 192.168.31.0/24 |
| NAT | Port forwarding to XTRM-U |
| Firewall | RouterOS firewall rules |
**Recovery:**
1. Physical access to HAP1
2. Reset: hold reset button 5s
3. Reconfigure via WinBox or SSH
---
### DHCP (HAP1)
| Pool | Range |
|------|-------|
| Dynamic | 192.168.31.100-200 |
| Lease Time | 24 hours |
**Static Leases:** Managed in RouterOS DHCP server
---
## P1 - Authentication & Secrets
### Authentik (SSO)
| Component | IP | Purpose |
|-----------|-----|---------|
| authentik | 172.18.0.11 | Web UI + OIDC |
| authentik-worker | 172.18.0.12 | Background tasks |
**URL:** https://auth.xtrm-lab.org
**Protects:**
- Traefik forward auth (all *.xtrm-lab.org)
- Gitea OAuth
- Woodpecker OAuth
- NetBox OAuth
- NetDisco SSO
**Recovery:**
```bash
cd /mnt/user/appdata/authentik
docker compose up -d
```
**Database:** postgresql17 (authentik_db)
---
### Vaultwarden (Passwords)
| Component | IP | Purpose |
|-----------|-----|---------|
| vaultwarden | 172.18.0.15 | Password manager |
**URL:** https://vault.xtrm-lab.org
**Data:** `/mnt/user/appdata/vaultwarden/`
**Recovery:**
```bash
docker start vaultwarden
```
**Backup:** Part of Unraid flash backup
---
## P1 - Reverse Proxy
### Traefik
| Component | IP | Ports |
|-----------|-----|-------|
| traefik | 172.18.0.3 | 8001→80, 44301→443 |
**Config:** `/mnt/user/appdata/traefik/`
- `traefik.yml` - Static config
- `dynamic.yml` - Routers & services
**TLS:** Let's Encrypt wildcard for *.xtrm-lab.org
**Recovery:**
```bash
docker start traefik
```
---
## Shared Infrastructure
### PostgreSQL 17
| IP | Databases |
|----|-----------|
| 172.18.0.13 | authentik_db, netbox, gitea, netdisco_db, diode, hydra |
**Data:** `/mnt/user/appdata/postgresql17/`
**Recovery:**
```bash
docker start postgresql17
# Wait for DB to be ready before starting dependents
```
### Redis
| IP | Consumers |
|----|-----------|
| 172.18.0.14 | Authentik, NetBox, Diode |
**Recovery:**
```bash
docker start Redis
```
---
## Startup Order
When recovering from full outage:
1. **postgresql17** - Database (wait 30s)
2. **Redis** - Cache/queue (wait 10s)
3. **traefik** - Reverse proxy
4. **authentik** + **authentik-worker** - SSO
5. **vaultwarden** - Passwords
6. All other services
---
## Monitoring
### Uptime Kuma
| URL | Monitors |
|-----|----------|
| https://uptime.xtrm-lab.org | 27 services |
**Alerts:** Configured per service (email/webhook)
---
## Backup Strategy
| Data | Location | Frequency |
|------|----------|-----------|
| Unraid Flash | Google Drive | Daily |
| PostgreSQL | `/mnt/user/Backup/` | Daily |
| Vaultwarden | Unraid Flash | With flash backup |
| Authentik | PostgreSQL + `/mnt/user/appdata/authentik/` | Daily |
---
## Future: XTRM-N1 Survival Node
When hardware upgrade completes, these services will have replicas on XTRM-N1:
| Service | Primary | Replica |
|---------|---------|---------|
| DNS | HAP1 | XTRM-N1 |
| Vaultwarden | XTRM-N5 | XTRM-N1 |
| Authentik | XTRM-N5 | XTRM-N1 |
See: `wip/UPGRADE-2026-HARDWARE.md`

262
docs/03-SERVICES-OTHER.md Normal file
View File

@@ -0,0 +1,262 @@
# Other Services
**Last Updated:** 2026-01-25
Non-critical services that enhance functionality but don't affect core network operation.
---
## DevOps
### Gitea (Git Server)
| Component | IP | URL |
|-----------|-----|-----|
| gitea | 172.18.0.31 | https://git.xtrm-lab.org |
**Ports:** 2222→22 (SSH), 3005→3000 (HTTP)
**Auth:** Authentik OAuth2
**Database:** postgresql17 (gitea)
### Woodpecker CI
| Component | IP | URL |
|-----------|-----|-----|
| woodpecker-server | 172.18.0.32 | https://ci.xtrm-lab.org |
| woodpecker-agent | 172.18.0.33 | - |
**Auth:** Gitea OAuth2
**Purpose:** CI/CD pipelines for infrastructure repo
---
## Network Management
### NetBox (DCIM/IPAM)
| Component | IP | URL |
|-----------|-----|-----|
| netbox | 172.18.0.61 | https://netbox.xtrm-lab.org |
| netbox-worker | 172.18.0.62 | - |
| netbox-redis-cache | 172.18.0.64 | - |
**Database:** postgresql17 (netbox)
**Plugins:** diode, nextbox-ui, dns, inventory, interface-sync, routing
### NetDisco (Network Discovery)
| Component | IP | URL |
|-----------|-----|-----|
| netdisco-web | 172.18.0.41 | https://netdisco.xtrm-lab.org |
| netdisco-backend | 172.18.0.42 | - |
**Database:** postgresql17 (netdisco_db)
**Purpose:** SNMP-based device discovery, MAC/ARP tracking
### Diode (NetBox Discovery)
| Component | IP | Purpose |
|-----------|-----|---------|
| diode-ingress | 172.18.0.70 | API Gateway |
| diode-ingester | 172.18.0.71 | Data ingestion |
| diode-reconciler | 172.18.0.72 | NetBox sync |
| diode-hydra | 172.18.0.73 | OAuth2 |
| diode-auth | 172.18.0.74 | Token service |
| diode-agent | host | Network scanner |
**Discovery:** 192.168.31.0/24 every 30 minutes
### Unimus
| Network | URL |
|---------|-----|
| host | https://unimus.xtrm-lab.org |
**Purpose:** Network device configuration backup (MikroTik)
---
## Monitoring
### Uptime Kuma
| Component | IP | URL |
|-----------|-----|-----|
| UptimeKuma | 172.18.0.20 | https://uptime.xtrm-lab.org |
| Uptime-Kuma-API | 172.18.0.18 | - |
| AutoKuma | 172.18.0.19 | - |
**Monitors:** 27 services configured
### Speedtest Tracker
| Component | IP | URL |
|-----------|-----|-----|
| speedtest-tracker | 172.18.0.21 | https://speedtest.xtrm-lab.org |
### NetAlertX
| Network | URL |
|---------|-----|
| host | https://netalert.xtrm-lab.org |
**Purpose:** Network device discovery and alerting
---
## Media
### Plex
| Network | Port | URL |
|---------|------|-----|
| host | 32400 | https://plex.xtrm-lab.org |
**Libraries:** Movies, TV Shows, Music
### Libation
| Component | IP |
|-----------|-----|
| Libation | 172.18.0.23 |
**Purpose:** Audiobook management
### Transmission
| Component | IP | Ports |
|-----------|-----|-------|
| transmission | 172.18.0.26 | 9091, 51413 |
**Purpose:** Torrent client
---
## Productivity
### Nextcloud
| Component | IP | URL |
|-----------|-----|-----|
| Nextcloud | 172.18.0.24 | https://cloud.xtrm-lab.org |
**Purpose:** File sync, calendar, contacts
### Actual Budget
| Component | IP | URL |
|-----------|-----|-----|
| actual-budget | 172.18.0.16 | https://actual.xtrm-lab.org |
**Purpose:** Personal finance tracking
### n8n
| Component | IP | URL |
|-----------|-----|-----|
| n8n | 172.18.0.17 | https://n8n.xtrm-lab.org |
**Purpose:** Workflow automation
### Karakeep
| Component | IP | URL |
|-----------|-----|-----|
| karakeep | 172.18.0.25 | https://karakeep.xtrm-lab.org |
**Purpose:** Bookmark manager
### Homarr (Dashboard)
| Component | IP | URL |
|-----------|-----|-----|
| homarr | 172.18.0.4 | https://xtrm-lab.org |
**Purpose:** Home dashboard with Docker integration
---
## Storage & Backup
### TimeMachine
| Network | IP |
|---------|-----|
| br0 macvlan | 192.168.31.12 |
**Purpose:** macOS Time Machine backup target
### UrBackup
| Network | URL |
|---------|-----|
| host | https://urbackup.xtrm-lab.org |
**Purpose:** File and image backup for PCs
### RustFS
| Network |
|---------|
| bridge |
**Purpose:** Rust-based filesystem utilities
---
## Remote Access
### RustDesk
| Component | Ports |
|-----------|-------|
| rustdesk-hbbs | 21115-21116 (TCP), 21116 (UDP) |
| rustdesk-hbbr | 21117-21119 |
**Purpose:** Self-hosted remote desktop
### Tailscale
| Host | IP |
|------|-----|
| HAP1 | 172.17.0.4 |
**Purpose:** Mesh VPN for remote access
---
## Smart Home
### Home Assistant
| Network | URL |
|---------|-----|
| host | https://ha.xtrm-lab.org |
**Purpose:** Home automation hub
---
## Container Management
### Portainer
| Network | Port |
|---------|------|
| bridge | 9002 |
**Purpose:** Docker container management UI
**Access:** http://192.168.31.2:9002 or via Tailscale
---
## Stopped/Disabled Services
| Service | Reason | Status |
|---------|--------|--------|
| Unbound | Redundant (AdGuard upstream) | Stopped |
| DoH-Server | Redundant (AdGuard built-in) | Removed |
| stunnel-dot | Redundant (AdGuard built-in) | Removed |
| Pi-hole | Replaced by AdGuard Home | Removed |
| Pangolin | Not in use | Removed |
| Slurp'it | Replaced by Diode | Removed |

View File

@@ -0,0 +1,184 @@
# Hardware Inventory
**Last Updated:** 2026-01-25
---
## Network Devices
### HAP1 | MikroTik hAP ax³
| Property | Value |
|----------|-------|
| **Role** | Router, WiFi Controller, DNS |
| **Location** | 19" Rack U3 (on shelf) |
| **IP** | 192.168.31.1 |
| **MAC** | 78:9A:18:2C:A5:48 |
| **OS** | RouterOS 7.20.6 |
| **Serial** | - |
**Ports:**
| Port | Speed | Connected To |
|------|-------|--------------|
| H-1 | 2.5G | ISP Gateway (WAN) |
| H-2 | 1G | PP1-3 → CAP |
| H-3 | 1G | CSS1-1 |
| H-4 | 2.5G | ZX1-1 |
| H-5 | 1G | Unused |
**Containers:** AdGuard Home, Tailscale
---
### CSS1 | MikroTik CSS326-24G-2S+
| Property | Value |
|----------|-------|
| **Role** | Distribution Switch |
| **Location** | 19" Rack U1 |
| **IP** | 192.168.31.9 |
| **MAC** | F4:1E:57:C9:BD:09 |
| **OS** | SwOS |
| **Serial** | - |
**Ports:** 24x 1G RJ45, 2x 10G SFP+
- SFP1: 10G DAC to ZX1
- Ports 16-24: Room outlets via PP1
---
### ZX1 | ZX-SWTGW218AS
| Property | Value |
|----------|-------|
| **Role** | Core Switch (2.5GbE) |
| **Location** | 10" Rack U7 (on shelf) |
| **IP** | 192.168.31.7 |
| **MAC** | 1C:2A:A3:1E:78:67 |
| **Serial** | - |
**Ports:** 8x 2.5G RJ45, 2x 10G SFP+
| Port | Connected To |
|------|--------------|
| ZX1-1 | HAP1 H-4 |
| ZX1-2 | PP2-1 → XTRM-U XU-1 |
| ZX1-3 | PP2-2 → XTRM-U XU-2 |
| SFP1 | CSS1-SFP1 (10G backbone) |
---
### CAP | MikroTik cAP XL ac
| Property | Value |
|----------|-------|
| **Role** | Wireless Access Point |
| **Location** | Corridor (ceiling) |
| **IP** | 192.168.31.6 |
| **MAC** | 18:FD:74:54:3D:BC |
| **OS** | RouterOS 7.x |
| **Serial** | HCT085KBH8B |
**Managed by:** HAP1 CAPsMAN
---
### ISP Gateway | Vivacom Fiber ONT
| Property | Value |
|----------|-------|
| **Role** | WAN Connection |
| **Location** | 10" Rack U9 (on shelf) |
| **WAN IP** | 62.73.120.142 |
| **MAC** | 9C:E0:41:BB:5E:32 |
---
## Servers
### XTRM-U | Unraid NAS
| Property | Value |
|----------|-------|
| **Role** | Production Server |
| **Location** | 10" Rack U1-U4 |
| **IP** | 192.168.31.2 |
| **OS** | Unraid 6.x |
**Network:**
| Interface | MAC | Speed |
|-----------|-----|-------|
| eth1 | A8:B8:E0:02:B6:15 | 2.5G |
| eth2 | A8:B8:E0:02:B6:16 | 2.5G |
| eth3 | A8:B8:E0:02:B6:17 | 2.5G |
| eth4 | A8:B8:E0:02:B6:18 | 2.5G |
| **bond0** | (virtual) | 5G aggregate |
**Storage:**
- Cache: (current NVMe)
- Array: 3.5" HDDs
**Virtual IPs:**
| IP | Purpose |
|----|---------|
| 192.168.31.4 | AdGuard Home (macvlan) |
| 192.168.31.15 | (reserved) |
---
## Patch Panels
### PP1 | 19" 24-Port
| Property | Value |
|----------|-------|
| **Location** | 19" Rack U2.5 |
| **Type** | Cat6a Keystone |
| **Height** | 0.5U |
**Active Ports:** 3 (CAP), 12, 16-24 (rooms)
### PP2 | 10" 12-Port
| Property | Value |
|----------|-------|
| **Location** | 10" Rack U8 |
| **Type** | Cat6a Keystone |
| **Height** | 1U |
**Active Ports:** 1-2 (XTRM-U)
---
## Rack Shelves
| Location | Size | Holds |
|----------|------|-------|
| 10" Rack U9 | 10" | ISP Gateway |
| 10" Rack U7 | 10" | ZX1 Switch |
| 19" Rack U3 | 19" | HAP1 Router |
---
## End Devices (Wired)
| Device | Room | Outlet | Switch Port | MAC |
|--------|------|--------|-------------|-----|
| LGTV | Living Room | L3 | CSS1-24 | - |
| XTRM-Nobara | Main Bedroom | M2 | CSS1-20 | 08:92:04:C6:07:C5 |
| Dell Display | Main Bedroom | M3 | CSS1-21 | - |
| Dancho | Boys Room | B1 | CSS1-18 | - |
| KVM Switch | - | Direct | CSS1-2 | - |
---
## Future Hardware (Planned)
See: `wip/UPGRADE-2026-HARDWARE.md`
| Device | Role | Status |
|--------|------|--------|
| XTRM-N5 (Minisforum N5 Air) | Production server | Planned |
| XTRM-N1 (N100 ITX) | Survival node | Planned |
| 3x Samsung 990 EVO Plus 1TB | XTRM-N5 NVMe pool | Planned |
| 2x Fikwot FX501Pro 512GB | XTRM-N1 mirror | Planned |
| MikroTik CRS310-8G+2S+IN | Replace ZX1 | Future |

71
docs/05-CHANGELOG.md Normal file
View File

@@ -0,0 +1,71 @@
# Infrastructure Changelog
**Purpose:** Major infrastructure events only. Minor changes are in git commit messages.
---
## 2026-01
### 2026-01-25
- **[DOCS]** Restructured documentation - consolidated into 5 core docs + archive
- **[NETBOX]** Added shelf devices for rack organization (U9, U7, U3)
### 2026-01-24
- **[NETBOX]** Standardized device names to NetBox convention (HAP1, CSS1, ZX1)
- **[DOCS]** Created NETWORK-PHYSICAL-MAP.md with complete port maps
### 2026-01-23
- **[SERVICE]** Deployed Diode network discovery stack
- **[SERVICE]** Removed Slurp'it (replaced by Diode + NetDisco)
- **[SERVICE]** Consolidated NetBox Redis to shared instance
- **[SERVICE]** Removed redundant DNS services (Unbound, DoH-Server, stunnel-dot)
### 2026-01-22
- **[SERVICE]** Migrated NetBox to shared PostgreSQL 17
- **[SERVICE]** Deployed AdGuard Home on MikroTik (primary DNS)
- **[SERVICE]** Deployed AdGuard Home on Unraid (secondary DNS)
- **[SERVICE]** Removed Pi-hole (replaced by AdGuard Home)
- **[DOCS]** Created INFRASTRUCTURE-DIAGRAM.md
### 2026-01-21
- **[BACKUP]** Configured Rclone sync to Google Drive
### 2026-01-19
- **[SERVICE]** Deployed NetBox IPAM/DCIM
- **[SERVICE]** Deployed NetDisco network discovery
- **[NETWORK]** Enabled SNMP on all MikroTik devices
### 2026-01-18
- **[SERVICE]** Deployed Gitea git server
- **[SERVICE]** Deployed Woodpecker CI
- **[NETWORK]** Configured CAPsMAN on HAP1
- **[WIRELESS]** CAP added to CAPsMAN management
### 2026-01-17
- **[SERVICE]** Deployed Portainer CE
---
## Format Guide
```markdown
### YYYY-MM-DD
- **[CATEGORY]** Brief description
Categories:
- [DEVICE] - Hardware added/removed/changed
- [SERVICE] - Container/service deployed/removed
- [NETWORK] - Network topology/config changes
- [WIRELESS] - WiFi/CAPsMAN changes
- [BACKUP] - Backup configuration
- [DOCS] - Major documentation changes
```
---
## Previous History
For detailed history before 2026-01-17, see archived changelogs:
- `archive/06-CHANGELOG.md`
- `archive/07-CHANGELOG.md`
- `archive/00-CHANGELOG.md`

16
docs/archive/README.md Normal file
View File

@@ -0,0 +1,16 @@
# Archived Documentation
> ⚠️ **OBSOLETE - DO NOT UPDATE**
These documents are from the legacy documentation structure (pre-2026-01-25).
They are kept for historical reference only.
**For current documentation, see the parent `docs/` folder:**
- `01-NETWORK-MAP.md` - Network topology, IPs, services
- `02-SERVICES-CRITICAL.md` - Essential services
- `03-SERVICES-OTHER.md` - Non-critical services
- `04-HARDWARE-INVENTORY.md` - Hardware details
- `05-CHANGELOG.md` - Major events
**Do not reference these archived documents for current state.**
All relevant information has been migrated to the new structure.

View File

@@ -0,0 +1,180 @@
# GitOps for Container Management
**Status:** 💡 IDEA
**Depends On:** Hardware upgrade completion
**Author:** Kaloyan
---
## Overview
Version control all container configurations to:
1. Track changes over time
2. Maintain consistency between XTRM-N5 and XTRM-N1
3. Enable automated deployments via Woodpecker CI
4. Recover from disasters quickly
---
## Repository Structure
```
infrastructure/
├── configs/
│ ├── common/ # Shared configs
│ │ ├── traefik/
│ │ │ └── dynamic.yml
│ │ └── authentik/
│ │ └── blueprints/
│ │
│ ├── xtrm-n5/ # Production server
│ │ ├── docker/
│ │ │ ├── compose/ # docker-compose files
│ │ │ │ ├── netbox.yml
│ │ │ │ ├── gitea.yml
│ │ │ │ └── ...
│ │ │ ├── templates/ # Unraid XML templates
│ │ │ └── env/ # Environment files (.env.example)
│ │ ├── network/
│ │ │ └── docker-networks.json
│ │ └── unraid/
│ │ ├── shares.json
│ │ └── users.json
│ │
│ └── xtrm-n1/ # Survival node
│ ├── docker/
│ │ └── compose/
│ │ ├── adguard.yml
│ │ ├── vaultwarden.yml
│ │ └── authentik-replica.yml
│ └── proxmox/
│ └── vm-configs/
└── .woodpecker.yml
```
---
## Workflow
### 1. Change Detection
```mermaid
flowchart LR
A[Edit config in Git] --> B[Push to main]
B --> C[Woodpecker CI triggers]
C --> D{Validate configs}
D -->|Pass| E[Deploy to target server]
D -->|Fail| F[Notify & block]
```
### 2. Drift Detection
```mermaid
flowchart LR
A[Scheduled job] --> B[Export current state]
B --> C{Compare to Git}
C -->|Match| D[All good]
C -->|Drift| E[Alert + PR with diff]
```
---
## Implementation Phases
### Phase 2.1: Export Current State
1. Export all docker-compose files
2. Export Unraid container templates (XML → YAML)
3. Export network configurations
4. Create initial commit
### Phase 2.2: CI Pipeline
```yaml
# .woodpecker.yml
pipeline:
validate:
image: docker:latest
commands:
- docker compose -f configs/xtrm-n5/docker/compose/*.yml config
deploy-n5:
image: alpine/ssh
when:
path: configs/xtrm-n5/**
commands:
- ssh root@192.168.31.2 "cd /path && docker compose up -d"
secrets: [ssh_key]
deploy-n1:
image: alpine/ssh
when:
path: configs/xtrm-n1/**
commands:
- ssh root@xtrm-n1 "cd /path && docker compose up -d"
secrets: [ssh_key]
```
### Phase 2.3: Drift Detection
Scheduled Woodpecker job:
1. SSH to each server
2. Export current docker/network state
3. Compare to Git configs
4. Create issue/PR if drift detected
### Phase 2.4: Unraid GUI Sync
**Challenge:** Changes made in Unraid GUI need to sync to Git
**Solution Options:**
| Option | Pros | Cons |
|--------|------|------|
| **A: Webhook on change** | Real-time sync | Complex, needs Unraid plugin |
| **B: Scheduled export** | Simple, reliable | Delay between change and commit |
| **C: Prohibit GUI changes** | Clean workflow | User friction |
**Recommended:** Option B with daily scheduled exports
```bash
# Cron job on Unraid
0 4 * * * /boot/config/scripts/export-docker-config.sh
```
---
## Secrets Management
**Options:**
| Tool | Integration | Complexity |
|------|-------------|------------|
| Woodpecker Secrets | Native | Low |
| Vaultwarden API | Via script | Medium |
| HashiCorp Vault | Enterprise | High |
**Recommended:** Woodpecker Secrets for CI, `.env.example` in Git
```yaml
# In docker-compose
services:
app:
env_file:
- .env # Not in Git, created from .env.example + secrets
```
---
## Rollback Strategy
1. **Git revert** - Revert commit, CI redeploys previous version
2. **Tagged releases** - Deploy specific tag
3. **Manual override** - SSH and docker compose down/up
---
## Related Documents
- `UPGRADE-2026-HARDWARE.md` - Hardware prerequisite

16
docs/wip/README.md Normal file
View File

@@ -0,0 +1,16 @@
# Work In Progress
This folder contains planned changes, evaluations, and ideas that are not yet implemented.
## Document Status
| Status | Meaning |
|--------|---------|
| 📋 PLANNED | Approved, waiting for resources/time |
| 🔬 EVALUATING | Under investigation/research |
| 💡 IDEA | Concept, needs further definition |
## Current Items
- `UPGRADE-2026-HARDWARE.md` - Hardware upgrade plan (N5 Air + N100)
- `GITOPS-CONTAINERS.md` - Container GitOps implementation (Phase 2)

View File

@@ -0,0 +1,229 @@
# Hardware Upgrade Plan 2026
**Status:** 📋 PLANNED
**Target:** Q1 2026
**Author:** Kaloyan
---
## Overview
Migration from single-server architecture to dual-server with survival capabilities.
## Device Naming Convention
| Name | Hardware | Role | Notes |
|------|----------|------|-------|
| **XTRM-U** | Current NAS chassis | Current production | Will be decommissioned |
| **XTRM-N5** | Minisforum N5 Air | New production NAS | Inherits XTRM-U storage, OS, services + new NVMe pool |
| **XTRM-N1** | N100 ITX (XTRM-U board) | Survival node | Same 4x2.5GbE NIC, 1U 10" case, no HDDs |
---
## Hardware Specifications
### XTRM-N5 (Minisforum N5 Air) - Production Server
| Component | Specification |
|-----------|---------------|
| **CPU** | Intel N95/N100 or similar |
| **RAM** | TBD |
| **Storage (Fast)** | 3x Samsung 990 EVO Plus 1TB (NVMe ZFS RAID-Z1) |
| **Storage (Capacity)** | Existing 3.5" HDD Array (from XTRM-U) |
| **Network** | 10GbE Primary + 5GbE Secondary |
| **OS** | Unraid 7 (internal boot from NVMe pool) |
### XTRM-N1 (N100 ITX) - Survival Node
| Component | Specification |
|-----------|---------------|
| **Board** | XTRM-U motherboard (4x 2.5GbE Intel NICs) |
| **Chassis** | 1U 10" Rackmount (modular/3D printed) |
| **Storage** | 2x Fikwot FX501Pro 512GB (NVMe mirror) |
| **Power** | PicoPSU + 12V External Brick |
| **OS** | Proxmox VE or Debian |
---
## Migration Flow
```
XTRM-U (Current State)
├── Hardware (board, 4x2.5GbE) ──────────────► XTRM-N1 (1U 10" case)
│ └── New role: Survival node
│ └── New storage: 2x Fikwot 512GB
├── Storage (3.5" HDDs) ─────────────────────► XTRM-N5
├── OS (Unraid + config) ────────────────────► XTRM-N5
├── Services (containers) ───────────────────► XTRM-N5
│ └── New: Samsung NVMe pool
└── Chassis ─────────────────────────────────► Retired
```
---
## Post-Migration Architecture
### XTRM-N1 (Survival Node)
**Purpose:** Keep network alive if production server is offline
| Service | Purpose | Priority |
|---------|---------|----------|
| OPNsense | Firewall/Router (replaces HAP1 routing) | P0 |
| AdGuard Home | DNS | P0 |
| Vaultwarden | Password manager | P0 |
| Authentik (replica) | SSO backup | P1 |
### XTRM-N5 (Production Server)
**Purpose:** Primary services and storage
| Service | Purpose | Priority |
|---------|---------|----------|
| Nextcloud | File sync/storage | P1 |
| Plex | Media streaming | P2 |
| Authentik (primary) | SSO | P1 |
| Gitea | Git hosting | P2 |
| Woodpecker CI | CI/CD | P2 |
| NetBox | Network documentation | P2 |
| All other services | Various | P3 |
---
## Rack Layout (Post-Migration)
### 10" Rack (9U)
| U | Device | Notes |
|---|--------|-------|
| U9 | Shelf + ISP Gateway | Unchanged |
| U8 | PP2 (12-port) | Unchanged |
| U7 | Shelf + ZX1 (or CRS310) | Future: upgrade to CRS310 |
| U6 | **XTRM-N1** | NEW: Survival node |
| U5 | (empty) | |
| U4 | (XTRM-N5 continued) | |
| U3 | (XTRM-N5 continued) | |
| U2 | (XTRM-N5 continued) | |
| U1 | **XTRM-N5** | NEW: Production server |
---
## Software Configuration
### XTRM-N5: Unraid 7 Features
1. **Internal Boot Pool**
- Use "Add Bootable Pool" feature
- ZFS RAID-Z1 with 3x Samsung 990 EVO Plus
- Partition 1: 8GB (OS boot)
- Partition 2: ~1.8TB usable (data/containers)
- Remove USB boot stick
2. **Storage Pools**
- **nvme_pool**: 3x Samsung 990 EVO Plus (~1.8TB usable)
- **hdd_pool**: Existing 3.5" HDDs (capacity tier)
### XTRM-N1: Proxmox Configuration
1. **Storage**
- ZFS mirror: 2x Fikwot 512GB
- Boot + VMs + Containers
2. **VMs/Containers**
- OPNsense VM (if routing moves here)
- AdGuard LXC
- Vaultwarden LXC
- Authentik LXC (replica)
---
## Network Changes
### Current (HAP1 as router)
```
Internet → HAP1 (Router) → ZX1 → XTRM-U
→ CSS1 → Room outlets
```
### Option A: Keep HAP1 as Router
```
Internet → HAP1 (Router) → ZX1 → XTRM-N5
↓ → XTRM-N1 (services only)
→ CSS1 → Room outlets
```
### Option B: Move Routing to XTRM-N1
```
Internet → HAP1 (Bridge) → XTRM-N1 (OPNsense) → ZX1 → XTRM-N5
→ CSS1 → Rooms
```
**Decision:** TBD based on performance testing
---
## Thermal Management
| Strategy | Implementation |
|----------|----------------|
| Samsung NVMe in N5 | "Thermal Champions" - run cooler than Fikwot |
| Physical spacing | 1U gap between XTRM-N1 and XTRM-N5 |
| Active cooling | Exhaust fans tied to drive temp sensors |
| HDD hibernation | Aggressive spindown, NVMe handles active I/O |
---
## Future Milestone: CRS310 Upgrade
When budget allows, replace ZX1 with MikroTik CRS310-8G+2S+IN:
| Feature | Benefit |
|---------|---------|
| 10G DAC to CSS326 | Proper backbone |
| L3 offloading | Inter-VLAN routing off HAP1 |
| 8x 2.5G ports | All servers at full speed |
---
## Migration Checklist
### Pre-Migration
- [ ] Order XTRM-N5 hardware
- [ ] Order 3x Samsung 990 EVO Plus 1TB
- [ ] Order 2x Fikwot FX501Pro 512GB
- [ ] Design/print 1U 10" chassis for N1
- [ ] Order PicoPSU + 12V brick
### XTRM-N1 Setup (First)
- [ ] Extract XTRM-U motherboard
- [ ] Install in 1U chassis
- [ ] Install Fikwot NVMe drives
- [ ] Install Proxmox VE
- [ ] Deploy survival services (AdGuard, Vaultwarden)
- [ ] Test network survival (unplug XTRM-U)
### XTRM-N5 Setup
- [ ] Unbox and install Samsung NVMe drives
- [ ] Create ZFS boot pool
- [ ] Migrate Unraid config from USB
- [ ] Transfer HDD array
- [ ] Verify all services running
- [ ] Update DNS/IPs as needed
### Post-Migration
- [ ] Update NetBox with new devices
- [ ] Update documentation
- [ ] Decommission old XTRM-U chassis
- [ ] Performance testing
- [ ] Thermal monitoring
---
## Related Documents
- `GITOPS-CONTAINERS.md` - Phase 2: Container configuration in Git