Kaloyan Danchev bf6a62a275 Add incident report: disk1 hardware failure (clicking/head crash)
HGST Ultrastar 10TB drive (serial 2TKK3K1D) failed on Feb 18.
Array running degraded on parity emulation. Recovery plan documented.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 17:54:23 +02:00
2026-01-18 16:40:48 +02:00

XTRM Home Lab Infrastructure

Domain: xtrm-lab.org Repository: https://git.xtrm-lab.org/jazzymc/infrastructure


Quick Reference

Resource Address
Dashboard https://xtrm-lab.org
NetBox https://netbox.xtrm-lab.org
Git https://git.xtrm-lab.org
CI/CD https://ci.xtrm-lab.org
DNS Primary dns.xtrm-lab.org
DNS Secondary dns2.xtrm-lab.org
Failover VIP 192.168.10.250

Documentation Structure

docs/
├── 01-NETWORK-MAP.md             # Network topology, IPs, Docker networks
├── 02-SERVICES-CRITICAL.md       # DNS, Auth, Routing - must stay up
├── 03-SERVICES-OTHER.md          # All other services
├── 04-HARDWARE-INVENTORY.md      # Physical devices, specs, serials
├── 05-PORT-UTILIZATION.md        # Device port assignments
├── 06-VLAN-DEVICE-ASSIGNMENT.md  # VLAN device mapping
├── 07-WIFI-CAPSMAN-CONFIG.md     # WiFi and CAPsMAN settings
├── 08-DNS-ARCHITECTURE.md        # DNS failover architecture
├── 09-TAILSCALE-VPN.md           # Tailscale VPN setup
├── 10-FAILOVER-NOBARA.md         # VRRP failover to Nobara workstation
├── CHANGELOG.md                  # Change history
├── archive/                      # Completed/legacy docs
│   └── vlan-migration/           # VLAN migration project artifacts
├── incidents/                    # Incident reports
└── wip/                          # Work in progress

Key Devices

Device IP Role
HAP1 192.168.10.1 Router, DNS, WiFi Controller
XTRM-U 192.168.10.20 Production Server (Unraid)
XTRM-Nobara 192.168.10.103 Failover Node (Nobara Linux)
CSS1 192.168.10.3 Distribution Switch
ZX1 192.168.10.4 Core Switch (2.5G)
CAP 192.168.10.6 Wireless Access Point

SSH Access

# Unraid
ssh -i ~/.ssh/id_ed25519_unraid root@192.168.10.20 -p 422

# MikroTik Router
ssh -i ~/.ssh/mikrotik_key -p 2222 xtrm@192.168.10.1

# Nobara (failover node)
ssh nobara

Emergency Recovery

  1. DNS down? → Automatic failover to 192.168.10.10 (secondary), see 08-DNS-ARCHITECTURE.md
  2. Internet down? → Check HAP1 at 192.168.10.1
  3. Services down? → Check Unraid at 192.168.10.20
  4. Unraid maintenance? → VRRP failover to Nobara (192.168.10.250 VIP), see 10-FAILOVER-NOBARA.md
  5. Full outage? → See 02-SERVICES-CRITICAL.md startup order

Change Management

  • Major changes: Document in CHANGELOG.md
  • Minor changes: Git commit messages only
  • Planned work: Create doc in wip/ folder

CI/CD

Woodpecker CI at https://ci.xtrm-lab.org

Pipelines trigger on push to this repository.

Description
No description provided
Readme 959 KiB
Languages
RouterOS Script 81.6%
Shell 18.4%