# Infrastructure Changelog **Purpose:** Major infrastructure events only. Minor changes are in git commit messages. --- ## 2026-02-14 ### CAP XL ac Recovery - **[WIRELESS]** Factory reset CAP XL ac (lost credentials) - **[WIRELESS]** Reconfigured CAPsMAN: regenerated certificate, CAP re-enrolled with `certificate=request` - **[WIRELESS]** Both CAP radios now active: wifi1 (2.4GHz XTRM2) + wifi2 (5GHz XTRM) - **[WIRELESS]** CAP now running RouterOS 7.21.1 - **[WIRELESS]** Enabled SSH on CAP port 2222 for user xtrm with mikrotik key - **[WIRELESS]** Confirmed WiFi access list has no VLAN assignment (rolled back Jan 27) ### Roms Network Share - **[SERVICE]** Shared /mnt/user/roms (2.3TB, 49 systems) via SMB from Unraid - **[SERVICE]** Mounted on Nobara at /mnt/roms (fstab, CIFS guest, systemd.automount) - **[SERVICE]** Mounted on Recalbox via custom.sh boot script (CIFS bind mounts) - **[SERVICE]** Deleted local roms from Recalbox SD card (~12.5GB freed) ### Documentation Updates - **[DOCS]** Updated 07-WIFI-CAPSMAN-CONFIG.md: CAP both radios working, access list status - **[DOCS]** Updated 01-NETWORK-MAP.md: Fixed CAP IP (.6→.2), added Nobara and SMB shares - **[DOCS]** Updated 04-HARDWARE-INVENTORY.md: CAP details, added Recalbox device - **[DOCS]** Updated 06-VLAN-DEVICE-ASSIGNMENT.md: Added Nobara (VLAN 10) and Recalbox (VLAN 25) - **[DOCS]** Updated 03-SERVICES-OTHER.md: Added Roms SMB share section --- ## 2026-02-13 ### Failover Infrastructure Deployed - **[SERVICE]** Deployed Docker failover stack on XTRM-Nobara (Traefik, Vaultwarden, Authentik, AdGuard Home) - **[SERVICE]** Installed Docker CE 29.2.1 + Docker Compose 5.0.2 on Nobara - **[SERVICE]** Deployed Keepalived VRRP for automatic failover (VIP: 192.168.10.250) - **[SERVICE]** Unraid: Keepalived as Docker container (local/keepalived, MASTER priority 150) - **[SERVICE]** Nobara: Keepalived as systemd service (BACKUP priority 100) - **[SERVICE]** Replicated data: Vaultwarden DB, Authentik PostgreSQL dump (864MB), AdGuard config, Traefik certs - **[NETWORK]** Added VRRP protocol to Nobara firewall (firewalld) - **[NETWORK]** Configured SSH key auth to Nobara (id_ed25519_nobara, passwordless) - **[NETWORK]** Added SSH config alias: `ssh nobara` - **[DOCS]** Created 10-FAILOVER-NOBARA.md with full failover documentation - **[DOCS]** Updated 02-SERVICES-CRITICAL.md with failover section - **[DOCS]** Updated 04-HARDWARE-INVENTORY.md with XTRM-Nobara specs - **[DOCS]** Updated README.md and CLAUDE.md with Nobara references --- ## 2026-02-06 ### Unraid Flash Drive Failure - **[INCIDENT]** Unraid flash drive crashing - migration procedure created - **[DOCS]** Created incident report with full flash drive replacement procedure ### Documentation Restructure - **[DOCS]** Restructured docs/ from 23 files to clean 9-doc structure - **[DOCS]** Archived 12 completed VLAN migration project docs to archive/vlan-migration/ - **[DOCS]** Archived 5 done/superseded WIP docs (VLAN proposals, AI stack, Fossorial, DNS backup) - **[DOCS]** Created standing reference docs: 08-DNS-ARCHITECTURE.md, 09-TAILSCALE-VPN.md - **[DOCS]** Renamed docs to clean numbering (05-PORT-UTILIZATION, 06-VLAN-DEVICE-ASSIGNMENT, 07-WIFI-CAPSMAN-CONFIG) - **[DOCS]** Merged 00-CHANGELOG.md + 06-CHANGELOG.md → CHANGELOG.md - **[DOCS]** Updated all core docs with current VLAN IPs (192.168.31.x → 192.168.10.x) - **[DOCS]** Fixed CSS1 IP: 192.168.10.9 → 192.168.10.3, ZX1 IP: 192.168.10.7 → 192.168.10.4 - **[DOCS]** Cleaned 06-VLAN-DEVICE-ASSIGNMENT.md: removed migration-era columns and sections, fixed VLAN 25 subnet - **[DOCS]** Updated README.md, CLAUDE.md, archive/README.md, wip/README.md --- ## 2026-02-01 ### WIP Documentation - **[DOCS]** Added KVM-SWITCH-MAC-NOBARA.md - Software KVM for Mac/Nobara switching - DDC/CI monitor control (Dell U3821DW) + HID++ Logitech peripheral switching - Scripts created on Mac at ~/scripts/ --- ## 2026-01-31 ### Docker Cleanup - **[DOCKER]** Removed 18 unused images (~4.9 GB reclaimed) - **[DOCKER]** Removed 12 dangling images (old builds, untagged) - **[DOCKER]** Removed Slurpit stack images (warehouse, portal, scanner, scraper) - **[DOCKER]** Removed unused MongoDB 8 and MariaDB 11 images - **[DOCKER]** Removed 35 orphaned volumes (~1.15 GB reclaimed) - **[DOCKER]** Removed 28 anonymous dangling volumes - **[DOCKER]** Removed 6 nextcloud_aio_* volumes (from old AIO install) - **[DOCKER]** Removed orphaned redis-data volume - **[DOCKER]** **Total reclaimed: ~6 GB** ### Kept (Stopped Containers) - open-webui, ollama (AI stack - for future use) - pgAdmin4 (database management) - diode-hydra-migrate, diode-auth-bootstrap (one-time migration jobs) --- ## 2026-01-27 ### VLAN Filtering Rolled Back - **[VLAN]** Enabled VLAN filtering - caused connectivity issues - **[VLAN]** ZX1 switch unreachable after activation (no management IP responding) - **[VLAN]** CSS326 traffic routing through ZX1 (not direct eth3 link) - **[VLAN]** **Rolled back** - VLAN filtering disabled - **[CONFIG]** Added eth4 (ZX1) to all VLAN tagged lists for future use - **[STATUS]** Network back to Legacy mode (192.168.31.0/24) - **[TODO]** Need physical access to ZX1 to configure VLAN trunking ### Issues Identified - ZX1 switch not responding on documented IP 192.168.31.22 - ZX1 may need VLAN trunk configuration before re-enabling filtering - All CSS326 traffic goes via ZX1→HAP1, not direct CSS326→HAP1 link (STP?) --- ## 2026-01-26 ### VLAN Filtering Activated - **[VLAN]** VLAN filtering enabled on MikroTik bridge - SUCCESSFUL - **[VLAN]** Internet connectivity verified (ping 1.1.1.1, google.com) - **[VLAN]** DNS resolution working through AdGuard - **[VLAN]** All previous fixes (DHCP DNS, firewall, NAT masquerade) working correctly - **[STATUS]** Network segmentation now ACTIVE ### Local AI Stack Deployed - **[AI]** Deployed Ollama container with Intel GPU passthrough - **[AI]** Deployed Open WebUI at http://192.168.31.2:3080 - **[AI]** Installed qwen2.5-coder:7b base model - **[AI]** Created custom `unraid-assistant` model with infrastructure knowledge - **[AI]** Created `/usr/local/bin/ai` terminal helper command - **[AI]** Stopped non-critical containers for RAM: karakeep, unimus, homarr, netdisco-* ### VLAN Activation Attempt & Fixes - **[VLAN]** Configured CSS326 switch VLANs via SwOS web interface - **[VLAN]** Enabled VLAN filtering on MikroTik - caused internet outage - **[VLAN]** Rolled back VLAN filtering to restore connectivity - **[VLAN]** **ROOT CAUSE IDENTIFIED:** Multiple configuration issues ### Issues Fixed - **[FIX]** DHCP DNS now points to each VLAN gateway instead of legacy 192.168.31.1 - **[FIX]** Added DNS redirect rules for all VLANs (src-address-list=all-vlans) - **[FIX]** Added all VLAN interfaces to LAN firewall interface list - **[FIX]** Added NAT masquerade rules for VLAN traffic to AdGuard container - **[BACKUP]** MikroTik config saved before activation attempt --- ## 2026-01-25 ### VLAN Phase 1 Complete - **[VLAN]** Added VLAN 25 (Kids) - interface, IP, DHCP server, pool, bridge entry - **[VLAN]** Fixed VLAN 10 (Management) leases - correct IPs per device assignment doc - **[VLAN]** Fixed VLAN 30 (IoT) leases - all 14 devices with correct IPs - **[VLAN]** Added VLAN 25 (Kids) leases - 6 devices including XTRM-Ally - **[VLAN]** Added VLAN 50 (Guest) leases - 7 unknown devices - **[VLAN]** Added firewall rules for VLAN 25 (Kids → IoT, Legacy, DNS) - **[VLAN]** Total devices configured: 44 ### VLAN Implementation (Prepared) - **[VLAN]** Created 6 VLANs on MikroTik bridge (10, 20, 30, 35, 40, 50) - **[VLAN]** Configured IP addresses for all VLAN interfaces - **[VLAN]** Created DHCP servers and pools for each VLAN - **[VLAN]** Added static DHCP leases mapping MACs to VLAN IPs - **[VLAN]** Configured bridge VLAN table with tagged/untagged ports - **[VLAN]** Set WiFi ports PVID=20 (Trusted VLAN default) - **[VLAN]** Added inter-VLAN firewall rules (active) - **[VLAN]** VLAN filtering NOT YET ENABLED (pending CSS326 switch config) - **[DOCS]** Added docs/11-VLAN-IMPLEMENTATION.md - **[SCRIPTS]** Added scripts/mikrotik-vlan-setup.rsc and mikrotik-vlan-enable.rsc ### DNS Configuration - **[DNS]** Updated both AdGuard instances to use Quad9 DoH - **[DNS]** Bootstrap DNS: 9.9.9.9, 149.112.112.112 ### MikroTik Containers - **[CONTAINER]** AdGuard Home container running on MikroTik (172.17.0.2) - **[CONTAINER]** Tailscale container configured (172.17.0.3) - **[CONTAINER]** Fixed Tailscale container authentication - **[CONTAINER]** Container bridge (containers-br) with NAT ### Network - **[NETWORK]** Enabled CSS326 SFP1 port - 10G backbone link to ZX1 now active ### Documentation - **[DOCS]** Created 02-PORT-UTILIZATION.md with ASCII port diagrams - **[DOCS]** Fixed ZX1 switch IP: 192.168.31.22 (was incorrectly documented as .7) ### Incident - **[INCIDENT]** DNS outage after MikroTik restart - multiple root causes fixed: - NAT rules blocking AdGuard outbound DNS (added exception rules) - DHCP pushing wrong DNS (8.8.8.8 → 192.168.31.1) - NAT redirect pointing to wrong IP/port (172.17.0.5:5355 → 192.168.31.4:53) - Asymmetric routing (added srcnat masquerade for DNS redirect) - **[SERVICE]** Removed MikroTik AdGuard Home container (storage/overlay errors) - **[SERVICE]** Removed MikroTik Tailscale container (root directory missing) - **[SERVICE]** Removed Pi-hole/Unbound leftovers from MikroTik (veth, mounts, envs) - **[NETWORK]** Consolidated DNS architecture: MikroTik → Unraid AdGuard (192.168.31.4) only - **[DOCS]** Created incident reports in docs/incidents/ - **[DOCS]** Restructured documentation - consolidated into 5 core docs + archive - **[NETBOX]** Added shelf devices for rack organization (U9, U7, U3) --- ## 2026-01-24 - **[NETBOX]** Standardized device names to NetBox convention (HAP1, CSS1, ZX1) - **[DOCS]** Created NETWORK-PHYSICAL-MAP.md with complete port maps --- ## 2026-01-23 - **[SERVICE]** Deployed Diode network discovery stack - **[SERVICE]** Removed Slurp'it (replaced by Diode + NetDisco) - **[SERVICE]** Consolidated NetBox Redis to shared instance - **[SERVICE]** Removed redundant DNS services (Unbound, DoH-Server, stunnel-dot) --- ## 2026-01-22 - **[SERVICE]** Migrated NetBox to shared PostgreSQL 17 - **[SERVICE]** Deployed AdGuard Home on MikroTik (primary DNS) - **[SERVICE]** Deployed AdGuard Home on Unraid (secondary DNS) - **[SERVICE]** Removed Pi-hole (replaced by AdGuard Home) - **[DOCS]** Created INFRASTRUCTURE-DIAGRAM.md --- ## 2026-01-21 - **[BACKUP]** Configured Rclone sync to Google Drive --- ## 2026-01-19 - **[SERVICE]** Deployed NetBox IPAM/DCIM - **[SERVICE]** Deployed NetDisco network discovery - **[NETWORK]** Enabled SNMP on all MikroTik devices --- ## 2026-01-18 - **[SERVICE]** Deployed Gitea git server - **[SERVICE]** Deployed Woodpecker CI - **[NETWORK]** Configured CAPsMAN on HAP1 - **[WIRELESS]** CAP added to CAPsMAN management --- ## 2026-01-17 - **[SERVICE]** Deployed Portainer CE --- ## Previous History For detailed history before 2026-01-17, see archived changelogs in `archive/`. --- ## Format Guide ```markdown ### YYYY-MM-DD - **[CATEGORY]** Brief description Categories: - [DEVICE] - Hardware added/removed/changed - [SERVICE] - Container/service deployed/removed - [NETWORK] - Network topology/config changes - [WIRELESS] - WiFi/CAPsMAN changes - [BACKUP] - Backup configuration - [DOCS] - Major documentation changes - [INCIDENT] - Outages and fixes - [VLAN] - VLAN configuration changes - [DOCKER] - Docker maintenance ```