Major documentation restructure - consolidated docs
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful

New Structure:
- 01-NETWORK-MAP.md - Network topology, IPs, Docker networks, services
- 02-SERVICES-CRITICAL.md - DNS, Auth, Routing (P0/P1 services)
- 03-SERVICES-OTHER.md - All non-critical services
- 04-HARDWARE-INVENTORY.md - Physical devices and specs
- 05-CHANGELOG.md - Major events only

New Folders:
- docs/archive/ - Legacy docs (read-only reference)
- docs/wip/ - Planned changes and ideas
  - UPGRADE-2026-HARDWARE.md - N5 Air + N100 migration plan
  - GITOPS-CONTAINERS.md - Phase 2 container GitOps

Changes:
- Moved all 22 legacy docs to archive/
- Consolidated container IPs, physical map, and services into single network map
- Extracted critical vs non-critical service classification
- Simplified changelog to major events only

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-25 11:17:33 +02:00
parent ed17dea2d6
commit b250493d5a
31 changed files with 1585 additions and 23 deletions

View File

@@ -0,0 +1,180 @@
# GitOps for Container Management
**Status:** 💡 IDEA
**Depends On:** Hardware upgrade completion
**Author:** Kaloyan
---
## Overview
Version control all container configurations to:
1. Track changes over time
2. Maintain consistency between XTRM-N5 and XTRM-N1
3. Enable automated deployments via Woodpecker CI
4. Recover from disasters quickly
---
## Repository Structure
```
infrastructure/
├── configs/
│ ├── common/ # Shared configs
│ │ ├── traefik/
│ │ │ └── dynamic.yml
│ │ └── authentik/
│ │ └── blueprints/
│ │
│ ├── xtrm-n5/ # Production server
│ │ ├── docker/
│ │ │ ├── compose/ # docker-compose files
│ │ │ │ ├── netbox.yml
│ │ │ │ ├── gitea.yml
│ │ │ │ └── ...
│ │ │ ├── templates/ # Unraid XML templates
│ │ │ └── env/ # Environment files (.env.example)
│ │ ├── network/
│ │ │ └── docker-networks.json
│ │ └── unraid/
│ │ ├── shares.json
│ │ └── users.json
│ │
│ └── xtrm-n1/ # Survival node
│ ├── docker/
│ │ └── compose/
│ │ ├── adguard.yml
│ │ ├── vaultwarden.yml
│ │ └── authentik-replica.yml
│ └── proxmox/
│ └── vm-configs/
└── .woodpecker.yml
```
---
## Workflow
### 1. Change Detection
```mermaid
flowchart LR
A[Edit config in Git] --> B[Push to main]
B --> C[Woodpecker CI triggers]
C --> D{Validate configs}
D -->|Pass| E[Deploy to target server]
D -->|Fail| F[Notify & block]
```
### 2. Drift Detection
```mermaid
flowchart LR
A[Scheduled job] --> B[Export current state]
B --> C{Compare to Git}
C -->|Match| D[All good]
C -->|Drift| E[Alert + PR with diff]
```
---
## Implementation Phases
### Phase 2.1: Export Current State
1. Export all docker-compose files
2. Export Unraid container templates (XML → YAML)
3. Export network configurations
4. Create initial commit
### Phase 2.2: CI Pipeline
```yaml
# .woodpecker.yml
pipeline:
validate:
image: docker:latest
commands:
- docker compose -f configs/xtrm-n5/docker/compose/*.yml config
deploy-n5:
image: alpine/ssh
when:
path: configs/xtrm-n5/**
commands:
- ssh root@192.168.31.2 "cd /path && docker compose up -d"
secrets: [ssh_key]
deploy-n1:
image: alpine/ssh
when:
path: configs/xtrm-n1/**
commands:
- ssh root@xtrm-n1 "cd /path && docker compose up -d"
secrets: [ssh_key]
```
### Phase 2.3: Drift Detection
Scheduled Woodpecker job:
1. SSH to each server
2. Export current docker/network state
3. Compare to Git configs
4. Create issue/PR if drift detected
### Phase 2.4: Unraid GUI Sync
**Challenge:** Changes made in Unraid GUI need to sync to Git
**Solution Options:**
| Option | Pros | Cons |
|--------|------|------|
| **A: Webhook on change** | Real-time sync | Complex, needs Unraid plugin |
| **B: Scheduled export** | Simple, reliable | Delay between change and commit |
| **C: Prohibit GUI changes** | Clean workflow | User friction |
**Recommended:** Option B with daily scheduled exports
```bash
# Cron job on Unraid
0 4 * * * /boot/config/scripts/export-docker-config.sh
```
---
## Secrets Management
**Options:**
| Tool | Integration | Complexity |
|------|-------------|------------|
| Woodpecker Secrets | Native | Low |
| Vaultwarden API | Via script | Medium |
| HashiCorp Vault | Enterprise | High |
**Recommended:** Woodpecker Secrets for CI, `.env.example` in Git
```yaml
# In docker-compose
services:
app:
env_file:
- .env # Not in Git, created from .env.example + secrets
```
---
## Rollback Strategy
1. **Git revert** - Revert commit, CI redeploys previous version
2. **Tagged releases** - Deploy specific tag
3. **Manual override** - SSH and docker compose down/up
---
## Related Documents
- `UPGRADE-2026-HARDWARE.md` - Hardware prerequisite

16
docs/wip/README.md Normal file
View File

@@ -0,0 +1,16 @@
# Work In Progress
This folder contains planned changes, evaluations, and ideas that are not yet implemented.
## Document Status
| Status | Meaning |
|--------|---------|
| 📋 PLANNED | Approved, waiting for resources/time |
| 🔬 EVALUATING | Under investigation/research |
| 💡 IDEA | Concept, needs further definition |
## Current Items
- `UPGRADE-2026-HARDWARE.md` - Hardware upgrade plan (N5 Air + N100)
- `GITOPS-CONTAINERS.md` - Container GitOps implementation (Phase 2)

View File

@@ -0,0 +1,229 @@
# Hardware Upgrade Plan 2026
**Status:** 📋 PLANNED
**Target:** Q1 2026
**Author:** Kaloyan
---
## Overview
Migration from single-server architecture to dual-server with survival capabilities.
## Device Naming Convention
| Name | Hardware | Role | Notes |
|------|----------|------|-------|
| **XTRM-U** | Current NAS chassis | Current production | Will be decommissioned |
| **XTRM-N5** | Minisforum N5 Air | New production NAS | Inherits XTRM-U storage, OS, services + new NVMe pool |
| **XTRM-N1** | N100 ITX (XTRM-U board) | Survival node | Same 4x2.5GbE NIC, 1U 10" case, no HDDs |
---
## Hardware Specifications
### XTRM-N5 (Minisforum N5 Air) - Production Server
| Component | Specification |
|-----------|---------------|
| **CPU** | Intel N95/N100 or similar |
| **RAM** | TBD |
| **Storage (Fast)** | 3x Samsung 990 EVO Plus 1TB (NVMe ZFS RAID-Z1) |
| **Storage (Capacity)** | Existing 3.5" HDD Array (from XTRM-U) |
| **Network** | 10GbE Primary + 5GbE Secondary |
| **OS** | Unraid 7 (internal boot from NVMe pool) |
### XTRM-N1 (N100 ITX) - Survival Node
| Component | Specification |
|-----------|---------------|
| **Board** | XTRM-U motherboard (4x 2.5GbE Intel NICs) |
| **Chassis** | 1U 10" Rackmount (modular/3D printed) |
| **Storage** | 2x Fikwot FX501Pro 512GB (NVMe mirror) |
| **Power** | PicoPSU + 12V External Brick |
| **OS** | Proxmox VE or Debian |
---
## Migration Flow
```
XTRM-U (Current State)
├── Hardware (board, 4x2.5GbE) ──────────────► XTRM-N1 (1U 10" case)
│ └── New role: Survival node
│ └── New storage: 2x Fikwot 512GB
├── Storage (3.5" HDDs) ─────────────────────► XTRM-N5
├── OS (Unraid + config) ────────────────────► XTRM-N5
├── Services (containers) ───────────────────► XTRM-N5
│ └── New: Samsung NVMe pool
└── Chassis ─────────────────────────────────► Retired
```
---
## Post-Migration Architecture
### XTRM-N1 (Survival Node)
**Purpose:** Keep network alive if production server is offline
| Service | Purpose | Priority |
|---------|---------|----------|
| OPNsense | Firewall/Router (replaces HAP1 routing) | P0 |
| AdGuard Home | DNS | P0 |
| Vaultwarden | Password manager | P0 |
| Authentik (replica) | SSO backup | P1 |
### XTRM-N5 (Production Server)
**Purpose:** Primary services and storage
| Service | Purpose | Priority |
|---------|---------|----------|
| Nextcloud | File sync/storage | P1 |
| Plex | Media streaming | P2 |
| Authentik (primary) | SSO | P1 |
| Gitea | Git hosting | P2 |
| Woodpecker CI | CI/CD | P2 |
| NetBox | Network documentation | P2 |
| All other services | Various | P3 |
---
## Rack Layout (Post-Migration)
### 10" Rack (9U)
| U | Device | Notes |
|---|--------|-------|
| U9 | Shelf + ISP Gateway | Unchanged |
| U8 | PP2 (12-port) | Unchanged |
| U7 | Shelf + ZX1 (or CRS310) | Future: upgrade to CRS310 |
| U6 | **XTRM-N1** | NEW: Survival node |
| U5 | (empty) | |
| U4 | (XTRM-N5 continued) | |
| U3 | (XTRM-N5 continued) | |
| U2 | (XTRM-N5 continued) | |
| U1 | **XTRM-N5** | NEW: Production server |
---
## Software Configuration
### XTRM-N5: Unraid 7 Features
1. **Internal Boot Pool**
- Use "Add Bootable Pool" feature
- ZFS RAID-Z1 with 3x Samsung 990 EVO Plus
- Partition 1: 8GB (OS boot)
- Partition 2: ~1.8TB usable (data/containers)
- Remove USB boot stick
2. **Storage Pools**
- **nvme_pool**: 3x Samsung 990 EVO Plus (~1.8TB usable)
- **hdd_pool**: Existing 3.5" HDDs (capacity tier)
### XTRM-N1: Proxmox Configuration
1. **Storage**
- ZFS mirror: 2x Fikwot 512GB
- Boot + VMs + Containers
2. **VMs/Containers**
- OPNsense VM (if routing moves here)
- AdGuard LXC
- Vaultwarden LXC
- Authentik LXC (replica)
---
## Network Changes
### Current (HAP1 as router)
```
Internet → HAP1 (Router) → ZX1 → XTRM-U
→ CSS1 → Room outlets
```
### Option A: Keep HAP1 as Router
```
Internet → HAP1 (Router) → ZX1 → XTRM-N5
↓ → XTRM-N1 (services only)
→ CSS1 → Room outlets
```
### Option B: Move Routing to XTRM-N1
```
Internet → HAP1 (Bridge) → XTRM-N1 (OPNsense) → ZX1 → XTRM-N5
→ CSS1 → Rooms
```
**Decision:** TBD based on performance testing
---
## Thermal Management
| Strategy | Implementation |
|----------|----------------|
| Samsung NVMe in N5 | "Thermal Champions" - run cooler than Fikwot |
| Physical spacing | 1U gap between XTRM-N1 and XTRM-N5 |
| Active cooling | Exhaust fans tied to drive temp sensors |
| HDD hibernation | Aggressive spindown, NVMe handles active I/O |
---
## Future Milestone: CRS310 Upgrade
When budget allows, replace ZX1 with MikroTik CRS310-8G+2S+IN:
| Feature | Benefit |
|---------|---------|
| 10G DAC to CSS326 | Proper backbone |
| L3 offloading | Inter-VLAN routing off HAP1 |
| 8x 2.5G ports | All servers at full speed |
---
## Migration Checklist
### Pre-Migration
- [ ] Order XTRM-N5 hardware
- [ ] Order 3x Samsung 990 EVO Plus 1TB
- [ ] Order 2x Fikwot FX501Pro 512GB
- [ ] Design/print 1U 10" chassis for N1
- [ ] Order PicoPSU + 12V brick
### XTRM-N1 Setup (First)
- [ ] Extract XTRM-U motherboard
- [ ] Install in 1U chassis
- [ ] Install Fikwot NVMe drives
- [ ] Install Proxmox VE
- [ ] Deploy survival services (AdGuard, Vaultwarden)
- [ ] Test network survival (unplug XTRM-U)
### XTRM-N5 Setup
- [ ] Unbox and install Samsung NVMe drives
- [ ] Create ZFS boot pool
- [ ] Migrate Unraid config from USB
- [ ] Transfer HDD array
- [ ] Verify all services running
- [ ] Update DNS/IPs as needed
### Post-Migration
- [ ] Update NetBox with new devices
- [ ] Update documentation
- [ ] Decommission old XTRM-U chassis
- [ ] Performance testing
- [ ] Thermal monitoring
---
## Related Documents
- `GITOPS-CONTAINERS.md` - Phase 2: Container configuration in Git