Major documentation restructure - consolidated docs
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
New Structure: - 01-NETWORK-MAP.md - Network topology, IPs, Docker networks, services - 02-SERVICES-CRITICAL.md - DNS, Auth, Routing (P0/P1 services) - 03-SERVICES-OTHER.md - All non-critical services - 04-HARDWARE-INVENTORY.md - Physical devices and specs - 05-CHANGELOG.md - Major events only New Folders: - docs/archive/ - Legacy docs (read-only reference) - docs/wip/ - Planned changes and ideas - UPGRADE-2026-HARDWARE.md - N5 Air + N100 migration plan - GITOPS-CONTAINERS.md - Phase 2 container GitOps Changes: - Moved all 22 legacy docs to archive/ - Consolidated container IPs, physical map, and services into single network map - Extracted critical vs non-critical service classification - Simplified changelog to major events only Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
180
docs/wip/GITOPS-CONTAINERS.md
Normal file
180
docs/wip/GITOPS-CONTAINERS.md
Normal file
@@ -0,0 +1,180 @@
|
||||
# GitOps for Container Management
|
||||
|
||||
**Status:** 💡 IDEA
|
||||
**Depends On:** Hardware upgrade completion
|
||||
**Author:** Kaloyan
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Version control all container configurations to:
|
||||
1. Track changes over time
|
||||
2. Maintain consistency between XTRM-N5 and XTRM-N1
|
||||
3. Enable automated deployments via Woodpecker CI
|
||||
4. Recover from disasters quickly
|
||||
|
||||
---
|
||||
|
||||
## Repository Structure
|
||||
|
||||
```
|
||||
infrastructure/
|
||||
├── configs/
|
||||
│ ├── common/ # Shared configs
|
||||
│ │ ├── traefik/
|
||||
│ │ │ └── dynamic.yml
|
||||
│ │ └── authentik/
|
||||
│ │ └── blueprints/
|
||||
│ │
|
||||
│ ├── xtrm-n5/ # Production server
|
||||
│ │ ├── docker/
|
||||
│ │ │ ├── compose/ # docker-compose files
|
||||
│ │ │ │ ├── netbox.yml
|
||||
│ │ │ │ ├── gitea.yml
|
||||
│ │ │ │ └── ...
|
||||
│ │ │ ├── templates/ # Unraid XML templates
|
||||
│ │ │ └── env/ # Environment files (.env.example)
|
||||
│ │ ├── network/
|
||||
│ │ │ └── docker-networks.json
|
||||
│ │ └── unraid/
|
||||
│ │ ├── shares.json
|
||||
│ │ └── users.json
|
||||
│ │
|
||||
│ └── xtrm-n1/ # Survival node
|
||||
│ ├── docker/
|
||||
│ │ └── compose/
|
||||
│ │ ├── adguard.yml
|
||||
│ │ ├── vaultwarden.yml
|
||||
│ │ └── authentik-replica.yml
|
||||
│ └── proxmox/
|
||||
│ └── vm-configs/
|
||||
│
|
||||
└── .woodpecker.yml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Workflow
|
||||
|
||||
### 1. Change Detection
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Edit config in Git] --> B[Push to main]
|
||||
B --> C[Woodpecker CI triggers]
|
||||
C --> D{Validate configs}
|
||||
D -->|Pass| E[Deploy to target server]
|
||||
D -->|Fail| F[Notify & block]
|
||||
```
|
||||
|
||||
### 2. Drift Detection
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[Scheduled job] --> B[Export current state]
|
||||
B --> C{Compare to Git}
|
||||
C -->|Match| D[All good]
|
||||
C -->|Drift| E[Alert + PR with diff]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 2.1: Export Current State
|
||||
|
||||
1. Export all docker-compose files
|
||||
2. Export Unraid container templates (XML → YAML)
|
||||
3. Export network configurations
|
||||
4. Create initial commit
|
||||
|
||||
### Phase 2.2: CI Pipeline
|
||||
|
||||
```yaml
|
||||
# .woodpecker.yml
|
||||
pipeline:
|
||||
validate:
|
||||
image: docker:latest
|
||||
commands:
|
||||
- docker compose -f configs/xtrm-n5/docker/compose/*.yml config
|
||||
|
||||
deploy-n5:
|
||||
image: alpine/ssh
|
||||
when:
|
||||
path: configs/xtrm-n5/**
|
||||
commands:
|
||||
- ssh root@192.168.31.2 "cd /path && docker compose up -d"
|
||||
secrets: [ssh_key]
|
||||
|
||||
deploy-n1:
|
||||
image: alpine/ssh
|
||||
when:
|
||||
path: configs/xtrm-n1/**
|
||||
commands:
|
||||
- ssh root@xtrm-n1 "cd /path && docker compose up -d"
|
||||
secrets: [ssh_key]
|
||||
```
|
||||
|
||||
### Phase 2.3: Drift Detection
|
||||
|
||||
Scheduled Woodpecker job:
|
||||
1. SSH to each server
|
||||
2. Export current docker/network state
|
||||
3. Compare to Git configs
|
||||
4. Create issue/PR if drift detected
|
||||
|
||||
### Phase 2.4: Unraid GUI Sync
|
||||
|
||||
**Challenge:** Changes made in Unraid GUI need to sync to Git
|
||||
|
||||
**Solution Options:**
|
||||
|
||||
| Option | Pros | Cons |
|
||||
|--------|------|------|
|
||||
| **A: Webhook on change** | Real-time sync | Complex, needs Unraid plugin |
|
||||
| **B: Scheduled export** | Simple, reliable | Delay between change and commit |
|
||||
| **C: Prohibit GUI changes** | Clean workflow | User friction |
|
||||
|
||||
**Recommended:** Option B with daily scheduled exports
|
||||
|
||||
```bash
|
||||
# Cron job on Unraid
|
||||
0 4 * * * /boot/config/scripts/export-docker-config.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Secrets Management
|
||||
|
||||
**Options:**
|
||||
|
||||
| Tool | Integration | Complexity |
|
||||
|------|-------------|------------|
|
||||
| Woodpecker Secrets | Native | Low |
|
||||
| Vaultwarden API | Via script | Medium |
|
||||
| HashiCorp Vault | Enterprise | High |
|
||||
|
||||
**Recommended:** Woodpecker Secrets for CI, `.env.example` in Git
|
||||
|
||||
```yaml
|
||||
# In docker-compose
|
||||
services:
|
||||
app:
|
||||
env_file:
|
||||
- .env # Not in Git, created from .env.example + secrets
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Rollback Strategy
|
||||
|
||||
1. **Git revert** - Revert commit, CI redeploys previous version
|
||||
2. **Tagged releases** - Deploy specific tag
|
||||
3. **Manual override** - SSH and docker compose down/up
|
||||
|
||||
---
|
||||
|
||||
## Related Documents
|
||||
|
||||
- `UPGRADE-2026-HARDWARE.md` - Hardware prerequisite
|
||||
16
docs/wip/README.md
Normal file
16
docs/wip/README.md
Normal file
@@ -0,0 +1,16 @@
|
||||
# Work In Progress
|
||||
|
||||
This folder contains planned changes, evaluations, and ideas that are not yet implemented.
|
||||
|
||||
## Document Status
|
||||
|
||||
| Status | Meaning |
|
||||
|--------|---------|
|
||||
| 📋 PLANNED | Approved, waiting for resources/time |
|
||||
| 🔬 EVALUATING | Under investigation/research |
|
||||
| 💡 IDEA | Concept, needs further definition |
|
||||
|
||||
## Current Items
|
||||
|
||||
- `UPGRADE-2026-HARDWARE.md` - Hardware upgrade plan (N5 Air + N100)
|
||||
- `GITOPS-CONTAINERS.md` - Container GitOps implementation (Phase 2)
|
||||
229
docs/wip/UPGRADE-2026-HARDWARE.md
Normal file
229
docs/wip/UPGRADE-2026-HARDWARE.md
Normal file
@@ -0,0 +1,229 @@
|
||||
# Hardware Upgrade Plan 2026
|
||||
|
||||
**Status:** 📋 PLANNED
|
||||
**Target:** Q1 2026
|
||||
**Author:** Kaloyan
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Migration from single-server architecture to dual-server with survival capabilities.
|
||||
|
||||
## Device Naming Convention
|
||||
|
||||
| Name | Hardware | Role | Notes |
|
||||
|------|----------|------|-------|
|
||||
| **XTRM-U** | Current NAS chassis | Current production | Will be decommissioned |
|
||||
| **XTRM-N5** | Minisforum N5 Air | New production NAS | Inherits XTRM-U storage, OS, services + new NVMe pool |
|
||||
| **XTRM-N1** | N100 ITX (XTRM-U board) | Survival node | Same 4x2.5GbE NIC, 1U 10" case, no HDDs |
|
||||
|
||||
---
|
||||
|
||||
## Hardware Specifications
|
||||
|
||||
### XTRM-N5 (Minisforum N5 Air) - Production Server
|
||||
|
||||
| Component | Specification |
|
||||
|-----------|---------------|
|
||||
| **CPU** | Intel N95/N100 or similar |
|
||||
| **RAM** | TBD |
|
||||
| **Storage (Fast)** | 3x Samsung 990 EVO Plus 1TB (NVMe ZFS RAID-Z1) |
|
||||
| **Storage (Capacity)** | Existing 3.5" HDD Array (from XTRM-U) |
|
||||
| **Network** | 10GbE Primary + 5GbE Secondary |
|
||||
| **OS** | Unraid 7 (internal boot from NVMe pool) |
|
||||
|
||||
### XTRM-N1 (N100 ITX) - Survival Node
|
||||
|
||||
| Component | Specification |
|
||||
|-----------|---------------|
|
||||
| **Board** | XTRM-U motherboard (4x 2.5GbE Intel NICs) |
|
||||
| **Chassis** | 1U 10" Rackmount (modular/3D printed) |
|
||||
| **Storage** | 2x Fikwot FX501Pro 512GB (NVMe mirror) |
|
||||
| **Power** | PicoPSU + 12V External Brick |
|
||||
| **OS** | Proxmox VE or Debian |
|
||||
|
||||
---
|
||||
|
||||
## Migration Flow
|
||||
|
||||
```
|
||||
XTRM-U (Current State)
|
||||
├── Hardware (board, 4x2.5GbE) ──────────────► XTRM-N1 (1U 10" case)
|
||||
│ └── New role: Survival node
|
||||
│ └── New storage: 2x Fikwot 512GB
|
||||
│
|
||||
├── Storage (3.5" HDDs) ─────────────────────► XTRM-N5
|
||||
├── OS (Unraid + config) ────────────────────► XTRM-N5
|
||||
├── Services (containers) ───────────────────► XTRM-N5
|
||||
│ └── New: Samsung NVMe pool
|
||||
│
|
||||
└── Chassis ─────────────────────────────────► Retired
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Post-Migration Architecture
|
||||
|
||||
### XTRM-N1 (Survival Node)
|
||||
|
||||
**Purpose:** Keep network alive if production server is offline
|
||||
|
||||
| Service | Purpose | Priority |
|
||||
|---------|---------|----------|
|
||||
| OPNsense | Firewall/Router (replaces HAP1 routing) | P0 |
|
||||
| AdGuard Home | DNS | P0 |
|
||||
| Vaultwarden | Password manager | P0 |
|
||||
| Authentik (replica) | SSO backup | P1 |
|
||||
|
||||
### XTRM-N5 (Production Server)
|
||||
|
||||
**Purpose:** Primary services and storage
|
||||
|
||||
| Service | Purpose | Priority |
|
||||
|---------|---------|----------|
|
||||
| Nextcloud | File sync/storage | P1 |
|
||||
| Plex | Media streaming | P2 |
|
||||
| Authentik (primary) | SSO | P1 |
|
||||
| Gitea | Git hosting | P2 |
|
||||
| Woodpecker CI | CI/CD | P2 |
|
||||
| NetBox | Network documentation | P2 |
|
||||
| All other services | Various | P3 |
|
||||
|
||||
---
|
||||
|
||||
## Rack Layout (Post-Migration)
|
||||
|
||||
### 10" Rack (9U)
|
||||
|
||||
| U | Device | Notes |
|
||||
|---|--------|-------|
|
||||
| U9 | Shelf + ISP Gateway | Unchanged |
|
||||
| U8 | PP2 (12-port) | Unchanged |
|
||||
| U7 | Shelf + ZX1 (or CRS310) | Future: upgrade to CRS310 |
|
||||
| U6 | **XTRM-N1** | NEW: Survival node |
|
||||
| U5 | (empty) | |
|
||||
| U4 | (XTRM-N5 continued) | |
|
||||
| U3 | (XTRM-N5 continued) | |
|
||||
| U2 | (XTRM-N5 continued) | |
|
||||
| U1 | **XTRM-N5** | NEW: Production server |
|
||||
|
||||
---
|
||||
|
||||
## Software Configuration
|
||||
|
||||
### XTRM-N5: Unraid 7 Features
|
||||
|
||||
1. **Internal Boot Pool**
|
||||
- Use "Add Bootable Pool" feature
|
||||
- ZFS RAID-Z1 with 3x Samsung 990 EVO Plus
|
||||
- Partition 1: 8GB (OS boot)
|
||||
- Partition 2: ~1.8TB usable (data/containers)
|
||||
- Remove USB boot stick
|
||||
|
||||
2. **Storage Pools**
|
||||
- **nvme_pool**: 3x Samsung 990 EVO Plus (~1.8TB usable)
|
||||
- **hdd_pool**: Existing 3.5" HDDs (capacity tier)
|
||||
|
||||
### XTRM-N1: Proxmox Configuration
|
||||
|
||||
1. **Storage**
|
||||
- ZFS mirror: 2x Fikwot 512GB
|
||||
- Boot + VMs + Containers
|
||||
|
||||
2. **VMs/Containers**
|
||||
- OPNsense VM (if routing moves here)
|
||||
- AdGuard LXC
|
||||
- Vaultwarden LXC
|
||||
- Authentik LXC (replica)
|
||||
|
||||
---
|
||||
|
||||
## Network Changes
|
||||
|
||||
### Current (HAP1 as router)
|
||||
|
||||
```
|
||||
Internet → HAP1 (Router) → ZX1 → XTRM-U
|
||||
→ CSS1 → Room outlets
|
||||
```
|
||||
|
||||
### Option A: Keep HAP1 as Router
|
||||
|
||||
```
|
||||
Internet → HAP1 (Router) → ZX1 → XTRM-N5
|
||||
↓ → XTRM-N1 (services only)
|
||||
→ CSS1 → Room outlets
|
||||
```
|
||||
|
||||
### Option B: Move Routing to XTRM-N1
|
||||
|
||||
```
|
||||
Internet → HAP1 (Bridge) → XTRM-N1 (OPNsense) → ZX1 → XTRM-N5
|
||||
→ CSS1 → Rooms
|
||||
```
|
||||
|
||||
**Decision:** TBD based on performance testing
|
||||
|
||||
---
|
||||
|
||||
## Thermal Management
|
||||
|
||||
| Strategy | Implementation |
|
||||
|----------|----------------|
|
||||
| Samsung NVMe in N5 | "Thermal Champions" - run cooler than Fikwot |
|
||||
| Physical spacing | 1U gap between XTRM-N1 and XTRM-N5 |
|
||||
| Active cooling | Exhaust fans tied to drive temp sensors |
|
||||
| HDD hibernation | Aggressive spindown, NVMe handles active I/O |
|
||||
|
||||
---
|
||||
|
||||
## Future Milestone: CRS310 Upgrade
|
||||
|
||||
When budget allows, replace ZX1 with MikroTik CRS310-8G+2S+IN:
|
||||
|
||||
| Feature | Benefit |
|
||||
|---------|---------|
|
||||
| 10G DAC to CSS326 | Proper backbone |
|
||||
| L3 offloading | Inter-VLAN routing off HAP1 |
|
||||
| 8x 2.5G ports | All servers at full speed |
|
||||
|
||||
---
|
||||
|
||||
## Migration Checklist
|
||||
|
||||
### Pre-Migration
|
||||
- [ ] Order XTRM-N5 hardware
|
||||
- [ ] Order 3x Samsung 990 EVO Plus 1TB
|
||||
- [ ] Order 2x Fikwot FX501Pro 512GB
|
||||
- [ ] Design/print 1U 10" chassis for N1
|
||||
- [ ] Order PicoPSU + 12V brick
|
||||
|
||||
### XTRM-N1 Setup (First)
|
||||
- [ ] Extract XTRM-U motherboard
|
||||
- [ ] Install in 1U chassis
|
||||
- [ ] Install Fikwot NVMe drives
|
||||
- [ ] Install Proxmox VE
|
||||
- [ ] Deploy survival services (AdGuard, Vaultwarden)
|
||||
- [ ] Test network survival (unplug XTRM-U)
|
||||
|
||||
### XTRM-N5 Setup
|
||||
- [ ] Unbox and install Samsung NVMe drives
|
||||
- [ ] Create ZFS boot pool
|
||||
- [ ] Migrate Unraid config from USB
|
||||
- [ ] Transfer HDD array
|
||||
- [ ] Verify all services running
|
||||
- [ ] Update DNS/IPs as needed
|
||||
|
||||
### Post-Migration
|
||||
- [ ] Update NetBox with new devices
|
||||
- [ ] Update documentation
|
||||
- [ ] Decommission old XTRM-U chassis
|
||||
- [ ] Performance testing
|
||||
- [ ] Thermal monitoring
|
||||
|
||||
---
|
||||
|
||||
## Related Documents
|
||||
|
||||
- `GITOPS-CONTAINERS.md` - Phase 2: Container configuration in Git
|
||||
Reference in New Issue
Block a user