Major documentation restructure - consolidated docs
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/push/woodpecker Pipeline was successful
New Structure: - 01-NETWORK-MAP.md - Network topology, IPs, Docker networks, services - 02-SERVICES-CRITICAL.md - DNS, Auth, Routing (P0/P1 services) - 03-SERVICES-OTHER.md - All non-critical services - 04-HARDWARE-INVENTORY.md - Physical devices and specs - 05-CHANGELOG.md - Major events only New Folders: - docs/archive/ - Legacy docs (read-only reference) - docs/wip/ - Planned changes and ideas - UPGRADE-2026-HARDWARE.md - N5 Air + N100 migration plan - GITOPS-CONTAINERS.md - Phase 2 container GitOps Changes: - Moved all 22 legacy docs to archive/ - Consolidated container IPs, physical map, and services into single network map - Extracted critical vs non-critical service classification - Simplified changelog to major events only Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,212 @@
|
||||
# Critical Services
|
||||
|
||||
**Last Updated:** 2026-01-25
|
||||
|
||||
Services that must remain operational for network functionality and security.
|
||||
|
||||
---
|
||||
|
||||
## Priority Levels
|
||||
|
||||
| Priority | Meaning | Recovery Target |
|
||||
|----------|---------|-----------------|
|
||||
| **P0** | Network offline without this | < 5 minutes |
|
||||
| **P1** | Major functionality impacted | < 1 hour |
|
||||
|
||||
---
|
||||
|
||||
## P0 - Network Core
|
||||
|
||||
### DNS (AdGuard Home)
|
||||
|
||||
| Instance | Host | IP | Role |
|
||||
|----------|------|-----|------|
|
||||
| Primary | HAP1 | 172.17.0.5 | Main DNS, DoH/DoT/DoQ |
|
||||
| Secondary | XTRM-U | 192.168.31.4 | Failover DNS |
|
||||
|
||||
**Endpoints:**
|
||||
- DoH: `https://dns.xtrm-lab.org/dns-query`
|
||||
- DoT: `tls://dns.xtrm-lab.org:853`
|
||||
- DoQ: `quic://dns.xtrm-lab.org:8853`
|
||||
|
||||
**Config Sync:** adguardhome-sync (every 30 min)
|
||||
|
||||
**Upstream:** Quad9 DoH (`https://dns10.quad9.net/dns-query`)
|
||||
|
||||
**Recovery:**
|
||||
1. If primary fails → clients use secondary (192.168.31.4)
|
||||
2. Restart container on HAP1: `/container/start adguardhome`
|
||||
|
||||
---
|
||||
|
||||
### Routing (HAP1)
|
||||
|
||||
| Function | Details |
|
||||
|----------|---------|
|
||||
| WAN | 62.73.120.142 via Vivacom fiber |
|
||||
| LAN | 192.168.31.0/24 |
|
||||
| NAT | Port forwarding to XTRM-U |
|
||||
| Firewall | RouterOS firewall rules |
|
||||
|
||||
**Recovery:**
|
||||
1. Physical access to HAP1
|
||||
2. Reset: hold reset button 5s
|
||||
3. Reconfigure via WinBox or SSH
|
||||
|
||||
---
|
||||
|
||||
### DHCP (HAP1)
|
||||
|
||||
| Pool | Range |
|
||||
|------|-------|
|
||||
| Dynamic | 192.168.31.100-200 |
|
||||
| Lease Time | 24 hours |
|
||||
|
||||
**Static Leases:** Managed in RouterOS DHCP server
|
||||
|
||||
---
|
||||
|
||||
## P1 - Authentication & Secrets
|
||||
|
||||
### Authentik (SSO)
|
||||
|
||||
| Component | IP | Purpose |
|
||||
|-----------|-----|---------|
|
||||
| authentik | 172.18.0.11 | Web UI + OIDC |
|
||||
| authentik-worker | 172.18.0.12 | Background tasks |
|
||||
|
||||
**URL:** https://auth.xtrm-lab.org
|
||||
|
||||
**Protects:**
|
||||
- Traefik forward auth (all *.xtrm-lab.org)
|
||||
- Gitea OAuth
|
||||
- Woodpecker OAuth
|
||||
- NetBox OAuth
|
||||
- NetDisco SSO
|
||||
|
||||
**Recovery:**
|
||||
```bash
|
||||
cd /mnt/user/appdata/authentik
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
**Database:** postgresql17 (authentik_db)
|
||||
|
||||
---
|
||||
|
||||
### Vaultwarden (Passwords)
|
||||
|
||||
| Component | IP | Purpose |
|
||||
|-----------|-----|---------|
|
||||
| vaultwarden | 172.18.0.15 | Password manager |
|
||||
|
||||
**URL:** https://vault.xtrm-lab.org
|
||||
|
||||
**Data:** `/mnt/user/appdata/vaultwarden/`
|
||||
|
||||
**Recovery:**
|
||||
```bash
|
||||
docker start vaultwarden
|
||||
```
|
||||
|
||||
**Backup:** Part of Unraid flash backup
|
||||
|
||||
---
|
||||
|
||||
## P1 - Reverse Proxy
|
||||
|
||||
### Traefik
|
||||
|
||||
| Component | IP | Ports |
|
||||
|-----------|-----|-------|
|
||||
| traefik | 172.18.0.3 | 8001→80, 44301→443 |
|
||||
|
||||
**Config:** `/mnt/user/appdata/traefik/`
|
||||
- `traefik.yml` - Static config
|
||||
- `dynamic.yml` - Routers & services
|
||||
|
||||
**TLS:** Let's Encrypt wildcard for *.xtrm-lab.org
|
||||
|
||||
**Recovery:**
|
||||
```bash
|
||||
docker start traefik
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Shared Infrastructure
|
||||
|
||||
### PostgreSQL 17
|
||||
|
||||
| IP | Databases |
|
||||
|----|-----------|
|
||||
| 172.18.0.13 | authentik_db, netbox, gitea, netdisco_db, diode, hydra |
|
||||
|
||||
**Data:** `/mnt/user/appdata/postgresql17/`
|
||||
|
||||
**Recovery:**
|
||||
```bash
|
||||
docker start postgresql17
|
||||
# Wait for DB to be ready before starting dependents
|
||||
```
|
||||
|
||||
### Redis
|
||||
|
||||
| IP | Consumers |
|
||||
|----|-----------|
|
||||
| 172.18.0.14 | Authentik, NetBox, Diode |
|
||||
|
||||
**Recovery:**
|
||||
```bash
|
||||
docker start Redis
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Startup Order
|
||||
|
||||
When recovering from full outage:
|
||||
|
||||
1. **postgresql17** - Database (wait 30s)
|
||||
2. **Redis** - Cache/queue (wait 10s)
|
||||
3. **traefik** - Reverse proxy
|
||||
4. **authentik** + **authentik-worker** - SSO
|
||||
5. **vaultwarden** - Passwords
|
||||
6. All other services
|
||||
|
||||
---
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Uptime Kuma
|
||||
|
||||
| URL | Monitors |
|
||||
|-----|----------|
|
||||
| https://uptime.xtrm-lab.org | 27 services |
|
||||
|
||||
**Alerts:** Configured per service (email/webhook)
|
||||
|
||||
---
|
||||
|
||||
## Backup Strategy
|
||||
|
||||
| Data | Location | Frequency |
|
||||
|------|----------|-----------|
|
||||
| Unraid Flash | Google Drive | Daily |
|
||||
| PostgreSQL | `/mnt/user/Backup/` | Daily |
|
||||
| Vaultwarden | Unraid Flash | With flash backup |
|
||||
| Authentik | PostgreSQL + `/mnt/user/appdata/authentik/` | Daily |
|
||||
|
||||
---
|
||||
|
||||
## Future: XTRM-N1 Survival Node
|
||||
|
||||
When hardware upgrade completes, these services will have replicas on XTRM-N1:
|
||||
|
||||
| Service | Primary | Replica |
|
||||
|---------|---------|---------|
|
||||
| DNS | HAP1 | XTRM-N1 |
|
||||
| Vaultwarden | XTRM-N5 | XTRM-N1 |
|
||||
| Authentik | XTRM-N5 | XTRM-N1 |
|
||||
|
||||
See: `wip/UPGRADE-2026-HARDWARE.md`
|
||||
Reference in New Issue
Block a user