dockerproxy: redesign IPAM with static block + dynamic /25 pool
Recreated dockerproxy network with --ip-range 172.18.0.128/25 so Docker auto-allocations are isolated from the .2-.127 static reservation block. Eliminates IP-collision class that caused the 2026-05-17 Traefik outage. Adds 13-DOCKERPROXY-NETWORK.md as the canonical reference for the network spec, recreate command, and current IP assignments.
This commit is contained in:
@@ -61,8 +61,37 @@ Final state after `docker compose up -d`:
|
||||
|
||||
---
|
||||
|
||||
## Structural Fix — IPAM Redesign (~05:15 UTC same day)
|
||||
|
||||
Compose-level pinning fixes one container at a time. To remove the root cause class, the `dockerproxy` network was recreated with a dedicated dynamic IP range. Since Docker network IPAM is immutable, the network had to be torn down and rebuilt with all 32 containers offline (~3 min outage).
|
||||
|
||||
New IPAM:
|
||||
|
||||
```
|
||||
Subnet: 172.18.0.0/16
|
||||
Gateway: 172.18.0.1
|
||||
IPRange: 172.18.0.128/25 ← Docker only auto-assigns from .128–.255
|
||||
```
|
||||
|
||||
- `.2`–`.127` is now a **static-only reservation block**
|
||||
- `.128`–`.255` is the dynamic pool
|
||||
|
||||
Procedure (scripted; full state snapshot at `/root/dockerproxy-recreate-2026-05-17/` on Unraid):
|
||||
|
||||
1. Capture container→static-IP map from `docker inspect`
|
||||
2. `docker stop` all 32 in parallel
|
||||
3. `docker network disconnect dockerproxy <each>`
|
||||
4. `docker network rm dockerproxy`
|
||||
5. `docker network create --driver bridge --subnet 172.18.0.0/16 --gateway 172.18.0.1 --ip-range 172.18.0.128/25 dockerproxy`
|
||||
6. `docker network connect --ip <preserved-static> dockerproxy <each>` (no `--ip` for the one dynamic container, `traefik-manager`)
|
||||
7. Start in dependency order: dockersocket → postgresql17 + Redis → traefik + traefik-manager → authentik + authentik-worker → rest
|
||||
|
||||
Final state: all 32 containers up, all static IPs preserved (.3–.70), traefik-manager moved from `.2` to `.128` (first dynamic slot).
|
||||
|
||||
Reference: `docs/13-DOCKERPROXY-NETWORK.md` documents the network spec, recreate command, and current IP assignments.
|
||||
|
||||
## Follow-up
|
||||
|
||||
- Audit remaining Dockge stacks on `dockerproxy` for missing `ipv4_address` reservations.
|
||||
- Add a Uptime Kuma docker-state monitor for `traefik` (HTTP probes route *through* Traefik and fail uselessly when Traefik itself is the outage).
|
||||
- Two Traefik-IP incidents in 12 days (also 2026-05-05 vaultwarden misroute) — consider a deliberate redesign of the dockerproxy IP map.
|
||||
- IPAM redesign eliminates the auto-allocation collision class. Future stacks that omit `ipv4_address` land safely in `.128+`.
|
||||
- Network is imperative (not in any compose); recreate command is documented in `docs/13-DOCKERPROXY-NETWORK.md` for disaster recovery.
|
||||
- Add an Uptime Kuma docker-state monitor for `traefik` (HTTP probes route through Traefik and fail uselessly when Traefik is the outage).
|
||||
|
||||
Reference in New Issue
Block a user