dockerproxy: redesign IPAM with static block + dynamic /25 pool
Recreated dockerproxy network with --ip-range 172.18.0.128/25 so Docker auto-allocations are isolated from the .2-.127 static reservation block. Eliminates IP-collision class that caused the 2026-05-17 Traefik outage. Adds 13-DOCKERPROXY-NETWORK.md as the canonical reference for the network spec, recreate command, and current IP assignments.
This commit is contained in:
@@ -0,0 +1,91 @@
|
|||||||
|
# dockerproxy Docker Network
|
||||||
|
|
||||||
|
User-defined Docker bridge on Unraid hosting Traefik and all reverse-proxied services. Defined imperatively (not in any compose file — stacks reference it as `external: true`).
|
||||||
|
|
||||||
|
## IPAM
|
||||||
|
|
||||||
|
| Property | Value |
|
||||||
|
|----------|-------|
|
||||||
|
| Driver | `bridge` |
|
||||||
|
| Subnet | `172.18.0.0/16` |
|
||||||
|
| Gateway | `172.18.0.1` |
|
||||||
|
| IP Range (dynamic pool) | `172.18.0.128/25` (.128–.255) |
|
||||||
|
| Static reservation block | `172.18.0.2 – 172.18.0.127` |
|
||||||
|
|
||||||
|
The `--ip-range` constrains Docker's auto-allocation to `.128–.255`. Anything pinned via compose `ipv4_address` outside that range is conflict-free. Set up 2026-05-17 after the collision incident in `incidents/2026-05-17-traefik-ip-collision.md`.
|
||||||
|
|
||||||
|
## Recreate Command
|
||||||
|
|
||||||
|
If the network is ever lost (Docker reset, accidental `docker network rm`):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker network create \
|
||||||
|
--driver bridge \
|
||||||
|
--subnet 172.18.0.0/16 \
|
||||||
|
--gateway 172.18.0.1 \
|
||||||
|
--ip-range 172.18.0.128/25 \
|
||||||
|
dockerproxy
|
||||||
|
```
|
||||||
|
|
||||||
|
After recreating, compose-managed containers reconnect via `docker compose up -d`. Standalone containers need `docker network connect [--ip <static>] dockerproxy <name>`.
|
||||||
|
|
||||||
|
## Static Assignments (2026-05-17)
|
||||||
|
|
||||||
|
| IP | Container |
|
||||||
|
|----|-----------|
|
||||||
|
| .1 | (gateway) |
|
||||||
|
| .3 | traefik |
|
||||||
|
| .6 | dockersocket |
|
||||||
|
| .8 | authentik-worker |
|
||||||
|
| .9 | authentik |
|
||||||
|
| .10 | postgresql17 |
|
||||||
|
| .14 | Redis |
|
||||||
|
| .15 | vaultwarden |
|
||||||
|
| .16 | actual-budget |
|
||||||
|
| .18 | Uptime-Kuma-API |
|
||||||
|
| .19 | AutoKuma |
|
||||||
|
| .20 | UptimeKuma |
|
||||||
|
| .21 | speedtest-tracker |
|
||||||
|
| .22 | obsidian-livesync |
|
||||||
|
| .23 | SeekAndWatch |
|
||||||
|
| .25 | karakeep |
|
||||||
|
| .26 | transmission |
|
||||||
|
| .31 | gitea |
|
||||||
|
| .32 | woodpecker-server |
|
||||||
|
| .33 | woodpecker-agent |
|
||||||
|
| .43 | radarr |
|
||||||
|
| .44 | sonarr |
|
||||||
|
| .45 | prowlarr |
|
||||||
|
| .50 | dockhand |
|
||||||
|
| .53 | n8n |
|
||||||
|
| .60 | overseerr |
|
||||||
|
| .61 | plex_debrid |
|
||||||
|
| .62 | zurg |
|
||||||
|
| .63 | zurg-rclone |
|
||||||
|
| .65 | xtrm-agent |
|
||||||
|
| .66 | kasm |
|
||||||
|
| .70 | ewa-apps |
|
||||||
|
| .128+ | dynamic pool (traefik-manager landed here) |
|
||||||
|
|
||||||
|
## Adding a New Service
|
||||||
|
|
||||||
|
1. Pick a free IP in `.2–.127` (or omit and accept dynamic `.128+`)
|
||||||
|
2. In compose:
|
||||||
|
```yaml
|
||||||
|
services:
|
||||||
|
myservice:
|
||||||
|
networks:
|
||||||
|
dockerproxy:
|
||||||
|
ipv4_address: 172.18.0.X
|
||||||
|
networks:
|
||||||
|
dockerproxy:
|
||||||
|
external: true
|
||||||
|
```
|
||||||
|
3. Append to the table above and commit.
|
||||||
|
|
||||||
|
## Snapshot of Pre-Recreate State
|
||||||
|
|
||||||
|
On Unraid: `/root/dockerproxy-recreate-2026-05-17/`
|
||||||
|
- `network-before.json` — full `docker network inspect` output
|
||||||
|
- `state.tsv` — per-container name/static-IP/runtime-IP/status/restart-policy
|
||||||
|
- `containers.txt` — sorted container list (32 entries)
|
||||||
@@ -61,8 +61,37 @@ Final state after `docker compose up -d`:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Structural Fix — IPAM Redesign (~05:15 UTC same day)
|
||||||
|
|
||||||
|
Compose-level pinning fixes one container at a time. To remove the root cause class, the `dockerproxy` network was recreated with a dedicated dynamic IP range. Since Docker network IPAM is immutable, the network had to be torn down and rebuilt with all 32 containers offline (~3 min outage).
|
||||||
|
|
||||||
|
New IPAM:
|
||||||
|
|
||||||
|
```
|
||||||
|
Subnet: 172.18.0.0/16
|
||||||
|
Gateway: 172.18.0.1
|
||||||
|
IPRange: 172.18.0.128/25 ← Docker only auto-assigns from .128–.255
|
||||||
|
```
|
||||||
|
|
||||||
|
- `.2`–`.127` is now a **static-only reservation block**
|
||||||
|
- `.128`–`.255` is the dynamic pool
|
||||||
|
|
||||||
|
Procedure (scripted; full state snapshot at `/root/dockerproxy-recreate-2026-05-17/` on Unraid):
|
||||||
|
|
||||||
|
1. Capture container→static-IP map from `docker inspect`
|
||||||
|
2. `docker stop` all 32 in parallel
|
||||||
|
3. `docker network disconnect dockerproxy <each>`
|
||||||
|
4. `docker network rm dockerproxy`
|
||||||
|
5. `docker network create --driver bridge --subnet 172.18.0.0/16 --gateway 172.18.0.1 --ip-range 172.18.0.128/25 dockerproxy`
|
||||||
|
6. `docker network connect --ip <preserved-static> dockerproxy <each>` (no `--ip` for the one dynamic container, `traefik-manager`)
|
||||||
|
7. Start in dependency order: dockersocket → postgresql17 + Redis → traefik + traefik-manager → authentik + authentik-worker → rest
|
||||||
|
|
||||||
|
Final state: all 32 containers up, all static IPs preserved (.3–.70), traefik-manager moved from `.2` to `.128` (first dynamic slot).
|
||||||
|
|
||||||
|
Reference: `docs/13-DOCKERPROXY-NETWORK.md` documents the network spec, recreate command, and current IP assignments.
|
||||||
|
|
||||||
## Follow-up
|
## Follow-up
|
||||||
|
|
||||||
- Audit remaining Dockge stacks on `dockerproxy` for missing `ipv4_address` reservations.
|
- IPAM redesign eliminates the auto-allocation collision class. Future stacks that omit `ipv4_address` land safely in `.128+`.
|
||||||
- Add a Uptime Kuma docker-state monitor for `traefik` (HTTP probes route *through* Traefik and fail uselessly when Traefik itself is the outage).
|
- Network is imperative (not in any compose); recreate command is documented in `docs/13-DOCKERPROXY-NETWORK.md` for disaster recovery.
|
||||||
- Two Traefik-IP incidents in 12 days (also 2026-05-05 vaultwarden misroute) — consider a deliberate redesign of the dockerproxy IP map.
|
- Add an Uptime Kuma docker-state monitor for `traefik` (HTTP probes route through Traefik and fail uselessly when Traefik is the outage).
|
||||||
|
|||||||
Reference in New Issue
Block a user