Files
infrastructure/docs/incidents/2026-02-06-unraid-flash-drive-failure.md
Kaloyan Danchev c93f7da733 Add Unraid flash drive migration procedure
Flash drive on XTRM-U is failing. Created incident doc with complete
step-by-step procedure: backup retrieval (4 options), new USB prep via
Flash Creator, license transfer via Tools→Registration, post-migration
verification checklist, and prevention measures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 15:14:43 +02:00

201 lines
6.2 KiB
Markdown

# Incident: Unraid Flash Drive Failure
**Date:** 2026-02-06
**Severity:** P1 - Server at risk
**Status:** In Progress
**Affected:** XTRM-U (Unraid NAS)
---
## Symptoms
Unraid flash drive experiencing crashes/instability. Risk of complete failure and data loss of boot configuration.
---
## Migration Procedure: Replace Flash Drive
### Step 1: Retrieve Flash Backup
Try these options in order of preference:
**Option A - Fresh backup from WebGUI (if server still boots):**
1. Open http://192.168.10.20 in browser
2. Go to **Main** tab → click on **Flash** device
3. Under Flash Device Settings, click **FLASH BACKUP**
4. Download the ZIP file to your Mac
**Option B - Google Drive (daily Rclone backup):**
```bash
# From Mac (if rclone is installed)
rclone copy drive:Backups/unraid-flash ~/Desktop/unraid-flash-backup/
# Or download manually from Google Drive web UI
# Folder: Backups/unraid-flash
```
**Option C - Local backup on Unraid (if server boots but WebGUI broken):**
```bash
ssh -i ~/.ssh/id_ed25519_unraid root@192.168.10.20 -p 422
# Backup is at:
ls /mnt/user/Backup/unraid-flash/
# Copy it off the server:
scp -P 422 -i ~/.ssh/id_ed25519_unraid root@192.168.10.20:/mnt/user/Backup/unraid-flash/* ~/Desktop/unraid-flash-backup/
```
**Option D - Direct copy from failing drive:**
1. Shut down server
2. Remove flash drive, insert into Mac
3. Copy entire contents to `~/Desktop/unraid-flash-backup/`
---
### Step 2: Prepare New USB Drive
**Requirements:**
- USB 2.0 recommended (more reliable than USB 3.0 for this purpose)
- Capacity: 4 GB minimum, 32 GB maximum
- Reputable brand (SanDisk, Samsung, Kingston)
- Must have a unique hardware GUID
**Write the backup to new drive:**
1. Download [Unraid USB Flash Creator](https://unraid.net/download) for macOS
2. Insert new USB drive into Mac
3. Open Flash Creator
4. For **Operating System**, scroll down and select **"Use custom"**
5. Browse to your backup ZIP file from Step 1
6. Select the new USB drive as destination
7. Click **Write** and wait for completion
**If you don't have a backup ZIP** (only raw files from Option D):
1. In Flash Creator, select the Unraid OS version matching your current install
2. Write a fresh Unraid install to the new drive
3. After writing, mount the drive and copy your backed-up `config/` folder onto it, replacing the default one
---
### Step 3: Swap Drives and Boot
1. Shut down XTRM-U if still running
2. Remove the old (failing) flash drive
3. Insert the new USB drive
4. Power on the server
5. Wait for boot (1-2 minutes)
6. Try accessing WebGUI at http://192.168.10.20
**If WebGUI doesn't load:**
- Connect a monitor to the server to check boot messages
- Verify the USB drive is detected in BIOS
- Ensure boot order has USB first
---
### Step 4: Transfer License
You will see an "Invalid, missing or expired registration key" message. This is expected.
1. In WebGUI, go to **Tools → Registration**
2. Click **Replace Key**
3. Enter the email address associated with your Unraid account
4. Check your email for the confirmation/license key
5. Follow the link or paste the key file URL into the Registration page
6. Click **Done**
**Important warnings:**
- Replacing the key **permanently blacklists** the old USB drive - it can never be used with Unraid again
- First license transfer can be done at any time
- Subsequent transfers: once per 12 months via the automated system
- If you need another transfer within 12 months, contact [Unraid support](https://unraid.net/contact) with old GUID, new GUID, license key, and purchase email
**If you can't find your license:**
- Log into https://account.unraid.net to view your keys
- Check email for original purchase confirmation
---
### Step 5: Post-Migration Verification
Run through this checklist after the server is back up:
**Array & Storage:**
- [ ] WebGUI loads at http://192.168.10.20
- [ ] Array starts normally (Main tab → Start)
- [ ] All disks show healthy status
- [ ] Shares are accessible
**Docker & Services:**
```bash
ssh -i ~/.ssh/id_ed25519_unraid root@192.168.10.20 -p 422
# Check all containers
docker ps -a --format 'table {{.Names}}\t{{.Status}}'
# Start any stopped critical containers (in order):
docker start postgresql17 # Wait 30s
docker start Redis # Wait 10s
docker start traefik
docker start authentik authentik-worker
docker start vaultwarden
```
**Network:**
- [ ] SSH works: `ssh -i ~/.ssh/id_ed25519_unraid root@192.168.10.20 -p 422`
- [ ] DNS failover AdGuard reachable: http://192.168.10.10:3000
- [ ] AdGuard sync working (check `docker logs adguardhome-sync --tail 5`)
- [ ] External URLs working (https://xtrm-lab.org)
**Services checklist:**
- [ ] Traefik reverse proxy (https://xtrm-lab.org)
- [ ] Authentik SSO (https://auth.xtrm-lab.org)
- [ ] Gitea (https://git.xtrm-lab.org)
- [ ] Uptime Kuma (https://uptime.xtrm-lab.org)
- [ ] Vaultwarden (https://vault.xtrm-lab.org)
- [ ] Plex (https://plex.xtrm-lab.org)
**Backup:**
- [ ] Verify Rclone config still present: `rclone listremotes` (should show `drive:`)
- [ ] Test flash backup: trigger manual backup from WebGUI or User Scripts
- [ ] Verify cron schedule for flash backup is active
---
### Step 6: Prevention
After successful migration:
1. **Enable Unraid Connect** (if not already) for automated cloud flash backup:
- Settings → Management Access → Unraid Connect
- Sign in with your unraid.net account
- Enable Flash Backup
2. **Verify Rclone cron** is scheduled:
```bash
# Check user scripts plugin for flash backup schedule
ls /boot/config/plugins/user.scripts/scripts/
```
3. **Keep a spare USB drive** prepared with a fresh Unraid install - makes future recovery faster
4. **Test backup restoration** periodically - don't wait for a failure to discover your backup is incomplete
---
## References
- [Unraid Docs: Changing the Flash Device](https://docs.unraid.net/unraid-os/system-administration/maintain-and-update/changing-the-flash-device/)
- [Unraid Docs: Licensing FAQ](https://docs.unraid.net/unraid-os/troubleshooting/licensing-faq/)
- Internal: `docs/02-SERVICES-CRITICAL.md` (startup order)
---
## Resolution
*Update this section when migration is complete:*
- **Date resolved:**
- **New USB drive:**
- **License transferred:** Yes/No
- **Services verified:** Yes/No
- **Backup reconfigured:** Yes/No