ops/policy-checks/mem-limits-baseline.md captures the deliberate
"not today" decision for memory limits plus the plan for when it
becomes relevant:
- Phase 1: 7 days of hourly docker stats snapshots
- Phase 2: derive Tier-1 peak per container
- Phase 3: set limits at peak * 1.5 with documented floors
(Postgres 1G, Mongo 1G, Redis 256M, etc.)
- Phase 4: roll out smallest-risk containers first, observe 24h
between stages
- Phase 5: Tier-2 only after a concrete trigger event
Next trigger: family invitation out + 4 weeks stable use, or
first real OOM event in docker-critical-events.sh, or a sudden
Immich/Nextcloud load spike where host swap becomes visible.
Today's policy check is clean (0 Critical, 1 documented Warning
on influxdb3-core user 0, 13 documented Info findings on host
ports / privileged exceptions / latest+digest tags).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Six files had outdated status notes that the F-09 first run on
2026-05-30 made wrong:
- ops/restore-tests/komodo-bootstrap-runbook.md: "Erster echter Lauf
steht noch aus" -> first run confirmed
- ops/restore-tests/komodo-bootstrap-plan.md: "Noch offen vor dem
ersten echten Lauf" section -> "Bestaetigte Laeufe" table with
the --what-if and --keep-data runs
- ops/restore-tests/immich-runbook.md: status note still said
"Erster echter Lauf steht noch aus" although the Immich first run
was 2026-05-27; correcting in the same sweep
- docs/AUDIT_2026-05-25_TODO.md: Sprint 2 entry on Komodo bootstrap
path no longer carries the "Trockenlauf-Skript bleibt als offene
Folgeaufgabe" tail
- docs/SERVICES_RECOVERY.md: replaced the "Trockenlauf-Idee (Doku-only,
nicht ausgefuehrt)" section with the confirmed repo-script flow and
marked the two "Naechste Aufgaben" rows about the dry-run as done
- docs/RESTORE_DRILL_ROUTINE.md: Q2 2026 DR-Sanity-Check entry now
splits Komodo-Bootstrap-Pfad (done) from the two still-open items
(Gitea bundles, secrets inventory)
No behavior change, only documentation consistency.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drei Issues beim Erstlauf gefunden und gefixt:
1. EAI_AGAIN: Renovate-Container konnte git.kaleschke.info nicht
aufloesen. Analog zu Komodos extra_hosts mappen wir den Hostname
per --add-host auf 192.168.178.58 (LAN-IP des Unraid-Hosts).
Zusaetzlich --dns 1.1.1.1/8.8.8.8 fuer externe Image-Registries.
2. Token-Leak in ps und docker inspect: -e RENOVATE_TOKEN=... macht
den Wert in Process-Listing sichtbar. Stattdessen --env-file mit
einem 0600 tempfile unter $RENOVATE_STATE_DIR/.env, das nach dem
Lauf via shred bzw. rm geloescht wird.
3. Doppelter rc=$? Block plus return innerhalb einer {}-Subshell
waren Tot-Code; aufgeraeumt.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
renovate.json: gitea platform, autodiscover Micha/*, group rules
(major separate, minor+patch+digest grouped, stateful tier-1
individual, komodo-major disabled), pin range strategy, no
automerge, dependency dashboard enabled.
ops/renovate/run-renovate.sh: one-shot docker run wrapper that
reads the Gitea PAT from /mnt/user/appdata/secrets/renovate_token.txt,
runs renovate/renovate:41, logs into /mnt/user/services/renovate/logs/.
docs/RENOVATE.md: 5-step operator setup (Gitea service account,
PAT, token file, first run, six-hourly user script). Explicit
no-automerge stance with notfall-stop checklist.
Cross-doc sweep: SECRETS_MAP entry for renovate_token.txt,
REPO_MAP entry for RENOVATE.md, AUDIT_2026-05-25_TODO new
Sprint 8 with F-15, F-07, F-09 rest, F-12 status, MIGRATION_LOG
captures the four-block sprint in one entry.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirror of the Immich restore-test pattern for the Komodo bootstrap
anchor. Brings up a throwaway komodo-mongo + komodo-core +
komodo-periphery under project restoretest-komodo, isolated from
production:
- same image digests as production (mongo:7.0.32, komodo-core:2,
komodo-periphery:2) to prove compose-level bootstrap compatibility
- restore-lab paths under /mnt/user/backups/restore-lab/komodo
- 127.0.0.1:19120 only, no LAN bind, no Traefik, no Authelia
- test periphery runs WITHOUT docker.sock mount and WITHOUT
/mnt/user/services mount; cannot manage productive containers
- KOMODO_* secrets are throwaway placeholders hardcoded in the test
compose; productive secrets never enter this path
Smoke test: compose config valid, mongo healthy, mongo auth-ping
with test creds, komodo-core HTTP 200/302/303/401, periphery
container running. Report under restore-reports/komodo-bootstrap-*.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- AUDIT_2026-05-25_TODO: Borg-Stale, Cert-Expiry, Container-Down
Alerts auf "erledigt" (Cron */5 textfile exporter live,
Prometheus reload mit 14 Regeln); Gitea-Bundle-Cron auf "erledigt"
(User-Script gitea-bundle-mirror-6h aktiv, Bundles 644);
H:/ Nearline-Pull auf "erledigt (Pull live, Scheduled Task offen)"
mit Zaehlerstaenden 19 Borg-Dumps + 10 Bundle-Files.
- MIGRATION_LOG: neuer Eintrag fasst die drei zusammenhaengenden
Live-Aktivierungen zusammen, inkl. Befund-Ursprung (Permission-
Drift), Reparaturen und expliziter Ausklammerung der nicht
angefassten Themen (Auth, Hermes, USV, FRITZ!Box, Plex).
- H_DRIVE_NEARLINE_PULL: Erstlauf-Befund mit Permission-Issues
und nachgezogenem Stand; Erwartungs-Liste auf real geliefertes
Set angepasst; Flash-Config explizit Out-of-Scope.
- pull-critical-backups.ps1: Live-Robocopy-Output an Out-Null,
damit der Markdown-Report nicht von Robocopy-Strings zerlegt
wird (PowerShell-Pipeline-Quirk im foreach).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
pre-backup-dumps.sh: atomic_write nimmt jetzt einen optionalen
mode-Parameter (Default 0644). Damit sind alle DB-/SQLite-/BoltDB-
/Mongo-Dumps konsistent 0644 und vom Nearline-Pull lesbar. Die
sensible unraid-flash-config-Familie (.tar.gz, .sha256, .manifest)
ruft explizit mit mode 600 auf und bleibt damit Operator-only.
Loest das Permission-Problem fuer filebrowser.bolt.dump (Source
ist 0640) im naechsten regulaeren Dump-Lauf.
pull-critical-backups.ps1: Jobs koennen ExcludeFiles ueber /XF
mitliefern. borg-dumps-latest schliesst die unraid-flash-config-
Artefakte aus, weil sie bewusst 0600 bleiben sollen und sonst den
Lauf abbrechen lassen. Restore-Quelle fuer Flash-Config bleibt
das Hetzner-Borg-Repo.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
node-exporter runs as nobody:65534 inside its container and was
hitting node_textfile_scrape_error 1 on homelab.prom, because the
file was 0600 root:root (mktemp default). Set it to 0644 right
before the atomic mv. Bundle inhaltsidentisch zum Git-Repo, ohne
Secrets (.gitignore-abgedeckt) und nicht sensibler als die
uebrigen /mnt/user/backups/borg/dumps/latest/*.dump-Files, die
ebenfalls 0644 sind. So funktioniert auch der Nearline-Pull-Workflow
ueber SMB (docs/H_DRIVE_NEARLINE_PULL.md).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New docs/RESTORE_DRILL_ROUTINE.md introduces a three-stage model:
weekly freshness check, monthly/bimonthly mini-restores, quarterly
DR sanity check. Tracks confirmed mini-restores (Vaultwarden, Gitea,
Paperless 2026-05-07; Immich 2026-05-27) and rotates services by
quarter Q1-Q4. Includes ten-point DR sanity check and abort rules
that point at the drift runbook. No host schedule is created; the
existing ops/restore-tests/schedule.md now references this routine
as the source for quarterly assignment.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>