renovate.json: gitea platform, autodiscover Micha/*, group rules
(major separate, minor+patch+digest grouped, stateful tier-1
individual, komodo-major disabled), pin range strategy, no
automerge, dependency dashboard enabled.
ops/renovate/run-renovate.sh: one-shot docker run wrapper that
reads the Gitea PAT from /mnt/user/appdata/secrets/renovate_token.txt,
runs renovate/renovate:41, logs into /mnt/user/services/renovate/logs/.
docs/RENOVATE.md: 5-step operator setup (Gitea service account,
PAT, token file, first run, six-hourly user script). Explicit
no-automerge stance with notfall-stop checklist.
Cross-doc sweep: SECRETS_MAP entry for renovate_token.txt,
REPO_MAP entry for RENOVATE.md, AUDIT_2026-05-25_TODO new
Sprint 8 with F-15, F-07, F-09 rest, F-12 status, MIGRATION_LOG
captures the four-block sprint in one entry.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mirror of the Immich restore-test pattern for the Komodo bootstrap
anchor. Brings up a throwaway komodo-mongo + komodo-core +
komodo-periphery under project restoretest-komodo, isolated from
production:
- same image digests as production (mongo:7.0.32, komodo-core:2,
komodo-periphery:2) to prove compose-level bootstrap compatibility
- restore-lab paths under /mnt/user/backups/restore-lab/komodo
- 127.0.0.1:19120 only, no LAN bind, no Traefik, no Authelia
- test periphery runs WITHOUT docker.sock mount and WITHOUT
/mnt/user/services mount; cannot manage productive containers
- KOMODO_* secrets are throwaway placeholders hardcoded in the test
compose; productive secrets never enter this path
Smoke test: compose config valid, mongo healthy, mongo auth-ping
with test creds, komodo-core HTTP 200/302/303/401, periphery
container running. Report under restore-reports/komodo-bootstrap-*.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reads live RepoDigests of each running monitoring container and
freezes the compose to the exact image manifest. Brings the
monitoring stack to the same digest-pin discipline as the
stateful tier-1 services. influxdb3-core was already pinned.
Affected: prometheus, alertmanager, alertmanager-ntfy-bridge,
blackbox-exporter, loki, promtail, grafana, node-exporter,
cadvisor (plus a second python:3.13-alpine for the bootstrap
dashboard importer).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Vaultwarden image ships curl, not wget. Switched the CMD-SHELL
test from wget --spider to curl -fsS.
Authelia 4.39.x removed the "helper health-check" subcommand;
use the /api/health endpoint via wget instead (verified inside
the running container).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Authelia ships its own health-check binary subcommand since 4.37+.
Avoids needing wget/curl in the container.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Enable --ping=true and use traefik healthcheck --ping. Lightweight
binary call inside the container, no extra tooling needed.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Gitea exposes /api/healthz unauthenticated. 60s start_period
because Gitea sqlite migration on cold start can take a while.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Tier-1 health visibility for the shared Redis. Uses redis-cli with
the password from the mounted secret, fails on anything but PONG.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Tier-1 health visibility for shared Postgres cluster. pg_isready
against the admin DB; 30s interval, 30s start_period.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Befund vom 2026-05-29: HomelabBorgLastJobCompletedWithWarnings
zuendete vier Tage in Folge mit Borg-Exit-Code 107. Ursache im
Logfile: /local/appdata/homepage wurde am 25.05. entfernt, aber
in der Borg-UI-Source-Liste blieb der Eintrag drin und Borg
warnte taeglich BackupFileNotFoundError. Backups selbst waren
nicht gefaehrdet (alle 23 anderen Quellen sauber archiviert).
Operator hat den Eintrag in der Borg-UI manuell entfernt;
Source-Liste jetzt 23 statt 24, naechster Lauf 2026-05-30 sollte
wieder completed ohne Warning sein.
Erkenntnis: bei Stack-Removal wurde die Borg-Source-Liste nicht
mit-aufgeraeumt. WORKFLOW.md um neuen Abschnitt "Service-Removal-
Checkliste" erweitert mit 9 Pflichtschritten inklusive
Borg-UI-Source-Bereinigung als Schritt 8.
Positiv: die am 2026-05-27 scharfgeschaltete Alert-Pipeline
(Cron Textfile -> node-exporter -> Prometheus -> Alertmanager
-> ntfy-Bridge) hat den Drift binnen 24 h sichtbar gemacht.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Windows Scheduled Task "KalliLab H Drive Nearline Pull" auf dem
Operator-Windows-PC registriert: taeglich 05:30 nach dem Borg-
Dump-Fenster. RunLevel Limited, StartWhenAvailable, Akku-OK,
Execution-Time-Limit 2h. Naechster Lauf 2026-05-29 05:30.
Repo-Snippet in H_DRIVE_NEARLINE_PULL.md korrigiert: PowerShell-
Enum-Wert ist Limited, nicht LeastPrivilege (alter Snippet haette
beim ersten Register-ScheduledTask einen Parameter-Binding-Fehler
geworfen). Status auf "produktiv" gesetzt.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Operator-Entscheidungen 2026-05-28:
- F-03 zweites Off-site: bewusst NICHT umgesetzt. 3-2-1 ist mit
Live + lokalem Borg + Hetzner + H:/-Nearline erfuellt; ein
zweites Off-site deckt nur den Fall "Hetzner-Account verloren"
ab, Aufwand unverhaeltnismaessig fuer Familien-Homelab.
Stattdessen drei Folge-TODOs zur Haertung der bestehenden
Topologie. Hetzner-2FA bewusst ohne (Operator-Praeferenz,
analog USV-Risiko-Akzeptanz), durch starkes Passwort +
Backup-Zahlungsweg + Login-Mails ersetzt. Borg-Append-Only-
Befund: Repo laeuft im Mode 'full', custom_flags leer; Setup
waere server-seitig in Hetzner-authorized_keys (Folge-Sprint).
Review-Trigger in OFFSITE_BACKUP_OPTIONS.md dokumentiert.
- paperless-gpt: behalten bis Paperless-NGX 3.0 (erwartete
native KI-Features). Aktuell 0 Traefik-Zugriffe in 7 Tagen,
Resource-Footprint 34 MB RAM.
- BentoPDF: behalten als situatives Tool. 0 Traefik-Zugriffe,
4 MB RAM. Begruendungs-Anker im SERVICE_CATALOG.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Operator-Befund beim F-17-Versuch: Plex-Server war seit 18.05.
unclaimed (Preferences.xml ohne PlexOnline*) und Library-Sections
leer. Filmdateien unter /mnt/user/media/* blieben unangetastet.
Reclaim als Xeridos via inline PLEX_CLAIM-Env beim docker compose
force-recreate. Token nirgendwo persistiert (kein .env, kein Repo,
keine Komodo-Stack-ENV); zweiter Recreate ohne Token, damit
docker inspect-Snapshot sauber bleibt.
Endstand: PlexOnlineUsername Xeridos, PlexOnlineHome 1,
PublishServerOnPlexOnlineKey 0 (Remote Access aus). Bibliotheken
operator-seitig wieder eingerichtet (/data/movies 1.4 TB,
/data/Heimatfilme 300 GB). Plex bleibt LAN/Tailscale-only,
konsistent zur FRITZBox-Bereinigung vom selben Tag.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- AUDIT_2026-05-25_TODO: Borg-Stale, Cert-Expiry, Container-Down
Alerts auf "erledigt" (Cron */5 textfile exporter live,
Prometheus reload mit 14 Regeln); Gitea-Bundle-Cron auf "erledigt"
(User-Script gitea-bundle-mirror-6h aktiv, Bundles 644);
H:/ Nearline-Pull auf "erledigt (Pull live, Scheduled Task offen)"
mit Zaehlerstaenden 19 Borg-Dumps + 10 Bundle-Files.
- MIGRATION_LOG: neuer Eintrag fasst die drei zusammenhaengenden
Live-Aktivierungen zusammen, inkl. Befund-Ursprung (Permission-
Drift), Reparaturen und expliziter Ausklammerung der nicht
angefassten Themen (Auth, Hermes, USV, FRITZ!Box, Plex).
- H_DRIVE_NEARLINE_PULL: Erstlauf-Befund mit Permission-Issues
und nachgezogenem Stand; Erwartungs-Liste auf real geliefertes
Set angepasst; Flash-Config explizit Out-of-Scope.
- pull-critical-backups.ps1: Live-Robocopy-Output an Out-Null,
damit der Markdown-Report nicht von Robocopy-Strings zerlegt
wird (PowerShell-Pipeline-Quirk im foreach).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
pre-backup-dumps.sh: atomic_write nimmt jetzt einen optionalen
mode-Parameter (Default 0644). Damit sind alle DB-/SQLite-/BoltDB-
/Mongo-Dumps konsistent 0644 und vom Nearline-Pull lesbar. Die
sensible unraid-flash-config-Familie (.tar.gz, .sha256, .manifest)
ruft explizit mit mode 600 auf und bleibt damit Operator-only.
Loest das Permission-Problem fuer filebrowser.bolt.dump (Source
ist 0640) im naechsten regulaeren Dump-Lauf.
pull-critical-backups.ps1: Jobs koennen ExcludeFiles ueber /XF
mitliefern. borg-dumps-latest schliesst die unraid-flash-config-
Artefakte aus, weil sie bewusst 0600 bleiben sollen und sonst den
Lauf abbrechen lassen. Restore-Quelle fuer Flash-Config bleibt
das Hetzner-Borg-Repo.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
node-exporter runs as nobody:65534 inside its container and was
hitting node_textfile_scrape_error 1 on homelab.prom, because the
file was 0600 root:root (mktemp default). Set it to 0644 right
before the atomic mv. Bundle inhaltsidentisch zum Git-Repo, ohne
Secrets (.gitignore-abgedeckt) und nicht sensibler als die
uebrigen /mnt/user/backups/borg/dumps/latest/*.dump-Files, die
ebenfalls 0644 sind. So funktioniert auch der Nearline-Pull-Workflow
ueber SMB (docs/H_DRIVE_NEARLINE_PULL.md).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pull repo map up to include FAMILY_VIEW_DASHBOARD,
RESTORE_DRILL_ROUTINE, IMMICH_RESTORE_TEST, FRITZBOX_PORT_
CORRECTION_PLAN and OFFSITE_BACKUP_OPTIONS.
Mark Sprint 2 'Komodo bootstrap', Sprint 3 'Family-View',
Sprint 4 'Family onboarding' and Sprint 7 'Quarterly restore drill'
as done with explicit completion notes. Carry the FRITZBox UPnP
finding forward; tag the second offsite item as decision-pending.
Add two doc-only migration log entries for the bootstrap/family/
onboarding/drill sprint and for the FRITZBox/offsite preparation.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two operator decision documents, doc-only, no live action:
- docs/FRITZBOX_PORT_CORRECTION_PLAN.md prepares the three open
router items: remove 80/tcp (no HTTP-01 in use), do not add
222/tcp while Tailscale remains the operator path, deactivate
the UPnP self-exposure from PC-192-168-178-71. Every step waits
for operator go.
- docs/OFFSITE_BACKUP_OPTIONS.md compares rsync.net, BorgBase EU2
and rotating cold disk for a second offsite target. Recommends
rsync.net or cold disk; BorgBase EU2 is explicitly not
recommended because it does not separate the provider risk.
No provider booked, no costs triggered.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New docs/RESTORE_DRILL_ROUTINE.md introduces a three-stage model:
weekly freshness check, monthly/bimonthly mini-restores, quarterly
DR sanity check. Tracks confirmed mini-restores (Vaultwarden, Gitea,
Paperless 2026-05-07; Immich 2026-05-27) and rotates services by
quarter Q1-Q4. Includes ten-point DR sanity check and abort rules
that point at the drift runbook. No host schedule is created; the
existing ops/restore-tests/schedule.md now references this routine
as the source for quarterly assignment.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Set status to final pre-invitation, soften the 2FA section to
app-specific 2FA (no SSO promise while Authelia-OIDC stays parked),
add a 'bewusst nicht versprochen' block (no single sign-on, no
24/7 SLA, no hotline support, no data sharing), and refine the
2FA loss guidance.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New docs/FAMILY_VIEW_DASHBOARD.md specifies the homelab-family-view
Grafana dashboard: 8 panels covering endpoints up, Borg freshness,
cert days, critical containers, disk usage, endpoint table, cert
table and container status. Includes PromQL queries, thresholds,
layout grid, datasource references, build order and smoke test.
Dashboard JSON is intentionally not created yet because the
Borg-stale / cert-expiry / container-down metrics from Sprint 3
are still pending.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add explicit stages A-F to docs/SERVICES_RECOVERY.md: host/docker
baseline, repo source, secrets order, Komodo start, web/GitOps
validation, tier stack rollout. Recovery anchor is ops/komodo/
docker-compose.yml; the self-stack is explicitly not the anchor.
Link DISASTER_RECOVERY Phase 4 stage 3 to the new bootstrap section
and the stack-env-only secrets section in SECRETS_MAP.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>