Backup-Audit-Hardening: Dump-Frische-Monitoring und Scope-Konsistenz

Findings aus dem Backup-/Restore-Audit 2026-06-18 umgesetzt:

- Dump-Frische als Prometheus-Metrik (homelab_borg_dump_present /
  homelab_borg_dump_age_seconds) im Host-Exporter; schliesst den
  Blindfleck, dass Borg weiterlaeuft und stale Dumps archiviert, ohne
  Job-Fehler.
- Neue Alerts HomelabBorgDumpMissing / HomelabBorgDumpStale (critical)
  plus ALERT_RULES.md.
- Freshness-Gate (.sh + .ps1) und H:-Nearline-Pull um n8n.sqlite.dump
  und postgresql17-globals.sql ergaenzt.
- Critical-Container-Watch um mail-archiver, n8n, homeassistant,
  smarthome-mosquitto erweitert.
- BACKUP_SCOPE: /mnt/user/projekte und sonstige User-Shares ausserhalb
  App-Scope als bewusste offene Operator-Entscheidung dokumentiert;
  Hermes-data-Pfad als geparkt klargestellt.
- MASTER_TODO: Nearline-Pull-Ueberwachung, Host-Pull-Nachzug und
  projekte-Scope-Entscheidung aufgenommen.

Enthaelt ausserdem die zuvor vorbereiteten Scope-Erweiterungen
(nextcloud html+data, n8n, filebrowser, influxdb3) und Scope-Drift-/
Retention-/Compact-/Check-Alerts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-18 20:25:54 +02:00
parent 5171059dd1
commit bc9ace315a
11 changed files with 243 additions and 7 deletions
+72
View File
@@ -131,6 +131,78 @@ groups:
summary: "Latest Borg backup completed with warnings"
description: "The latest Borg UI job completed with warnings for archive {{ $labels.archive }}."
- alert: HomelabBorgScopeSourceListMissing
expr: homelab_borg_scope_expected_file_present != 1
for: 15m
labels:
severity: critical
annotations:
summary: "Borg expected source list is not visible"
description: "Borg UI cannot see the repo source list used for drift checks."
- alert: HomelabBorgScopeMissingSources
expr: homelab_borg_scope_missing_sources_total > 0
for: 15m
labels:
severity: critical
annotations:
summary: "Borg UI is missing expected backup sources"
description: "Borg UI is missing {{ $value }} source path(s) from ops/borg-ui/all-important-sources.txt."
- alert: HomelabBorgScopeExtraSources
expr: homelab_borg_scope_extra_sources_total > 0
for: 30m
labels:
severity: warning
annotations:
summary: "Borg UI has sources not tracked in the repo"
description: "Borg UI has {{ $value }} source path(s) that are not listed in ops/borg-ui/all-important-sources.txt."
- alert: HomelabBorgDumpMissing
expr: homelab_borg_dump_present == 0
for: 15m
labels:
severity: critical
annotations:
summary: "Borg pre-backup dump is missing: {{ $labels.dump }}"
description: "Expected dump artifact {{ $labels.dump }} is not present in the latest dump set. The pre-backup dump job may have failed or stopped."
- alert: HomelabBorgDumpStale
expr: homelab_borg_dump_age_seconds > 30 * 60 * 60
for: 15m
labels:
severity: critical
annotations:
summary: "Borg pre-backup dump is stale: {{ $labels.dump }}"
description: "Dump artifact {{ $labels.dump }} is older than 30 hours. pre-backup-dumps.sh may have stopped; Borg would keep archiving stale database content without a job failure."
- alert: HomelabBorgRepositoryCheckStale
expr: time() - homelab_borg_repository_last_check_timestamp_seconds > 14 * 24 * 60 * 60
for: 30m
labels:
severity: warning
annotations:
summary: "Borg repository check is stale"
description: "Borg repository {{ $labels.repository }} has not had a recorded check for more than 14 days."
- alert: HomelabBorgRetentionDisabled
expr: homelab_borg_schedule_prune_after_enabled != 1
for: 30m
labels:
severity: warning
annotations:
summary: "Borg retention pruning is disabled"
description: "Scheduled Borg job {{ $labels.schedule }} does not run prune after backup."
- alert: HomelabBorgCompactDisabled
expr: homelab_borg_schedule_compact_after_enabled != 1
for: 30m
labels:
severity: warning
annotations:
summary: "Borg compaction is disabled"
description: "Scheduled Borg job {{ $labels.schedule }} does not run compact after backup."
- alert: HomelabCriticalContainerDown
expr: homelab_critical_container_running == 0
for: 5m