Backup-Audit-Hardening: Dump-Frische-Monitoring und Scope-Konsistenz
Findings aus dem Backup-/Restore-Audit 2026-06-18 umgesetzt: - Dump-Frische als Prometheus-Metrik (homelab_borg_dump_present / homelab_borg_dump_age_seconds) im Host-Exporter; schliesst den Blindfleck, dass Borg weiterlaeuft und stale Dumps archiviert, ohne Job-Fehler. - Neue Alerts HomelabBorgDumpMissing / HomelabBorgDumpStale (critical) plus ALERT_RULES.md. - Freshness-Gate (.sh + .ps1) und H:-Nearline-Pull um n8n.sqlite.dump und postgresql17-globals.sql ergaenzt. - Critical-Container-Watch um mail-archiver, n8n, homeassistant, smarthome-mosquitto erweitert. - BACKUP_SCOPE: /mnt/user/projekte und sonstige User-Shares ausserhalb App-Scope als bewusste offene Operator-Entscheidung dokumentiert; Hermes-data-Pfad als geparkt klargestellt. - MASTER_TODO: Nearline-Pull-Ueberwachung, Host-Pull-Nachzug und projekte-Scope-Entscheidung aufgenommen. Enthaelt ausserdem die zuvor vorbereiteten Scope-Erweiterungen (nextcloud html+data, n8n, filebrowser, influxdb3) und Scope-Drift-/ Retention-/Compact-/Check-Alerts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -131,6 +131,78 @@ groups:
|
||||
summary: "Latest Borg backup completed with warnings"
|
||||
description: "The latest Borg UI job completed with warnings for archive {{ $labels.archive }}."
|
||||
|
||||
- alert: HomelabBorgScopeSourceListMissing
|
||||
expr: homelab_borg_scope_expected_file_present != 1
|
||||
for: 15m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "Borg expected source list is not visible"
|
||||
description: "Borg UI cannot see the repo source list used for drift checks."
|
||||
|
||||
- alert: HomelabBorgScopeMissingSources
|
||||
expr: homelab_borg_scope_missing_sources_total > 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "Borg UI is missing expected backup sources"
|
||||
description: "Borg UI is missing {{ $value }} source path(s) from ops/borg-ui/all-important-sources.txt."
|
||||
|
||||
- alert: HomelabBorgScopeExtraSources
|
||||
expr: homelab_borg_scope_extra_sources_total > 0
|
||||
for: 30m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Borg UI has sources not tracked in the repo"
|
||||
description: "Borg UI has {{ $value }} source path(s) that are not listed in ops/borg-ui/all-important-sources.txt."
|
||||
|
||||
- alert: HomelabBorgDumpMissing
|
||||
expr: homelab_borg_dump_present == 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "Borg pre-backup dump is missing: {{ $labels.dump }}"
|
||||
description: "Expected dump artifact {{ $labels.dump }} is not present in the latest dump set. The pre-backup dump job may have failed or stopped."
|
||||
|
||||
- alert: HomelabBorgDumpStale
|
||||
expr: homelab_borg_dump_age_seconds > 30 * 60 * 60
|
||||
for: 15m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "Borg pre-backup dump is stale: {{ $labels.dump }}"
|
||||
description: "Dump artifact {{ $labels.dump }} is older than 30 hours. pre-backup-dumps.sh may have stopped; Borg would keep archiving stale database content without a job failure."
|
||||
|
||||
- alert: HomelabBorgRepositoryCheckStale
|
||||
expr: time() - homelab_borg_repository_last_check_timestamp_seconds > 14 * 24 * 60 * 60
|
||||
for: 30m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Borg repository check is stale"
|
||||
description: "Borg repository {{ $labels.repository }} has not had a recorded check for more than 14 days."
|
||||
|
||||
- alert: HomelabBorgRetentionDisabled
|
||||
expr: homelab_borg_schedule_prune_after_enabled != 1
|
||||
for: 30m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Borg retention pruning is disabled"
|
||||
description: "Scheduled Borg job {{ $labels.schedule }} does not run prune after backup."
|
||||
|
||||
- alert: HomelabBorgCompactDisabled
|
||||
expr: homelab_borg_schedule_compact_after_enabled != 1
|
||||
for: 30m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Borg compaction is disabled"
|
||||
description: "Scheduled Borg job {{ $labels.schedule }} does not run compact after backup."
|
||||
|
||||
- alert: HomelabCriticalContainerDown
|
||||
expr: homelab_critical_container_running == 0
|
||||
for: 5m
|
||||
|
||||
Reference in New Issue
Block a user