Files
homelab-infra/ops/borg-ui/scripts/pre-borg.sh
T
Micha f296338530 monitoring + backup: Stale-Handle-Hardening und Dead-Man's-Switch
Schliesst den lokalen Code-Stand fuer zwei offene MASTER_TODO-Punkte ab.

monitoring: restliche Einzeldatei-Bind-Mounts (alertmanager, blackbox,
loki, promtail, alertmanager-ntfy-bridge) auf Directory-Mounts umgestellt,
analog zum Prometheus-Fix vom 2026-06-19. Vermeidet "Stale NFS file handle"
auf dem /mnt/user-FUSE-Share bei git/Komodo-Updates. grafana-provisioning
war bereits Directory-Mount. `docker compose config` gruen. Beim Deploy
--force-recreate noetig, da sich Mount-Zielpfade aendern.

backup: endpoint-agnostischer Dead-Man's-Switch (Healthchecks-kompatibel,
Cloud oder self-hosted) in pull-critical-backups.ps1 und pre-borg.sh.
Pings /start, Erfolg und /fail; No-Op ohne konfigurierte URL, bricht also
keinen Lauf. Ping-URLs sind Capability-URLs und bleiben als Secret
ausserhalb des Repos.

Doku: SECRETS_MAP, Nearline-README und MASTER_TODO nachgezogen.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 17:54:53 +02:00

85 lines
2.8 KiB
Bash
Executable File

#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="${REPO_ROOT:-$(cd "$SCRIPT_DIR/../../.." && pwd)}"
POSTURE_CHECK="${POSTURE_CHECK:-$REPO_ROOT/services/posture-check/posture-check.sh}"
FRESHNESS_CHECK="${FRESHNESS_CHECK:-$REPO_ROOT/ops/restore-tests/check-restore-freshness.sh}"
PRE_BACKUP_DUMPS="${PRE_BACKUP_DUMPS:-$SCRIPT_DIR/pre-backup-dumps.sh}"
NTFY_SCRIPT="${NTFY_SCRIPT:-$REPO_ROOT/ops/restore-tests/send-ntfy.sh}"
NTFY_TOPIC="${NTFY_TOPIC:-homelab-alerts}"
ALLOW_POSTURE_WARNING="${ALLOW_POSTURE_WARNING:-1}"
case "${DUMP_ROOT:-}" in
*/latest)
FRESHNESS_DUMP_ROOT="${FRESHNESS_DUMP_ROOT:-$DUMP_ROOT}"
;;
"")
FRESHNESS_DUMP_ROOT="${FRESHNESS_DUMP_ROOT:-/mnt/user/backups/borg/dumps/latest}"
;;
*)
FRESHNESS_DUMP_ROOT="${FRESHNESS_DUMP_ROOT:-$DUMP_ROOT/latest}"
;;
esac
# Externer Dead-Man's-Switch (endpoint-agnostisch: Healthchecks.io-Cloud oder
# self-hosted). ntfy meldet nur Fehler eines tatsaechlich gestarteten Laufs;
# der externe Switch faengt zusaetzlich den Fall ab, dass der Pre-Hook gar nicht
# laeuft. Die Ping-URL ist eine Capability-URL -> wie ein Secret behandeln,
# niemals ins Repo. Ist keine URL gesetzt, ist der Switch ein No-Op.
HEALTHCHECKS_URL="${HEALTHCHECKS_URL:-${HEALTHCHECKS_BORG_URL:-}}"
HEALTHCHECKS_URL_FILE="${HEALTHCHECKS_URL_FILE:-/mnt/user/appdata/secrets/healthchecks_borg_url}"
if [ -z "$HEALTHCHECKS_URL" ] && [ -r "$HEALTHCHECKS_URL_FILE" ]; then
HEALTHCHECKS_URL="$(tr -d '[:space:]' < "$HEALTHCHECKS_URL_FILE")"
fi
hc_ping() {
# $1: optionaler Suffix ("/start" | "/fail"); leer = Erfolg
[ -n "$HEALTHCHECKS_URL" ] || return 0
command -v curl >/dev/null 2>&1 || return 0
curl -fsS -m 10 --retry 3 "${HEALTHCHECKS_URL}${1:-}" >/dev/null 2>&1 || true
}
notify_failure() {
local step="$1"
local message="$2"
if [ -x "$NTFY_SCRIPT" ]; then
"$NTFY_SCRIPT" "$NTFY_TOPIC" "Borg pre-hook failed: $step" "$message" high || true
fi
hc_ping "/fail"
}
run_step() {
local step="$1"
shift
echo "[pre-borg] Running $step"
if "$@"; then
echo "[pre-borg] OK: $step"
else
rc=$?
notify_failure "$step" "Command failed with exit code $rc: $*"
exit "$rc"
fi
}
hc_ping "/start"
echo "[pre-borg] Running posture-check"
if "$POSTURE_CHECK"; then
echo "[pre-borg] OK: posture-check"
else
rc=$?
if [ "$rc" -eq 1 ] && [ "$ALLOW_POSTURE_WARNING" = "1" ]; then
echo "[pre-borg] WARNING: posture-check returned warnings; continuing because ALLOW_POSTURE_WARNING=1"
else
notify_failure "posture-check" "Command failed with exit code $rc: $POSTURE_CHECK"
exit "$rc"
fi
fi
run_step "pre-backup-dumps" "$PRE_BACKUP_DUMPS"
run_step "restore-freshness" env DUMP_ROOT="$FRESHNESS_DUMP_ROOT" "$FRESHNESS_CHECK"
echo "[pre-borg] All pre-flight checks passed"
hc_ping