Commit Graph

12 Commits

Author SHA1 Message Date
Micha c677ef0515 Add service removal checklist after stale Borg source finding
Befund vom 2026-05-29: HomelabBorgLastJobCompletedWithWarnings
zuendete vier Tage in Folge mit Borg-Exit-Code 107. Ursache im
Logfile: /local/appdata/homepage wurde am 25.05. entfernt, aber
in der Borg-UI-Source-Liste blieb der Eintrag drin und Borg
warnte taeglich BackupFileNotFoundError. Backups selbst waren
nicht gefaehrdet (alle 23 anderen Quellen sauber archiviert).

Operator hat den Eintrag in der Borg-UI manuell entfernt;
Source-Liste jetzt 23 statt 24, naechster Lauf 2026-05-30 sollte
wieder completed ohne Warning sein.

Erkenntnis: bei Stack-Removal wurde die Borg-Source-Liste nicht
mit-aufgeraeumt. WORKFLOW.md um neuen Abschnitt "Service-Removal-
Checkliste" erweitert mit 9 Pflichtschritten inklusive
Borg-UI-Source-Bereinigung als Schritt 8.

Positiv: die am 2026-05-27 scharfgeschaltete Alert-Pipeline
(Cron Textfile -> node-exporter -> Prometheus -> Alertmanager
-> ntfy-Bridge) hat den Drift binnen 24 h sichtbar gemacht.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 15:01:45 +02:00
Micha 5cb401797d Bind AdGuard admin to Tailscale 2026-05-26 14:55:49 +02:00
Micha cd650b19ac Close Gitea signup, dedup posture-check alerts, extend Borg scope
Operational hardening across several services after live incident
analysis between 2026-05-18 and 2026-05-20:

- Gitea: disable public registration and OpenID signup/signin to
  stop the external POST / 5xx bursts that triggered availability
  alerts. New repo-wide policy requires every productive
  Micha/homelab-infra Komodo stack to ship with an active
  Gitea->Komodo webhook on the current stack ID (documented in
  CLAUDE.md, AI_CONTEXT.md, WORKFLOW.md).
- posture-check: extract the Disk1 fstype check into its own
  function so the documented Disk1 NTFS exception no longer raises
  ntfy warnings, skip POSIX inode checks on NTFS, and dedup ntfy
  alerts via a fingerprint state file with ALERT_REPEAT_SECONDS
  (default 24h). Repeat-spam on the same cause now suppressed.
- docker-critical-events: parse the event JSON for container name,
  action, exit code and signal; drop `die exit=0` events (clean
  stops); ship a structured ntfy message instead of the raw event
  line.
- Borg UI: mount /mnt/user/services into the backup container as
  /local/services:ro and include homelab-infra, stacks and
  posture-check in all-important-sources.txt. RESTORE_MATRIX and
  DISASTER_RECOVERY updated accordingly.
- Unraid user scripts: document the new
  homelab-operations-report-daily cron job and the SMTP password
  file it expects on the host.
- MIGRATION_LOG: capture the four live events from this window -
  Gitea 5xx burst + signup closure, Komodo webhook reconciliation,
  posture-check host-version verification, Borg scope extension,
  and Traefik 5xx alert detuning.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 11:05:35 +02:00
Micha ef3b546d30 Align documentation consistency fixes 2026-05-16 20:04:46 +02:00
Micha 16416d964f Add restore test automation scaffolding 2026-05-07 11:07:46 +02:00
Micha 821fe99807 Document GitOps drift recovery and InfluxDB LAN access 2026-05-04 15:23:49 +02:00
Micha 718305cb98 Update Doku
Update Docu
2026-04-17 11:29:38 +02:00
Micha 96d9015867 Harden code-server and move Redis password to secret file
Harden code-server and move Redis password to secret file
2026-04-17 07:56:29 +02:00
Micha d362a9ab4c Aktualisierung
Aktualisierung meiner Doku
2026-04-15 12:21:02 +02:00
Micha 88c07e7d66 docs/WORKFLOW.md aktualisiert 2026-03-30 08:53:45 +00:00
Micha 430d2e3919 docs/WORKFLOW.md aktualisiert 2026-03-28 19:47:07 +00:00
Micha a9f5496c12 docs/WORKFLOW.md hinzugefügt
Add GitOps workflow and no-drift rules
2026-03-23 18:41:08 +00:00