Compare commits
1 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 410b20d74f |
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
mail-archiver:
|
||||
image: s1t5/mailarchiver@sha256:9860b170040dc096aeedc47bdbb09e2913aa8742eec205e29edd0ab79f9dda7e
|
||||
image: s1t5/mailarchiver@sha256:4ea7ecc47ad1dd2c523b85c3967574b61e39def1b6fd26edf874e21733c4018c
|
||||
container_name: mail-archiver
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
nextcloud:
|
||||
image: nextcloud:33.0.5-apache@sha256:56bdc45109067500fd0832fa64832b7c77a167d9394cbf5f0f4b59740b94194d
|
||||
image: nextcloud:33.0.5-apache@sha256:fe5166b04f217c9e3a263bfe76eef5e13de0dd3919966827de1b47bae2308b01
|
||||
container_name: nextcloud
|
||||
restart: unless-stopped
|
||||
depends_on:
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
super-productivity:
|
||||
image: johannesjo/super-productivity:v18.11.0@sha256:bbf510b159262a00ab133818909c2234c7fc71bc59907ab58a810a6d3c15cc43
|
||||
image: johannesjo/super-productivity:v18.9.1@sha256:773760107344e739f4c29409f7842db66a1b167d50eb2c40248cb5b5b328652e
|
||||
container_name: super-productivity
|
||||
restart: unless-stopped
|
||||
|
||||
|
||||
+1
-9
@@ -1,6 +1,6 @@
|
||||
# Alert Rules
|
||||
|
||||
Stand: 2026-06-18
|
||||
Stand: 2026-06-05
|
||||
|
||||
Diese Datei beschreibt die produktiven Alarmwege und wichtigsten Regeln. Die
|
||||
Konfiguration selbst liegt in `monitoring/prometheus/alerts.yml` und in den
|
||||
@@ -36,14 +36,6 @@ Skripten unter `services/posture-check/`.
|
||||
| `HomelabBorgBackupStale` | letztes Borg-Backup >30h | warning | Backup-Lauf nachholen/pruefen |
|
||||
| `HomelabBorgLastJobFailed` | letzter Borg-Job fehlgeschlagen | critical | Borg-UI-Job-Log pruefen |
|
||||
| `HomelabBorgLastJobCompletedWithWarnings` | letzter Borg-Job mit Warnungen | warning | Warnung im Borg-UI-Job lesen |
|
||||
| `HomelabBorgDumpMissing` | erwartetes Dump-Artefakt fehlt im aktuellen Dump-Set | critical | `pre-backup-dumps.sh`/User-Script pruefen |
|
||||
| `HomelabBorgDumpStale` | Dump-Artefakt >30h alt (Borg laeuft, Dumps eingefroren) | critical | `pre-backup-dumps.sh`/User-Script pruefen, nicht nur den Borg-Job |
|
||||
| `HomelabBorgScopeSourceListMissing` | Repo-Quellliste fuer Borg-Drift-Check fehlt | critical | Borg-UI-Mount `/local/services/homelab-infra` und Repo-Pfad pruefen |
|
||||
| `HomelabBorgScopeMissingSources` | Borg UI enthaelt nicht alle Pfade aus `ops/borg-ui/all-important-sources.txt` | critical | Live-Borg-Scope an Repo-Quelle angleichen |
|
||||
| `HomelabBorgScopeExtraSources` | Borg UI enthaelt Pfade ausserhalb der Repo-Quellliste | warning | Doku oder Live-Scope bereinigen |
|
||||
| `HomelabBorgRepositoryCheckStale` | letzter Borg-Check >14 Tage alt | warning | Borg-Repository-Check ausfuehren oder Scheduler pruefen |
|
||||
| `HomelabBorgRetentionDisabled` | Scheduled Job fuehrt kein Prune aus | warning | Retention-Einstellung in Borg UI pruefen |
|
||||
| `HomelabBorgCompactDisabled` | Scheduled Job fuehrt kein Compact aus | warning | Compact-Einstellung in Borg UI pruefen |
|
||||
| `HomelabCriticalContainerDown` | kritischer Container fehlt | critical | Komodo/Docker-Status pruefen |
|
||||
| `HomelabPrometheusTargetDown` | Scrape-Ziel down | critical | node-exporter/cadvisor/blackbox/traefik pruefen |
|
||||
|
||||
|
||||
@@ -1,16 +1,14 @@
|
||||
# Authelia OIDC fuer Apps - Plan & Runbook
|
||||
|
||||
Stand: 2026-06-17. Authelia-Version: **v4.39.20**.
|
||||
Stand: 2026-06-06. Authelia-Version: **v4.39.20**.
|
||||
|
||||
Ziel: App-uebergreifendes Single-Sign-On ueber Authelia als OpenID-Connect-Provider
|
||||
(`https://auth.kaleschke.info`). Statt pro App eigener Logins meldet man sich einmal
|
||||
bei Authelia an (inkl. 2FA) und wird per OIDC an die App durchgereicht.
|
||||
|
||||
> **Status:** aktives Runbook. Grafana und Mealie sind seit 2026-06-06 live
|
||||
> und per Login-Smoke verifiziert. Paperless ist seit 2026-06-17 technisch
|
||||
> verdrahtet (Authelia-Client + Stack-ENV-Secret + Service-Smoke gruen);
|
||||
> finaler Browser-Login mit Operator-Account bleibt offen. Der Rollout bleibt
|
||||
> additiv: lokale App-Logins bleiben als Fallback aktiv.
|
||||
> und per Login-Smoke verifiziert. Der weitere Rollout bleibt additiv: lokale
|
||||
> App-Logins bleiben als Fallback aktiv.
|
||||
|
||||
---
|
||||
|
||||
@@ -87,7 +85,7 @@ docker exec authelia authelia crypto hash generate pbkdf2 \
|
||||
| 2 | Immich | `immich.kaleschke.info` | nativ (Admin-UI/Config-File) | s. u. (Familie) | mittel | **GEPARKT bis Onboarding (Entscheidung 2026-06-06):** nur `micha` hat Authelia-Account, Familien-SSO-Nutzen entsteht erst mit Familien-Accounts; Immich ist mobil-lastig (hoechste Stoeranfaelligkeit) und braucht UI/Config-File. Erst nach Onboarding gezielt. Runbook bereit. |
|
||||
| 3 | Nextcloud | `cloud.kaleschke.info` | App `user_oidc` (+occ) | s. u. | mittel | **GEPARKT bis Onboarding (Entscheidung 2026-06-06):** wie Immich; braucht `user_oidc`-App-Install + `occ`. Lokaler Login bleibt. Erst nach Onboarding. Runbook bereit. |
|
||||
| **4 ERLEDIGT 2026-06-06** | Mealie | `mealie.kaleschke.info` | nativ | `one_factor` | niedrig | **Live + Login verifiziert.** OIDC-Env additiv (lokaler Login bleibt), Secret als Stack-ENV `${MEALIE_OIDC_CLIENT_SECRET}`, `extra_hosts` noetig (s. Gotchas) |
|
||||
| **5 TEILWEISE ERLEDIGT 2026-06-17** | Paperless-ngx | `paperless.kaleschke.info` | `django-allauth` (Umgebungsvariablen) | `one_factor` (hostseitiger Ist-Stand; `two_factor` spaeter moeglich) | mittel | **Authelia-Client + `${PAPERLESS_OIDC_SECRET}` in Stack-ENV gesetzt, Authelia-Config validiert, Paperless HTTP-Smoke `200`.** Lokaler Login bleibt Fallback; finaler Browser-Login mit Operator-Account offen. |
|
||||
| 5 | Paperless-ngx | `paperless.kaleschke.info` | `django-allauth` (Umgebungsvariablen) | `two_factor` | mittel | dokumentenlastig, Operator-nah |
|
||||
|
||||
**Nicht OIDC:** Vaultwarden hat kein Standard-Endnutzer-OIDC (SSO ist Enterprise/Bitwarden-Feature) -> bleibt eigener Login. ntfy bleibt wie gehabt.
|
||||
|
||||
@@ -177,8 +175,7 @@ GF_AUTH_GENERIC_OAUTH_ALLOW_SIGN_UP=true
|
||||
E-Mail-Claim. Stimmt die Authelia-E-Mail mit dem App-Account, wird verknuepft;
|
||||
sonst legt die App (bei aktivem Signup) einen neuen User an.
|
||||
- **Secret-Mechanik je App verschieden:** Grafana `__FILE` (Docker-Secret),
|
||||
Mealie Stack-ENV `${MEALIE_OIDC_CLIENT_SECRET}`, Paperless Stack-ENV
|
||||
`${PAPERLESS_OIDC_SECRET}`. Hash immer in der Authelia-Host-Config, Klartext nie ins Repo.
|
||||
Mealie Stack-ENV `${...}`. Hash immer in der Authelia-Host-Config, Klartext nie ins Repo.
|
||||
|
||||
## Spaetere Feinschliffe vor breitem Rollout
|
||||
|
||||
|
||||
+11
-14
@@ -1,6 +1,6 @@
|
||||
# Master To-do - KalliLab CORE
|
||||
|
||||
Typ: Status/To-do · Stand: 2026-06-18 · Status: aktiv
|
||||
Typ: Status/To-do · Stand: 2026-06-12 · Status: aktiv
|
||||
|
||||
Diese Liste ist die **einzige** Arbeitsliste fuer offene operative Punkte im
|
||||
Homelab. Detailablaeufe stehen in den verlinkten Runbooks; Entscheidungen mit
|
||||
@@ -23,21 +23,18 @@ Host-Reports (`/mnt/user/backups/restore-reports/`) und in der Git-Historie.
|
||||
| Family-Onboarding erster Termin | Operator | Checkliste ist fertig (`docs/FAMILY_ONBOARDING.md` Abschnitt "Erster Onboarding-Termin"). Personen/Geraete festlegen, Reihenfolge Vaultwarden -> Immich -> Mealie pro Person abarbeiten | `docs/FAMILY_ONBOARDING.md` |
|
||||
| Restore-Test Unraid OS Flash (Stick-Boot) | Operator | Artefakt-Validierung 2026-06-05 erledigt (`ops/maintenance/check-unraid-flash-backup.sh`). **Verbleibt:** physischer Ersatzstick-Boot-Test, wenn ein Wegwerf-Stick bereitliegt | `ops/restore-tests/unraid-flash-runbook.md` |
|
||||
| Restore-Test Tailscale | Operator | State-Validierung + Reconnect nur auf Wegwerf-Host/VM, danach Geraet in Tailscale-Admin entfernen | `ops/restore-tests/tailscale-runbook.md` |
|
||||
| Authelia OIDC fuer Apps | Operator/Codex | Live: Grafana + Mealie login-verifiziert; Paperless Secret verdrahtet und Service-Smoke am 2026-06-17 gruen, finaler Browser-Login mit Operator-Account offen. Immich + Nextcloud bewusst geparkt bis Family-Onboarding (siehe `docs/DECISIONS.md` 2026-06-06) | `docs/AUTHELIA_OIDC_PLAN.md` |
|
||||
| Authelia OIDC fuer Apps | Operator/Claude | Live: Grafana + Mealie (verifiziert), Paperless deployed (Login-Test offen). Immich + Nextcloud bewusst geparkt bis Family-Onboarding (siehe `docs/DECISIONS.md` 2026-06-06) | `docs/AUTHELIA_OIDC_PLAN.md` |
|
||||
| Glance-v2-Widgets: Tokens setzen | Operator | In Komodo Stack-ENV fuer `ops-glance` setzen: `GLANCE_KOMODO_API_KEY`/`_SECRET` (Komodo read-only API-Key), `GLANCE_GITEA_TOKEN` (read-only, scope `read:repository`), `GLANCE_PAPERLESS_TOKEN`, `GLANCE_MEALIE_TOKEN`; bis dahin zeigen die neuen Widgets Fehler/leer. Speedtest-Widget: falls weiter 0.0, API-Response pruefen | `ops/glance/config/` |
|
||||
| Home Assistant Tibber | Operator/Codex | Tibber per HA-UI-Config-Flow verbinden. Danach Energy-Dashboard um echte Kosten/Preisquelle ergaenzen; SolarEdge-PV, Netz und Speicher sind bereits konfiguriert und validiert | `docs/runbooks/smart-home-bootstrap.md`, `docs/DECISIONS.md` |
|
||||
| Nearline-Pull Dead-Man's-Switch | Operator | H:-Pull war ~2026-06-04 bis 2026-06-18 still gestoppt (Task fehlte, kein Alarm). Lauf nachgeholt + Scheduled Task `KalliLab H Drive Nearline Pull` neu registriert (State Ready). **Verbleibt:** externer Dead-Man's-Switch (Healthchecks.io-Ping am Ende von `pull-critical-backups.ps1` und `ops/borg-ui/scripts/pre-borg.sh`), da Prometheus auf Unraid den baerchen-Pull nicht sieht | `ops/h-drive-nearline/README.md` |
|
||||
| Monitoring Single-File-Bind-Mount Hardening | Operator/Claude | Prometheus am 2026-06-19 auf stabilen Directory-Mount (`./prometheus:/etc/prometheus/config:ro`) umgestellt + recreated -> Stale-Handle-Footgun dort beseitigt, Reload reicht wieder. **Verbleibt:** gleiches Einzeldatei-Muster bei alertmanager/blackbox/loki/promtail/grafana-provisioning praeventiv auf Directory-Mount umstellen | `monitoring/docker-compose.yml` |
|
||||
| Audit-PDF aus `docs/` entfernen | Operator | `docs/KalliLab_CORE_Audit_2026-06-06.pdf` (untracked) extern ablegen (H:/ oder Documents-Share) und lokal loeschen; Binaerdateien gehoeren nicht ins GitOps-Repo | Doku-Regeln `docs/REPO_MAP.md` |
|
||||
|
||||
---
|
||||
|
||||
## Operator-Entscheidung
|
||||
|
||||
**Stand 2026-06-11: keine offenen Operator-Entscheidungen.**
|
||||
Getroffene Entscheidungen mit Begruendung und Review-Trigger: `docs/DECISIONS.md`.
|
||||
|
||||
| Thema | Entscheidung noetig | Quelle |
|
||||
|---|---|---|
|
||||
| `/mnt/user/projekte` Backup-Scope | Filebrowser serviert `projekte` (und ganze `documents`/`photos`), aber nur App-Unterordner sind im Borg-Scope. Entscheiden: `projekte` als read-only Borg-UI-Mount + Quelllisten-Eintrag aufnehmen, oder bewusst als "nur lokal, nicht DR-relevant" bestaetigen | `ops/borg-ui/BACKUP_SCOPE.md` Abschnitt "User-Daten-Shares ausserhalb des App-Scope" |
|
||||
|
||||
---
|
||||
|
||||
## Geparkt
|
||||
@@ -53,11 +50,11 @@ Bewusst nicht jetzt - Begruendungen in `docs/DECISIONS.md`, hier nur Thema und T
|
||||
| CrowdSec vor Traefik | breitere Attack Surface als nur `443/tcp` | `docs/DECISIONS.md` |
|
||||
| Nextcloud 2FA (Operator-TOTP) | OIDC-/SSO-Block erreicht die App-Login-Ebene | `docs/DECISIONS.md` |
|
||||
| Hermes-Agent | Review-Deadline 2026-07-25; NAS-Stack bleibt deaktiviert | `docs/SERVICE_CATALOG.md` |
|
||||
| Tailnet-Konsole aufraeumen (Rest) | trivial, bei Gelegenheit: tote Node-Eintraege (`kallilab-core`, alter `baerchen`) in der Tailscale-Admin-Konsole entfernen; optional State-Pfad `/mnt/user/appdata/tailscale` nach `_archive/` | `docs/NETWORK_INVENTORY.md` |
|
||||
| Dedizierter SMB-User `veeam-baerchen` | nur wenn Unraid-User-/Share-Rechte bewusst angefasst werden | `ops/windows-reinstall/docs/windows-image-backup-baseline.md` |
|
||||
| Filebrowser-Mount-Scope | naechster Hardening-Sprint | `docs/SERVICE_CATALOG.md` |
|
||||
| Scrutiny Privileged-Ausnahme | nur mit klarer Begruendung aendern | `docs/SERVICE_CATALOG.md` |
|
||||
| Immich Redis named volume | passende Wartung am Immich-Stack | `docs/SERVICE_CATALOG.md` |
|
||||
| Komodo keys named volume | gemeinsames Wartungsfenster mit Operator | Live-Volume `komodo_komodo_keys` nach `/mnt/user/appdata/komodo/keys` migrieren, Compose anpassen, Periphery-Reconnect pruefen, dann in Borg-Scope aufnehmen |
|
||||
| Storage-Wachstum (zweite NVMe, zweite Array-Disk, ZFS/BTRFS) | Trigger aus Capacity-Doku | `docs/STORAGE_LAYOUT.md`, `docs/CAPACITY_AND_LIFECYCLE.md` |
|
||||
| Wiederkehrende Restore-Drills | laufend nach Kadenz, inkl. quartalsweisem Frische-Negativtest (`run-restore-checks.sh freshness-negative`) | `docs/RESTORE_MATRIX.md`, `ops/restore-tests/schedule.md` |
|
||||
| Doku-Quartals-Gaertnern (~15 min) | quartalsweise, erster Lauf mit Q3-Review ab 2026-07-01: Datiertes archivieren, Done-/Review-Logs kuerzen, tote Links pruefen | `docs/REPO_MAP.md` Doku-Regeln |
|
||||
@@ -74,11 +71,11 @@ Bewusst nicht jetzt - Begruendungen in `docs/DECISIONS.md`, hier nur Thema und T
|
||||
|
||||
## Zuletzt erledigt (Kurzlog, max. 5 Eintraege)
|
||||
|
||||
- **2026-06-17** Offene TODOs gegen Live-Stand abgeglichen: Paperless-OIDC-Secret verdrahtet und Service-Smoke gruen; alter Tailscale-Docker-State nach `_archive/tailscale-removed-2026-06-06/` verschoben; Tailnet-Restpunkt geschlossen.
|
||||
- **2026-06-17** Repo-Hygiene abgeschlossen: Glance-Widget-Tokens sind in Runtime gesetzt, Audit-PDF liegt extern unter `H:\kallilab-recovery\audits`, Worktree clean.
|
||||
- **2026-06-17** Komodo/Gitea-Webhooks normalisiert: aktive Komodo-Hooks fuer `Micha/homelab-infra` nutzen Branch-Filter `master`; DB-Backup vor Host-Hotfix erstellt. Workflow-Regel nachgezogen.
|
||||
- **2026-06-19** Backup-Hardening live verifiziert: Borg-Scope-Drift 0 (alle 33 Quellen konfiguriert), Dumps frisch (11/11 present), neue Dump-Alerts aktiv (25 Regeln, 0 feuern). Prometheus-`alerts.yml`-Stale-Handle (FUSE-Einzeldatei-Mount) per `--force-recreate` behoben und anschliessend dauerhaft auf Directory-Mount umgestellt (recreated, 25 Regeln aktiv).
|
||||
- **2026-06-18** Backup-Audit-Hardening: Dump-Frische-Metriken + Alerts `HomelabBorgDumpMissing/Stale`, Freshness-Checks + Nearline-Pull um `n8n`/`globals` ergaenzt, 4 Tier-2-Container in Critical-Watch, Scope-Doku fuer `projekte`/Hermes praezisiert. H:-Nearline (still seit 2026-06-04) nachgeholt + Task neu registriert.
|
||||
- **2026-06-13** Home Assistant MQTT-Integration produktiv verbunden: Config-Entry `smarthome-mosquitto` ist `loaded`, Mosquitto sieht den HA-Client `homeassistant`; `check_config` gruen.
|
||||
- **2026-06-13** HA Energy Dashboard konfiguriert: Netz, PV und Speicher aus SolarEdge Local gesetzt, `energy/validate` ohne Issues; HA-Backup danach erzeugt.
|
||||
- **2026-06-13** SolarEdge lokal angebunden: `solaredge_modbus_multi` v3.2.5 ueber `192.168.178.111:1502`, Device-ID `1`; 68 Entitaeten inkl. Inverter, Smart Meter und Batterie; HA-Backup danach erzeugt.
|
||||
- **2026-06-13** Home Assistant Restore-Probe erfolgreich: isolierter Test aus HA-native Backup + Mosquitto-Appdata + Fachrepo-Clone, HA HTTP/API/check_config gruen, MQTT Publish/Subscribe und retained Topic nach Broker-Restart gruen. Report: `/mnt/user/backups/restore-reports/homeassistant-2026-06-13.md`.
|
||||
- **2026-06-13** Home Assistant Foundation live: `smart-home` in Komodo angelegt, Gitea-Webhook aktiv, Authelia-Onboarding-Guard entfernt, HA-native Auth + Login-Ban aktiv, HA-Backup erzeugt/geprueft und MQTT-Broker-Smoke erfolgreich.
|
||||
|
||||
---
|
||||
|
||||
|
||||
+18
-17
@@ -1,7 +1,7 @@
|
||||
# Network Inventory - KalliLab CORE
|
||||
|
||||
Status: Host-Audit erfasst; Router-Baseline und Portfreigaben-UI bereinigt; FRITZ!Box-Remote-Dienste aus; IPv6-Exposure technisch und per UI entschaerft; Tailscale-Inventar am 2026-06-17 real gemessen.
|
||||
Letzte Pruefung: 2026-06-17 (Tailscale-Inventar), 2026-06-01 (Router/Ports)
|
||||
Status: Host-Audit erfasst; Router-Baseline und Portfreigaben-UI bereinigt; FRITZ!Box-Remote-Dienste aus; IPv6-Exposure technisch und per UI entschaerft; Tailscale-Inventar am 2026-06-05 real gemessen.
|
||||
Letzte Pruefung: 2026-06-05 (Tailscale-Inventar), 2026-06-01 (Router/Ports)
|
||||
|
||||
## Zweck
|
||||
|
||||
@@ -38,7 +38,7 @@ Dieses Dokument beschreibt Router, DNS, Tailscale, Portfreigaben und Netztrennun
|
||||
| Komponente | Rolle | Adresse | Bemerkung |
|
||||
|---|---|---|---|
|
||||
| AdGuard Home | LAN DNS / Filter | Host `192.168.178.58`, Docker `172.23.0.3` | DNS auf Port 53; Admin soll nur via Tailscale-IP `100.80.98.33:8082` erreichbar sein |
|
||||
| Unbound | DNSSEC-validierender Forwarding-Resolver | Docker `dns_net` | Upstream fuer AdGuard; forwardet per DoT zu Cloudflare, keine Root-Rekursion |
|
||||
| Unbound | Rekursiver Resolver | Docker `dns_net` | Upstream fuer AdGuard |
|
||||
| Cloudflare | Authoritative DNS | extern | DNS-Challenge fuer TLS |
|
||||
| Router | DHCP DNS-Verteilung | TBD | Muss auf AdGuard zeigen, falls so betrieben |
|
||||
|
||||
@@ -57,16 +57,18 @@ Gemessen am 2026-06-05 per read-only SSH auf den Host (`tailscale status`,
|
||||
| Subnet Router | **Ja, aktiv.** Host advertised und ist Primary fuer `192.168.178.0/24` (`Self.PrimaryRoutes: ["192.168.178.0/24"]`, ebenfalls in `AllowedIPs`). Das LAN ist also fuer das gesamte Tailnet ueber diesen Subnet-Router erreichbar — bewusst gemessener Ist-Zustand, **kein** "keine Route" wie zuvor vermutet. |
|
||||
| ACL-Policy extern dokumentiert | **Angewendet 2026-06-06** — restriktive Tag-basierte `grants`-Policy live (`tag:server`/`tag:operator`, `tag:family` schlafend). Default-Allow entfernt, verifiziert. Details im Block unten. |
|
||||
|
||||
### Tailnet-Geraete (Snapshot 2026-06-17)
|
||||
### Tailnet-Geraete (Snapshot 2026-06-05)
|
||||
|
||||
| Tailscale-IP | Node | OS | Status |
|
||||
|---|---|---|---|
|
||||
| `100.80.98.33` | kallilabcore | linux | aktiv (Host, Subnet-Router) |
|
||||
| `100.78.133.37` | baerchen-1 | windows | aktiv (aktuelle Operator-Workstation, direct) |
|
||||
| `100.73.83.55` | iphone-14 | iOS | bekannt, aktuell offline |
|
||||
| `100.105.203.21` | baerchen | windows | offline, zuletzt vor ~1 Tag gesehen (Alt-Node) |
|
||||
| `100.73.83.55` | iphone-14 | iOS | bekannt |
|
||||
| `100.112.0.90` | kallilab-core | linux | **am 2026-06-06 entfernt.** War der redundante userspace-only `Tailscale-Docker`-Stack (`host-services/tailscale/`). Komodo-Stack gestoppt+destroyed, Repo-Pfad per `git rm` entfernt, Container weg (read-only verifiziert). Node-Eintrag in der Admin-Konsole noch zu entfernen. |
|
||||
|
||||
> **Historischer Befund 2026-06-06 (read-only auf dem Host ermittelt):** Der Host
|
||||
> hatte damals **zwei** `tailscaled`-Prozesse:
|
||||
> **Befund 2026-06-06 (read-only auf dem Host ermittelt):** Der Host hat **zwei**
|
||||
> `tailscaled`-Prozesse:
|
||||
>
|
||||
> 1. **Native Unraid-Plugin** = `kallilabcore` (100.80.98.33). Prozess
|
||||
> `/usr/local/sbin/tailscaled -statedir /boot/config/plugins/tailscale/state
|
||||
@@ -87,10 +89,9 @@ Gemessen am 2026-06-05 per read-only SSH auf den Host (`tailscale status`,
|
||||
> (Operator), `git rm host-services/tailscale/`, Glance-Widget entfernt, und
|
||||
> Architektur-/Service-Catalog-/DR-/CLAUDE-Doku auf "natives Plugin" nachgezogen.
|
||||
> Read-only verifiziert: Container weg, nur noch der native `tailscaled` mit
|
||||
> `tailscale1`, Subnet-Route + Operator-Zugriff intakt. Nachpruefung 2026-06-17:
|
||||
> `tailscale status --self=false` zeigt nur noch `baerchen-1` und `iphone-14`;
|
||||
> der alte State-Pfad `/mnt/user/appdata/tailscale` ist weg und liegt archiviert
|
||||
> unter `/mnt/user/appdata/_archive/tailscale-removed-2026-06-06/`.
|
||||
> `tailscale1`, Subnet-Route + Operator-Zugriff intakt. Offen: Node-Eintraege
|
||||
> `kallilab-core` und alter `baerchen` in der Admin-Konsole entfernen; State-Pfad
|
||||
> `/mnt/user/appdata/tailscale` bei Gelegenheit nach `_archive/` (kein Sofort-Loeschen).
|
||||
>
|
||||
> **Doku-Korrektur erledigt:** `docs/RESTORE_MATRIX.md` zeigt jetzt auf den
|
||||
> funktionalen State `/boot/config/plugins/tailscale/state` (im Flash-Backup)
|
||||
@@ -154,8 +155,8 @@ erhalten.
|
||||
```
|
||||
|
||||
**Geraete-Tags (live):** `kallilabcore` = `tag:server`; `baerchen-1` + `iphone-14`
|
||||
= `tag:operator`. Alte Nodes `kallilab-core` und `baerchen` sind nicht mehr im
|
||||
aktuellen Tailnet-Status sichtbar.
|
||||
= `tag:operator`; `kallilab-core` (Docker) + alter `baerchen` bewusst untagged ->
|
||||
isoliert.
|
||||
|
||||
**Rollout-Protokoll 2026-06-06 (lockout-sicher, je Schritt read-only verifiziert):**
|
||||
|
||||
@@ -192,10 +193,10 @@ ist die vollstaendige Wahrheit.
|
||||
- Familien-Dienste/Ports konkretisieren — erst wenn ein reales Familiengeraet dazukommt.
|
||||
- **Zwei-Tailscale-Konsolidierung: ERLEDIGT 2026-06-06** — redundanter Docker-Stack
|
||||
abgebaut, nur noch die native Plugin-Instanz `kallilabcore` (Subnet-Router) aktiv.
|
||||
- **Tailnet-Konsole/Altstate aufraeumen: ERLEDIGT 2026-06-17** — Node-Eintraege
|
||||
`kallilab-core` und alter Offline-`baerchen` sind im aktuellen Tailnet-Status
|
||||
nicht mehr sichtbar; State-Pfad `/mnt/user/appdata/tailscale` vom entfernten
|
||||
Docker-Stack liegt unter `_archive/tailscale-removed-2026-06-06/`.
|
||||
- **Tailnet-Konsole aufraeumen: ERLEDIGT 2026-06-06** — Node-Eintraege `kallilab-core`
|
||||
und alter Offline-`baerchen` aus der Admin-Konsole entfernt.
|
||||
- State-Pfad `/mnt/user/appdata/tailscale` (vom entfernten Docker-Stack) bei
|
||||
Gelegenheit nach `_archive/tailscale-removed-2026-06-06/` (kein Sofort-Loeschen).
|
||||
- Optionaler Off-LAN-Routentest: von einem Operator-Geraet im Mobilfunk
|
||||
(nicht im Heim-LAN) ein LAN-Ziel ueber `192.168.178.0/24` erreichen, um die
|
||||
Subnet-Route end-to-end zu bestaetigen (im Heim-LAN nicht sauber isolierbar).
|
||||
|
||||
@@ -29,7 +29,7 @@ Sie ist die fachliche Ergaenzung zu `docs/DISASTER_RECOVERY.md`.
|
||||
| Unraid OS Flash | Borg-Artefakt + optional Unraid Connect | `/boot/config` aus `unraid-flash-config.tar.gz` | `unraid-flash-config.tar.gz`, `.sha256`, Manifest | enthaelt sensible Host-Konfiguration, wie Secret-Material behandeln | Unraid USB Flash Creator / neuer Boot-Stick | Unraid bootet, Array-Zuordnung und Shares sind sichtbar |
|
||||
| Traefik | Share / Borg | `/mnt/user/appdata/traefik`, besonders `dynamic/`, `letsencrypt`, `secrets` | keine eigene DB | `cloudflare_dns_api_token` | `frontend_net`, `backend_net` | `https://traefik.kaleschke.info` erreichbar, Dashboard ueber Authelia |
|
||||
| AdGuard Home | Share / Borg | `/mnt/user/appdata/adguard/conf` | keine | keine zusaetzlichen Repo-Secrets dokumentiert | `dns_net`, `frontend_net` | DNS-Aufloesung funktioniert; Restore-Smoke am 2026-06-06 erfolgreich |
|
||||
| Tailscale | Flash-Backup (funktional) | **Funktional: `/boot/config/plugins/tailscale/state`** (native Unraid-Plugin-Instanz `kallilabcore`, Subnet-Router, im Flash-Backup gesichert). Der frueher genannte Pfad `/mnt/user/appdata/tailscale` gehoerte zum entfernten userspace-only Docker-Stack `kallilab-core` und ist seit 2026-06-17 nach `/mnt/user/appdata/_archive/tailscale-removed-2026-06-06/` verschoben; nicht mehr als aktive Restore-Quelle behandeln | keine | Tailscale-State im Flash-Backup; Archivpfad nur fuer Altanalyse | Host-Netz | Tailscale verbunden, Subnet-Route `192.168.178.0/24` aktiv |
|
||||
| Tailscale | Flash-Backup (funktional) / Share | **Funktional: `/boot/config/plugins/tailscale/state`** (native Unraid-Plugin-Instanz `kallilabcore`, Subnet-Router, im Flash-Backup gesichert). Der frueher hier genannte Pfad `/mnt/user/appdata/tailscale` gehoert zum **userspace-only Docker-Stack** `kallilab-core` (redundant, Abbau geplant — siehe `docs/NETWORK_INVENTORY.md`) | keine | Tailscale-State im jeweiligen State-Pfad | Host-Netz | Tailscale verbunden, Subnet-Route `192.168.178.0/24` aktiv |
|
||||
| PostgreSQL 18 | Share + Dumps | `/mnt/user/appdata/postgresql18` (archivierter Rollback-Altstand: `/mnt/user/appdata/_archive/pg18-immich-rollback-volumes-20260602/postgresql17`) | `postgresql17-globals.sql`, `postgresql17-mailarchiver.dump`, `postgresql17-paperless.dump`, optional `postgresql17-authelia.dump` | `postgres_password.txt`, App-Rollen-Passwoerter aus den jeweiligen Stack-ENV/Secret-Dateien | `backend_net` | DB startet, Ziel-Datenbanken vorhanden; `SHOW data_checksums` ist `on` |
|
||||
| Redis 8 | Share / Host | `/mnt/user/appdata/redis`; Rollback-Backup unter `/mnt/user/backups/borg/dumps/latest/shared-redis-pre-redis8-<ts>` | RDB/AOF-Dateien im Datenpfad | `redis_password.txt` | `backend_net` | Redis startet, `redis_version` ist 8.x, Apps verbinden sich; Restore-Smoke am 2026-06-06 erfolgreich |
|
||||
| Authelia | Borg | `/mnt/user/appdata/authelia/config`, `/mnt/user/appdata/secrets/*authelia*` | Shared PostgreSQL 18, optional Dump `postgresql17-authelia.dump` | JWT/Session/Storage/Postgres-/SMTP-Secret-Dateien | PostgreSQL 18, Traefik, GMX SMTP | Login-Seite und ForwardAuth funktionieren; SMTP-Notifier startet; aktive Sessions werden nach Restart neu aufgebaut; Restore-Smoke am 2026-06-03 erfolgreich: Config aus Borg, minimale Test-Config, frisches Test-Postgres, HTTP `/api/health` 200, Report `/mnt/user/backups/restore-reports/authelia-2026-06-03.md` |
|
||||
@@ -52,7 +52,7 @@ Sie ist die fachliche Ergaenzung zu `docs/DISASTER_RECOVERY.md`.
|
||||
|
||||
| Dienst | Fuehrende Quelle | Datei-Restore | Dump / DB | Secrets / ENV | Abhaengigkeiten | Smoke-Test |
|
||||
|---|---|---|---|---|---|---|
|
||||
| Paperless-ngx | Borg + Dumps | `/mnt/user/appdata/paperless-ngx/data`, `/mnt/user/documents/paperless`, `/mnt/user/documents/paperless/export`, `/mnt/user/documents/scans_inbox` | `postgresql17-paperless.dump` | `PAPERLESS_DBPASS`, `PAPERLESS_REDIS`, `PAPERLESS_OIDC_SECRET`, `borg_repo_passphrase.txt` fuer Restore-Tests | PostgreSQL 18, Redis, Traefik, Authelia OIDC | Web-UI startet, Dokumente vorhanden; Restore-Test am 2026-05-31 erfolgreich: Borg-Archiv `Tägliche-Sicherung-2026-05-31T04:30:13.181`, isolierter PostgreSQL-18-/Redis-8-Testpfad, HTTP `200`, `32` Dokumente im Test-DB-Check, Report `/mnt/user/backups/restore-reports/paperless-2026-05-31.md`; OIDC-Secret am 2026-06-17 verdrahtet, lokaler Login bleibt Fallback |
|
||||
| Paperless-ngx | Borg + Dumps | `/mnt/user/appdata/paperless-ngx/data`, `/mnt/user/documents/paperless`, `/mnt/user/documents/paperless/export`, `/mnt/user/documents/scans_inbox` | `postgresql17-paperless.dump` | `PAPERLESS_DBPASS`, `PAPERLESS_REDIS`, `borg_repo_passphrase.txt` fuer Restore-Tests | PostgreSQL 18, Redis, Traefik | Web-UI startet, Dokumente vorhanden; Restore-Test am 2026-05-31 erfolgreich: Borg-Archiv `Tägliche-Sicherung-2026-05-31T04:30:13.181`, isolierter PostgreSQL-18-/Redis-8-Testpfad, HTTP `200`, `32` Dokumente im Test-DB-Check, Report `/mnt/user/backups/restore-reports/paperless-2026-05-31.md` |
|
||||
| Mealie | Borg + Dump | `/mnt/user/appdata/mealie/data`, `/mnt/user/appdata/mealie/postgres18` (archivierter Rollback-Altstand: `/mnt/user/appdata/_archive/pg18-immich-rollback-volumes-20260602/mealie-postgres17`) | `mealie.dump` | `mealie_postgres_password.txt` | `mealie-postgres`, Traefik | UI startet, Rezepte vorhanden |
|
||||
| Immich | Borg + Dump | `/mnt/user/photos/immich`, `/mnt/user/photos/family_archive`, `/mnt/user/appdata/immich_postgres_vectorchord`; archivierter Rollback-Altstand: `/mnt/user/appdata/_archive/pg18-immich-rollback-volumes-20260602/immich-postgres-pgvecto-rs` | `immich.dump`; nach VectorChord braucht ein Restore ein Postgres-Image mit VectorChord | `IMMICH_DB_PASSWORD`, `immich_postgres_password.txt`, `borg_repo_passphrase.txt` fuer Restore-Tests | `immich_postgres`, `immich_redis`, Traefik | DB- und UI-Smoke gegen produktives Borg-Archiv am 2026-05-27 erfolgreich validiert; VectorChord-Migration am 2026-05-31: `11977` Assets, `11107` Smart-Search-Zeilen, `7092` Face-Search-Zeilen, `vchord 0.4.3`, `vector 0.8.1`, HTTP/API-Smoke 200. Voll-Restore der Foto-Dateien bleibt separater DR-Drill |
|
||||
| Mail-Archiver | Borg + Shared Dump | `/mnt/user/appdata/mailarchiver/data-protection-keys` | `postgresql17-mailarchiver.dump` | `MAILARCHIVER_DB_CONNECTION`, `MAILARCHIVER_AUTH_PASSWORD` | PostgreSQL 18, Traefik, Authelia | Authelia-Weiterleitung greift; nach Login startet die Web-UI und das Archiv laesst sich oeffnen |
|
||||
@@ -60,7 +60,6 @@ Sie ist die fachliche Ergaenzung zu `docs/DISASTER_RECOVERY.md`.
|
||||
| Glance | Git / Borg-Repo | Repo-Konfiguration unter `ops/glance/config/glance.yml`; keine kritische Datenpersistenz | keine | `GLANCE_IMMICH_API_KEY`, `GLANCE_ADGUARD_USERNAME`, `GLANCE_ADGUARD_PASSWORD`, `GLANCE_SPEEDTEST_API_KEY` | Traefik, Authelia, optional interne API-Ziele | Dashboard startet, Widgets laden, Docker-Status laeuft nur ueber `glance-docker-socket-proxy` |
|
||||
| ntfy | Borg / Share | `/mnt/user/appdata/ntfy` | keine | keine besonderen Secret-Dateien dokumentiert | Traefik | UI und Push-Endpunkt erreichbar |
|
||||
| Paperless-GPT | Borg / Share | `/mnt/user/appdata/paperless-gpt` | keine eigene DB | `PAPERLESS_API_TOKEN`, `OPENAI_API_KEY` | Traefik, Paperless, OpenAI API | UI startet, Konfiguration vorhanden; LLM-Provider zeigt `openai` / `gpt-5.4-mini` |
|
||||
| n8n | Borg + Dump | `/mnt/user/appdata/n8n/data` | `n8n.sqlite.dump`; Credentials sind nur mit dem passenden `N8N_ENCRYPTION_KEY` entschluesselbar | `N8N_ENCRYPTION_KEY`, GMX/OpenAI/Gitea-Credentials in n8n | Traefik, GMX IMAP, OpenAI API, Gitea API | UI startet, Owner-Login funktioniert, kritischer Mail->LLM->Gitea-Workflow ist vorhanden und deaktiviert/aktiv wie vor Restore |
|
||||
| Home Assistant | Borg + HA-native Backups + Fachrepo | `/mnt/user/appdata/homeassistant` inkl. `.storage`, `secrets.yaml`, `trusted_proxies.yaml`, `custom_components` (HACS, `solaredge_modbus_multi`); Fach-YAML aus `/mnt/user/services/smart-home-kalli/home-assistant` | HA-native Backup-Artefakte unter `/mnt/user/appdata/homeassistant/backups`; erstes Artefakt 2026-06-13 erzeugt und tar-lesbar (`backup.json`, `homeassistant.tar.gz`); Backup nach SolarEdge-Integration: `Custom_backup_2026.6.1_2026-06-13_14.59_48645373.tar`; Backup nach Energy-Dashboard-Konfiguration: `Custom_backup_2026.6.1_2026-06-13_15.59_25670583.tar`; keine externe DB in Phase 1 | HA-Secrets in `secrets.yaml`, Integrations-Tokens in `.storage`, MQTT-Credentials, Agent-API-Tokens als Host-Secrets `ha_token_codex`/`ha_token_claude` (nur mit erhaltenem `.storage`-Auth-State nutzbar), spaeter Tibber/InfluxDB-Tokens | Traefik, `frontend_net`, `smarthome_net`, Mosquitto, Fachrepo-Clone, SolarEdge-Wechselrichter `192.168.178.111:1502` | Restore-Test am 2026-06-13 erfolgreich: HA-native Backup + Mosquitto-Appdata + Fachrepo-Clone isoliert gestartet, HA HTTP/API/check_config gruen; produktiv danach HA-MQTT-Config-Entry `smarthome-mosquitto` geladen, SolarEdge Local `solaredge_modbus_multi` loaded mit 68 Entitaeten und Energy Dashboard fuer Netz/PV/Speicher per `energy/validate` ohne Issues; Report `/mnt/user/backups/restore-reports/homeassistant-2026-06-13.md` |
|
||||
| Smart-Home MQTT / Mosquitto | Borg / Share | `/mnt/user/appdata/mosquitto/config`, `/mnt/user/appdata/mosquitto/data`, `/mnt/user/appdata/mosquitto/log` | Mosquitto persistiert retained messages/subscriptions dateibasiert | `passwordfile`, `aclfile`, spaeter per-Device-User | `smarthome_net`, Home Assistant, spaeter ESPHome/Zigbee2MQTT | Restore-Test am 2026-06-13 erfolgreich: authentifizierter Publish/Subscribe-Smoke mit `homeassistant`-User und retained Topic nach Broker-Restart gruen; produktiv verbindet sich HA als User `homeassistant` |
|
||||
| Smart-Home Fachrepo | Gitea + Borg-Repo-Clone | `/mnt/user/services/smart-home-kalli` | keine | keine echten Secrets im Repo; `secrets-template/` nur Beispiele | Gitea, Home Assistant Mounts | `git status` sauber, HA liest `configuration.yaml` und `packages/` aus dem Clone |
|
||||
@@ -105,7 +104,6 @@ Aktuell relevante Dump-Artefakte unter `/mnt/user/backups/borg/dumps/latest`:
|
||||
- `filebrowser.bolt.dump`
|
||||
- `borg-ui.sqlite`
|
||||
- `grafana.sqlite`
|
||||
- `n8n.sqlite.dump`
|
||||
- `unraid-flash-config.tar.gz` plus `unraid-flash-config.tar.gz.sha256` und Manifest
|
||||
- Monitoring-Stack: keine verpflichtenden Dump-Artefakte; Prometheus/Loki/Grafana named volumes sind Diagnose-/Dashboard-Zustand, keine primaere Restore-Quelle.
|
||||
- `komodo-mongo.archive.gz` (noch gesondert verifizieren)
|
||||
|
||||
+2
-3
@@ -25,7 +25,6 @@ Dieses Dokument listet sensible Daten, deren Ablageorte und die vorgesehene Einb
|
||||
| mealie-postgres | DB Password | `/mnt/user/appdata/secrets/mealie_postgres_password.txt` -> `POSTGRES_PASSWORD_FILE` | aktiv |
|
||||
| Paperless-ngx | DB Password | Stack ENV `${PAPERLESS_DBPASS}` | aktiv |
|
||||
| Paperless-ngx | Redis URL | Stack ENV `${PAPERLESS_REDIS}` | aktiv |
|
||||
| Paperless OIDC (Authelia) | Client Secret | Stack ENV `${PAPERLESS_OIDC_SECRET}` in `/mnt/user/services/stacks/paperless/apps/paperless/.env` (Komodo-Stack-ENV); pbkdf2-Hash im Authelia-Host-Config-Client `paperless` (kein Wert im Repo) | aktiv (2026-06-17) |
|
||||
| Paperless-GPT | OpenAI API Key | Stack ENV `${OPENAI_API_KEY}`; nicht im Repo, nicht in Logs | aktiv |
|
||||
| code-server | Passwort | `/mnt/user/appdata/code-server/secrets/password` -> `FILE__PASSWORD` | aktiv |
|
||||
| Filebrowser | Admin Password | `/mnt/user/appdata/secrets/filebrowser_admin_password.txt` -> initialisierte SQLite-DB | aktiv |
|
||||
@@ -117,7 +116,7 @@ Weitere dokumentierte Secret-Pfade:
|
||||
- Borg UI verwaltet Session-Secret, Admin-Login, SSH-Keys und Repo-Credentials in seiner persistenten `/data`-Struktur. Diese Daten liegen nicht im Git, muessen aber gesichert werden.
|
||||
- Die Borg-Repo-Passphrase liegt zusaetzlich als Host-Secret-Datei fuer Restore-Tests und Notfallzugriff vor. Der Wert ist laut Operator-Bestaetigung vom 2026-05-26 offline gesichert; Ablageort und Wert werden nicht im Repo dokumentiert.
|
||||
- Gitea verwaltet den GitHub-Push-Mirror-PAT in den Repository-Mirror-Settings. Der Wert wird nicht dokumentiert und nicht in Dateien unter `docs/` oder `core/gitea/` geschrieben.
|
||||
- `paperless-ngx` ist eine bewusste Ausnahme: DB-Passwort, Redis-URL und OIDC-Client-Secret bleiben aktuell als Komodo Stack Environment Variables hinterlegt, um den stabil laufenden Produktionsstand nicht fuer eine reine Secret-Mechanik-Migration zu riskieren.
|
||||
- `paperless-ngx` ist eine bewusste Ausnahme: DB-Passwort und Redis-URL bleiben aktuell als Komodo Stack Environment Variables hinterlegt, um den stabil laufenden Produktionsstand nicht fuer eine reine Secret-Mechanik-Migration zu riskieren.
|
||||
- `baerchen` nutzt fuer das Veeam-Backup aktuell den bestehenden SMB-User
|
||||
`micha`. Ein dedizierter SMB-User `veeam-baerchen` ist nur ein spaeteres
|
||||
Hardening-Ziel, solange keine Unraid-User-/Share-Aenderungen gewuenscht sind.
|
||||
@@ -140,7 +139,7 @@ Einige Secrets liegen bewusst nur als Komodo Stack Environment Variables vor, we
|
||||
|
||||
| Stack | Stack-ENV-Variablen | Restore-Quelle (Reihenfolge) | Folgen bei Verlust aller Quellen |
|
||||
|---|---|---|---|
|
||||
| `paperless-ngx` | `PAPERLESS_DBPASS`, `PAPERLESS_REDIS`, `PAPERLESS_OIDC_SECRET` | Komodo-Mongo-Dump -> Vaultwarden -> externe Notiz | App-DB ist im Postgres-Cluster, Passwort muss in Postgres und Stack-ENV synchron neu gesetzt werden; Redis-URL ist deterministisch rekonstruierbar (Host, Port, Passwort), wenn Redis-Passwort-Datei vorliegt; OIDC-Client-Secret kann mit passendem Authelia-Client neu rotiert werden |
|
||||
| `paperless-ngx` | `PAPERLESS_DBPASS`, `PAPERLESS_REDIS` | Komodo-Mongo-Dump -> Vaultwarden -> externe Notiz | App-DB ist im Postgres-Cluster, Passwort muss in Postgres und Stack-ENV synchron neu gesetzt werden; Redis-URL ist deterministisch rekonstruierbar (Host, Port, Passwort), wenn Redis-Passwort-Datei vorliegt |
|
||||
| `paperless-gpt` | `PAPERLESS_API_TOKEN`, `OPENAI_API_KEY` | Komodo-Mongo-Dump -> Vaultwarden -> externe Notiz | Paperless-Token kann in Paperless neu erzeugt werden; OpenAI-Key muss im OpenAI-Projekt rotiert/neu erstellt werden |
|
||||
| `immich-server` | `IMMICH_DB_PASSWORD` | Komodo-Mongo-Dump -> Vaultwarden -> externe Notiz | analog Paperless: Postgres-User-Passwort in `immich_postgres` und Stack-ENV gemeinsam zuruecksetzen |
|
||||
| `mail-archiver` | `MAILARCHIVER_DB_CONNECTION`, `MAILARCHIVER_AUTH_PASSWORD` | Komodo-Mongo-Dump -> Vaultwarden -> externe Notiz | DB-Connection-String enthaelt Postgres-Pass; App-Auth-Password fuer Web-UI |
|
||||
|
||||
@@ -35,7 +35,7 @@ Secret-Werte sind nicht enthalten. Es werden nur Secret-Namen, Env-Key-Namen und
|
||||
|
||||
| Service | Zweck | Autoritativer Pfad | URL / Zugang | Abhaengigkeiten | Datenpfade | Backup / Restore | Traefik | Besonderheiten / TODOs |
|
||||
|---|---|---|---|---|---|---|---|---|
|
||||
| `paperless-ngx` | Dokumentenmanagement | `apps/paperless/docker-compose.yml` | `https://paperless.kaleschke.info` | PostgreSQL 18, Redis 8, Traefik, Authelia OIDC | `/mnt/user/appdata/paperless-ngx/data`, `/mnt/user/documents/paperless`, `/mnt/user/documents/scans_inbox` | Tier 2, Borg + `postgresql17-paperless.dump` | ja + Authelia | DB/Redis/OIDC Secrets bleiben bewusst Stack ENV; OIDC ist additiv via Authelia konfiguriert, lokaler Login bleibt Fallback; Dump-Dateiname behaelt den historischen Cluster-Namen |
|
||||
| `paperless-ngx` | Dokumentenmanagement | `apps/paperless/docker-compose.yml` | `https://paperless.kaleschke.info` | PostgreSQL 18, Redis 8, Traefik | `/mnt/user/appdata/paperless-ngx/data`, `/mnt/user/documents/paperless`, `/mnt/user/documents/scans_inbox` | Tier 2, Borg + `postgresql17-paperless.dump` | ja | DB/Redis Secrets bleiben bewusst Stack ENV; Dump-Dateiname behaelt den historischen Cluster-Namen |
|
||||
| `paperless-gpt` | KI-Ergaenzung fuer Paperless | `apps/paperless-gpt/docker-compose.yml` | `https://paperless-gpt.kaleschke.info` | Paperless API, OpenAI API, Traefik | `/mnt/user/appdata/paperless-gpt/data`, `/mnt/user/appdata/paperless-gpt/prompts` | Tier 2 | ja + Authelia | `PAPERLESS_API_TOKEN` und `OPENAI_API_KEY` als Stack ENV; LLM und Vision-OCR laufen ueber `gpt-5.4-mini`, kein Zugriff mehr auf lokale Ollama-VM. **Behalten-Entscheidung 2026-05-28:** Container bleibt aktiv, auch wenn aktuell keine Traefik-Zugriffe in der Woche; Ablouseplanung erst mit Paperless-NGX 3.0 (eigene KI-Features erwartet) - dann neu bewerten. |
|
||||
| `immich_server` | Foto-/Video-App | `apps/immich/docker-compose.yml` | `https://immich.kaleschke.info` | Immich Postgres, Immich Redis, ML, Traefik | `/mnt/user/photos/immich`, `/mnt/user/photos/family_archive` | Tier 2, Borg + `immich.dump` | ja | native App-Auth; externes Fotoarchiv gemountet |
|
||||
| `immich_postgres` | Immich-Datenbank | `apps/immich/docker-compose.yml` | intern | `immich_default` | `/mnt/user/appdata/immich_postgres_vectorchord`, archivierter Rollback-Altstand `/mnt/user/appdata/_archive/pg18-immich-rollback-volumes-20260602/immich-postgres-pgvecto-rs`, `immich_postgres_password.txt` | Dump `immich.dump`; Restore braucht ein Image mit VectorChord/pgvector | nein | PG14 bleibt bewusst; Immich-DB-Image `ghcr.io/immich-app/postgres:14-vectorchord0.4.3-pgvectors0.2.0`; nie ins `frontend_net` |
|
||||
|
||||
+3
-9
@@ -124,20 +124,14 @@ Pflichtschritte beim Anlegen:
|
||||
1. Stack in Komodo aus Gitea anlegen
|
||||
2. `webhook_enabled` in Komodo aktivieren
|
||||
3. passenden Gitea-Webhook fuer die aktuelle Stack-ID anlegen
|
||||
4. Branch-Filter im Gitea-Hook auf den produktiven Branch setzen, aktuell `master`
|
||||
5. Gitea-Hook gegen `http://komodo-core:9120/listener/github/stack/<stack-id>/deploy` pruefen
|
||||
6. einen Push oder Test-Delivery ausloesen und `last_status`/Komodo-Deploy pruefen
|
||||
7. Ausnahmen explizit dokumentieren
|
||||
4. Gitea-Hook gegen `http://komodo-core:9120/listener/github/stack/<stack-id>/deploy` pruefen
|
||||
5. einen Push oder Test-Delivery ausloesen und `last_status`/Komodo-Deploy pruefen
|
||||
6. Ausnahmen explizit dokumentieren
|
||||
|
||||
**Regel:** Kein neuer produktiver GitOps-Stack ohne funktionierenden Gitea->Komodo-Webhook. Bewusste Ausnahmen muessen im selben Aenderungsblock dokumentiert werden, inklusive Grund und Alternativ-Deploy-Weg.
|
||||
|
||||
Der Standardfall nutzt den globalen `KOMODO_WEBHOOK_SECRET` aus der Komodo-Host-`.env`, ausser Komodo zeigt fuer den Stack explizit ein eigenes per-Stack-Secret.
|
||||
|
||||
Der Gitea-Branch-Filter darf nicht leer oder `*` bleiben, solange der Komodo-Stack
|
||||
einen konkreten Repo-Branch erwartet. Sonst triggern Feature-/Arbeitsbranches alle
|
||||
Stack-Listener, Komodo verwirft sie mit `request branch does not match expected`
|
||||
und der Operations-Report bekommt unnuetzes Komodo-/Traefik-Rauschen.
|
||||
|
||||
### Ausnahme: Komodo-Zugangsmodell
|
||||
|
||||
Komodo bleibt **bewusst** ohne zentrale Traefik-ForwardAuth-Middleware.
|
||||
|
||||
@@ -4,16 +4,13 @@ services:
|
||||
container_name: monitoring-prometheus
|
||||
restart: unless-stopped
|
||||
command:
|
||||
- --config.file=/etc/prometheus/config/prometheus.yml
|
||||
- --config.file=/etc/prometheus/prometheus.yml
|
||||
- --storage.tsdb.path=/prometheus
|
||||
- --storage.tsdb.retention.time=30d
|
||||
- --web.enable-lifecycle
|
||||
volumes:
|
||||
# Verzeichnis-Mount statt Einzeldatei: auf dem Unraid-FUSE-Share (/mnt/user)
|
||||
# bricht ein Einzeldatei-Bind-Mount bei git/Komodo-Updates zu
|
||||
# "Stale NFS file handle" (Inode-Wechsel) -> Reload laedt 0 Regeln, nur
|
||||
# --force-recreate heilt. Directory-Inode ist stabil, Reload reicht wieder.
|
||||
- ./prometheus:/etc/prometheus/config:ro
|
||||
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
|
||||
- ./prometheus/alerts.yml:/etc/prometheus/alerts.yml:ro
|
||||
- prometheus_data:/prometheus
|
||||
networks:
|
||||
- monitoring_net
|
||||
@@ -28,7 +25,7 @@ services:
|
||||
- cadvisor
|
||||
|
||||
alertmanager:
|
||||
image: prom/alertmanager:v0.33.0@sha256:af26fbe4dd1886ac0efd7bd55cd9027da262e105b137a376522b7c14c3626e4a
|
||||
image: prom/alertmanager:v0.32.2@sha256:b85533a2eb45865835315810315f6951331b2dbc8c93a6cf9a51e156a006a706
|
||||
container_name: monitoring-alertmanager
|
||||
restart: unless-stopped
|
||||
command:
|
||||
@@ -45,7 +42,7 @@ services:
|
||||
- no-new-privileges:true
|
||||
|
||||
alertmanager-ntfy-bridge:
|
||||
image: python:3.14-alpine@sha256:26730869004e2b9c4b9ad09cab8625e81d256d1ce97e72df5520e806b1709f92
|
||||
image: python:3.14-alpine@sha256:5a824eb82cc75361f98611f3cfc5091ea33f10a6ccea4d4ebdabbc523b9a1614
|
||||
container_name: monitoring-alertmanager-ntfy-bridge
|
||||
restart: unless-stopped
|
||||
dns:
|
||||
@@ -340,7 +337,7 @@ services:
|
||||
- no-new-privileges:true
|
||||
|
||||
influxdb3-core:
|
||||
image: influxdb:3.10.0-core@sha256:b3e577f38c19963597170d8850a3a7f77af8f0cfa866c64cd13e5de0f238e114
|
||||
image: influxdb:3.9.3-core@sha256:c27c9b2ca2625b5b6966f0b09baa448102310e63a471fd60dff22646a2522e29
|
||||
container_name: monitoring-influxdb3-core
|
||||
user: "0"
|
||||
restart: unless-stopped
|
||||
@@ -354,12 +351,6 @@ services:
|
||||
- --data-dir=/var/lib/influxdb3/data
|
||||
- --plugin-dir=/var/lib/influxdb3/plugins
|
||||
- --admin-token-file=/run/secrets/influxdb3_admin_token
|
||||
# InfluxDB 3 Core kompaktiert Parquet-Dateien nicht (nur Enterprise).
|
||||
# HA schreibt viele Sensoren haeufig -> Tabellen wie "°C"/"%"/"hPa" liefen
|
||||
# ins Default-Limit von 432 Dateien/Query ("No data" in Grafana).
|
||||
# Stopgap: Limit anheben. Langfristig: Enterprise (Auto-Compaction, frei
|
||||
# fuer Home) oder weniger/seltener nach InfluxDB schreiben.
|
||||
- --query-file-limit=20000
|
||||
volumes:
|
||||
- /mnt/user/appdata/influxdb3/data:/var/lib/influxdb3/data
|
||||
- /mnt/user/appdata/influxdb3/plugins:/var/lib/influxdb3/plugins
|
||||
|
||||
@@ -131,78 +131,6 @@ groups:
|
||||
summary: "Latest Borg backup completed with warnings"
|
||||
description: "The latest Borg UI job completed with warnings for archive {{ $labels.archive }}."
|
||||
|
||||
- alert: HomelabBorgScopeSourceListMissing
|
||||
expr: homelab_borg_scope_expected_file_present != 1
|
||||
for: 15m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "Borg expected source list is not visible"
|
||||
description: "Borg UI cannot see the repo source list used for drift checks."
|
||||
|
||||
- alert: HomelabBorgScopeMissingSources
|
||||
expr: homelab_borg_scope_missing_sources_total > 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "Borg UI is missing expected backup sources"
|
||||
description: "Borg UI is missing {{ $value }} source path(s) from ops/borg-ui/all-important-sources.txt."
|
||||
|
||||
- alert: HomelabBorgScopeExtraSources
|
||||
expr: homelab_borg_scope_extra_sources_total > 0
|
||||
for: 30m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Borg UI has sources not tracked in the repo"
|
||||
description: "Borg UI has {{ $value }} source path(s) that are not listed in ops/borg-ui/all-important-sources.txt."
|
||||
|
||||
- alert: HomelabBorgDumpMissing
|
||||
expr: homelab_borg_dump_present == 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "Borg pre-backup dump is missing: {{ $labels.dump }}"
|
||||
description: "Expected dump artifact {{ $labels.dump }} is not present in the latest dump set. The pre-backup dump job may have failed or stopped."
|
||||
|
||||
- alert: HomelabBorgDumpStale
|
||||
expr: homelab_borg_dump_age_seconds > 30 * 60 * 60
|
||||
for: 15m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "Borg pre-backup dump is stale: {{ $labels.dump }}"
|
||||
description: "Dump artifact {{ $labels.dump }} is older than 30 hours. pre-backup-dumps.sh may have stopped; Borg would keep archiving stale database content without a job failure."
|
||||
|
||||
- alert: HomelabBorgRepositoryCheckStale
|
||||
expr: time() - homelab_borg_repository_last_check_timestamp_seconds > 14 * 24 * 60 * 60
|
||||
for: 30m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Borg repository check is stale"
|
||||
description: "Borg repository {{ $labels.repository }} has not had a recorded check for more than 14 days."
|
||||
|
||||
- alert: HomelabBorgRetentionDisabled
|
||||
expr: homelab_borg_schedule_prune_after_enabled != 1
|
||||
for: 30m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Borg retention pruning is disabled"
|
||||
description: "Scheduled Borg job {{ $labels.schedule }} does not run prune after backup."
|
||||
|
||||
- alert: HomelabBorgCompactDisabled
|
||||
expr: homelab_borg_schedule_compact_after_enabled != 1
|
||||
for: 30m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Borg compaction is disabled"
|
||||
description: "Scheduled Borg job {{ $labels.schedule }} does not run compact after backup."
|
||||
|
||||
- alert: HomelabCriticalContainerDown
|
||||
expr: homelab_critical_container_running == 0
|
||||
for: 5m
|
||||
|
||||
@@ -5,7 +5,7 @@ global:
|
||||
site: kallilabcore
|
||||
|
||||
rule_files:
|
||||
- /etc/prometheus/config/alerts.yml
|
||||
- /etc/prometheus/alerts.yml
|
||||
|
||||
alerting:
|
||||
alertmanagers:
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Borg Backup Scope for KalliLabcore
|
||||
|
||||
Stand: 2026-06-17
|
||||
Stand: 2026-05-31
|
||||
|
||||
This file defines the target state for replacing Backrest with Borg in this homelab.
|
||||
|
||||
@@ -38,7 +38,7 @@ The Unraid flash configuration archive is intentional as well and must be treate
|
||||
| Traefik | file data | `/local/appdata/traefik` |
|
||||
| ntfy | file data | `/local/appdata/ntfy` |
|
||||
| Paperless-GPT | file data | `/local/appdata/paperless-gpt` |
|
||||
| Tailscale | Flash config artifact | covered by `/local/borg-dumps/unraid-flash-config.tar.gz`; no active `/local/appdata/tailscale` path |
|
||||
| Tailscale | file data | `/local/appdata/tailscale` |
|
||||
| AdGuard | config only | `/local/appdata/adguard/conf` |
|
||||
| Borg UI | SQLite dump + self-backup | `/local/borg-dumps`, `/local/appdata/borg-ui/data` |
|
||||
| Komodo | config + Mongo dump | `/local/borg-dumps`, `/local/appdata/komodo/periphery`, `/local/appdata/komodo/core` |
|
||||
@@ -48,12 +48,11 @@ The Unraid flash configuration archive is intentional as well and must be treate
|
||||
| Grafana | SQLite dump from `monitoring_grafana_data` + provisioned config in Git | `/local/borg-dumps`, `monitoring/grafana/provisioning`, `monitoring/grafana/dashboards` |
|
||||
| Filebrowser | file-backed state dump + file data | `/local/borg-dumps`, `/local/appdata/filebrowser` |
|
||||
| InfluxDB 3 Core | file data | `/local/appdata/influxdb3/data`, `/local/appdata/influxdb3/plugins` |
|
||||
| n8n | SQLite dump + encrypted workflow/credential state | `/local/borg-dumps`, `/local/appdata/n8n/data` |
|
||||
| Home Assistant | HA-native backup + file state | `/local/appdata/homeassistant`, `/local/services/smart-home-kalli` |
|
||||
| Smart-Home MQTT / Mosquitto | file data | `/local/appdata/mosquitto/config`, `/local/appdata/mosquitto/data` |
|
||||
| Zigbee2MQTT (planned) | file data + coordinator state | `/local/appdata/zigbee2mqtt`, `/local/services/smart-home-kalli` |
|
||||
| ESPHome (planned) | Fachrepo + optional build/runtime cache | `/local/services/smart-home-kalli/esphome`, optional `/local/appdata/esphome` |
|
||||
| Hermes Agent | file data + SSH key | SSH-Key via `/local/secrets`; `/local/appdata/hermes-agent/data` ist bewusst NICHT in `all-important-sources.txt`, weil der Stack geparkt ist (Review 2026-07-25). Beim Aktivieren des Stacks in die Quellliste aufnehmen. |
|
||||
| Hermes Agent | file data + SSH key | `/local/appdata/hermes-agent/data`, `/local/secrets/hermes_runner_id_ed25519` |
|
||||
| BentoPDF | rebuildable | no critical persistence in compose |
|
||||
|
||||
## Open Decisions and Coverage Gaps
|
||||
@@ -72,17 +71,6 @@ Option A umgesetzt: `pre-backup-dumps.sh` writes `nextcloud.dump` from `nextclou
|
||||
|
||||
The live Unraid User Scripts execute repo scripts from `/mnt/user/services/homelab-infra`, while Komodo keeps stack workspaces below `/mnt/user/services/stacks`. These paths are now mounted into Borg UI as `/local/services/...` and included explicitly so host-side script hotfixes, stack workspace state, and posture-check state are recoverable.
|
||||
|
||||
### User-Daten-Shares ausserhalb des App-Scope
|
||||
|
||||
Filebrowser serviert `/mnt/user/projekte`, `/mnt/user/documents` und `/mnt/user/photos` komplett (`ops/filebrowser/docker-compose.yml`). Der Borg-Scope deckt aber bewusst nur die App-Unterordner ab (`documents/paperless*`, `documents/nextcloud-data`, `documents/scans_inbox`, `photos/immich`, `photos/family_archive`).
|
||||
|
||||
- **`/mnt/user/projekte`** ist aktuell in **keinem** Borg-Scope. Ad-hoc-Dateien, die direkt unter `documents/` oder `photos/` (ausserhalb der genannten App-Ordner) abgelegt werden, ebenfalls nicht.
|
||||
- Entscheidung Operator offen (Eintrag in `docs/MASTER_TODO.md`): Entweder `projekte` als eigenen read-only Borg-UI-Mount + Quelllisten-Eintrag aufnehmen, oder bewusst als "nur lokal, nicht DR-relevant" bestaetigen. Bis zur Entscheidung gilt: dort liegende Originaldaten sind **nicht** wiederherstellbar.
|
||||
|
||||
### Komodo keys
|
||||
|
||||
Production still stores Komodo Core/Periphery keys in the Docker named volume `komodo_komodo_keys`. This is a known open migration item and is not fixed by the Borg source list alone. Target state: move the keys to a host path such as `/mnt/user/appdata/komodo/keys` and mount that path into both Komodo containers, then include it in Borg. Do not treat this as solved until the live Compose stack has been migrated and Periphery reconnect has been verified.
|
||||
|
||||
## Database Dumps Required
|
||||
|
||||
### Shared PostgreSQL (`postgresql17`, runtime PostgreSQL 18)
|
||||
@@ -101,7 +89,6 @@ Production still stores Komodo Core/Periphery keys in the Docker named volume `k
|
||||
|
||||
- Komodo MongoDB
|
||||
- SQLite: `gitea`, `vaultwarden`, `speedtest-tracker`, `borg-ui`, `grafana`
|
||||
- SQLite: `n8n` (`n8n.sqlite.dump`, credentials require the matching `N8N_ENCRYPTION_KEY`)
|
||||
- File-backed state: `filebrowser.bolt.dump`
|
||||
- Unraid flash config: `unraid-flash-config.tar.gz` plus `unraid-flash-config.tar.gz.sha256`
|
||||
- Home Assistant native backups: created by HA under `/mnt/user/appdata/homeassistant/backups` and captured as file state
|
||||
|
||||
@@ -14,16 +14,11 @@
|
||||
/local/appdata/traefik
|
||||
/local/appdata/ntfy
|
||||
/local/appdata/paperless-gpt
|
||||
/local/appdata/tailscale
|
||||
/local/appdata/adguard/conf
|
||||
/local/appdata/borg-ui/data
|
||||
/local/appdata/komodo/periphery
|
||||
/local/appdata/komodo/core
|
||||
/local/appdata/nextcloud/html
|
||||
/local/nextcloud/data
|
||||
/local/appdata/n8n/data
|
||||
/local/appdata/filebrowser
|
||||
/local/appdata/influxdb3/data
|
||||
/local/appdata/influxdb3/plugins
|
||||
/local/services/homelab-infra
|
||||
/local/services/smart-home-kalli
|
||||
/local/services/stacks
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
borg-ui:
|
||||
image: ainullcode/borg-ui@sha256:e51b3d2e6cb38d1ba127ef60ba442c1e157965327196e6f7afb69f30c0ba99d1
|
||||
image: ainullcode/borg-ui@sha256:0922157e8f77a1b2bd23cd09366a458ea6de07fd9306aa1485f9cfe623eca17f
|
||||
container_name: borg-ui
|
||||
restart: unless-stopped
|
||||
security_opt:
|
||||
|
||||
@@ -325,7 +325,6 @@ main() {
|
||||
# Additional host-side SQLite dumps for admin tooling with appdata files.
|
||||
dump_sqlite_file "/mnt/user/appdata/borg-ui/data/borg.db" "$LATEST_DIR/borg-ui.sqlite" "borg-ui"
|
||||
dump_sqlite_file "/var/lib/docker/volumes/monitoring_grafana_data/_data/grafana.db" "$LATEST_DIR/grafana.sqlite" "grafana"
|
||||
dump_sqlite_file "/mnt/user/appdata/n8n/data/database.sqlite" "$LATEST_DIR/n8n.sqlite.dump" "n8n"
|
||||
|
||||
# MongoDB
|
||||
dump_mongo_container "komodo-mongo" "$LATEST_DIR/komodo-mongo.archive.gz"
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
code-server:
|
||||
image: lscr.io/linuxserver/code-server:4.125.0@sha256:7e9523734c003b6336781942df7b48aa6936a9df6931c12a19a1f7ad7858eeba
|
||||
image: lscr.io/linuxserver/code-server:4.123.0@sha256:cb261a7f87674b445e0fd66d87d55900c1b823d276c727ab0d168a75e69e9992
|
||||
container_name: code-server
|
||||
restart: unless-stopped
|
||||
security_opt:
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
filebrowser:
|
||||
image: filebrowser/filebrowser:v2.63.15@sha256:9805b21cf910f3ef6f4a1c8f441f1dd6cc4197136f9541fe2a1ab6d050706e4b
|
||||
image: filebrowser/filebrowser:v2.63.14@sha256:1ec9b0c68297550c92f4a93feed432850c2993b261706cc3cc2e808f94a95e76
|
||||
container_name: filebrowser
|
||||
restart: unless-stopped
|
||||
security_opt:
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
glances:
|
||||
image: nicolargo/glances:latest-full@sha256:58651aabedf62db8bfc1d252f8d3889675dfcdb5d0ad1c177ae5879c21626f3a
|
||||
image: nicolargo/glances:latest-full@sha256:60872a1af0e40a3150975617c7e811ad7ad48f95bc45d033fb0c1737a037e4d2
|
||||
container_name: glances
|
||||
restart: unless-stopped
|
||||
pid: host
|
||||
|
||||
@@ -25,7 +25,6 @@ $Jobs = @(
|
||||
"immich.dump",
|
||||
"komodo-mongo.archive.gz",
|
||||
"mealie.dump",
|
||||
"n8n.sqlite.dump",
|
||||
"nextcloud.dump",
|
||||
"postgresql17-authelia.dump",
|
||||
"postgresql17-globals.sql",
|
||||
|
||||
@@ -45,13 +45,13 @@
|
||||
"description": "VPN / Remote-Zugang",
|
||||
"tier": 1,
|
||||
"category": "core",
|
||||
"container_name": null,
|
||||
"container_name": "tailscale",
|
||||
"dependencies": [],
|
||||
"url": null,
|
||||
"dump_file": null,
|
||||
"data_paths": ["/boot/config/plugins/tailscale/state"],
|
||||
"first_check": "Tailscale Status auf Host pruefen; native Unraid-Plugin-Instanz und Subnet-Route aktiv?",
|
||||
"notes": "Natives Unraid-Plugin, nicht Docker/Komodo-verwaltet; State liegt im Flash-Backup. Alter Docker-State ist archiviert unter /mnt/user/appdata/_archive/tailscale-removed-2026-06-06/"
|
||||
"data_paths": ["/mnt/user/appdata/tailscale"],
|
||||
"first_check": "Tailscale Status auf Host pruefen; State-Datei fuer Key-Renewal vorhanden?",
|
||||
"notes": "network_mode: host; NET_ADMIN, NET_RAW, /dev/net/tun — dokumentierte VPN-Ausnahmen"
|
||||
},
|
||||
"gitea": {
|
||||
"description": "Git-Server — operative Quelle der Wahrheit fuer GitOps",
|
||||
|
||||
@@ -75,14 +75,14 @@ services:
|
||||
description: VPN / Remote-Zugang
|
||||
tier: 1
|
||||
category: core
|
||||
container_name: null
|
||||
container_name: tailscale
|
||||
dependencies: []
|
||||
url: null
|
||||
dump_file: null
|
||||
data_paths:
|
||||
- /boot/config/plugins/tailscale/state
|
||||
first_check: "Tailscale Status auf Host pruefen; native Unraid-Plugin-Instanz und Subnet-Route aktiv?"
|
||||
notes: "Natives Unraid-Plugin, nicht Docker/Komodo-verwaltet; State liegt im Flash-Backup. Alter Docker-State ist archiviert unter /mnt/user/appdata/_archive/tailscale-removed-2026-06-06/"
|
||||
- /mnt/user/appdata/tailscale
|
||||
first_check: "Tailscale Status auf Host pruefen; State-Datei fuer Key-Renewal vorhanden?"
|
||||
notes: "network_mode: host; NET_ADMIN, NET_RAW, /dev/net/tun — dokumentierte VPN-Ausnahmen"
|
||||
|
||||
gitea:
|
||||
description: Git-Server — operative Quelle der Wahrheit fuer GitOps
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
restoretest-adguard:
|
||||
image: adguard/adguardhome:v0.107.77@sha256:e6f2b8bcda06064ab055b44933a4f0e983c35558b9cdb8d2e7ab1efcee36d890
|
||||
image: adguard/adguardhome:v0.107.76@sha256:7157eb1dc3b26c7af1d6898759a7b3f7d0fa09891fbd2d3caa6abc1057a9179b
|
||||
container_name: restoretest-adguard
|
||||
restart: "no"
|
||||
ports:
|
||||
|
||||
@@ -6,7 +6,6 @@ param(
|
||||
)
|
||||
|
||||
$checks = @(
|
||||
@{ Name = "postgresql17-globals.sql"; Path = Join-Path $DumpRoot "postgresql17-globals.sql" },
|
||||
@{ Name = "postgresql17-paperless.dump"; Path = Join-Path $DumpRoot "postgresql17-paperless.dump" },
|
||||
@{ Name = "postgresql17-mailarchiver.dump"; Path = Join-Path $DumpRoot "postgresql17-mailarchiver.dump" },
|
||||
@{ Name = "mealie.dump"; Path = Join-Path $DumpRoot "mealie.dump" },
|
||||
@@ -14,7 +13,6 @@ $checks = @(
|
||||
@{ Name = "nextcloud.dump"; Path = Join-Path $DumpRoot "nextcloud.dump" },
|
||||
@{ Name = "gitea.sqlite.dump"; Path = Join-Path $DumpRoot "gitea.sqlite.dump" },
|
||||
@{ Name = "vaultwarden.sqlite.dump"; Path = Join-Path $DumpRoot "vaultwarden.sqlite.dump" },
|
||||
@{ Name = "n8n.sqlite.dump"; Path = Join-Path $DumpRoot "n8n.sqlite.dump" },
|
||||
@{ Name = "speedtest-tracker.sqlite.dump"; Path = Join-Path $DumpRoot "speedtest-tracker.sqlite.dump" },
|
||||
@{ Name = "filebrowser.bolt.dump"; Path = Join-Path $DumpRoot "filebrowser.bolt.dump" },
|
||||
@{ Name = "unraid-flash-config.tar.gz"; Path = Join-Path $DumpRoot "unraid-flash-config.tar.gz" }
|
||||
|
||||
@@ -89,7 +89,6 @@ check_pg_header() {
|
||||
}
|
||||
|
||||
for dump in \
|
||||
postgresql17-globals.sql \
|
||||
postgresql17-paperless.dump \
|
||||
postgresql17-mailarchiver.dump \
|
||||
mealie.dump \
|
||||
@@ -97,7 +96,6 @@ for dump in \
|
||||
nextcloud.dump \
|
||||
gitea.sqlite.dump \
|
||||
vaultwarden.sqlite.dump \
|
||||
n8n.sqlite.dump \
|
||||
speedtest-tracker.sqlite.dump \
|
||||
filebrowser.bolt.dump \
|
||||
unraid-flash-config.tar.gz; do
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
scrutiny:
|
||||
image: ghcr.io/starosdev/scrutiny:latest-omnibus@sha256:d79e6f1bc299ab28fbd95c9e05fa5a8c565332d2cb9091a91e42d84d4d939989
|
||||
image: ghcr.io/starosdev/scrutiny:latest-omnibus@sha256:228483f16a6236d2fa9b2fbfca2e76dc861e648fbc6ae6e680d23e5d00211a5d
|
||||
container_name: scrutiny
|
||||
restart: unless-stopped
|
||||
privileged: true
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
speedtest-tracker:
|
||||
image: lscr.io/linuxserver/speedtest-tracker:1.14.4@sha256:f99dfd097709016dfb4387d65bfdc0419bde99cf1dce7e26e70ca616c86f1281
|
||||
image: lscr.io/linuxserver/speedtest-tracker:1.14.3@sha256:c3750c40948a9360000ce62d694da92e85584b4ab6d3d9a9d1432d76fa5e0726
|
||||
container_name: speedtest-tracker
|
||||
restart: unless-stopped
|
||||
security_opt:
|
||||
|
||||
+1
-2
@@ -39,11 +39,10 @@
|
||||
"labels": ["dependencies", "minor-patch"]
|
||||
},
|
||||
{
|
||||
"description": "Kritische Kerninfra (Traefik=Public-Entrypoint, AdGuard/Unbound=DNS, n8n, Nextcloud): nicht im Sammel-PR, eigene einzeln reviewbare PRs, kein Auto-Merge",
|
||||
"description": "Kritische Kerninfra (Traefik=Public-Entrypoint, Unbound=DNS, n8n, Nextcloud): nicht im Sammel-PR, eigene einzeln reviewbare PRs, kein Auto-Merge",
|
||||
"matchManagers": ["docker-compose", "dockerfile"],
|
||||
"matchPackageNames": [
|
||||
"traefik",
|
||||
"adguard/adguardhome",
|
||||
"shaanmajid/unbound",
|
||||
"docker.n8n.io/n8nio/n8n",
|
||||
"nextcloud"
|
||||
|
||||
@@ -30,7 +30,7 @@ parse_compose() {
|
||||
return value
|
||||
}
|
||||
function emit() {
|
||||
if (service && image && !has_profile) {
|
||||
if (service && image) {
|
||||
print clean(container) "\t" clean(image)
|
||||
}
|
||||
}
|
||||
@@ -40,7 +40,6 @@ parse_compose() {
|
||||
sub(/:$/, "", service)
|
||||
image=""
|
||||
container=service
|
||||
has_profile=0
|
||||
next
|
||||
}
|
||||
service && /^ image:/ {
|
||||
@@ -53,10 +52,6 @@ parse_compose() {
|
||||
sub(/^[[:space:]]*container_name:[[:space:]]*/, "", container)
|
||||
next
|
||||
}
|
||||
service && /^ profiles:/ {
|
||||
has_profile=1
|
||||
next
|
||||
}
|
||||
END { emit() }
|
||||
' "$compose"
|
||||
}
|
||||
|
||||
@@ -13,7 +13,6 @@ CERT_MAX_ROWS="${CERT_MAX_ROWS:-12}"
|
||||
IMAGE_AGE_WARN_DAYS="${IMAGE_AGE_WARN_DAYS:-180}"
|
||||
IMAGE_AGE_ALLOW_FILE="${IMAGE_AGE_ALLOW_FILE:-/mnt/user/services/homelab-infra/services/posture-check/image-age-allow.patterns}"
|
||||
LOG_VOLUME_TOP_N="${LOG_VOLUME_TOP_N:-10}"
|
||||
LOG_VOLUME_OBSERVE_THRESHOLD="${LOG_VOLUME_OBSERVE_THRESHOLD:-100000}"
|
||||
DISK_USAGE_WARN_PCT="${DISK_USAGE_WARN_PCT:-85}"
|
||||
CERT_WARN_DAYS="${CERT_WARN_DAYS:-21}"
|
||||
BACKUP_DRIFT_FACTOR="${BACKUP_DRIFT_FACTOR:-2.0}"
|
||||
@@ -218,73 +217,6 @@ derive_report_status() {
|
||||
set_summary "report_status" "$REPORT_STATUS"
|
||||
}
|
||||
|
||||
print_status_reasons() {
|
||||
local count=0
|
||||
|
||||
add_reason() {
|
||||
printf '%s\n' "- $1"
|
||||
count=$((count + 1))
|
||||
}
|
||||
|
||||
[ "${borg_status:-unknown}" != "completed" ] && add_reason "Borg Backup ist \`${borg_status:-unknown}\` statt \`completed\`."
|
||||
[ "${prometheus_alerts:-0}" = "unknown" ] && add_reason "Prometheus Alerts konnten nicht sicher gelesen werden."
|
||||
[ "${cert_warnings:-0}" != "0" ] && add_reason "Zertifikatswarnungen: \`${cert_warnings:-0}\`."
|
||||
[ "${disk_warnings:-0}" != "0" ] && add_reason "Storage-Warnungen: \`${disk_warnings:-0}\`."
|
||||
if [ "${image_warnings:-0}" != "0" ]; then
|
||||
if [ -n "${image_warning_names:-}" ]; then
|
||||
add_reason "Image-Warnungen: \`${image_warnings:-0}\` (${image_warning_names})."
|
||||
else
|
||||
add_reason "Image-Warnungen: \`${image_warnings:-0}\`."
|
||||
fi
|
||||
fi
|
||||
[ "${containers_exited_nonzero:-0}" != "0" ] && add_reason "Container exited non-zero: \`${containers_exited_nonzero:-0}\`."
|
||||
[ "${host_recent_boot:-0}" = "1" ] && add_reason "Host-Reboot innerhalb der letzten 24 Stunden."
|
||||
[ "${backup_duration_drift:-0}" = "1" ] && add_reason "Backup-Dauer-Drift erkannt."
|
||||
[ "${noise_threshold_exceeded:-0}" != "0" ] && add_reason "Noise-Pattern ueber Eskalations-Schwelle: \`${noise_threshold_exceeded:-0}\`."
|
||||
|
||||
if [ "${prometheus_alerts_pending:-0}" != "0" ] && [ "${prometheus_alerts_pending:-0}" != "unknown" ]; then
|
||||
add_reason "Prometheus pending Alerts: \`${prometheus_alerts_pending:-0}\`."
|
||||
fi
|
||||
if [ "${prometheus_alerts_firing:-0}" != "0" ] && [ "${prometheus_alerts_firing:-0}" != "unknown" ]; then
|
||||
add_reason "Prometheus firing Alerts: \`${prometheus_alerts_firing:-0}\`."
|
||||
fi
|
||||
[ "${containers_unhealthy:-0}" != "0" ] && add_reason "Unhealthy Container: \`${containers_unhealthy:-0}\`."
|
||||
|
||||
if [ "$count" -eq 0 ]; then
|
||||
printf '%s\n' "- Keine direkten Ampel-Ausloeser im Summary-Set gefunden."
|
||||
fi
|
||||
}
|
||||
|
||||
print_notable_observations() {
|
||||
local count=0
|
||||
|
||||
add_observation() {
|
||||
printf '%s\n' "- $1"
|
||||
count=$((count + 1))
|
||||
}
|
||||
|
||||
if [ "${traefik_5xx:-0}" != "0" ] && [ "${traefik_5xx:-0}" != "unknown" ]; then
|
||||
if [ -n "${traefik_5xx_top:-}" ] && [ "${traefik_5xx_top:-none}" != "none" ]; then
|
||||
add_observation "Traefik 5xx: \`${traefik_5xx:-0}\` (Top-Gruppe: \`${traefik_5xx_top}\`)."
|
||||
else
|
||||
add_observation "Traefik 5xx: \`${traefik_5xx:-0}\`."
|
||||
fi
|
||||
fi
|
||||
if [ "${log_highlights:-0}" != "0" ] && [ "${log_highlights:-0}" != "unknown" ]; then
|
||||
add_observation "Log-Highlights: \`${log_highlights:-0}\` handlungsrelevante Treffer; Beispiele stehen in der Log-Auswertung."
|
||||
fi
|
||||
if printf '%s' "${log_volume_total:-0}" | grep -Eq '^[0-9]+$' && [ "${log_volume_total:-0}" -ge "$LOG_VOLUME_OBSERVE_THRESHOLD" ]; then
|
||||
add_observation "Log-Volumen: \`${log_volume_total:-0}\` Zeilen im Zeitraum; Top-Verursacher stehen im Log-Volumen-Abschnitt."
|
||||
fi
|
||||
if [ "${docker_events:-0}" != "0" ] && [ "${docker_events:-0}" != "unknown" ]; then
|
||||
add_observation "Docker Critical Events: \`${docker_events:-0}\`."
|
||||
fi
|
||||
|
||||
if [ "$count" -eq 0 ]; then
|
||||
printf '%s\n' "- Keine zusaetzlichen auffaelligen Beobachtungen im Management-Summary."
|
||||
fi
|
||||
}
|
||||
|
||||
collect_borg() {
|
||||
append "## Borg Backup"
|
||||
append ""
|
||||
@@ -652,7 +584,6 @@ collect_image_freshness() {
|
||||
local image_file="$TMP_DIR/images.tsv"
|
||||
local image_warnings=0
|
||||
local image_allowed=0
|
||||
local image_warning_names=""
|
||||
local now_epoch
|
||||
: > "$image_file"
|
||||
now_epoch="$(date +%s)"
|
||||
@@ -699,7 +630,6 @@ collect_image_freshness() {
|
||||
else
|
||||
note="ueberaltert"
|
||||
image_warnings=$((image_warnings + 1))
|
||||
image_warning_names="${image_warning_names:+$image_warning_names,}$name:${age_days}d"
|
||||
fi
|
||||
fi
|
||||
printf '%d\t%s\t%s\t%s\n' "$age_days" "$name" "$image_tag" "$note" >> "$image_file"
|
||||
@@ -707,7 +637,6 @@ collect_image_freshness() {
|
||||
|
||||
set_summary "image_warnings" "$image_warnings"
|
||||
set_summary "image_allowed" "$image_allowed"
|
||||
set_summary "image_warning_names" "$image_warning_names"
|
||||
|
||||
if [ ! -s "$image_file" ]; then
|
||||
append "- Keine Image-Daten verfuegbar."
|
||||
@@ -852,16 +781,8 @@ collect_traefik_5xx() {
|
||||
set_summary "traefik_5xx" "$count"
|
||||
|
||||
if [ "$count" -eq 0 ]; then
|
||||
set_summary "traefik_5xx_top" "none"
|
||||
append "- Keine 5xx-Antworten."
|
||||
else
|
||||
local top_group
|
||||
top_group="$(awk '{ code=$9; service=$12; gsub(/"/, "", service); counts[service " " code]++ } END { for (k in counts) print counts[k], k }' "$file" \
|
||||
| sort -nr \
|
||||
| head -n 1 \
|
||||
| awk '{ print $2 ":" $3 ":" $1 }' \
|
||||
| sed -E 's#[^A-Za-z0-9_.:@/-]+#_#g')"
|
||||
set_summary "traefik_5xx_top" "${top_group:-none}"
|
||||
append "- 5xx-Antworten: $count"
|
||||
append ""
|
||||
append "### Gruppiert nach Service/Code"
|
||||
@@ -1260,20 +1181,10 @@ write_report() {
|
||||
if [ "$REPORT_STATUS" = "OK" ]; then
|
||||
printf 'Im betrachteten Zeitraum zeigt das Homelab eine stabile Betriebslage. Das letzte Borg-Backup ist erfolgreich abgeschlossen, Prometheus meldet keine firing Alerts, keine unhealthy Container, Zertifikate und Storage im erwarteten Bereich.\n\n'
|
||||
elif [ "$REPORT_STATUS" = "WARNUNG" ]; then
|
||||
printf 'Im betrachteten Zeitraum gibt es Punkte, die Aufmerksamkeit verdienen. Der Betrieb ist nicht automatisch als kompromittiert zu bewerten; die konkreten Ampel-Ausloeser stehen direkt darunter.\n\n'
|
||||
printf 'Im betrachteten Zeitraum gibt es Punkte, die Aufmerksamkeit verdienen. Der Betrieb ist nicht automatisch als kompromittiert zu bewerten, aber mindestens ein Signal (Backup, Pending Alert, Zertifikat, Storage, Image-Alter, Drift oder Reboot) weicht vom Normalzustand ab.\n\n'
|
||||
else
|
||||
printf 'Im betrachteten Zeitraum liegt ein kritisches Betriebssignal vor. Der Bericht sollte zeitnah gelesen und die betroffenen Komponenten priorisiert geprueft werden.\n\n'
|
||||
fi
|
||||
printf '### Warum dieser Status?\n\n'
|
||||
if [ "$REPORT_STATUS" = "OK" ]; then
|
||||
printf '%s\n\n' "- Keine Ampel-Ausloeser im Summary-Set."
|
||||
else
|
||||
print_status_reasons
|
||||
printf '\n'
|
||||
fi
|
||||
printf '### Weitere auffaellige Beobachtungen\n\n'
|
||||
print_notable_observations
|
||||
printf '\n'
|
||||
printf '### Management-Bewertung\n\n'
|
||||
printf '%s\n' "- Status: \`$REPORT_STATUS\`"
|
||||
printf '%s\n' "- Borg Backup: \`${borg_status:-unknown}\`"
|
||||
|
||||
@@ -4,11 +4,7 @@ set -euo pipefail
|
||||
TEXTFILE_DIR="${TEXTFILE_DIR:-/mnt/user/services/posture-check/textfile}"
|
||||
OUTPUT_FILE="${OUTPUT_FILE:-$TEXTFILE_DIR/homelab.prom}"
|
||||
BORG_CONTAINER="${BORG_CONTAINER:-borg-ui}"
|
||||
BORG_EXPECTED_SOURCES_FILE="${BORG_EXPECTED_SOURCES_FILE:-/local/services/homelab-infra/ops/borg-ui/all-important-sources.txt}"
|
||||
# Host-Pfad der aktuellen Dump-Artefakte (pre-backup-dumps.sh schreibt hierhin).
|
||||
# Wird host-seitig gestattet; der Exporter laeuft als Unraid User Script.
|
||||
BORG_DUMP_DIR="${BORG_DUMP_DIR:-/mnt/user/backups/borg/dumps/latest}"
|
||||
CRITICAL_CONTAINERS="${CRITICAL_CONTAINERS:-traefik authelia postgresql17 gitea komodo-core komodo-mongo komodo-periphery vaultwarden borg-ui ntfy adguard unbound monitoring-alertmanager monitoring-alertmanager-ntfy-bridge monitoring-blackbox-exporter monitoring-cadvisor monitoring-grafana monitoring-loki monitoring-node-exporter monitoring-promtail immich_server immich_postgres immich_redis paperless-ngx nextcloud nextcloud-postgres nextcloud-redis mealie mealie-postgres mail-archiver n8n homeassistant smarthome-mosquitto}"
|
||||
CRITICAL_CONTAINERS="${CRITICAL_CONTAINERS:-traefik authelia postgresql17 gitea komodo-core komodo-mongo komodo-periphery vaultwarden borg-ui ntfy adguard unbound monitoring-alertmanager monitoring-alertmanager-ntfy-bridge monitoring-blackbox-exporter monitoring-cadvisor monitoring-grafana monitoring-loki monitoring-node-exporter monitoring-promtail immich_server immich_postgres immich_redis paperless-ngx nextcloud nextcloud-postgres nextcloud-redis mealie mealie-postgres}"
|
||||
# Hinweis: Tailscale laeuft als natives Unraid-Plugin (kein Docker-Container) und
|
||||
# wird daher hier bewusst NICHT als kritischer Container gefuehrt (Stand 2026-06-06).
|
||||
|
||||
@@ -94,32 +90,11 @@ EOF
|
||||
# TYPE homelab_borg_last_success gauge
|
||||
# HELP homelab_borg_last_job_warning Whether the most recent Borg backup job completed with warnings.
|
||||
# TYPE homelab_borg_last_job_warning gauge
|
||||
# HELP homelab_borg_repository_last_check_timestamp_seconds Unix timestamp of the latest Borg repository check known to Borg UI.
|
||||
# TYPE homelab_borg_repository_last_check_timestamp_seconds gauge
|
||||
# HELP homelab_borg_scope_expected_file_present Whether the expected Borg source list file is visible inside Borg UI.
|
||||
# TYPE homelab_borg_scope_expected_file_present gauge
|
||||
# HELP homelab_borg_scope_expected_sources_total Number of expected Borg source paths from the repo source list.
|
||||
# TYPE homelab_borg_scope_expected_sources_total gauge
|
||||
# HELP homelab_borg_scope_configured_sources_total Number of Borg source paths configured in Borg UI.
|
||||
# TYPE homelab_borg_scope_configured_sources_total gauge
|
||||
# HELP homelab_borg_scope_missing_sources_total Number of expected Borg source paths missing from Borg UI.
|
||||
# TYPE homelab_borg_scope_missing_sources_total gauge
|
||||
# HELP homelab_borg_scope_extra_sources_total Number of Borg UI source paths not present in the repo source list.
|
||||
# TYPE homelab_borg_scope_extra_sources_total gauge
|
||||
# HELP homelab_borg_scope_source_configured Whether an expected Borg source path is configured in Borg UI.
|
||||
# TYPE homelab_borg_scope_source_configured gauge
|
||||
# HELP homelab_borg_schedule_prune_after_enabled Whether a Borg scheduled job runs prune after backup.
|
||||
# TYPE homelab_borg_schedule_prune_after_enabled gauge
|
||||
# HELP homelab_borg_schedule_compact_after_enabled Whether a Borg scheduled job runs compact after backup.
|
||||
# TYPE homelab_borg_schedule_compact_after_enabled gauge
|
||||
EOF
|
||||
|
||||
if docker inspect "$BORG_CONTAINER" >/dev/null 2>&1; then
|
||||
docker exec -i -e BORG_EXPECTED_SOURCES_FILE="$BORG_EXPECTED_SOURCES_FILE" "$BORG_CONTAINER" python3 - <<'PY'
|
||||
docker exec -i "$BORG_CONTAINER" python3 - <<'PY'
|
||||
import datetime as dt
|
||||
import json
|
||||
import os
|
||||
from pathlib import Path
|
||||
import sqlite3
|
||||
|
||||
conn = sqlite3.connect("/data/borg.db")
|
||||
@@ -160,9 +135,6 @@ def parse_ts(value):
|
||||
def escape_label(value):
|
||||
return (value or "").replace("\\", "\\\\").replace('"', '\\"')
|
||||
|
||||
def bool_metric(value):
|
||||
return 1 if value else 0
|
||||
|
||||
latest_status = latest["status"] if latest else "missing"
|
||||
latest_success = 1 if latest_status in ("completed", "completed_with_warnings") else 0
|
||||
latest_warning = 1 if latest_status == "completed_with_warnings" else 0
|
||||
@@ -173,107 +145,12 @@ completed_archive = escape_label(completed["archive_name"] if completed else "")
|
||||
print(f'homelab_borg_last_success{{status="{latest_status}",archive="{latest_archive}"}} {latest_success}')
|
||||
print(f'homelab_borg_last_job_warning{{status="{latest_status}",archive="{latest_archive}"}} {latest_warning}')
|
||||
print(f'homelab_borg_last_completed_timestamp_seconds{{archive="{completed_archive}"}} {completed_ts}')
|
||||
|
||||
repo = cur.execute("""
|
||||
select id, name, source_directories, last_check
|
||||
from repositories
|
||||
order by id
|
||||
limit 1
|
||||
""").fetchone()
|
||||
|
||||
if repo:
|
||||
repo_name = escape_label(repo["name"] or str(repo["id"]))
|
||||
print(f'homelab_borg_repository_last_check_timestamp_seconds{{repository="{repo_name}"}} {parse_ts(repo["last_check"])}')
|
||||
|
||||
try:
|
||||
configured_sources = json.loads(repo["source_directories"] or "[]")
|
||||
except json.JSONDecodeError:
|
||||
configured_sources = []
|
||||
else:
|
||||
configured_sources = []
|
||||
|
||||
expected_path = Path(os.environ.get("BORG_EXPECTED_SOURCES_FILE", ""))
|
||||
expected_file_present = expected_path.is_file()
|
||||
if expected_file_present:
|
||||
expected_sources = [
|
||||
line.strip()
|
||||
for line in expected_path.read_text(encoding="utf-8").splitlines()
|
||||
if line.strip() and not line.lstrip().startswith("#")
|
||||
]
|
||||
else:
|
||||
expected_sources = []
|
||||
|
||||
configured_set = set(configured_sources)
|
||||
expected_set = set(expected_sources)
|
||||
missing_sources = [source for source in expected_sources if source not in configured_set]
|
||||
extra_sources = [source for source in configured_sources if source not in expected_set]
|
||||
|
||||
print(f"homelab_borg_scope_expected_file_present {bool_metric(expected_file_present)}")
|
||||
print(f"homelab_borg_scope_expected_sources_total {len(expected_sources)}")
|
||||
print(f"homelab_borg_scope_configured_sources_total {len(configured_sources)}")
|
||||
print(f"homelab_borg_scope_missing_sources_total {len(missing_sources)}")
|
||||
print(f"homelab_borg_scope_extra_sources_total {len(extra_sources)}")
|
||||
|
||||
for source in expected_sources:
|
||||
value = 1 if source in configured_set else 0
|
||||
print(f'homelab_borg_scope_source_configured{{source="{escape_label(source)}"}} {value}')
|
||||
|
||||
for source in extra_sources:
|
||||
print(f'homelab_borg_scope_source_configured{{source="{escape_label(source)}",state="extra"}} 0')
|
||||
|
||||
for schedule in cur.execute("""
|
||||
select id, name, run_prune_after, run_compact_after
|
||||
from scheduled_jobs
|
||||
where enabled = 1
|
||||
order by id
|
||||
"""):
|
||||
schedule_name = escape_label(schedule["name"] or str(schedule["id"]))
|
||||
print(f'homelab_borg_schedule_prune_after_enabled{{schedule="{schedule_name}"}} {bool_metric(schedule["run_prune_after"])}')
|
||||
print(f'homelab_borg_schedule_compact_after_enabled{{schedule="{schedule_name}"}} {bool_metric(schedule["run_compact_after"])}')
|
||||
PY
|
||||
else
|
||||
printf 'homelab_borg_last_success{status="container_missing",archive=""} 0\n'
|
||||
printf 'homelab_borg_last_job_warning{status="container_missing",archive=""} 0\n'
|
||||
printf 'homelab_borg_last_completed_timestamp_seconds{archive=""} 0\n'
|
||||
printf 'homelab_borg_repository_last_check_timestamp_seconds{repository=""} 0\n'
|
||||
printf 'homelab_borg_scope_expected_file_present 0\n'
|
||||
printf 'homelab_borg_scope_expected_sources_total 0\n'
|
||||
printf 'homelab_borg_scope_configured_sources_total 0\n'
|
||||
printf 'homelab_borg_scope_missing_sources_total 0\n'
|
||||
printf 'homelab_borg_scope_extra_sources_total 0\n'
|
||||
fi
|
||||
|
||||
# Dump-Frische host-seitig messen. Schliesst den Blindfleck, dass Borg
|
||||
# weiterlaeuft und stale Dumps archiviert, ohne dass ein Job-Fehler entsteht
|
||||
# (pre-backup-dumps.sh gestoppt). Laeuft ausserhalb des borg-ui-Containers,
|
||||
# weil die Dumps host-seitig unter $BORG_DUMP_DIR liegen.
|
||||
cat <<'EOF'
|
||||
# HELP homelab_borg_dump_present Whether an expected Borg pre-backup dump artifact exists in the latest dump set.
|
||||
# TYPE homelab_borg_dump_present gauge
|
||||
# HELP homelab_borg_dump_age_seconds Age in seconds of an expected Borg pre-backup dump artifact.
|
||||
# TYPE homelab_borg_dump_age_seconds gauge
|
||||
EOF
|
||||
for dump in \
|
||||
postgresql17-globals.sql \
|
||||
postgresql17-mailarchiver.dump \
|
||||
postgresql17-paperless.dump \
|
||||
mealie.dump \
|
||||
immich.dump \
|
||||
nextcloud.dump \
|
||||
gitea.sqlite.dump \
|
||||
vaultwarden.sqlite.dump \
|
||||
n8n.sqlite.dump \
|
||||
unraid-flash-config.tar.gz \
|
||||
komodo-mongo.archive.gz; do
|
||||
dump_path="$BORG_DUMP_DIR/$dump"
|
||||
if [ -f "$dump_path" ]; then
|
||||
dump_mtime="$(stat -c %Y "$dump_path" 2>/dev/null || echo 0)"
|
||||
printf 'homelab_borg_dump_present{dump="%s"} 1\n' "$dump"
|
||||
printf 'homelab_borg_dump_age_seconds{dump="%s"} %s\n' "$dump" "$(( now - dump_mtime ))"
|
||||
else
|
||||
printf 'homelab_borg_dump_present{dump="%s"} 0\n' "$dump"
|
||||
fi
|
||||
done
|
||||
} > "$tmp"
|
||||
|
||||
# 0644 statt mktemp-default 0600, damit der node-exporter-Textfile-Collector
|
||||
|
||||
@@ -28,9 +28,3 @@ immich_postgres 2026-09-10
|
||||
# (Dez 2025). Das Image-Alter ist nur Build-Alter, keine veraltete Version.
|
||||
# Re-check: ob eine blackbox_exporter-Version > v0.28.0 erschienen ist.
|
||||
monitoring-blackbox-exporter 2026-09-10
|
||||
|
||||
# glance-docker-socket-proxy: v0.4.2 ist am 2026-06-17 weiterhin der neueste
|
||||
# stabile Tag / latest. Neuere Tags sind nur master/nightly und werden fuer den
|
||||
# lesenden Glance-Socket-Proxy bewusst nicht produktiv eingesetzt.
|
||||
# Re-check: ob ein stabiler Tag > v0.4.2 erschienen ist.
|
||||
glance-docker-socket-proxy 2026-09-17
|
||||
|
||||
@@ -87,19 +87,3 @@ adguard.*bad question section.*only 1 question allowed
|
||||
# this lookup is harmless and does not affect any dashboard.
|
||||
# Re-check: only if Amazon Prometheus is added as a datasource.
|
||||
monitoring-grafana.*grafana-amazonprometheus-datasource not found
|
||||
|
||||
# cAdvisor stale container filesystem stats on Unraid.
|
||||
# Why: cAdvisor can keep reporting an already removed Docker container path in
|
||||
# fsHandler even though the container and path no longer exist. This is a
|
||||
# collector bookkeeping issue, not a failed workload or missing data path.
|
||||
# Re-check: if the message references an existing/running container, if
|
||||
# Prometheus target health fails, or if broader cAdvisor errors appear.
|
||||
monitoring-cadvisor.*failed to collect filesystem stats.*var/lib/docker/containers/[0-9a-f]{64}
|
||||
|
||||
# cAdvisor startup lines that match the generic "oom" / "failed" grep.
|
||||
# Why: "oom_event" is a metric name printed during startup, and Unraid loop
|
||||
# devices can disappear while cAdvisor enumerates block devices.
|
||||
# Re-check: if cAdvisor target health fails or these messages appear outside
|
||||
# container startup together with missing container metrics.
|
||||
monitoring-cadvisor.*enabled metrics:.*oom_event
|
||||
monitoring-cadvisor.*stat failed on /dev/loop[0-9]+ with error: no such file or directory
|
||||
|
||||
@@ -431,24 +431,24 @@ def render_summary_grid(entries):
|
||||
status = classify(label, value)
|
||||
theme = STATUS_THEMES.get(status, STATUS_THEMES["UNKNOWN"])
|
||||
cards.append(
|
||||
'<td style="padding:6px;width:50%;vertical-align:top">'
|
||||
'<td style="padding:6px;width:33.33%;vertical-align:top">'
|
||||
f'<div style="background:{theme["card_bg"]};'
|
||||
f'border:1px solid {theme["card_border"]};'
|
||||
'border-radius:8px;padding:11px 12px;min-height:74px">'
|
||||
'border-radius:8px;padding:12px 14px">'
|
||||
f'<div style="font-size:11px;color:#1e293b;'
|
||||
'text-transform:uppercase;letter-spacing:0.04em;font-weight:700;'
|
||||
f'line-height:1.35;opacity:0.78;overflow-wrap:anywhere">{html.escape(label)}</div>'
|
||||
f'<div style="font-size:16px;font-weight:700;'
|
||||
'text-transform:uppercase;letter-spacing:0.08em;font-weight:700;'
|
||||
f'line-height:1.3;opacity:0.78">{html.escape(label)}</div>'
|
||||
f'<div style="font-size:17px;font-weight:700;'
|
||||
f'color:{theme["card_text"]};margin-top:5px;line-height:1.25;'
|
||||
f'word-break:normal;overflow-wrap:anywhere;font-variant-numeric:tabular-nums">'
|
||||
f'word-break:break-word;font-variant-numeric:tabular-nums">'
|
||||
f'{html.escape(value)}</div>'
|
||||
'</div></td>'
|
||||
)
|
||||
rows_html = []
|
||||
for chunk_start in range(0, len(cards), 2):
|
||||
chunk = cards[chunk_start:chunk_start + 2]
|
||||
while len(chunk) < 2:
|
||||
chunk.append('<td style="padding:6px;width:50%"></td>')
|
||||
for chunk_start in range(0, len(cards), 3):
|
||||
chunk = cards[chunk_start:chunk_start + 3]
|
||||
while len(chunk) < 3:
|
||||
chunk.append('<td style="padding:6px;width:33.33%"></td>')
|
||||
rows_html.append("<tr>" + "".join(chunk) + "</tr>")
|
||||
return (
|
||||
'<table role="presentation" cellpadding="0" cellspacing="0" border="0" width="100%" '
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
services:
|
||||
homeassistant:
|
||||
image: ghcr.io/home-assistant/home-assistant:2026.6.3@sha256:aed891b8f801072302815b4b0fab5adb714182967e9d2e2d4a2be558241c73ad
|
||||
image: ghcr.io/home-assistant/home-assistant:2026.6.1@sha256:59aa8824955c9db491b75d2eebe42bd68494f80c2ec69ec0d66d9dae37d37514
|
||||
container_name: homeassistant
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
|
||||
Reference in New Issue
Block a user