node-exporter runs as nobody:65534 inside its container and was
hitting node_textfile_scrape_error 1 on homelab.prom, because the
file was 0600 root:root (mktemp default). Set it to 0644 right
before the atomic mv. Bundle inhaltsidentisch zum Git-Repo, ohne
Secrets (.gitignore-abgedeckt) und nicht sensibler als die
uebrigen /mnt/user/backups/borg/dumps/latest/*.dump-Files, die
ebenfalls 0644 sind. So funktioniert auch der Nearline-Pull-Workflow
ueber SMB (docs/H_DRIVE_NEARLINE_PULL.md).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Operational hardening across several services after live incident
analysis between 2026-05-18 and 2026-05-20:
- Gitea: disable public registration and OpenID signup/signin to
stop the external POST / 5xx bursts that triggered availability
alerts. New repo-wide policy requires every productive
Micha/homelab-infra Komodo stack to ship with an active
Gitea->Komodo webhook on the current stack ID (documented in
CLAUDE.md, AI_CONTEXT.md, WORKFLOW.md).
- posture-check: extract the Disk1 fstype check into its own
function so the documented Disk1 NTFS exception no longer raises
ntfy warnings, skip POSIX inode checks on NTFS, and dedup ntfy
alerts via a fingerprint state file with ALERT_REPEAT_SECONDS
(default 24h). Repeat-spam on the same cause now suppressed.
- docker-critical-events: parse the event JSON for container name,
action, exit code and signal; drop `die exit=0` events (clean
stops); ship a structured ntfy message instead of the raw event
line.
- Borg UI: mount /mnt/user/services into the backup container as
/local/services:ro and include homelab-infra, stacks and
posture-check in all-important-sources.txt. RESTORE_MATRIX and
DISASTER_RECOVERY updated accordingly.
- Unraid user scripts: document the new
homelab-operations-report-daily cron job and the SMTP password
file it expects on the host.
- MIGRATION_LOG: capture the four live events from this window -
Gitea 5xx burst + signup closure, Komodo webhook reconciliation,
posture-check host-version verification, Borg scope extension,
and Traefik 5xx alert detuning.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Brings the previously untracked daily-status-report.sh and
send-operations-report-mail.sh into the repo, plus a refactor of the
log-noise pipeline:
- New helper services/posture-check/lib/normalize-noise-patterns.sh
strips comments, empty lines and trailing whitespace from
log-noise.patterns before grep -f sees it. A stray empty line in
the pattern file would otherwise have made grep -Eaif match every
hit and silently wipe the log highlights.
- log-noise.patterns is now documented per-pattern (Why / Re-check).
The Vaultwarden pattern is split: token/session noise stays as
noise; DNS/Connect/Resolve/reqwest/hyper errors are removed from
the noise set so real network signals stay visible.
- collect_log_highlights now reports a per-container and per-pattern
noise breakdown (Top N) and an escalation flag when any pattern
exceeds NOISE_ESCALATION_THRESHOLD (default 500). The flag is fed
into derive_report_status and the management summary.
- New shell tests under services/posture-check/tests/ verify the
normalize helper handles comments, empty lines, whitespace-only
lines, and that unknown error lines remain in the attention set.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>