Add daily operations report with hardened log-noise filtering

Brings the previously untracked daily-status-report.sh and
send-operations-report-mail.sh into the repo, plus a refactor of the
log-noise pipeline:

- New helper services/posture-check/lib/normalize-noise-patterns.sh
  strips comments, empty lines and trailing whitespace from
  log-noise.patterns before grep -f sees it. A stray empty line in
  the pattern file would otherwise have made grep -Eaif match every
  hit and silently wipe the log highlights.
- log-noise.patterns is now documented per-pattern (Why / Re-check).
  The Vaultwarden pattern is split: token/session noise stays as
  noise; DNS/Connect/Resolve/reqwest/hyper errors are removed from
  the noise set so real network signals stay visible.
- collect_log_highlights now reports a per-container and per-pattern
  noise breakdown (Top N) and an escalation flag when any pattern
  exceeds NOISE_ESCALATION_THRESHOLD (default 500). The flag is fed
  into derive_report_status and the management summary.
- New shell tests under services/posture-check/tests/ verify the
  normalize helper handles comments, empty lines, whitespace-only
  lines, and that unknown error lines remain in the attention set.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-23 10:41:33 +02:00
parent b7cbbe51de
commit 9e7bebbd3c
5 changed files with 2026 additions and 0 deletions
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,27 @@
#!/usr/bin/env bash
# normalize-noise-patterns.sh
#
# Read a log-noise.patterns file and emit a normalized stream of patterns
# that is safe to feed into `grep -Eaif`.
#
# Behaviour:
# - Lines starting with `#` (after optional leading whitespace) are dropped.
# - Empty / whitespace-only lines are dropped.
# - Leading and trailing whitespace is trimmed from each pattern.
# - Patterns that become empty after trimming are dropped.
#
# Why this exists:
# A single empty / whitespace-only line in the input file would make
# `grep -Eaif` match every input line, silently wiping the entire log
# highlights signal. Always pipe patterns through this normalizer first.
#
# Usage:
# normalize-noise-patterns.sh <file>
# cat patterns | normalize-noise-patterns.sh
set -euo pipefail
src="${1:-/dev/stdin}"
grep -Ev '^[[:space:]]*(#|$)' "$src" \
| sed -E 's/^[[:space:]]+//; s/[[:space:]]+$//' \
| grep -v '^$' || true
+81
View File
@@ -0,0 +1,81 @@
# log-noise.patterns - Daily Operations Report
#
# Format:
# - One Extended Regex (ERE) per non-comment line.
# - Lines starting with '#' (after optional whitespace) are comments.
# - Empty / whitespace-only lines are ignored.
# - Patterns are applied case-insensitively (grep -Eaif).
# - The file is normalized via lib/normalize-noise-patterns.sh before use.
#
# Per pattern, document:
# - Why this is noise (root cause, not just "expected").
# - When to re-check / what would invalidate the assumption.
#
# Adding a new pattern: prefer the narrowest container.* prefix and the
# narrowest message anchor. A pattern that matches across containers or
# matches generic error strings will hide real signal.
#
# Removing a pattern: replace with a fresh attention example in the next
# daily report and consult before reintroducing.
#
# Last reviewed: 2026-05-21
# Loki internal query cancellations / scheduler chatter.
# Why: Loki cancels internal queries continuously when downstream Promtails
# or Grafana panels drop connections; no user-visible outage by itself.
# Re-check: if Grafana dashboards show real Loki query failures or if
# Prometheus alerts fire on Loki ingestion / availability.
monitoring-loki.*(context canceled|error notifying scheduler|closing iterator)
# node-exporter parsing /host/proc/mdstat on Unraid.
# Why: Unraid uses its own array driver, not Linux mdadm, so /proc/mdstat
# layout is unparsable for node-exporter. Pure collector noise.
# Re-check: only if migrating to mdadm-based RAID. Then remove this entry
# and act on real mdadm errors.
monitoring-node-exporter.*mdadm.*Cannot parse /host/proc/mdstat
# Gitea OpenID login attempts return 403.
# Why: OpenID provider is intentionally disabled in Gitea config; 403 is
# the expected response for stale OAuth callback URLs.
# Re-check: when OpenID/OIDC gets enabled again. Remove this and treat
# the 403 as a real auth failure signal.
gitea.*user/login/openid.*403 Forbidden
# Uptime Kuma monitor for legacy domain grafana.kaleschke.info returning 404.
# Why: Monitor was not removed when the public Grafana endpoint was
# decommissioned.
# Re-check: at next Uptime-Kuma cleanup. Action: delete the obsolete monitor
# and remove this pattern.
uptime-kuma.*grafana\.kaleschke\.info.*status code 404
# Tailscale PCP port mapping failure (NAT-PMP unsupported by router).
# Why: Tailscale falls back to STUN/DERP transparently; no functional impact.
# Re-check: if Tailscale reports persistent connectivity problems in real
# usage, or if a router change adds NAT-PMP support.
Tailscale-Docker.*failed to get PCP mapping
# Immich version check failed to reach GitHub releases API.
# Why: External GitHub release check; transient failures do not affect
# Immich core functionality.
# Re-check: if Immich UI persistently warns about being outdated or if
# security updates are missed because of this.
immich_server.*Failed to fetch latest release
# Authelia 408 client-side request timeouts.
# Why: Clients (browsers, Vaultwarden-CLI etc.) drop slow connections;
# without correlated login failures or 5xx, individual 408s are normal.
# Re-check: if 408-rate spikes (>5/min sustained) or if login flows complain.
# Then narrow this pattern instead of removing.
authelia.*Request timeout occurred.*status_code=408
# Vaultwarden expired sessions and invalid refresh tokens (auth/session class).
# Why: Normal session expiry; clients retry and re-login transparently.
# Re-check: if many distinct external IPs trigger 401s in a short window
# (possible brute-force or credential-stuffing pattern).
#
# NOTE: DNS / Connect / Resolve / reqwest / hyper-client errors are
# intentionally NOT suppressed here. They are real network signals
# and should be visible in the attention list. If push-notification
# noise becomes overwhelming, add a *narrow* pattern restricted to
# push contexts only (e.g. `vaultwarden.*push.*(ResolveError|...)`).
vaultwarden.*(Token has expired|Invalid refresh token|Failed to decode.*refresh_token|POST /identity/connect/token => 401 Unauthorized)
@@ -0,0 +1,612 @@
#!/usr/bin/env bash
set -euo pipefail
REPORT_PATH="${1:-}"
REPORT_STATUS="${2:-UNKNOWN}"
MAIL_FROM="${MAIL_FROM:-michideheld@gmx.de}"
MAIL_TO="${MAIL_TO:-Mi.Kaleschke@gmx.de}"
SMTP_HOST="${SMTP_HOST:-smtp.gmx.net}"
SMTP_PORT="${SMTP_PORT:-587}"
SMTP_USER="${SMTP_USER:-$MAIL_FROM}"
SMTP_PASS_FILE="${SMTP_PASS_FILE:-/mnt/user/appdata/secrets/homelab_smtp_password.txt}"
MAIL_IMAGE="${MAIL_IMAGE:-python:3.13-alpine}"
MAIL_DNS_1="${MAIL_DNS_1:-1.1.1.1}"
MAIL_DNS_2="${MAIL_DNS_2:-8.8.8.8}"
if [ -z "$REPORT_PATH" ] || [ ! -f "$REPORT_PATH" ]; then
echo "Usage: $0 <report-path> [status]" >&2
exit 1
fi
if [ ! -f "$SMTP_PASS_FILE" ]; then
echo "Missing SMTP password file: $SMTP_PASS_FILE" >&2
exit 1
fi
REPORT_BASENAME="$(basename "$REPORT_PATH")"
REPORT_DATE="${REPORT_BASENAME#homelab-day-}"
REPORT_DATE="${REPORT_DATE%.md}"
SUBJECT="${MAIL_SUBJECT:-Homelab Operations Report - $REPORT_DATE - $REPORT_STATUS}"
docker run -i --rm \
--dns "$MAIL_DNS_1" \
--dns "$MAIL_DNS_2" \
-e MAIL_FROM="$MAIL_FROM" \
-e MAIL_TO="$MAIL_TO" \
-e SMTP_HOST="$SMTP_HOST" \
-e SMTP_PORT="$SMTP_PORT" \
-e SMTP_USER="$SMTP_USER" \
-e MAIL_SUBJECT="$SUBJECT" \
-e REPORT_STATUS="$REPORT_STATUS" \
-e REPORT_HOSTNAME="$(hostname)" \
-v "$REPORT_PATH:/report.md:ro" \
-v "$SMTP_PASS_FILE:/smtp-password:ro" \
"$MAIL_IMAGE" python - <<'PY'
import os
import html
import re
import smtplib
import ssl
from datetime import datetime, timezone
from email.message import EmailMessage
from pathlib import Path
mail_from = os.environ["MAIL_FROM"]
mail_to = os.environ["MAIL_TO"]
smtp_host = os.environ["SMTP_HOST"]
smtp_port = int(os.environ.get("SMTP_PORT", "587"))
smtp_user = os.environ.get("SMTP_USER") or mail_from
subject = os.environ["MAIL_SUBJECT"]
report_status = os.environ.get("REPORT_STATUS", "UNKNOWN")
report_hostname = os.environ.get("REPORT_HOSTNAME", "")
password = Path("/smtp-password").read_text(encoding="utf-8").strip()
report = Path("/report.md").read_text(encoding="utf-8")
# ---------- Style constants ----------
COLORS = {
"bg": "#f7f8fa",
"card_bg": "#ffffff",
"text": "#0f172a",
"text_muted": "#475569",
"border": "#e2e8f0",
"border_strong": "#cbd5e1",
"zebra": "#f1f5f9",
"code_bg": "#eef2ff",
"code_text": "#3730a3",
"pre_bg": "#f8fafc",
"accent": "#3b82f6",
}
STATUS_THEMES = {
"OK": {"banner_a": "#16a34a", "banner_b": "#22c55e", "card_bg": "#dcfce7", "card_border": "#86efac", "card_text": "#166534"},
"WARNUNG": {"banner_a": "#d97706", "banner_b": "#f59e0b", "card_bg": "#fef3c7", "card_border": "#fcd34d", "card_text": "#92400e"},
"KRITISCH": {"banner_a": "#dc2626", "banner_b": "#ef4444", "card_bg": "#fee2e2", "card_border": "#fca5a5", "card_text": "#991b1b"},
"UNKNOWN": {"banner_a": "#475569", "banner_b": "#64748b", "card_bg": "#f1f5f9", "card_border": "#cbd5e1", "card_text": "#334155"},
}
OK_VALUES = {"ok", "completed", "0", "aktiv", "ja"}
WARN_VALUES = {"warnung", "pending", "insufficient"}
CRIT_VALUES = {"kritisch", "failed", "error"}
UNKNOWN_VALUES = {"unknown", "missing"}
CRIT_LABEL_HINTS = ("unhealthy", "firing")
META_LABELS = {"Erstellt", "Zeitraum", "Host", "Gesamtbewertung"}
SUMMARY_LIST_RE = re.compile(r"^-\s+(.+?):\s*`(.+?)`\s*$")
H1_DATE_RE = re.compile(r"^#\s+Homelab Operations Report\s*-\s*(\d{4}-\d{2}-\d{2})")
INLINE_CODE_RE = re.compile(r"`([^`]+)`")
def classify(label, value):
v = value.strip().lower()
lbl = label.strip().lower()
if v in OK_VALUES:
return "OK"
if v in UNKNOWN_VALUES:
return "UNKNOWN"
if v in WARN_VALUES:
return "WARNUNG"
if v in CRIT_VALUES:
return "KRITISCH"
try:
n = float(v)
except ValueError:
return "UNKNOWN"
if n == 0:
return "OK"
if any(hint in lbl for hint in CRIT_LABEL_HINTS):
return "KRITISCH"
return "WARNUNG"
def classify_callout(text):
lower = text.lower()
if "kritisch" in lower or "sofort" in lower:
return "crit"
warn_hints = ("drift", "warnung", "ablauf", "brauchen aufmerksamkeit", "pruefen", "prüfen", "ueberalter", "nachverfolgt")
if any(h in lower for h in warn_hints):
return "warn"
return "ok"
# ---------- Pass 1: parse_blocks ----------
def parse_blocks(text):
lines = text.splitlines()
blocks = []
meta = {}
report_date = None
in_management_section = False
seen_h2 = False
i = 0
n = len(lines)
def flush_paragraph(buf):
if not buf:
return
joined = " ".join(buf).strip()
if not joined:
return
if joined.startswith("Bewertung:"):
blocks.append(("callout", classify_callout(joined), joined))
else:
blocks.append(("paragraph", joined))
while i < n:
line = lines[i]
stripped = line.rstrip()
m1 = H1_DATE_RE.match(stripped)
if m1:
report_date = m1.group(1)
i += 1
continue
if stripped.startswith("# "):
i += 1
continue
if stripped.startswith("```"):
i += 1
pre_buf = []
while i < n and not lines[i].startswith("```"):
pre_buf.append(lines[i])
i += 1
if i < n:
i += 1 # closing fence
blocks.append(("pre", "\n".join(pre_buf)))
continue
if stripped.startswith("### "):
title = stripped[4:].strip()
in_management_section = (title == "Management-Bewertung")
blocks.append(("heading", 3, title))
i += 1
continue
if stripped.startswith("## "):
blocks.append(("heading", 2, stripped[3:].strip()))
seen_h2 = True
in_management_section = False
i += 1
continue
if stripped.startswith("- "):
if in_management_section:
entries = []
while i < n and lines[i].rstrip().startswith("- "):
body = lines[i].rstrip()[2:].strip()
if ":" in body:
lbl, val = body.split(":", 1)
val = val.strip()
val = re.sub(r"`([^`]+)`", r"\1", val)
entries.append((lbl.strip(), val))
else:
entries.append((body, ""))
i += 1
blocks.append(("summary_grid", entries))
in_management_section = False
continue
if not seen_h2:
saved_i = i
tmp_items = []
tmp_is_meta = True
while i < n and lines[i].rstrip().startswith("- "):
body = lines[i].rstrip()[2:].strip()
if ":" not in body:
tmp_is_meta = False
break
lbl, val = body.split(":", 1)
val = val.strip()
m = re.search(r"`([^`]+)`", val)
if m:
val = m.group(1)
else:
val = val.strip("`")
tmp_items.append((lbl.strip(), val))
i += 1
if tmp_is_meta and tmp_items and any(lbl in META_LABELS for lbl, _ in tmp_items):
for lbl, val in tmp_items:
meta[lbl] = val
continue
i = saved_i
items = []
while i < n and lines[i].rstrip().startswith("- "):
items.append(lines[i].rstrip()[2:].strip())
i += 1
blocks.append(("ul", items))
continue
if stripped.startswith("|") and stripped.endswith("|"):
header_cells = [c.strip() for c in stripped.strip("|").split("|")]
i += 1
alignments = ["left"] * len(header_cells)
if i < n:
sep = lines[i].rstrip()
if sep.startswith("|") and "-" in sep and sep.endswith("|"):
sep_cells = [c.strip() for c in sep.strip("|").split("|")]
for idx, cell in enumerate(sep_cells):
if idx >= len(alignments):
continue
if cell.startswith(":") and cell.endswith(":"):
alignments[idx] = "center"
elif cell.endswith(":"):
alignments[idx] = "right"
i += 1
rows = []
while i < n:
row = lines[i].rstrip()
if not (row.startswith("|") and row.endswith("|")):
break
rows.append([c.strip() for c in row.strip("|").split("|")])
i += 1
blocks.append(("table", header_cells, alignments, rows))
continue
if not stripped:
i += 1
continue
para_buf = [stripped]
i += 1
while i < n:
nxt = lines[i].rstrip()
if (not nxt
or nxt.startswith("#")
or nxt.startswith("- ")
or nxt.startswith("```")
or (nxt.startswith("|") and nxt.endswith("|"))):
break
para_buf.append(nxt)
i += 1
flush_paragraph(para_buf)
return blocks, report_date, meta
# ---------- Pass 2: section wrappers ----------
def inject_section_wrappers(blocks):
out = []
inside = False
for blk in blocks:
if blk[0] == "heading" and blk[1] == 2:
if inside:
out.append(("section_close",))
out.append(("section_open", blk[2]))
inside = True
continue
out.append(blk)
if inside:
out.append(("section_close",))
return out
# ---------- Pass 3: render ----------
def inline(text):
escaped = html.escape(text)
return INLINE_CODE_RE.sub(
lambda m: (
f'<code style="background:{COLORS["code_bg"]};color:{COLORS["code_text"]};'
f'padding:1px 6px;border-radius:4px;'
f'font-family:ui-monospace,SFMono-Regular,Consolas,monospace;font-size:12.5px">'
f'{m.group(1)}</code>'
),
escaped,
)
def render_hero(status, report_date, hostname, meta):
theme = STATUS_THEMES.get(status, STATUS_THEMES["UNKNOWN"])
a, b = theme["banner_a"], theme["banner_b"]
date_label = report_date or meta.get("Erstellt", "") or ""
chips = []
erstellt = meta.get("Erstellt", "")
zeitraum = meta.get("Zeitraum", "")
if erstellt:
chips.append(f"Erstellt {html.escape(erstellt)}")
if zeitraum:
chips.append(f"Zeitraum {html.escape(zeitraum)}")
if hostname:
chips.append(f"Host {html.escape(hostname)}")
chips_html = ""
if chips:
chips_html = (
'<div style="margin-top:14px;font-size:12px;color:rgba(255,255,255,0.92);'
'line-height:1.5">'
+ " &nbsp;·&nbsp; ".join(chips)
+ "</div>"
)
return (
'<table role="presentation" cellpadding="0" cellspacing="0" border="0" width="100%" '
'style="margin-bottom:20px"><tr><td '
f'bgcolor="{a}" '
f'style="background-color:{a};'
f'background-image:linear-gradient(135deg,{a} 0%,{b} 100%);'
'padding:28px 32px;border-radius:12px;color:#ffffff">'
'<div style="font-size:12px;text-transform:uppercase;letter-spacing:0.14em;'
'opacity:0.85;font-weight:600">Homelab Operations Report</div>'
f'<div style="font-size:30px;font-weight:700;margin-top:8px;line-height:1.1">'
f'{html.escape(date_label)}</div>'
'<div style="margin-top:14px">'
'<span style="display:inline-block;background:rgba(255,255,255,0.22);'
'padding:6px 16px;border-radius:999px;font-weight:700;'
f'letter-spacing:0.08em;font-size:13px">{html.escape(status)}</span>'
'</div>'
f'{chips_html}'
'</td></tr></table>'
)
def render_section_open(title):
return (
'<table role="presentation" cellpadding="0" cellspacing="0" border="0" width="100%" '
f'style="margin:16px 0;background:{COLORS["card_bg"]};'
f'border:1px solid {COLORS["border"]};border-radius:10px;'
'box-shadow:0 1px 2px rgba(15,23,42,0.04);overflow:hidden">'
'<tr>'
f'<td width="4" bgcolor="{COLORS["accent"]}" '
f'style="background-color:{COLORS["accent"]};width:4px;min-width:4px"></td>'
'<td style="padding:18px 24px">'
f'<h2 style="margin:0 0 14px;font-size:19px;color:{COLORS["text"]};'
f'font-weight:700;letter-spacing:-0.01em">{html.escape(title)}</h2>'
)
def render_section_close():
return "</td></tr></table>"
def render_heading(level, text):
if level == 3:
return (
f'<h3 style="font-size:14px;margin:18px 0 8px;color:{COLORS["text_muted"]};'
'font-weight:600;text-transform:uppercase;letter-spacing:0.06em">'
f'{inline(text)}</h3>'
)
return (
f'<h4 style="font-size:13px;margin:14px 0 6px;color:{COLORS["text_muted"]};'
f'font-weight:600">{inline(text)}</h4>'
)
def render_paragraph(text):
return (
f'<p style="margin:8px 0;color:{COLORS["text"]};line-height:1.6;'
f'font-size:14px">{inline(text)}</p>'
)
def render_callout(flavor, text):
themes = {
"ok": {"bg": "#ecfdf5", "border": "#16a34a", "text": "#065f46"},
"warn": {"bg": "#fffbeb", "border": "#d97706", "text": "#78350f"},
"crit": {"bg": "#fef2f2", "border": "#dc2626", "text": "#7f1d1d"},
}
t = themes.get(flavor, themes["ok"])
return (
f'<div style="background:{t["bg"]};border-left:4px solid {t["border"]};'
f'color:{t["text"]};padding:12px 16px;margin:14px 0;border-radius:4px;'
f'line-height:1.55;font-size:14px">{inline(text)}</div>'
)
def render_ul(items):
lis = "".join(
f'<li style="margin:5px 0;color:{COLORS["text"]};'
f'line-height:1.55;font-size:14px">{inline(it)}</li>'
for it in items
)
return f'<ul style="margin:8px 0 12px 22px;padding:0">{lis}</ul>'
def render_summary_grid(entries):
if not entries:
return ""
cards = []
for label, value in entries:
status = classify(label, value)
theme = STATUS_THEMES.get(status, STATUS_THEMES["UNKNOWN"])
cards.append(
'<td style="padding:6px;width:33.33%;vertical-align:top">'
f'<div style="background:{theme["card_bg"]};'
f'border:1px solid {theme["card_border"]};'
'border-radius:8px;padding:12px 14px">'
f'<div style="font-size:11px;color:#1e293b;'
'text-transform:uppercase;letter-spacing:0.08em;font-weight:700;'
f'line-height:1.3;opacity:0.78">{html.escape(label)}</div>'
f'<div style="font-size:17px;font-weight:700;'
f'color:{theme["card_text"]};margin-top:5px;line-height:1.25;'
f'word-break:break-word;font-variant-numeric:tabular-nums">'
f'{html.escape(value)}</div>'
'</div></td>'
)
rows_html = []
for chunk_start in range(0, len(cards), 3):
chunk = cards[chunk_start:chunk_start + 3]
while len(chunk) < 3:
chunk.append('<td style="padding:6px;width:33.33%"></td>')
rows_html.append("<tr>" + "".join(chunk) + "</tr>")
return (
'<table role="presentation" cellpadding="0" cellspacing="0" border="0" width="100%" '
'style="margin:12px 0;border-collapse:separate;border-spacing:0">'
+ "".join(rows_html)
+ "</table>"
)
def render_table(header_cells, alignments, rows):
def is_numeric_header(h):
h_strip = h.strip().rstrip(":")
if re.search(r"(anzahl|zeilen|tage|sekunden|gestern|heute|%|nutzung|frei|resttage)$",
h_strip, re.IGNORECASE):
return True
return False
final_aligns = []
for idx, h in enumerate(header_cells):
if idx < len(alignments) and alignments[idx] != "left":
final_aligns.append(alignments[idx])
elif is_numeric_header(h):
final_aligns.append("right")
else:
final_aligns.append("left")
th_html = "".join(
f'<th align="{a}" style="text-align:{a};padding:9px 12px;'
f'background:{COLORS["zebra"]};'
f'border-bottom:2px solid {COLORS["border_strong"]};'
f'color:{COLORS["text"]};font-size:12px;font-weight:600;'
'text-transform:uppercase;letter-spacing:0.05em">'
f'{inline(h)}</th>'
for h, a in zip(header_cells, final_aligns)
)
tr_html = []
for idx, row in enumerate(rows):
bg = COLORS["zebra"] if idx % 2 == 1 else COLORS["card_bg"]
cells = []
for cidx, cell in enumerate(row):
a = final_aligns[cidx] if cidx < len(final_aligns) else "left"
numeric_style = "font-variant-numeric:tabular-nums;" if a == "right" else ""
cells.append(
f'<td align="{a}" style="text-align:{a};padding:8px 12px;'
f'border-bottom:1px solid {COLORS["border"]};'
f'color:{COLORS["text"]};font-size:13px;{numeric_style}">'
f'{inline(cell)}</td>'
)
tr_html.append(
f'<tr style="background:{bg}">' + "".join(cells) + "</tr>"
)
return (
'<table role="presentation" cellpadding="0" cellspacing="0" border="0" width="100%" '
f'style="border-collapse:collapse;margin:12px 0;'
f'border:1px solid {COLORS["border"]};border-radius:6px;overflow:hidden">'
f'<thead><tr>{th_html}</tr></thead>'
f'<tbody>{"".join(tr_html)}</tbody>'
'</table>'
)
def render_pre(text):
return (
f'<div style="background:{COLORS["pre_bg"]};'
f'border:1px solid {COLORS["border"]};'
f'border-left:4px solid {COLORS["accent"]};'
'border-radius:6px;padding:12px 14px;margin:12px 0;overflow:auto">'
'<pre style="margin:0;font-family:ui-monospace,SFMono-Regular,Consolas,monospace;'
f'font-size:12px;color:{COLORS["text"]};line-height:1.5;'
'white-space:pre-wrap;word-break:break-word">'
+ html.escape(text)
+ '</pre></div>'
)
def render_footer(hostname):
ts = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
parts = []
if hostname:
parts.append(f"Host {html.escape(hostname)}")
parts.append("Generator send-operations-report-mail.sh")
parts.append(f"Rendered {ts}")
return (
f'<div style="margin:24px 0 4px;padding:14px;color:{COLORS["text_muted"]};'
'font-size:11px;line-height:1.5;text-align:center;'
f'border-top:1px solid {COLORS["border"]}">'
+ " &nbsp;·&nbsp; ".join(parts)
+ '</div>'
)
def render_blocks(blocks, status, hostname, report_date, meta):
out = [render_hero(status, report_date, hostname, meta)]
for blk in blocks:
kind = blk[0]
if kind == "section_open":
out.append(render_section_open(blk[1]))
elif kind == "section_close":
out.append(render_section_close())
elif kind == "heading":
out.append(render_heading(blk[1], blk[2]))
elif kind == "paragraph":
out.append(render_paragraph(blk[1]))
elif kind == "callout":
out.append(render_callout(blk[1], blk[2]))
elif kind == "ul":
out.append(render_ul(blk[1]))
elif kind == "summary_grid":
out.append(render_summary_grid(blk[1]))
elif kind == "table":
out.append(render_table(blk[1], blk[2], blk[3]))
elif kind == "pre":
out.append(render_pre(blk[1]))
out.append(render_footer(hostname))
return "\n".join(out)
def markdown_to_html(text, status="UNKNOWN", hostname=""):
blocks, report_date, meta = parse_blocks(text)
blocks = inject_section_wrappers(blocks)
body_html = render_blocks(blocks, status, hostname, report_date, meta)
css = (
"body{font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Helvetica,Arial,sans-serif;"
f"line-height:1.55;color:{COLORS['text']};background:{COLORS['bg']};"
"max-width:940px;margin:24px auto;padding:0 18px}"
"*{box-sizing:border-box}"
f"a{{color:{COLORS['accent']};text-decoration:none}}"
"a:hover{text-decoration:underline}"
)
return (
"<!doctype html>"
"<html><head><meta charset='utf-8'>"
"<meta name='viewport' content='width=device-width,initial-scale=1'>"
f"<style>{css}</style>"
"</head><body>"
f"{body_html}"
"</body></html>"
)
message = EmailMessage()
message["From"] = mail_from
message["To"] = mail_to
message["Subject"] = subject
message.set_content(report, subtype="plain", charset="utf-8")
message.add_alternative(
markdown_to_html(report, status=report_status, hostname=report_hostname),
subtype="html",
charset="utf-8",
)
context = ssl.create_default_context()
with smtplib.SMTP(smtp_host, smtp_port, timeout=30) as smtp:
smtp.ehlo()
smtp.starttls(context=context)
smtp.ehlo()
smtp.login(smtp_user, password)
smtp.send_message(message)
PY
@@ -0,0 +1,105 @@
#!/usr/bin/env bash
# test-log-noise-filter.sh
#
# Verifies that the log-noise filtering pipeline used by collect_log_highlights
# behaves correctly when the pattern file contains comments, empty lines and
# trailing whitespace, and that unknown error lines remain visible in attention.
#
# Run from anywhere:
# bash services/posture-check/tests/test-log-noise-filter.sh
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
NORMALIZE="$SCRIPT_DIR/../lib/normalize-noise-patterns.sh"
if [ ! -x "$NORMALIZE" ]; then
echo "FAIL: normalize helper not executable at $NORMALIZE" >&2
exit 1
fi
tmp="$(mktemp -d)"
trap 'rm -rf "$tmp"' EXIT
# Pattern file with: comment, empty line, whitespace-only line, real patterns
# with leading and trailing whitespace, and a duplicate-shaped pattern.
cat > "$tmp/patterns" <<'EOF'
# this is a comment that must be dropped
monitoring-loki.*context canceled
gitea.*user/login/openid.*403 Forbidden
# another comment line that must be dropped
authelia.*Request timeout occurred.*status_code=408
EOF
# Hits file: 3 known-noise lines (one per real pattern), 2 unknown-error
# lines that must remain in attention.
cat > "$tmp/hits" <<'EOF'
[monitoring-loki] caller=scheduler context canceled
[gitea] router: /user/login/openid 403 Forbidden
[authelia] Request timeout occurred status_code=408
[postgres] FATAL: connection refused for host backup-db
[traefik] error while serving request: tls handshake timeout
EOF
# Run the same pipeline collect_log_highlights uses.
"$NORMALIZE" "$tmp/patterns" > "$tmp/patterns.norm"
grep -Eaif "$tmp/patterns.norm" "$tmp/hits" > "$tmp/known" || true
sed -E 's/[[:space:]]+$//' "$tmp/known" > "$tmp/known.n"
sed -E 's/[[:space:]]+$//' "$tmp/hits" > "$tmp/hits.n"
grep -Fvxf "$tmp/known.n" "$tmp/hits.n" > "$tmp/attention" || true
fail() {
echo "FAIL: $*" >&2
echo "--- patterns.norm ---" >&2
cat "$tmp/patterns.norm" >&2
echo "--- known ---" >&2
cat "$tmp/known" >&2
echo "--- attention ---" >&2
cat "$tmp/attention" >&2
exit 1
}
# Test 1: normalize produced exactly 3 patterns (comments + empties dropped,
# whitespace trimmed).
norm_lines="$(wc -l < "$tmp/patterns.norm" | tr -d ' ')"
[ "$norm_lines" = "3" ] || fail "T1 expected 3 normalized patterns, got $norm_lines"
# Test 2: normalize output contains no comment lines.
if grep -q '^#' "$tmp/patterns.norm"; then
fail "T2 normalized output still contains a comment line"
fi
# Test 3: empty / whitespace-only pattern lines must NOT match all hits.
# With 3 real patterns there must be exactly 3 known-noise lines (out of 5).
known_count="$(wc -l < "$tmp/known" | tr -d ' ')"
[ "$known_count" = "3" ] || fail "T3 expected 3 known-noise hits, got $known_count"
# Test 4: unknown error lines remain in attention (postgres + traefik).
att_count="$(wc -l < "$tmp/attention" | tr -d ' ')"
[ "$att_count" = "2" ] || fail "T4 expected 2 attention hits, got $att_count"
grep -q 'postgres.*FATAL' "$tmp/attention" || fail "T4 postgres line missing in attention"
grep -q 'traefik.*tls handshake' "$tmp/attention" || fail "T4 traefik line missing in attention"
# Test 5: regression guard for the worst case - a pattern file containing
# ONLY empty / comment / whitespace lines must produce an empty normalized
# output AND must not knock out all hits when used as input to grep -f.
cat > "$tmp/patterns.only_empty" <<'EOF'
# only comments and whitespace below
# nothing real
EOF
"$NORMALIZE" "$tmp/patterns.only_empty" > "$tmp/patterns.only_empty.norm"
empty_norm_lines="$(wc -l < "$tmp/patterns.only_empty.norm" | tr -d ' ')"
[ "$empty_norm_lines" = "0" ] || fail "T5 expected 0 normalized patterns from empty-only input, got $empty_norm_lines"
# When the normalized file is empty, collect_log_highlights skips grep -f
# entirely. Simulate that branch and confirm attention preserves all hits.
if [ -s "$tmp/patterns.only_empty.norm" ]; then
fail "T5 expected normalized file to be empty"
fi
echo "OK - all log-noise filter tests passed (5 assertions)"