Post 019

Why Uptime Kuma

The TIG stack (Post 017) tells you how healthy your infrastructure is. Uptime Kuma tells you whether your services are actually reachable. Different questions, different tools.

Uptime Kuma runs lightweight HTTP/TCP/ping checks against every service endpoint and fires a Discord webhook the moment something goes down. It's the canary - fast, dumb, and reliable.

Deployment

Uptime Kuma runs on Node-C (Gozanti Cruiser) as a Docker container:

Host:    Node-C (OptiPlex 7050)
IP:      192.168.20.61
Port:    3001
Access:  https://uptime.tima.dev (via NPM)

docker run -d \
  --name uptime-kuma \
  -p 3001:3001 \
  -v uptime-kuma:/app/data \
  --restart unless-stopped \
  louislam/uptime-kuma:1

Monitor Configuration

Every service in the Alliance Fleet gets a monitor. The configuration follows a pattern:

HTTP Monitors (Web Services)

Service	URL	Interval	Expected
Grafana	`http://192.168.20.40:3000`	60s	200
Authentik	`http://192.168.20.10:9000`	60s	200/302
Portainer	`https://192.168.20.10:9443`	60s	200
n8n	`http://192.168.20.50:5678`	60s	200
Vaultwarden	`http://192.168.20.51:80`	60s	200
OpenWebUI	`http://192.168.20.20:3000`	60s	200
Homepage	`http://192.168.20.60:3000`	60s	200
Ghost Blog	`https://holocron-labs.tima.dev`	300s	200
Portfolio	`https://tima.dev`	300s	200

TCP Monitors (Infrastructure)

Service	Host	Port	Interval
InfluxDB	192.168.20.41	8086	60s
PostgreSQL	192.168.20.10	5432	60s
Redis	192.168.20.10	6379	60s
Wazuh Manager	192.168.20.30	1514	60s
Wazuh API	192.168.20.30	55000	60s
Ollama	192.168.20.20	11434	60s

Ping Monitors (Hosts)

Host	IP	Interval
Node-A (Falcon)	192.168.1.10	60s
Node-B (Corvette)	192.168.1.11	60s
Node-C (Gozanti)	192.168.1.12	60s
UDM Pro	192.168.1.1	60s
AdGuard	192.168.1.4	60s

Discord Webhook Integration - Admiral Ackbar

Uptime Kuma sends alerts to the #uptime-beacon channel in the Alliance Fleet Discord server via webhook. The bot identity is Admiral Ackbar - consistent with the alert bot used by n8n and Wazuh.

Webhook Setup

In Discord: Server Settings → Integrations → Webhooks → New Webhook

Name:    Admiral Ackbar
Channel: #uptime-beacon

Copy the webhook URL, then in Uptime Kuma → Settings → Notifications → Setup Notification:

Type:        Discord
Webhook URL: (paste Discord webhook URL)
Bot Display Name: Admiral Ackbar

Alert Format

When a service goes down, Admiral Ackbar posts:

🔴 [DOWN] Grafana - http://192.168.20.40:3000
Time: 2026-02-15 14:32:00 UTC
Duration: 0s

When it recovers:

🟢 [UP] Grafana - http://192.168.20.40:3000
Time: 2026-02-15 14:35:12 UTC
Duration: 3m 12s

IP Source-of-Truth Document

With 25+ monitors configured, keeping track of which IP belongs to which service becomes its own problem. I maintain an IP source-of-truth document that maps every service to its:

Internal IP and port
VLAN membership
Host node
NPM subdomain
Monitor type in Uptime Kuma

This document serves double duty: it's the reference for configuring new monitors and the first thing I check during an incident to confirm I'm looking at the right endpoint.

Operational Value

Uptime Kuma catches three categories of issues:

Service crashes - Container exits, application errors. Uptime Kuma detects it in 60 seconds and alerts via Discord.
Network path failures - Inter-VLAN firewall rule changes, NPM misconfigurations, DNS issues. If the HTTP check fails but the TCP/ping check passes, the problem is at the application or proxy layer, not the network.
Silent degradation - Services that respond with 200 but are actually broken (empty dashboards, login pages stuck in redirect loops). For these, Uptime Kuma's keyword matching feature checks that the response body contains expected content.

What Uptime Kuma Doesn't Do

Uptime Kuma is a binary up/down checker. It doesn't tell you:

Why a service is slow (that's Grafana + TIG stack)
What security events are happening (that's Wazuh)
Whether the service is functionally correct (that's application-level testing)

It's intentionally simple. The value is in the speed and reliability of the alert - not the depth of the diagnosis.