ExecSearch monitoring hub

Jump to dashboards and telemetry UIs for this environment.

Host suffix: search.dev.alexmeakin.uk

Quick links

Grafana

Dashboards, alerts, and Explore (metrics, logs, traces).

Open Grafana

Prometheus

Targets, ad-hoc queries, and scrape health.

Open Prometheus

Loki

Log store HTTP API (often used via Grafana Explore).

Open Loki

Tempo

Distributed tracing backend (query HTTP; traces in Grafana).

Open Tempo

OpenTelemetry Collector

Prometheus-format self-metrics on this port (pipelines, receivers).

View metrics

cAdvisor

Per-node container resource usage and health.

Open cAdvisor

MLflow

Experiment tracking UI for LLM / ML runs from the app.

Open MLflow

What each component does

Grafana
Primary operator UI: ExecSearch dashboards, datasource wiring to Prometheus, Loki, and Tempo, and ad-hoc analysis.
Prometheus
Time-series metrics database. Scrapes the OpenTelemetry Collector, Postgres exporter, cAdvisor, and Kubernetes targets defined in repo config.
Loki
Log aggregation. Promtail on each node ships container logs here for correlation with metrics and traces in Grafana.
Tempo
Trace backend. The .NET API and worker send OTLP spans via the collector; Tempo stores them for Grafana trace views.
OpenTelemetry Collector
Ingests OTLP from applications (gRPC/HTTP), exports to Tempo and Prometheus, and exposes its own Prometheus metrics for pipeline health.
cAdvisor
Container metrics per node (CPU, memory, filesystem). Prometheus scrapes these for infra dashboards.
MLflow
Tracking server for experiments, parameters, and metrics from optional MLflow client usage in the platform.
Postgres exporter in-cluster only
Exposes PostgreSQL health and stats to Prometheus at postgres-exporter.observability.svc.cluster.local:9187. No public ingress.
Promtail in-cluster only
DaemonSet that tails node container logs and pushes to Loki. Health: promtail.observability.svc.cluster.local:9080.