Skip to main content

Monitoring & Observability

Observability is a core investment area. Distributed tracing is in place today, with real-time metrics, centralized logging, and proactive alerting on the near-term roadmap.

What's in Place Today

  • Distributed tracing — OpenTelemetry instrumentation with SigNoz backend, providing end-to-end request tracing across services
  • Health checks — Kubernetes liveness and startup probes on all services via /health/ready endpoints
  • Infrastructure monitoring — Azure-managed monitoring for Kubernetes, PostgreSQL, and Redis

Near-Term Roadmap

The following capabilities are under active development:

  • Real-time metrics dashboards — Application and infrastructure health monitored continuously across all environments
  • Centralized logging — Searchable logs across all services with request-level correlation
  • Proactive alerting — Automated alerts for anomalies, threshold breaches, and error spikes
  • External uptime monitoring — Synthetic checks from multiple geographic locations, independent of the platform itself

Integration Health

EHR integrations and external data flows are monitored alongside internal services:

  • Sync status — Bidirectional EHR sync tracked per integration partner
  • Message throughput — Inbound and outbound message volume monitored
  • Error classification — Failed messages classified by type for targeted resolution
  • Automatic retry — Temporary failures retry automatically; permanently failed messages are queued for review with full context preserved