Monitoring & Observability
Observability is a core investment area. Distributed tracing is in place today, with real-time metrics, centralized logging, and proactive alerting on the near-term roadmap.
What's in Place Today
- Distributed tracing — OpenTelemetry instrumentation with SigNoz backend, providing end-to-end request tracing across services
- Health checks — Kubernetes liveness and startup probes on all services via
/health/readyendpoints - Infrastructure monitoring — Azure-managed monitoring for Kubernetes, PostgreSQL, and Redis
Near-Term Roadmap
The following capabilities are under active development:
- Real-time metrics dashboards — Application and infrastructure health monitored continuously across all environments
- Centralized logging — Searchable logs across all services with request-level correlation
- Proactive alerting — Automated alerts for anomalies, threshold breaches, and error spikes
- External uptime monitoring — Synthetic checks from multiple geographic locations, independent of the platform itself
Integration Health
EHR integrations and external data flows are monitored alongside internal services:
- Sync status — Bidirectional EHR sync tracked per integration partner
- Message throughput — Inbound and outbound message volume monitored
- Error classification — Failed messages classified by type for targeted resolution
- Automatic retry — Temporary failures retry automatically; permanently failed messages are queued for review with full context preserved