Runtime metrics
Metrics for running services, grouped by environment and instance.
Last updated: 2026-03-17
Runtime metrics show live health of each environment and instance.
What you can filter by
- Environment (prod / staging / preview)
- Instance (latest or older)
- Status (running vs stopped)
What you can see
- Requests, latency, errors
- CPU and memory
- Egress and bandwidth
Reading runtime metrics effectively
- Start with error rate and latency to detect user impact first.
- Correlate with CPU/memory to decide scale vs code fix.
- Compare across environments (
stagingvsprod) after each deployment. - Check instance-level divergence to detect one unhealthy replica.
Suggested SLO starter targets
| Metric | Starter target |
| --- | --- |
| Availability | >= 99.9% monthly |
| p95 latency | <= 500 ms for critical APIs |
| 5xx error rate | < 1% sustained |
| Restart loops | 0 recurring loops |
Troubleshooting
No data
- Confirm the instance is receiving traffic
- Wait for metrics sampling interval
Latency up but CPU normal
- Inspect downstream dependencies and network path (DB/API/DNS).
Errors up right after deploy
- Compare with previous deployment and rollback if needed.