Runtime metrics

Metrics for running services, grouped by environment and instance.

Last updated: 2026-03-17

Runtime metrics show live health of each environment and instance.

What you can filter by

Environment (prod / staging / preview)
Instance (latest or older)
Status (running vs stopped)

What you can see

Requests, latency, errors
CPU and memory
Egress and bandwidth

Reading runtime metrics effectively

Start with error rate and latency to detect user impact first.
Correlate with CPU/memory to decide scale vs code fix.
Compare across environments (staging vs prod) after each deployment.
Check instance-level divergence to detect one unhealthy replica.

Suggested SLO starter targets

| Metric | Starter target | | --- | --- | | Availability | >= 99.9% monthly | | p95 latency | <= 500 ms for critical APIs | | 5xx error rate | < 1% sustained | | Restart loops | 0 recurring loops |

Troubleshooting

No data

Confirm the instance is receiving traffic
Wait for metrics sampling interval

Latency up but CPU normal

Inspect downstream dependencies and network path (DB/API/DNS).

Errors up right after deploy

Compare with previous deployment and rollback if needed.

What you can filter by

What you can see

Reading runtime metrics effectively

Suggested SLO starter targets

Troubleshooting

See also