RED (Rate/Error/Duration)
Focus is application performance.
Rate
- the rate at which your system is receiving requests
- can provide important context when monitoring performance or troubleshooting
Errors
- how many requests are ending in or encountering errors?
- is a specific call failing 100% of the time?
- do errors increase as the rate of traffic increases?
Duration
- length of time each request to your system takes
- request duration is critical to determining end-user experience and monitoring overall performance
- “slow is the new down”
- as page load time increases from 1 to 3 seconds, the likelihood of a user leaving increases by 32%
USE (Utilization/Saturation/Errors)
Focus is system ressources/infrastructure.
Utilization
- number of resources a system is using to process work
- cpu, memory, network bandwidth, or even software metrics like process capacity and thread pools
Saturation
- amount of work that cannot be processed by the system due to a lack of available resources
- can e.g. be observed as queueing or latency
Errors
- just as errors can signal issues with your application, they can signal issues with your resources