Metrics
Rajesh, a teenager became health conscious. He started a habit of walking. He aims to cover 10,000 steps per day.
Metric is the measurement of something in the numerical representation.
Types of Metrics
Gauge
Rajesh's smartwatch shows his heartbeat get raised when he started walking. It get reduced when he slows down. It fluctuates mostly.
Gauge
is something which can be high as well as low at the particular moments you record the value.
Counter
Steps count of Rajesh for the particular day can only be incremented. What he covered was covered. It can't be reduced in the noon or evening.
For the next day, Steps count for the particular day will be reset to zero automatically and starts again.
Counter
type is similar to that. It is monotonically increasing value which won't get decremented. It can be reset to zero.
rate()
Days passed. He joined athletics team in his college. He started running. Now, he need to monitor him more closely. He likes to check the steps count of each five minutes for the particular day.
That is the rate of change of distance. In this case, it is the average speed of every 5 minutes.
rate() is something closely tied with counter
. If you like derive the instant change of value using counter, rate function has to be used.
Distribution
Rajesh pushed his boundaries and been a part of rally race team. Coach likes to test out everyone's performance for few days before training for the rally. Time taken by a each athlete for 20 runs daily for 1 week has been noted down by coach.
His Coach liked to focus on top 3 runs alone for performance. Later, he realized the athletes will perform multiple runs, team event and individual event in the competition and so benchmarked the average time taken as well.
Distribution
covers overall data with deeper information from which we can derive total value, averages and percentiles etc.
Before diving into distribution details, lets quickly catch up on how monitoring tools collect the application metrics.
Application binaries should include the libraries or agents of the monitoring tool.
Once the application is started, agents will instrument the application.
Instrumented metrics will be either pushed by agents to monitoring system or pulled by monitoring system from the agents in the application.
Coming back to Distribution, it involves cumulative metrics like sum, average, median, minimum, maximum, percentiles etc.
Once you instrumented the application, it is possible to cumulate the metric value at the two places.
Distribution-Type-1 : Agent in the application (client side).
Distribution-Type-2 : Monitoring system (server side).
Place we derive the cumulative data have its own pros and cons.
Rajesh's coach tracked individual top 3 runs daily. He can easily tabulate the one week data of 4 player team in one page. But, He have to temporarily note down all runs of individual player so that he can arrive at top 3.
Coach like to evaluate the average run of each day as well. He calculated the average along with top with top 3 runs.
- Calculating the values at client side is overhead for application.
He likes to have overall top 10 runs across players as well. Later, he realised that noting down all informations can provide different insights.
- Deriving the same in monitoring system query will have deeper information. But, it takes overhead.
Many keywords and Different meanings
Above types are the basic for all monitoring tools, some differences across tools are the mentioned below.
Counter
has variants in Opentelemetry API spec. It is direct type in Prometheus.Distribution covers
Histogram
(Distribution-Type-2) andSummary
(Distribution-Type-1) in general. Those are individual types in Prometheus.DataDog accepts
Histogram
(Distribution-Type-1) andDistribution
(Distribution-Type-2) as different types. Yes,Histogram
in datadog is theSummary
of the prometheus.Likewise, Newrelic, Dynatrace and all monitoring tools follow their own metric type in their agents and monitoring systems.
It is more likely the datatype of the programming language. Looks similar. But, the implementations differ from one language to another.
Conclusion
Stick to the base concepts.
Understand the definition of the metric type of one tool that you use.
Don't assume the same data type to be there in another tool.
Ultimate goal is to improve the application maintenance using monitoring tools. To achieve the same effectively, understand your monitoring tool better.
Observe your monitoring tool sometime before observe your software
Thanks for Reading. If you liked this article, follow me on linkedIn.