we write about the things we build and the things we consume
Martins Irbe

improving api errors on our dashboard

I have recently worked on updating our Dashing dashboard API errors widget which displays error rates for our APIs. We now use Amazon CloudWatch and ElasticSearch for our API errors widget as these tools provide all the important metrics about our services that can be converted to a more user-friendly format.

before

Before working on the widget changes, we were using Amazon CloudWatch to extract and display the total number of API errors for past 15 minutes. We used to categorise errors in two categories, HTTP 4xx and 5xx errors. However, over the time we have noticed that displaying such information isn’t very helpful as it’s hard to identify the ELB error rate by just seeing the error count.

old_api_errors

The API errors widget before changes.

after

Additionally, to Amazon CloudWatch, we now make use of ElasticSearch that allows easily querying over stored Nginx logs that contain the necessary information about all requests that our APIs are serving. It’s possible to get the necessary information from the ElasticSearch by making an HTTP request with the ElasticSearch query as an HTTP query parameter. The best part of this is that we can make a specific request that would retrieve all requests that returned HTTP 5xx for one of our APIs and by providing `search_type=count` query parameter we can retrieve a simple response, which means that all counting have been performed in behind the scenes by the ElasticSearch.

We have configured the widget so that if one of API error rates are above 2% then the widget background changes to amber, while over 8% error rate changes the background to red, otherwise green. We are displaying all alerts that are over 0.1% so that we know that the error rate is quite high however yet still within limits.

new_api_errors

The API errors widget after changes.

dashing on

There’s always something that we can improve in our dashboard, especially to make it easier to spot something important without a need to think much about it. We have changed most of our widgets to display a nice big number for each widget that holds a count for errors, pull-requests and other important metrics. In this case, we have made API errors widget much friendlier and easier to read.

If you enjoyed the read, drop us a comment below or share the article, follow us on Twitter or subscribe to our #MetaBeers newsletter. Before you go, grab a PDF of the article, and let us know if it’s time we worked together.

blog comments powered by Disqus
sign up to #metabeers
slideshow