we write about the things we build and the things we consume
denormalizing deer*
Back in July I introduced the new version of Atlas that we're currently working on, and this post is going to cover how we're going to keep giving users what they need through our API while maintaining performance, updating in near real-time, and adding features.
keep it simple
We strive to provide the most useful data we can in simple ways to users of the Atlas API, as Jonathan mentioned in his recent post on our API design principles.
Let's take an example of this. You're building an EPG and want to display the title of a programme in the grid. You'll just take the episode title in the output of our schedule endpoint and display it in the grid, right? That's what should happen if we adhere to our design principles. However, that's not how the underlying data is modelled, not that you should need to care about that! What needs to be displayed to a user depends on what's being broadcast; if it's an episode of a brand then you'll likely want the brand's title.
We've already made life easier with on-the-fly de-normalizations using the brand_summary and series_summary annotations in our current version of Atlas, but this improvement in usability has come at a cost: we're doing a lot more legwork behind the scenes to service such requests.
the new (de)norm
Currently when we handle a request for data about an episode where one of these annotations has been provided, we need to query for the series or brand, too, then pull out the summary data from them to send back in the response. We wanted to right this with deer, so we're going to be changing the way we deal with these types of requests.
Instead of having to query the brand or series, we'll be persisting this de-normalized summary information in-line with the episodes themselves. It means we'll not need to go back to the database to get the brand and series any more if you ask for an episode of Eastenders.
new challenges
Of course, this presents different problems. By de-normalizing this data we need to ensure that it's kept up-to-date when any changes happen. We'll be doing this by being notified of changes through our queue, and updating de-normalized data when necessary.
We've held off adding more annotations like brand_summary and series_summary until now, because we didn't want to keep increasing the cost of requests. However, this new approach paves the way for more de-normalizations to appear in the API whenever it's something of general use to people.
As always, if you've any comments or questions please do get in touch via the Atlas mailing list.
*No deer were harmed in the production of this blog post
