Fred van den Driessche

h is for historical data

Continuing the Atlas A-Z series, this article is brought to you by the letter H. H is for Historical Data.

Atlas aims to be the go-to place for all things media-metadata-related. It goes to great lengths to make sure that the information it dispenses is as up-to-date as possible. But sometimes it’s not only current data that’s important: things that were and things that have not yet come to pass can be just as relevant. To support these requirements Atlas stores a variety of historical data about its resources.

channeling the past

One such resource type where historical data is particularly important is the channel. Channels change surprisingly often. They start, they stop, they change number in a region, they change title and they change their images. Managing those changes to keep a schedule accurate requires a fair amount of both hindsight and foresight.

For channels, you can get access to historical data using the history annotation. As an example, let’s have a look at the channel which, at the time of writing, is titled Sky Movies Greats HD:

We get something like the following:

  "id": "cbkJ",
  "type": "channel",
  "uri": "",
  "start_date": "2008-09-16T00:00:00Z",
  "title": "Sky Movies Greats HD",
  "image": "",
  "history": [{
    "start_date": "2014-03-10T00:00:00Z",
    "title": "Sky Movies Greats HD",
    "image": ""
  }, {
    "start_date": "2014-04-14T00:00:00Z",
    "title": "Sky Movies Superheroes HD",
    "image": ""
  }, {
    "start_date": "2014-04-28T00:00:00Z",
    "title": "Sky Movies Greats HD",
    "image": ""

From the above it’s pretty straight forward to see that the channel’s name changes to Sky Movies Superheroes HD temporarily.

broadcasts and locations

Atlas also retains details about all the broadcasts and on-demand locations for a content item, even after broadcast is over or the location is no longer available. This means that historical schedules can be created and items marked as currently unavailable to view.

In fact, well-identified data is almost never removed from Atlas but it may be updated in such a way that it’s automatically filtered from view. For example, if a broadcast is cancelled for some reason it won’t be removed. Instead it will marked in such a way that it won’t appear on the API.

It’s worth noting that all this historical data is kept in-line, directly on the items and channels meaning there’s no real cost to retrieving it beyond the data transfer. There are no additional database queries required to gather it. Obviously this can be a fair amount of data, and it’ll grow as time passes, so one of our aims for the v4 API is to provide as much control as possible over the data returned using filters and annotations.

If you have any comments or questions about historical data in Atlas please don’t hesitate to get in touch either below or via the Atlas discussion group.

blog comments powered by Disqus