Tom McAdam

a sense of identity in atlas

What is today known as Atlas was born with a different name back in 2009: URIPlay. As that name suggests, it had, and continues to have, URIs at its heart. As we move to Atlas 4.0, though, we’re going to be creating IDs for content in Atlas, which will be used as the primary means of identification. We do this already in Voila, our recommendations and personalisation system, but moving them to Atlas means there will be consistent identifiers throughout our systems.

the state of play

If we look at a schedule query in Atlas 3.0 as an example, you’ll see that both channels and items being broadcast have URIs:

The URI for the item above can be used to query the piece of content from the content endpoint in Atlas: http://atlas.metabroadcast.com/3.0/content.json?uri=http://www.bbc.co.uk/programmes/b01sftv1. As you will have spotted, this is the BBC’s own URI for a broadcast of BBC Breakfast. This is great, since you can ask Atlas about a piece of content from another source using their URIs to perform the lookup, and it will tell you all it knows about it. Through the power of equivalence, the one lookup will even tell you what other people say about that piece of content, too: from on-demand locations through to topics.

Using URIs as keys for content is powerful for that reason. However, it does present some challenges. Firstly, many sources don’t have URIs for content, cool or otherwise. Secondly, URIs are difficult to use as keys in RESTful services: something like http://atlas.metabroadcast.com/4.0/content/http://www.bbc.co.uk/programmes/b01s9l5s is ambiguous when not encoded, and ugly if it is.

adding ids

You may have noticed that as we’ve added new features such as topics and channels, we’ve been moving to more RESTful URIs, and to allow this we’ve been creating our own IDs for them. As for our 4.0 endpoints, they’re all much more RESTful from the get-go: take for example, the schedule endpoint, where a single channel schedule will be available at /4.0/schedules/:id. Similarly, content will be available at /4.0/content/:id. Both of these IDs are Atlas-generated IDs: for existing content we’ll be migrating identifiers from Voila so they’ll remain the same.

playing with uris

We still think it’s incredibly important to be able to query using URIs so we will continue to support it through the use of aliases. We’ve recently done some work to better support aliases, which we’ll be talking about in the near future here, but I’ll give you a taster:

We’ll be storing URIs as aliases with a namespace of uri, so a filter can be applied to the content resource using aliases.namespace and aliases.value to perform a lookup. For example, http://atlas.metabroadcast.com/4.0/content.json?aliases.namespace=uri&aliases.value=http://www.bbc.co.uk/programmes/b01s9l5s&annotations=extended_id will query for content with URI http://www.bbc.co.uk/programmes/b01s9l5s

what does this mean for you?

Over the coming weeks, we’re going to be making a few changes to support IDs in Atlas, and migrate existing IDs from Voila. The main one to be aware of is that we’ll be doing an initial import of IDs from Voila into Atlas (both 3.0 and 4.0 endpoints). This will result in an id field being output from Atlas 3.0 endpoints where there was none before, and the IDs changing in the 4.0 endpoints. Note that this is mainly for testing purposes and, as such, these IDs will not be stable and will change before the final go-live. Therefore please don’t build dependencies around them. When we’re nearing the go-live of the IDs migration we will remove all of the IDs from Atlas and, once again, there will be no id field output from the 3.0 API until we go live.

On the Voila front, there will be no impact: the IDs you use today will stay the same. You’ll benefit from being able to use the ID you get from a Voila API call in Atlas when the work is complete.

We’ll be keeping the Atlas mailing list updated with progress, and if you’ve any questions or comments please post there.

blog comments powered by Disqus