Fred van den Driessche

a is for atlas

Over the following weeks I’m going to be writing an A-Z of Atlas. Each post I’ll choose a subject related to Atlas and give a brief overview of it.

To start with I’m going to talk about Atlas itself. It begins with A so it seems like a good place to start and should give a solid background to the rest of the series.

what is atlas?

Atlas describes itself as “the global video and audio index”. It contains metadata about various aspects of video, including TV and film, and audio, including radio and music. For example, it holds data on broadcasts of TV shows and where you could catchup with a radio show online. It doesn’t contain the video or audio, just data about it.

Atlas fetches its data from many different sources via adapters. Adapters transform and normalise the source’s data into Atlas model. It is stored and indexed in the persistent datastore ready for access via Atlas’ HTTP resources.

Data in Atlas is matched together where appropriate through a process called Equivalence. Through the Equivalence process Atlas can ingest sparse data about the same TV show from many sources, link them together, and provide complete, detailed information about that show through its API, as shown below.

Atlas Data flow

As shown above, Atlas provides a number of HTTP resources for accessing its data. During ingest it denormalised the fetched data to provide fast access to the data in alternative ways. For example, broadcasts on a TV episode are transformed into a schedule, whilst cast and crew for a show are denomalised into resources about each person so you can quickly find all the shows in which your favorite actor has appeared.

what’s next

Over the rest of the series this overview of Atlas should be fleshed out with more detailed explorations of different parts of the system. Next week: B is for Broadcasts.

blog comments powered by Disqus