Dragos Pitica

are you stressed? start to ingest

At MetaBroadcast we have applications called ingesters that are designed to absorb and digest data that would be useful and would make our clients’ life easier. The sources that data can be ingested from can vary from a TV channel to an online video platform. Also, the ingested data can be used for several purposes such as equivalence, comparison or just to find things quickly.

as simple as it looks

The main objective of using ingesters is to make the data available through a simple RESTful API such as Atlas. In order to achieve that, the only ingredient we need to provide is the data. The data can be ingested from many sources, but the process complexity usually depends on the data provider.
We use Item as a standard model for the API which represents any type of ingested data. Therefore, once we have the required data, we need to parse it and convert it to Item type. Also, using the right Atlas maven dependency will provide access to a GsonAtlasClient that will do the rest of the job for us.

we want to know everything

An effective tool produced by MetaBroadcast is Columbus Telescope. The main purpose of this tool is to gather monitoring information about ingested data from our services and provide it to our clients. Sending data to Telescope includes three main steps.

  1. Start ingest
  2. The first step tells the telescope client that data is ready to come. At this point, the telescope client creates a Task which requires an Ingester object. This object includes information such as the name and key of the source along with an Environment which means where the data comes from.

  3. Create events
  4. As we made our telescope client aware that data is about to come, we need to create events. An event is a representation of an ingested item. Each event, among other information, includes the Atlas ID, status, timestamp and raw data. However, when an item fails to be ingested, the event will include an error message.

  5. End ingest
  6. The last step of the process simply means that we have no more data to send. Also, it updates the task created in the first step with an end time.


Now that we have the theory, let’s see some real results. This example will outline how to ingest a playlist from Dailymotion.

As mentioned above, the first step is to ingest the data. To do that we need an HTTP client to connect to the Dailymotion API and a URL to access it. This will result in a playlist.

The code below will take a playlist ingested by the parser and convert each video to an Atlas item.

To POST to Atlas, we need the list of Items returned by the converter and pass it to the method underneath.

The Atlas API result will look like this.

POSTing to Telescope would follow the steps mentioned in the section above.

Also, there is a view of the Columbus Telescope front end.

Stay tuned for more posts about other popular ingesters!

If you enjoyed the read, drop us a comment below or share the article, follow us on Twitter or subscribe to our #MetaBeers newsletter. Before you go, grab a PDF of the article, and let us know if it’s time we worked together.

blog comments powered by Disqus