we write about the things we build and the things we consume
Back in November we wrote about Tellytopic, the first research prototype to come out of ABC-IP, a project we’re working on with BBC R&D. Over the last three months MetaBroadcast have made several major additions and improvements to Tellytopic, including topics based on what people are saying about programmes on Twitter. Tellytopic uses a wide range of content that we don’t have the rights to make public, but here’s an overview of what we’ve been up to.
new dynamic homepage
We have completely redesigned the homepage to be dynamic, constantly updating based on the recent schedule for the main BBC channels. Recently broadcast programmes are displayed based on the number of topics they have and the top five most common subjects, people and places are displayed underneath.
This new homepage means that Tellytopic is constantly changing and surfacing new programmes to users, with the topics featured changing as programmes change throughout the day.
topics extracted from twitter
We have also added the first iteration of a major new feature to Tellytopic—extracting and identifying topics from tweets (posts on the social networking service Twitter) about a programme. This has required significant development effort to build a system that can process large amounts of data from Twitter in a timely manner, using Hadoop to perform map reduces over the entire dataset of ingested tweets.
We ingest tweets we know are about a programme and then extract phrases from those tweets. These phrases are then matched to DBpedia. As topics added by the BBC are also matched to DBpedia, this means that we can then match between the topics a programme is about and the topics that people are talking about whilst watching a programme.
In this first version, topics from Twitter are displayed alongside the topics provided by the BBC. This provides an interesting juxtaposition between what the BBC says a programme is about and the things people are actually talking about when watching the programme. See if you can guess which programme this is:
programmes from co-occurring topics
Another major addition to Tellytopic is that we now calculate co-occurring topics. This allows us to identify which topics frequently co-occur together on programmes and display these combinations to the user on topic pages.
This can help the user to better understand the context of a topic, by introducing them to other topics that have a significance to the topic they are currently looking at. By displaying a number of programmes that have both of the co-occurring topics, they also provide another path for the user to discover new programmes about a topic that they may not have otherwise found.
Weightings have been added to topics in Atlas, and topics in Tellytopic are now displayed in order of weighting. We have also included thresholds to ensure that topics below a certain weighting are not displayed. Together these mean that the topics shown in Tellytopic are more representative of programme content.
improvements to related topics
The algorithm used to generate the related topics that are displayed on topic pages has been rewritten to take into account both topic weightings and topic frequency. Together these changes have made a significant improvement to the related topics displayed for a given topic. We are also displaying fewer related topics, showing only the most relevant as calculated by the new algorithm. This has also improved the set of topics displayed.
more data improvements
We’ve also been working to ensure that all data in Tellytopic is fully up to date and re-ingested on a regular basis. Stability of Atlas equivalence matching has also been improved for large datasets. Behind the scenes, prototypes like Tellytopic push the boundaries of our existing services, ensuring that we continually work to scale them to larger and larger data volumes and add the increasingly sophisticated functionality that allows us to support future products.
Do let us know what you think. As always, we welcome all comments and suggestions :)Download as PDF