As often is the case at MetaBroadcast, we’ve been working on improving and re-working some of our behind the scenes tools. In particular, we’ve been creating an application to obtain updates to a page in Facebook, in the first instance posts and status updates. This of course entailed working with the delights of the Facebook API.
where do we start?
Facebook is a gigantic mesh of interconnecting pieces of information. Even looking at just one aspect such as pages, there is still a vast amount of information out there. Fortunately, Facebook offers a number of different ways to access that information, including various different APIs. There is a legacy REST API, along with a Graph API, designed to show everything as either objects (i.e. pages, people) or connections between them (comments and likes). There is also a novel query language called FQL (Facebook Query Language) that allows the use of an SQL-like syntax to query the same data.
The Graph API seemed a natural place to start, as it has a similar interface to many of our own APIs. Also, thanks to the /feed url, you can easily access all the comments and status updates of a given page- exactly what we wanted. Simply using https://graph.facebook.com/[some page id]/feed gives comments going back in time, along with who made those comments, and the number of likes and comments. Pagination is supported, and there is a handy ‘since’ field that will allow you to get updates after a certain time.
This all seems to fit nicely into the idea of polling a page regularly for updates. Simply query the page’s feed with an appropriate ‘since’ value, paginate through the result, and Bob’s your close family relation.
One of the gifts/curses of Facebook, and hence their APIs, is the sheer number of fields. Fortunately, they have provided an API Explorer, which allows you to specify exactly which fields you want. This flexibility does enable you to specify exactly which fields you want to get, but on the flip side you can end up with some pretty ridiculous urls!
all good then?
One of the few downsides I’ve noticed when working with the Graph API relates to its inability to get updated posts, i.e. it only fetches posts created since the time specified in the ‘since’ parameter, meaning that if a post was created before the last time you have polled the API, but has had a comment added after that time, that post will not be returned. This is due to posts having two times associated with them- a created_time and an updated_time. When a post is created, both times are set to the time of creation. When someone comments or likes that post, the updated_time is adjusted appropriately.
Unfortunately, filtering by updated_time is not currently supported in the Graph API. It is, however, supported in the FQL API, which, naturally, uses entirely different syntax and requires somewhat different parameters. There is also a realtime update API, introduced last year by Facebook, that allows subscription to updates on various fields. You subscribe to a set of fields on a given page or person, and the API makes a POST request to a url that you specify, telling you the ids of items that have changed. You can then look up the items in question and obtain the updated information. It is not entirely clear whether this will stand the test of time, and continue to be supported and extended in the future. At this stage, it is also not entirely clear what aspects of pages can be subscribed to, as the documentation on this API is not entirely forthcoming on such matters.
As always, do feel free to comment if you have any thoughts. Maybe you have a better idea for how to get posts and status updates from a page? Leave us a comment and let us know!