Oliver Hall

queueing in a storm

We’ve been moving a lot of our internal data transfer across to queues for a while, and when we were sketching out how our new Storm- and Hadoop-based backend systems would work, one thing we agreed on was that data should ideally come in off queues. We’ve been testing this for a while, and it seems to be working out pretty well, after some initial hitches. At some point, we then had the idea that we could use one of our queue sources to feed both our new stats platform as well as existing systems.

how do you do that, then?

ActiveMQ is currently our queuing system of choice, with JMS our means of enqueueing/dequeueing. Given this, we pretty quickly worked that what we needed was a Topic. Now, one thing we wanted to make sure of was that if a consumer went down (for a deployment, for example), the messages would hang around until the system came back up and could continue consuming. A quick search suggested that a Durable Topic would do the trick. Essentially, you create a subscriber, with a specific name, and once created, it’ll queue up all of the messages sent to that topic, whether or not something is consuming.

so what went wrong?

Well, it turns out that Durable Topics have a rather troublesome caveat—namely only a single process can use a given consumer. That’s great until you decide that your single-threaded consumer application would be better if it had a pool of consumer thread, or you use something like Storm, which is multi-threaded by design. Then stuff don’ work. Which is rather annoying, given that pretty much any sensible application is likely to use multiple consumers, as otherwise you find your single consumer thread has died and your messages are backed up worse than the M25 at rush-hour.

when in doubt, virtualise

As it turns out, we’re not the first to come across this. ActiveMQ just so happens to have a nifty creation called Virtual Topics, designed for exactly this use-case. Indeed, if you’re using ActiveMQ, there seems little if any reason why you would choose a standard Durable Topic over a Virtual Topic. The latter offers all of the features of the former, with more flexibility. What’s not to like?

So, to cut a long story short, that’s what we’ll be using from here on out. If you’re using ActiveMQ, and think you might have a similar use-case, I would suggest you take a look at Virtual Topics.

blog comments powered by Disqus