Is node.js best for Comet?

Plurk Comet

At Plurk we process many millions of comet notifications pr. day and so far we have processed billions of them. It's comet at a very large scale and I think we are one of the biggest users of comet technology on the Internet. We use comet to deliver realtime updates to the users so they can plurk in realtime with their friends.

Scaling this has been a challenge and we have tried many different solutions:

Our node.js solution has served us well for the past 8 months and it has processed many millions of comet notifications each day. But sadly we began to have issues with processing our notifications and at our peak our comet queues got really huge.

We tried a lot of optimizations to make node.js work - some of these were:

  • Upgrading node.js to 0.2.2 from 0.1.33. Version 0.2.2 had worse performance and memory leaks - or at least the memory consumption had changed a lot for the worse
  • Different code optimizations such as using optimized data structures
  • Tried to rewrite using Redis as a data store
  • Tried to use a proxy for channel writing

None of these worked and we eventually got some new hardware that fixed the problem somewhat. But given that we are a startup with limited resources throwing hardware at the problem isn't generally a solution we use that much.

So we revisited our old Netty solution, did some optimizations to it and rolled it out in production. This was a win, it used more memory, but the general performance was much better. One Netty server can currnetly handle around 6000 comet notifications pr. second to around 10.000 clients. The node.js server could handle around 500 pr. sec (so it was at least an 10x improvement).

Is node.js the best Comet solution?

I would argue that it is if you just want a working solution ASAP. If you want something that is "web-scale" then Java+java.nio is the way to go (or C/C++). Remember that node.js served us well for the past 8 months serving millions of comet notifications each day. Do also note that I implemented our node.js solution in two days - so it's pretty simple to get something going with node.

But V8 (the JavaScript VM that node.js uses) has a serious problem and it's following:

  • V8 does not support threads or processes - everything is handled by the main process, even garbage collecting

This limitation is a smart choice given that V8 is mainly intended to be used in browsers, but it's not that good in server architectures. The problem with our node.js solution is that channel reads and writes and handled by the same process and this generally means that channel writing gets slowed down by channel reading and garbage collecting. Separating them is easy in our Java solution since we can just run the channel writer in it's own thread, but with node.js we can't do this.

As an end note I am still impressed by node.js and its performance given how young the project is. I wanted to share your experience with it and with comet in general. I hope you find this useful in your comet quest :-)

Happy hacking.

© Amir Salihefendic