powered by Google

Plurk Comet: Handling of 100.000+ open connections

Posted on 7. Jun 2009 · Comments [18]
Plurk Comet

Comet is a buzz word, just like Ajax, but a bit cooler. The bottom line in comet is that the server can push data to clients when new data arrives.

I think that comet is the next big thing and this trend can be seen in an upcoming product like Google Wave that use comet heavily for creating real time updates.

Plurk is growing fast and we are becoming one of the largest Python sites on the Internet. We serve many thousands of concurrent users pr. day (100.000+), so implementing comet for Plurk is a big challenge and I have spent around a week on fiddling with different solutions.

Here are some of the technologies I have tired:

  • Python Twisted: Non-blocking server in Python. Unfortunately it ate a lot of CPU and could not scale
  • Jetty: They claim to have good support for comet (and they do if you don't serve 100.000 clients at once). The Jetty installation we ran ate around 2GB of ram on 10.000 active users, which is unacceptable for our needs
  • Apache Tomcat: Same with Jetty, it eats tons of memory, even thought they do support comet connections
  • Apache Mina: A NIO (non-blocking IO) framework which I used to build a HTTP server. Unfortunately Mina is very badly documented and it did not scale up in production

After trying these out and found out that they could not handle a massive load I was about to give up. But then, I stumbled upon the savior:

  • JBoss Netty: A NIO framework, done by one of Apache Mina's founders

Netty is not that documented, but it's really well designed and after some hacking around the performance is pretty amazing.

The bottom line

Using Netty we have comet running on 100.000+ open connections - this uses some GB of memory and 20% of CPU on a quad core server. I.e. we have solved the C10k * 10 problem using non-blocking technology and some pretty impressive libraries (namely Java NIO and Netty).

A big kudos goes to Trustin Lee for his amazing work on Netty!

Labels: Labels: Code
Like this post? Subscribe to the RSS feed!
18 comments so far

douban.com is still the largest (Pure) Python site.

Interesting. Could you elaborate more about your approach with Twisted?

cezio:
I used Twisted Web. When a request was received it either got returned if it had data to return or stored with a timeout (the timeout was added via reactor.callLater).

how the java code integrate with the rest of the application.
also what about Erlang,a server around libevent or even Stackless python?

How many GBs of memory does 100k connections consume? We have a scenario where we'll have almost that number of clients (we're using XMPP instead).

Uriel:
Erlang would be a really good choice as well as the language is optimized for these kinds of tasks. I would not recommend using Python (simply because Java+Java NIO or Erlang are much better choices).

nonane:
Currently we have 5 Java servers running on 5 different ports. Each server has 200MB allocated. So currently it's 1GB on around 30000-40000 connections.

Obviously an important thing to tune, if you need to maintain 100k simultaneous TCP connections, is (for BSD sockets API) the SO_SNDBUF and SO_RCVBUF values.

(and that's on top of any application-level buffers and data structures.)

James:
Our settings of /etc/sysctl.conf (Linux):

# General gigabit tuning:
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_syncookies = 1
# this gives the kernel more memory for tcp
# which you need with many (100k+) open socket connections
net.ipv4.tcp_mem = 50576   64768   98152
net.core.netdev_max_backlog = 2500

One also needs to increase the limit on open files (they are usually located in /proc/sys/fs/file-max). The settings are taken from A Million-user Comet Application with Mochiweb, Part 1, which is worth a reading if someone is interested in doing a comet implementation.

Don't you run out of sockets after 65k clients?

Did you try any Erlang server? like Mochiweb or YAWS? They are advertised to be quite solid.

@RiX0R yeah I always wondered about that too.

RiX0R:
Not if your file-max is set to following (or higher values):

$ cat /proc/sys/fs/file-max
262140

i think Glassfish also has a comet implementation based on Grizzly

RiX0R, Matt, amix:

Yes you do. A connection is a unique (ip, localport) tuple and thus its hard to get more than 2^16 connections afaik.

It is mentioned in http://www.metabrew.com/articl... under the subtitle "Turning it up to 1 Million" (as amix linked to in a previous comment). Here the author is using 17 ips to get to a million.

Spand:
Creating one million connections from one host and managing one million connections from one host are two different problems, AFAIK.

Grizzly (the NIO framework Glassfish is built on) supports Comet.

Terminating 100k comet connections on a single server is possible, but you are going to have message rate scaling problems.

If every client sends a single request/response every 30s (which generally is the longest time you can go without sending an empty message to keep the connection open), then you are looking at 3333 responses per second - which is going to keep most servers pretty well busy - just for idle load before you do anything else.

For Jetty, we've not tried to optimize for beyond 10k simultaneous connection (although we have run 25k) because the applications that have sufficiently low message rates per user are pretty low.

If you want 100k simultaneous users, then I really recommend multiple servers (for numerous reasons).

Note also that Jetty's default setup is to terminate a comet connection in the rich/safe/known servlet environment. Netty and other NIO wrappers can indeed terminate more connections, but they provide a more challenging application environment. So it is a bit of an apples vs oranges comparison.

It’s really very informative post. I have already known about these technologies and also used them. These are Python Twisted, Jetty, Apache Tomcat, Apache Mina, JBoss Netty and their performances are also good. You have sharing such a nice post. Thanks!!! mcts dumps

Post a comment
Commenting on this post has expired.
© 2000-2009 amix. Powered by Skeletonz.