Configuring Apache and CherryPy to handle a Digg effect
This little guide will give some guidance on how to optimize Apache and CherryPy so they can handle an effect of a much bigger page.
Yesterday I got around 16000 page views and 13000 unique visits on Orangoo Spell Check. Here is a graph over traffic pr. hour (it's only for the first 17 hours).
16000 page views isn't that much, but if your setup isn't configured, then it is! I can tell that Apache died 2 times, until I figured out that my configuration was bad. The system setupI have following shared hosting setup: Memory: 160MB Disk size: 6000MB CPU: 160 Units OS: Debian Woody System: Xen 3 Not impressive, but it's actually ok. And it could handle an effect, so that's cool. Running CherryPy behind ApacheIt would be wise to run CherryPy behind Apache. Apache is lighting fast and well tested on huge loads. Here is a guide from the CherryPy docs: Apache 1 or Apache 2?I had first gone with Apache 2, but I regretted this step. Apache 2 is more complex and harder to configure. If you don't use any specific Apache 2 features, then I would advise to go with Apache 1. Don't use pre built versions Apache, compile Apache yourself for best performance (where you only enable the modules that you actually use). Compiling Apache 1Here is how I configured and build it: ./configure --enable-module=rewrite --enable-module=proxy --disable-module=userdir --disable-module=auth --disable-module=include --disable-module=cgi --disable-module=env make make install If you use the CherryPy and Proxy trick, then you'll have to turn rewrite and proxy on. Notice that CGI is turned off. Apache configurationBasically you only need to adjust some things to get Apache configured properly. Here are my main optimizations to the httpd.conf: MaxKeepAliveRequests 0 KeepAliveTimeout 15 MinSpareServers 15 MaxSpareServers 50 StartServers 15 MaxClients 256 MaxRequestsPerChild 0 You can read more about those parameters in two excellent articles on Apache configuration:
CherryPy configurationThis is really easy. In your configuration set following: server.thread_pool: 100 server.socket_queue_size: 30 Basically the server is going to start 100 threads. Queue size means that if all threads are busy then CherryPy will queue up to 30 requests. Monitoring your CherryPy serverHow do you monitor your CherryPy server? Don't sleep, that's it! Nah, you use supervisor to monitor your CherryPy server. It's super easy to use, built in Python, but has very little documentation. Luckily Titus Brown has written an excellent guide to get you started: Notice: Supervisor can also be used to monitor any process (Apache, MySQL etc.) Unite your static filesI forgot a important part: Unite your JavaScript and CSS files into two large files. My Apache server got 300.000 requests that day (even if I had united all my JS+CSS). I had 8 static files which I concatenated to 2. On 15000 visits that's around 100000 requests saved. Here is a little Python script that can do the dirty work for you: import os
def minifyAndUnite(files, output_file):
full_text = []
for f in files:
full_text.extend([l.lstrip() for l in open(f, "r").readlines()])
open(output_file, "w").write("".join(full_text))
#What files?
css = ['static/main.css', 'googiespell/googiespell.css',
'greybox/greybox.css']
js = ['googiespell/AmiJS.js', 'googiespell/cookiesupport.js',
'googiespell/googiespell.js', 'greybox/greybox.js']
#Get full path
cwd = os.getcwd()
full_path_css = ["%s/%s" % (cwd, fp) for fp in css]
full_path_js = ["%s/%s" % (cwd, fp) for fp in js]
#Minify and store
minifyAndUnite(full_path_css, "%s/static/css_generated.css" % cwd)
minifyAndUnite(full_path_js, "%s/static/js_generated.js" % cwd)
Know your resources (update)I found out that my configuration of both Apache and CherryPy was too greedy. I excepted too much of so little CPU and memory. I created 100 CherryPy threads, but this was way too optimistic. The problem becomes if they all get active at the same time - can the server handle the load? My couldn't. It's better to save resources and have fewer threads. Best way to solve this is testing and calculations. Find out how expensive one request is and how many you can handle concurrently. You need to do this both for Apache, CherryPy and other things that they might use (like Aspell...) My current configurationApache: MaxKeepAliveRequests 0 KeepAliveTimeout 15 MinSpareServers 20 MaxSpareServers 40 StartServers 25 MaxClients 256 MaxRequestsPerChild 0 CherryPy: server.thread_pool: 40 server.socket_queue_size: 15 10 comments so far
Post a comment
Commenting on this post has expired.
|
Blog labels |