memcached: Benchmark of 4 Python libraries![]() Optimizations, don't we just love them! Unfortunately most micro optimizations aren't worth doing. The optimizations that are worth doing are those that affect everything... And if you use memcached, then memcached affects everything ;-) In this blog post I present a benchmark of 4 most popular Python memcached libraries (one of them pure Python, the 3 others C wrappers). As my benchmark shows, there are lots of gains, basically you can speed up your memcached operations by 2x times - which is REALLY hard to do with any other optimization. There are currently 4 Python memcached libraries and there aren't any good benchmarks of these, so I have set a goal to benchmark these. The candidates:
Observations and changes:
Like every benchmark this benchmark should be taken with a grain of salt. Benchmark programThis is a modified benchmark.py from python-libmemcached. It runs 10000 iterations of each command: Benchmarking pylibmc_optimized... test_set: 0.743668 seconds test_set_get: 1.289444 seconds test_random_get: 2.336701 seconds test_set_same: 0.785587 seconds test_set_big_object (100 objects): 0.014704 seconds test_set_get_big_object (100 objects): 0.032860 seconds test_set_big_string (100 objects): 0.009394 seconds test_set_get_big_string (100 objects): 0.021033 seconds test_get: 0.606791 seconds test_get_big_object (100 objects): 0.012321 seconds test_get_multi: 0.019260 seconds Total_time is 5.871763 --- Benchmarking pylibmc... test_set: 0.744818 seconds test_set_get: 1.386534 seconds test_random_get: 2.475867 seconds test_set_same: 0.775607 seconds test_set_big_object (100 objects): 0.013254 seconds test_set_get_big_object (100 objects): 0.031905 seconds test_set_big_string (100 objects): 0.009887 seconds test_set_get_big_string (100 objects): 0.021890 seconds test_get: 0.644991 seconds test_get_big_object (100 objects): 0.011983 seconds test_get_multi: 0.018810 seconds Total_time is 6.135546 --- Benchmarking cmemcache... test_set: 0.898636 seconds test_set_get: 1.814076 seconds test_random_get: 3.197659 seconds test_set_same: 0.928649 seconds test_set_big_object (100 objects): 0.014427 seconds test_set_get_big_object (100 objects): 0.031279 seconds test_set_big_string (100 objects): 0.010986 seconds test_set_get_big_string (100 objects): 0.025449 seconds test_get: 0.854429 seconds test_get_big_object (100 objects): 0.013078 seconds test_get_multi: 0.463271 seconds Total_time is 8.251940 --- Benchmarking python-libmemcached... test_set: 0.740007 seconds test_set_get: 1.336759 seconds test_random_get: 2.363844 seconds test_set_same: 0.736221 seconds test_set_big_object (100 objects): 0.013195 seconds test_set_get_big_object (100 objects): 0.031755 seconds test_set_big_string (100 objects): 0.010874 seconds test_set_get_big_string (100 objects): 0.020221 seconds test_get: 0.622201 seconds test_get_big_object (100 objects): 0.011825 seconds test_get_multi: 0.015463 seconds Total_time is 5.902364 --- Benchmarking memcache... test_set: 1.276277 seconds test_set_get: 2.596438 seconds test_random_get: 4.869392 seconds test_set_same: 1.351409 seconds test_set_big_object (100 objects): 0.057328 seconds test_set_get_big_object (100 objects): 0.091957 seconds test_set_big_string (100 objects): 0.018521 seconds test_set_get_big_string (100 objects): 0.038375 seconds test_get: 1.303581 seconds test_get_big_object (100 objects): 0.028765 seconds test_get_multi: 0.380600 seconds Total_time is 12.012643 pylibmc seems to be fastest, especially when applied with tcp_nodelay=1 behavior. Generally, the C libraries seem to be around 2 times faster than the pure Python implementation. Test in a threaded environmentThis is a test of these libraries in a threaded environment (basically a WSGI application that does 4 GET operations). In order for this to work the libraries need to be encapsulated in a threading.local. The test does 1 warmup request and afterwards 1000 requests: #python-memcache Requests per second: 97.88 [#/sec] (mean) Time per request: 10.217 [ms] (mean) PID COMMAND %CPU TIME #TH #WQ #PORTS #MREG RPRVT RSHRD RSIZE 50077 Python 0.0 00:09.15 9 0 63 237 31M 244K 34M #python-libmemcached Requests per second: 82.05 [#/sec] (mean) Time per request: 12.188 [ms] (mean) PID COMMAND %CPU TIME #TH #WQ #PORTS #MREG RPRVT RSHRD RSIZE 50101 Python 0.0 00:10.62 10 1 72 270 36M 244K 39M #cmemcache Requests per second: 106.11 [#/sec] (mean) Time per request: 9.425 [ms] (mean) PID COMMAND %CPU TIME #TH #WQ #PORTS #MREG RPRVT RSHRD RSIZE 50121 Python 0.0 00:08.58 9 0 62 237 31M 244K 34M #pylibmc_optimized Requests per second: 108.09 [#/sec] (mean) Time per request: 9.251 [ms] (mean) PID COMMAND %CPU TIME #TH #WQ #PORTS #MREG RPRVT RSHRD RSIZE 50043 Python 0.0 00:08.48 9 0 62 243 32M 244K 34M I have no clue why python-libmemcache performs so poorly in this test. This also shows that benchmarks should be used as indicators and not the truth ;-) Conclusionpylibmc seem to be most promising library, probably because it's hand-coded and because it's based upon libmemcached. python-libmemcached seems to be promising on a simple benchmark, but seems to be lacking in performance (and memory usage!) in a threaded environment (this could be related to PyRex, but I am unsure). Looking at CPU and memory usage python-libmemcached seems to be taking most, while pylibmc uses least CPU and cmemcache least memory. So in general I would recommend pylibmc or cmemcache - this said, it's best that you do your own benchmarks based on your architecture(s) and your usage patterns. [Update] Patch for pylibmcIt patches following:
[Update] Patch for python_libmemcachedPatch adds following:
Benchmarks
·
Code
·
Python
•
3. Sep 2009
9 comments so far
Post a comment
Commenting on this post has expired.
|
Blog labels |