The Unladen Swallow strikes back![]() This is a follow up benchmark on Python unladen-swallow at least 33% slower than Python 2.5.1. My last benchmark was a quick one and I wanted to create one that's more detailed and that benchmarks Q2 and Python 2.6 on some real load. A little teaser is that Unladen Swallow Q2 seems to be making great progress :-) About the hardware and the Python versions testedThese tests are run on a MacBook with 2GHz Intel Core Duo 2 with 4GB of RAM. Following Python versions are tested:
They are all compiled with GCC 4.0.1 (Apple Inc. build 5465). About the testsFollowing tests are run on the above Python versions:
A warm up phase is used where 1000 requests are made, this should make it fair for Q2 version of unladen-swallow that supports JIT. pystone benchmarkspython 2.5.2Pystone(1.1) time for 50000 passes = 1.15 This machine benchmarks at 43478.3 pystones/second python 2.6.2Pystone(1.1) time for 50000 passes = 1.07391 This machine benchmarks at 46559 pystones/second Unladen Swallow Q1Pystone(1.1) time for 50000 passes = 1.02964 This machine benchmarks at 48560.9 pystones/second Unladen Swallow Q2Pystone(1.1) time for 50000 passes = 1.45034 This machine benchmarks at 34474.7 pystones/second And the winner is?The winner seems to be Unladen Swallow Q1. It should be noted that Python 2.6 seems to be faster than Python 2.5. The probable reason why Unladen Swallow Q2 is slower is because JIT has not yet kicked in, so this benchmark is useless for testing the real performance of Unladen Swallow Q2. Profile rendering, cachedThis request makes a lot of calls to memcached. It should be noted that Python 2.6.2 crashes with a Bus error when performing this benchmark... I was not able to debug why this bus error occurs. Python 2.5.2
Concurrency Level: 10
Time taken for tests: 38.729 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 25109000 bytes
HTML transferred: 24972000 bytes
Requests per second: 25.82 [#/sec] (mean)
Time per request: 387.287 [ms] (mean)
Time per request: 38.729 [ms] (mean, across all concurrent requests)
Transfer rate: 633.13 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 137 1390.8 0 19026
Processing: 11 179 781.8 111 19024
Waiting: 11 178 781.8 110 19024
Total: 11 316 1608.2 113 19482
Unladen Swallow Q1Concurrency Level: 10 Time taken for tests: 31.635 seconds Complete requests: 1000 Failed requests: 0 Write errors: 0 Total transferred: 24763000 bytes HTML transferred: 24626000 bytes Requests per second: 31.61 [#/sec] (mean) Time per request: 316.347 [ms] (mean) Time per request: 31.635 [ms] (mean, across all concurrent requests) Transfer rate: 764.43 [Kbytes/sec] received Unladen Swallow Q2
Concurrency Level: 10
Time taken for tests: 14.973 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 24692888 bytes
HTML transferred: 24555000 bytes
Requests per second: 66.79 [#/sec] (mean)
Time per request: 149.731 [ms] (mean)
Time per request: 14.973 [ms] (mean, across all concurrent requests)
Transfer rate: 1610.49 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.5 0 10
Processing: 46 149 100.4 131 1239
Waiting: 46 148 100.4 130 1239
Total: 46 149 100.4 131 1240
The above benchmark only performed well on the first 1000 requests, afterwards the performance degraded to about 20 req. pr. second. I guess this is some kind of bug lurking in the Q2 release. I chose to use this benchmark as it clearly shows the potential of JIT optimizations. /faq renderingThis request does not any requests to memcached or MySQL, but simply renders a Mako profile. Python 2.5.2
Concurrency Level: 10
Time taken for tests: 18.940 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 11526000 bytes
HTML transferred: 11389000 bytes
Requests per second: 52.80 [#/sec] (mean)
Time per request: 189.400 [ms] (mean)
Time per request: 18.940 [ms] (mean, across all concurrent requests)
Transfer rate: 594.29 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.5 0 11
Processing: 51 189 70.0 188 1187
Waiting: 50 188 70.1 188 1187
Total: 51 189 70.0 188 1187
Python 2.6.2
Concurrency Level: 10
Time taken for tests: 17.327 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 11526000 bytes
HTML transferred: 11389000 bytes
Requests per second: 57.71 [#/sec] (mean)
Time per request: 173.272 [ms] (mean)
Time per request: 17.327 [ms] (mean, across all concurrent requests)
Transfer rate: 649.61 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.6 0 10
Processing: 20 173 86.6 174 1207
Waiting: 20 172 86.7 174 1207
Total: 20 173 86.6 174 1207
Unladen Q1
Concurrency Level: 10
Time taken for tests: 44.839 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 11526000 bytes
HTML transferred: 11389000 bytes
Requests per second: 22.30 [#/sec] (mean)
Time per request: 448.391 [ms] (mean)
Time per request: 44.839 [ms] (mean, across all concurrent requests)
Transfer rate: 251.03 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.6 0 10
Processing: 49 448 141.6 462 1407
Waiting: 49 447 141.7 461 1406
Total: 50 448 141.5 462 1408
I have no clue why the performance of /faq was slow on Q1. I was able to reproduce this in 2 reruns. Unladen Q2
Concurrency Level: 10
Time taken for tests: 16.678 seconds
Complete requests: 1000
Failed requests: 0
Write errors: 0
Total transferred: 11526000 bytes
HTML transferred: 11389000 bytes
Requests per second: 59.96 [#/sec] (mean)
Time per request: 166.780 [ms] (mean)
Time per request: 16.678 [ms] (mean, across all concurrent requests)
Transfer rate: 674.89 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 1.0 0 11
Processing: 26 166 87.5 167 1149
Waiting: 25 165 87.6 167 1148
Total: 26 166 87.5 167 1149
Resource usagePython 2.5.2PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE 74080 python 0.0% 1:14.32 9 77 298 46M 688K 49M 68M Unladen Swallow Q1PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE 74273 python 0.0% 3:55.68 9 63 302 65M+ 188K- 68M 87M Unladen Swallow Q2PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE 74380 python 0.0% 1:09.29 9 70 356 75M 188K 83M 123M ConclusionLike all benchmarks you should take this benchmark with a grain of salt! That said, let's conclude some things about performance:
Conclusions about resource usages:
All in all, I must say I am impressed by the progress that Unladen Swallow Q2 shows. 2x performance improvement is a great deal and I think we can expect much more for the Q3 release.
Benchmarks
·
Code
·
Plurk
·
Python
•
29. Aug 2009
|
|