Finding bottlenecks in WSGI applications

In order to have fast web-applications it's really important to be able to find bottlenecks, so you don't spend time optimizing something that isn't an issue. The web-application profiling solutions out there were not that good and even big frameworks like Django have very bad support for profiling.

I will now present a simple WSGI profiling middleware that can be used to find bottlenecks in any WSGI application. WSGI is a Python specification that standardizes an interface between web servers and web applications. It's really one of the coolest things that has happened to Python web-development in the recent years. WSGI is/can used by Pylons, Django - just to name a few.

The profiling I present has following features:

  • It logs requests and times them. It's possible to view the average time a given request has taken
  • It can log custom columns (such as database updates done in the request or memcache requests done)
  • It can display the profile data in an informative manner and it's possible to sort after columns

A sample output for the profiler looks like this (it's for Plurk.com):

Plurk profiling

As you can see /TimeLine/getPlurksById is the most expensive call we do and has to be optimized (it does 40 database calls and 204 memcache calls on average...!)

Code and how to use it

You can grab the WSGI application wrapper here:

To use it, simply wrap your WSGI application:

wsgi_app = RequestProfiler(wsgi_app, custom_cols=['db_sel', 'db_update',
                                                  'memc_get', 'memc_set', 'memc_del'])

You can visit /_RequestProfiler to view the profile data.

The profiler is about 150 lines of code and can be extended in various ways - mail back if you do an extension :)

Also let me know if there is a WSGI library that I have missed for profiling WSGI applications (I am especially interested in aggregated profiling).

29. Oct 2008 Benchmarks · Code · Python · Tips
© Amir Salihefendic