Doing optimizations and doing it in Python
Here are a couple of really good quotes on optimization:
And these are from people that are programming-demi-gods :-) So what's really important: to know where to optimize. Optimizing things blindly is one of the most stupid things one can do. How to do this in Python?In order to do it smartly we use the standard library and a smart decorator: import hotshot, hotshot.stats
def profileit(printlines=20):
def _my(func):
def _func(*args, **kargs):
prof = hotshot.Profile("profiling.data")
res = prof.runcall(func, *args, **kargs)
prof.close()
stats = hotshot.stats.load("profiling.data")
stats.strip_dirs()
stats.sort_stats('time', 'calls')
print ">>>---- Begin profiling print"
stats.print_stats(printlines)
print ">>>---- End profiling print"
return res
return _func
return _my
We can then profile functions/methods easily by just attaching this decorator on them: import urllib
@profileit(20)
def my_expensive_function():
for i in range(0, 10):
html = urllib.urlopen("http://yahoo.com/").read()
my_expensive_function()
When you run it, it prints out a report over where time is spent:
$ python test_rendering.py
>>>---- Begin profiling print
2417 function calls (2367 primitive calls) in 1.660 CPU seconds
Ordered by: internal time, call count
List reduced from 141 to 20 due to restriction <20>
ncalls tottime percall cumtime percall filename:lineno(function)
180 0.693 0.004 0.693 0.004 socket.py:321(readline)
40 0.673 0.017 0.673 0.017 <string>:1(connect)
20 0.264 0.013 0.264 0.013 socket.py:278(read)
20 0.018 0.001 0.691 0.035 httplib.py:663(connect)
20/10 0.001 0.000 1.396 0.140 urllib.py:270(open_http)
20 0.001 0.000 0.008 0.000 httplib.py:224(readheaders)
1 0.001 0.001 0.003 0.003 httplib.py:67(<module>)
1 0.001 0.001 0.001 0.001 random.py:39(<module>)
1 0.001 0.001 0.002 0.002 mimetools.py:1(<module>)
1 0.001 0.001 0.002 0.002 tempfile.py:18(<module>)
20 0.000 0.000 0.000 0.000 socket.py:152(__init__)
20/10 0.000 0.000 1.396 0.140 urllib.py:163(open)
20 0.000 0.000 0.696 0.035 httplib.py:378(begin)
140 0.000 0.000 0.000 0.000 rfc822.py:193(isheader)
30 0.000 0.000 0.000 0.000 socket.py:229(close)
20 0.000 0.000 0.697 0.035 httplib.py:894(getresponse)
20 0.000 0.000 0.687 0.034 httplib.py:341(_read_status)
20 0.000 0.000 0.008 0.000 mimetools.py:15(__init__)
20 0.000 0.000 0.000 0.000 urllib.py:1016(splittype)
20 0.000 0.000 0.000 0.000 socket.py:159(close)
>>>---- End profiling print
And we can see that most time is spent in connecting and fetching data - which generally means that if you want to optimize urlopen, then you should optimize socket.connect and socket.readline.
Benchmarks
·
Code
·
Design
·
Python
·
Tips
•
17. Mar 2009
Post a comment
Commenting on this post has expired.
|
Blog labels |