Categories
English

Python Code Optimization

Code optimization is very important, especially for dynamic language like python. Recently I finished a python program of Approximate Dynamic Programming. One part of the program was to use dijkstra algorithm to calculate the shortest path distance in the network. Since the network is really large(about 1 million node), the first version of code was really slow which took about 2000 seconds each run. After code optimization(it took me about 6 hours), each run just takes 0.2 second. This is incredible improvement.

I used the  the following tools during the process, which I highly recommend:

  1. python cprofile module  –> this is a builtin profile module
  2. runsnakerun –> this is a viewer of the cProfile output
  3. line_profiler –> this is line by line profiler


The usage of runsnakerun is quite easy, just use the following two commands
$ python -m cProfile -o <outputfilename> <script-name> <options>
$ runsnake <outputfilename>

line_profil\ is a line by line python profiler and its webpage is:
http://packages.python.org/line_profiler/

add profile decorator to the function you want to optimize
$kernprof.py [-l/–line-by-line] script_to_profile.py

$ python -m line_profiler script_to_profile.py.lprof

 

Here are some tips to improve python performance:

1. in operator is really slow when the list is large. You may often use

if a in S:

           somehting

else:

          other thing

When S is very large, thousands, for instance, in operator will become a bottleneck.

 

A good way to optimize this is have list with binary value to indicate whether a element is in the set or not. Then the code will be:

if I[s]:

        do something

else:

        do some other thing

Of course, it will use more memory.

2.   Call the library function as much as possbile.

Most of the library functions are highly optimized, or implemented in C++.  So if you can call library functions, don’t implement it by you self in python. The for loop in python is slow. You can use scipy vectors to accelerate the program. You should try your best to reduce the for loop in python.

3  Don’t check each time for rare case.
you may have list with 100 elemnent, you will receive a index may or may not within range(0, 100). If it is in the range return the corresponding value, otherwise return -1. but 99.99% of the time the index is in the range.  The bad is:
if idx < 1000:
B = A[idx]
else:
B = -1
A better approach is:
try:
B = A[idx]
except IndexError:
B = 1
Be careful that don’t use broad Exception, use specific type of Exception.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s