Using Cython for Great Speedups
May 03, 2012About Cython
Python is a great dynamic language; it provides layers of abstractions (dynamic typing and memory management to name two) over the underlying hardware and makes it easier and faster to write productive code compared to C, Java and other strongly typed languages. Unfortunately, the layers of abstractions generally means it is slower. Many times this does not matter, but sometimes you want a piece of Python code to be fast and would be willing to make some restrictions to the code, trading Python’s dynamism, for faster runtimes. This is essentially what Python C-Extensions provide and using Cython is a very easy way to build C-extensions from existing Python code.
Cython allows you to take your existing working Python code, add type declarations, thereby restricting what can be done with the code, and creates a C-extension for you. This C-extension can be imported like any other Python module. Depending on your code, you can get a 4-5x performance improvement immediately with just several annotations. Your code will also use less memory and will likely be faster the equivalent node.js code.
Using Cython
Here is a simple example of how to use Cython. Here I use Cython to annotate a function to find the nth prime number. Note, there are more efficient algorithms to determine if a number is prime, this algorithm is used mostly for simplicity and pedagogical reasons:
Here is the program’s entry point. I added a flag, --with-cython
, to test the Cython optimized-code side-by-side with the pure-Python original:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Here is the "unoptimized" pure Python code to find the nth prime number:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Here is the "optimized" Cython code with the type annotations (NOTE: the .pyx
file extension which means this is no longer pure Python and must be pre-processed using the Cython code generator):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
A setup.py
is needed facilitate building the Cython C-extension:
1 2 3 4 5 6 7 8 9 10 |
|
To build the Cython C-extension module use:
$ python setup.py build_ext --inplace
running build_ext
cythoning libprimes_fast.pyx to libprimes_fast.c
building 'libprimes_fast' extension
llvm-gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch x86_64 -pipe -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c libprimes_fast.c -o build/temp.macosx-10.7-intel-2.7/libprimes_fast.o
llvm-gcc-4.2 -Wl,-F. -bundle -undefined dynamic_lookup -Wl,-F. -arch i386 -arch x86_64 build/temp.macosx-10.7-intel-2.7/libprimes_fast.o -o ./libprimes_fast.so
The above tells Cython to create a Python C-extension module for the libprimes_fast
module containing the function nth_prime
. The generated source of the C-extension is in the intermediate source file libprimes_fast.c
. When compiled using the system C compiler, LLVM here, this will produce the C-extension module libprimes_fast.so
. The cdef
annotation restricts the type of a variable to native C types (all unsigned long long
in this example.) In the body of nth_prime
, the value of the annotated variables (ex. prime_count
) are marshalled to and remain the native C type which is simpler and faster to manage on the underlying hardware. When the value is returned to the caller it will get marshalled back to a Python Object.
Here is a test run on my MacBook with a 2.2GHz i7 Processor using Python 2.7. First running the unoptimized Python code:
$ time python nthprime.py 100000
1299709
real 0m5.607s
user 0m5.597s
sys 0m0.010s
Then with the Cython created C-extension:
$ time python nthprime.py --with-cython 100000
1299709
real 0m0.715s
user 0m0.704s
sys 0m0.010s
That is a quick and nearly 90% reduction in the runtime with only 5 lines of annotations. There are some limitations to Cython, so read the Cython FAQ and Cython Docs if you are going to do anything involved. Finally, if building C-extensions is not your thing, give PyPy a try. It is an implementation of Python in Python with JIT Compilation and may give you similar if not better speed ups without annotating your code.
Tags: python, cython, performance, speed, optimization