Cython Is As Good As Advertised

I’ve have been aware of Cython for a few years but newer had chance to really test it in practice (apart of few dummy exercises).  Recently I’ve decided to look at it again and test it on my old project adecapcha. I was quite pleased with results, where I was able speed up the program significantly with minimum changes to the code.

I used following approach to improve the app performance with Cython:

  1. Profile the application (or it’s part) –  with cProfile and gprof2dot to give nice graphical view of computing time split across functions. (Alternatively you can use  pyvmmonitor, which nicely integrates with PyDev or PyCharms).  Below is the profiling result of adecaptcha before Cython optimalization:
    Here we can see two branches, which took majority on program time – one is for loading audio file, but it’s spending most of its time in Python standard module wave ( provides faster alternative, but not compatible with all wav files) .  However the second branch is completely  under our control.  We can see that a lot of time is spent in twin and wf functions ( triangular window calculations).
  2. Let’s look at functions identified by profiling:

    This is obviously not very efficient code (could be marginally optimized by using list comprehension expression – but performance will be approximately the same  –  24% of total time vs 27%).
  3. To introduce significant performance change we have to implement this function in Cython:

    As you can see the changes are minimal, we just reimplemented twin function, this time with statically typed loop control variable and numpy array.
  4. Profile again to see changes:
    As you can see twin function is not an issue any more ( all calc_mfcc branch is now basically spend in FFT calculation).

Cython can be also used to create Python bindings to C and C++ libraries.  Although the interface has to be written manually (while some other tools like SWIG can generate it automatically) and it requires some  effort,  on the other hand it forces authors to think about the interface design  and it can result in very nice ‘pythonic’ interfaces, which can save you a lot of time later. Key insight here is that  we do not need to interface all functions and classes in C  or C++ library, but only those, which are important for our needs.  Recently I’ve created a binding to libpoppler (PDF parsing library) and found this task fairly easy and quite enjoyable to do.


Even with few very simple changes into original code I was able to achieve significant performance improvements (run time 0.27s vs 0.41s – for one audio captcha, that’s app. 35 % improvement).
Programming in Cython is relatively easy so I was able to implement these changes quickly, without particular issues.  Complication errors were fairly well described, so I was able to fix problems quickly. Building can be defined in in a straightforward manner or can be even automatic ( with pyximport – however pyximport machinery takes some time – so for smaller programs manual compilation is more effective).

However it has to be said that not always path to performance improvements with Cython is so straightforward.  In another project ( Voronoi diagram in bounded box a part of myplaces project) I’ve tried too and I was not able to achieve any performance gain ( quite opposite cythonized version  was about 30% slower). I probably should  try harder, but obviously sometimes it’s difficult to achieve required improvements.

Cython can be also used to create nice pythonic interfaces to existing C and C++ libraries.

Overall I was quite impressed and looking forward to use Cython in future projects.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">