Cython Is As Good As Advertised

I've have been aware of Cython for a few years but newer had chance to really test it in practice (apart of few dummy exercises).  Recently I've decided to look at it again and test it on my old project adecapcha. I was quite pleased with results, where I was able speed up the program significantly with minimum changes to the code.

Download Email Attachments Automagically

Emails are still one of the most important means of electronic communication.  Apart of everyday usage with some convenient client ( like superb Thunderbird), from time to time one might need to get messages content out of the mailbox and perform some bulk action(s) with it – an example could be to download all image attachments from your mailbox into some folder – this can be done easily manually for few emails, but what if there is 10 thousands of emails?  Your mailbox is usually hosted on some server and you can access it via IMAP protocol. There are many possible  ways how to achieve this, however most of them require to download or synchronize full mailbox locally and then extract required parts from messages and process them.  This could be very inefficient indeed.   Recently I have a need for automated task like one above – search messages in particular IMAP mailbox,  identify attachments of certain type and name and download then and run a command with them, after command is finished successfully delete email (or move it to other folder).   Looking around I did not found anything suitable, which would meet my requirements (Linux, command line, simple yet powerful).  So having some experiences with IMAP and python, I decided to write such tool myself.   It's called imap_detach, and you can check details on it's page. Here I'd like to present couple of use cases for this tool in hope they might be useful for people with similar email processing needs.

Continue reading Download Email Attachments Automagically

Writing Simple Parser in Python

From time to time one might need to write simple language parser to  implement  some domain specific language for his application.  As always python ecosystem offers  various solutions – overview of python parser generators is available here.  In this article I'd like to describe my experiences with parsimonious package. For recent project of mine ( imap_detach –  a tool to automatically download attachment from IMAP mailbox) I needed simple expressions to specify what emails and what exact parts should be downloaded.

Terminal Interfaces in Python

Although there is a fair choice of GUI libraries for Python (good overview of Python GUI libraries is here), sometimes we need just a little bit enhanced terminal interface, like in my recent project –  XMPP test client  – where requirements were quite simple – just to split terminal screen into two areas –  main screen where messages are displayed (possibly asynchronously) and bottom line, where commands/messages can  be entered:

layout

OpenShift Experiencies

PaaS is happily buzzing in the Cloud and it seems to be hottest topic in the infrastructure  services today,  so I decided to test Openshift –  PaaS  offering  from Red Hat.  Couple of reasons make this platform interesting – firstly it's open source solution, so we can use it to build your own private solution, secondly on public service we  get 3 gears ( linux containers with predefined configuration) for free forever, so it's easy to experiment with this platform. As a sample project we will create very simple Python Flask web application with MongoDb.

Video Streaming from File Sharing Servers

As I’ve written video files can be streamed via Bit Torrent protocol. Although responsiveness (time to start, time to seek) is notably worst that in specialized solutions, it is still usable for normal user, with a bit of patience.

Video files are also provided by file sharing servers,  but in many cases download rate is limited, so it's not enough to stream video file. However it's often possible  to open several requests for same file, and combine download rate – this method is quite common in download managers. And if we add possibility to stream downloaded content to video player, we can achieve satisfactory results, possibly similar as or better then streaming via Bit Torrent.

OpenSubtitles provide easy to use API

When working on btclient, I was interested in possibility of downloading a subtitles for a video file, that is played. This seems to be common option in many player.  I've found that opensubtitles.org provides XML-RPC remote API,  which is very easy to use. With help of python xmlrpclib module, it's really a matter of minutes to create a simple working client.

Subtle evil of close_fds parameter in subprocess.Popen

In python newly created sub-process inherits file descriptors from parent process and these descriptors are left open – at least this was default till python ver. 3.3. subprocces.Popen constructor has parameter close_fds (defaults to False on python ver. 2.7), which can say if to close inherited FDs or not.  Leaving them open FDs for child process can lead to many problems as explained here and here.

Accesing Oracle from Python (with proper unicode support)

It’s not obvious to set it right, so I’m putting some notes here:

Installation is described here.
Few comments:

  • ORACLE_HOME is needed just for installation
  • If you add client library path to /etc/ld.so.conf.d/oracle.conf   and update ldconfig, you don’t need to export modified LD_LIBRARY_PATH
  • when you install Oracle client library and set environment,  you can install cx_oracle also via pip install cx_Oracle

The crucial step not mentioned in the installation guide is  to set NLS_LANG environment variable – this should be in the environment of your python program using cx_oracle.  So for instance for Flask+SQLAlchemy you can have:

Without this variable oracle client is using 7bits ASCII! So any unicode character will raise “UnicodeEncodeError: ‘ascii’ codec can’t encode character” error.

Streaming video file from BitTorrent P2P network

Although BitTorrent (BT) protocol was not designed for media streaming, in practice it can be used, with certain extent, to stream a video file from P2P network. Key trick is to force sequential download in BT client (normally BT client selects first pieces, that are least available in swarm,  which contributes to better distribution of the file, sequential download is playing against it, so it is not enabled in regular BT clients).

But if we force BT client to download sequentially, cache incoming pieces and have enough incoming bandwidth from peers, we can stream incoming video directly into video player.  Indeed it's a poor man streaming, because it lacks any advanced features like stream synchronization, stream seeking etc., but in many cases it works just good enough.