All posts by admin

Not Always PyPy Is Faster

PyPy is an alternative Python interpreter, which is known for it’s speed.  However it does not have to be always faster as ‘classic’ Python interpreter (called here CPython). For one small project of mine – PDF Checker – I was testing PyPy hoping to speed up PDF document processing (basically parsing to extract text – pdfminer library is used and document parsing takes majority of time).  Below are results from running program for two different files and in CPython interpreter or in PyPy (with JIT and without JIT compilation):

CPython PyPy PyPy with JIT disabled
Small PDF (110kB)  1.1 s  2.4 s  2.5 s
Big PDF (996kB)  16.6 s  10.9 s  36.5 s

Continue reading Not Always PyPy Is Faster

Decoding Audio Captchas in Python

For good or bad many sites are now using CAPTCHAs to determine if visitor is human or computer program. Captcha presents a task – usually reading some distorted letters  and writing them back to a form.  This is considered to be hard for computer to do, so user must be human.  To improve accessibility visual captchas are accompanied by audio captchas, where letters are spelled (usually with some background noise to make letters recognition more difficult) .  However audio captchas are know to be easier to break.  Inspired by this article [1]  I created a python implementation of audio captchas decoding using commonly available libraries and with just a general knowledge of speech recognition  technologies. Software is called adecaptcha and I tested it on couple of sites, where I got 99.5% accuracy of decoded letters for one site and 90% accuracy for other site (which has much distorted audio). Continue reading Decoding Audio Captchas in Python

Running uWSGI for gevent enabled application

Gevent is a great library that uses greenlets (a Python co-routine library) to enable asynchronous I/O, while providing  API that looks like normal synchronous API, so it’s  easier to use and understand.  The async magic is done automatically by Gevent, which is running an event loop on background and switching between coroutines as necessary.

This approach can be very useful for concurrent applications, which spend a lot of time in waiting for I/O.  Like web applications – so Gevent is popular there.  For certain type of workloads it can be quite useful – it can enable higher concurrency,  while using less resources (greenlet is much lighter then thread or  process). Continue reading Running uWSGI for gevent enabled application

Unity – adding unknown applications to Launcher/Dash

It’s described in numerous posts how to add new application to Unity so it’s searchable in Dash.   Unity works with .desktop files, which define how applications should be launched from Unity – these files are located in /usr/share/applications (system wide definitions) or ~/.local/share/applications (user specifics application). So if you add well formated .desktop file to any of these locations Unity will be aware of it (may need to restart unity).

Recently I found one more interesting behavior of Unity – if you start unknown application from terminal it will appear in Launcher (in ‘Running applications’ section), Unity even makes some effort to find correct icon for it.  Now you can lock it to Launcher (right click and chose ‘Lock to Launcher’ from menu).   On background Unity creates new .desktop entry for this program in ~/.local/share/applications, so it can stay locked to Launcher in future.   This new .desktop file contains title of the application from window title (in which application is running) executable path and parameters are taken from process properties, even icon path is stored if Unity was able to find one.  And when you unlock application from launcher, .desktop file will still remain in users applications – so you can search it in Dash.   So this approach can make adding new unknown application easier – just run, then lock and unlock from Launcher and you’ll have new entry in ~/.local/share/application. You can then edit it a bit manually to make it perfect and this is it.

 

APEX Application to View Log Files

Oracle APEX is keeping all data in database and makes it easy to create different reports for tables or views.  But what if we want to present something outside of database?   Like text log files – how this could be done in APEX?   For regular web server it is a trivial task – usually simple configuration of web server enables to list directory and download any files from it (and it probably would be easiest way to do it). But what if we need to integrate logs browsing into APEX application?  Actually there is a way to list and serve files even in APEX, if it is required. Continue reading APEX Application to View Log Files

Oracle DB – Compression of LOBs

We have some APEX applications for our colleagues, where people can attach files (usually MS Word documents or PDF files).  A straightforward way to implement it was to use a table with BLOB field and connect it with the file item in an upload form.  All works well,  but as times goes on BLOBs are starting to take significant amount of space.    I have been looking for some solution, how to reduce space taken by attached documents and explored Oracle 11g compression capabilities.
For LOBs 11g (Enterprise Edition)  provides new feature called SecureFiles, which also enables compression of LOB data. We have migrated our tables to use SecureFiles and saved approximately  50% of space. Continue reading Oracle DB – Compression of LOBs

Why GMail is not changing all server certificates in synch?

I’m accessing my Gmail account from behind HTTPS proxy – it was described is this post.  Thunderbird does not support it, for IMAP and SMTP  only SOCKS proxy can work.   To cope with it  I’m using a small local proxy, that redirects any connection via proxy CONNECT method to remote host:port.

This works fine, but in email client I had to set IMAP server as localhost and SMTP server also as localhost.  Thunderbird is cautious about it and since both connections are using TLS/SSL then there is a security issue –  I’m connecting to localhost, but certificates are  for *.gmail.com domain.  Luckily Thunderbird enables me to set security exception –   it asks me if I’ll allow that certificate for that host address, if I confirm everything works like charm until Gmail changes certificate on servers (which happens about couple time per year or so). Continue reading Why GMail is not changing all server certificates in synch?

Convert Dictionary for PocketBook eBook Reader

I’m great fan of ebooks and recently I changed my old Hanlin to new reader – PocketBook 626.   PocketBook is an European company, which is distributing  PocketBook readers, the actual development of devices is done by Ukrainian company Obreey Products. Ukrainian and Russian programmers have been always very active in ebooks technology (FB2 format,  CoolReaderFBReader, OpenInkPot (opensource FW for ebook) and more), so no wonder they have been able to produce quite a nice device.   I personally do prefer it to Kindle   because of wider format support (especially for support of epub format) and for broader possibilities how to customize the device  (price wise they are basically similar to Kindle).

One of first tasks was to get there more useful language dictionaries (there are some built in, but can get better ones). Many high quality free dictionaries are available in stardict format (for instance here for Czech language dictionaries;  other sites are available, which offer also other formats of dictionaries). Continue reading Convert Dictionary for PocketBook eBook Reader

Learning OCAML

I decided to increase my perspective in programming and this time to look at some functional language.  Functional languages are here for quite some time,  but never got into mainstream, however recently they are getting more popular.   And some features from functional programming made their way into other languages (first class functions, closures, higher order functions, … ), so I was curious  to learn a bit more of functional programming. For my exercise I decided to try OCAML.
Continue reading Learning OCAML

Eclipse Help Browser And Proxy

It’s quite pathetic, that HTTP proxies settings are causing problems again and again in various applications – like UbuntuOne,  pip …  Maybe it is just problem for Ubuntu/Linux platforms where proxy settings are in separate places (dconf keys for desktop,  http_proxy, HTTP_PROXY, no_proxy environment variables).

This time it was Eclipse IDE. Problem here is like this –  Eclipse has proxy settings in Preferences/ General / Network connections – however these settings are not applied to Help Browser (started via Help/Help Contents) –  this browser is using system settings (I believe from dconf key system.proxy in my case),  but not in consistent way –  while browser is fine with subnet entry in system.proxy.ignore-hosts  like 127.0.0.1/8, Eclipse help browser is not,  it just requires server part of url – e.g. just 127.0.0.1).

Also Native option for proxy settings in Eclipse (which are used for updates, plugins install) seems not to work on Linux.

I spent some time to fix this,  another victim to inconsistent proxy handling.