Category Archives: Tools

SQL or NoSQL – Why not to use both (in PostgreSQL)

NoSQL databases have become very popular in last years and there is a plenty of various options available. It looks like traditional relational databases (RDBMs) are almost not needed any more. NoSQL solutions are advertised as faster, more scalable and easier to use. So who would care about relations, joins, foreign keys and similar stuff (not talking about ACID properties, transactions, transaction isolation)? Who would,  if NoSQLs can make your life much easier. But there is a key insight about NoSQL databases – their wonderful achievements are possible because they made their life easier too is some aspects. But that comes with some price – would you be happy, if your bank will store your saving in MongoDb?

However there are many environments, where NoSQL databases shine – especially when there are huge amounts of simple data structures, which need to be scaled massively across the globe and where these data are not of much value – solutions like social networks, instant messaging etc. are not so much concerned about data consistency or data loss, because these data are basically valueless. (Their business model is just based on sharing absolutely trivial data, where one piece can be easily replaced with another and it does not matter if some pieces are lost. Consider – what will happen if whole Facebook will go away in one minute? Nothing! Few people will be pissed off because they think their online profile was cool, few sad that they cannot share their meaningless achievements with so called ‘friends’, but generally considered nothing special will happen and no real value will be lost. People will just switch to another provider and fill it’s database with tons of trivialities and will easily forget about data in their previous account).

I don’t want to create impression that NoSQL databases are useless, they are very good for certain scenarios (and we need to remember that NoSQL is rather broad category, it includes structured documents stores, key-value stores, object databases etc. – each one has it’s particular niche, where it excels), but relational databases are also good, actually very good. Relational model is fairly good abstraction of very many real world situations, data structures, entities, however we call them. And relational databases provide solid tools to works with them. So it make sense to use them in many cases. It might bit more difficult to start with relational database then with schema-less document store, but  in the long run it should pay off. And what is really nice it’s not about one or another solution, but we can use both and combine them smartly and inventively.
So enough of general mumbo jumbo – let’s get to my particular case – I’ve been looking for data store for my new project and considered to try MongoDb this time ( while in past I stuck to relational DBs), however finally decided for PostgreSQL (again) – and I’d like to share some tests, findings and thoughts. Continue reading SQL or NoSQL – Why not to use both (in PostgreSQL)

Download Email Attachments Automagically

Emails are still one of the most important means of electronic communication.  Apart of everyday usage with some convenient client ( like superb Thunderbird), from time to time one might need to get messages content out of the mailbox and perform some bulk action(s) with it – an example could be to download all image attachments from your mailbox into some folder – this can be done easily manually for few emails, but what if there is 10 thousands of emails?  Your mailbox is usually hosted on some server and you can access it via IMAP protocol. There are many possible  ways how to achieve this, however most of them require to download or synchronize full mailbox locally and then extract required parts from messages and process them.  This could be very inefficient indeed.   Recently I have a need for automated task like one above – search messages in particular IMAP mailbox,  identify attachments of certain type and name and download then and run a command with them, after command is finished successfully delete email (or move it to other folder).   Looking around I did not found anything suitable, which would meet my requirements (Linux, command line, simple yet powerful).  So having some experiences with IMAP and python, I decided to write such tool myself.   It’s called imap_detach, and you can check details on it’s page. Here I’d like to present couple of use cases for this tool in hope they might be useful for people with similar email processing needs.

Continue reading Download Email Attachments Automagically

Do We Trust Cloud Storage For Privacy?

With more generic offerings from  cloud storage providers –  up to 50GB free,   cloud storage is tempting alternative to store some of our data. I have some data, which I really do not want to loose. I already have them stored on several devices, however additional copy in cloud could help.  But how much I can trust cloud providers to keep my data private, even from their own employees.  Not that I have something super secret, but somehow I do not like idea, that some bored sysadmin, will be browsing my family photos.  Or provider  use my photos for some machine learning algorithms.

Main providers like Dropbox, Google do use some encryption, however they control  encryption keys, so they can theoretically access your data any time and in worst case provide them to third parties – like government agencies.   From what I have been looking around only few providers like Mega or SpiderOak  offer privacy  by design – which means  all encryption is done on client and they should not have any access to your keys (zero knowledge).   However how much we can trust that their implementation is flawless or that there are not intentional back-doors left? There has been some concerns about Mega security couple years ago,  but no major issues appeared since then.

So rather then trusting those guys fully, why not to take additional step and also encrypt our data, before sending them to cloud?  Additional encryption will not cost us much CPU time on current hardware (from tests – 11% of one core of old AMD CPU) and will not slow down transfers, because they are rather limited by Internet connection bandwidth.  And on Linux we have quite few quality encryption  tools like gpg or openssl, which can be relatively easily integrated into our backup/restore chains. In the rest of this article I’ll describe my PoC shell script, that backs up/ restores  whole directory to MEGA, while providing additional encryption / decryption on client side.  Continue reading Do We Trust Cloud Storage For Privacy?

Opus Audio Codec for Audio Books And More

Opus is a relatively new lossy audio codec from Xiph Foundation, successor to Vorbis and Speex codecs.  It provides very good quality for low bandwidth (<=32kbps) streams with speech, but also provides high quality for broader bandwidth (>64kbps) and more demanding data like music etc.  So it can be one-off solution for any digital audio encoding.  According to some tests presented on it’s site, it’s comparable with HE AAC for higher bandwidth, higher quality data,  while it additionally  provides better results for lower bandwidth, speech data  this is something xHE ACC is addressing too, however I have not seen available codec yet.).  And what is most appealing on Opus is that it’s free, without patents and it’s open source.  (While majority of common audio codecs e.g MP3, AAC are restricted by patents and are subject to paying loyalties , I think Fraunhofer holds basic patents, but situations is quite complex and differs per country).

Based on positives reviews, I though that Opus could be ideal codec for audio books, where it can provide good quality at low bit rates. At least for me, I really do not need top quality for audio books  (say mp3 320kbps), while the book takes gigabytes of space, but on the other hand,  I do appreciate good quality and with low quality audios I have problems to understand them and I cannot really enjoy the book.

So how can Opus help and is it ready for everyday use? Continue reading Opus Audio Codec for Audio Books And More

Media Server For Music And Audio-Books

music-7Having updated my mobile  recently (but still staying on Android) to 4G device, I thought that it would be about time to make my audio collection available outside of home network.  At home I use samba share, which is quite fine for most of uses, however enabling access from internet required bit more  effort. In following article I’d like to describe options, I’ve been looking at, and the final solution. Continue reading Media Server For Music And Audio-Books

Why GMail is not changing all server certificates in synch?

I’m accessing my Gmail account from behind HTTPS proxy – it was described is this post.  Thunderbird does not support it, for IMAP and SMTP  only SOCKS proxy can work.   To cope with it  I’m using a small local proxy, that redirects any connection via proxy CONNECT method to remote host:port.

This works fine, but in email client I had to set IMAP server as localhost and SMTP server also as localhost.  Thunderbird is cautious about it and since both connections are using TLS/SSL then there is a security issue –  I’m connecting to localhost, but certificates are  for *.gmail.com domain.  Luckily Thunderbird enables me to set security exception –   it asks me if I’ll allow that certificate for that host address, if I confirm everything works like charm until Gmail changes certificate on servers (which happens about couple time per year or so). Continue reading Why GMail is not changing all server certificates in synch?

Convert Dictionary for PocketBook eBook Reader

I’m great fan of ebooks and recently I changed my old Hanlin to new reader – PocketBook 626.   PocketBook is an European company, which is distributing  PocketBook readers, the actual development of devices is done by Ukrainian company Obreey Products. Ukrainian and Russian programmers have been always very active in ebooks technology (FB2 format,  CoolReaderFBReader, OpenInkPot (opensource FW for ebook) and more), so no wonder they have been able to produce quite a nice device.   I personally do prefer it to Kindle   because of wider format support (especially for support of epub format) and for broader possibilities how to customize the device  (price wise they are basically similar to Kindle).

One of first tasks was to get there more useful language dictionaries (there are some built in, but can get better ones). Many high quality free dictionaries are available in stardict format (for instance here for Czech language dictionaries;  other sites are available, which offer also other formats of dictionaries). Continue reading Convert Dictionary for PocketBook eBook Reader

Eclipse Help Browser And Proxy

It’s quite pathetic, that HTTP proxies settings are causing problems again and again in various applications – like UbuntuOne,  pip …  Maybe it is just problem for Ubuntu/Linux platforms where proxy settings are in separate places (dconf keys for desktop,  http_proxy, HTTP_PROXY, no_proxy environment variables).

This time it was Eclipse IDE. Problem here is like this –  Eclipse has proxy settings in Preferences/ General / Network connections – however these settings are not applied to Help Browser (started via Help/Help Contents) –  this browser is using system settings (I believe from dconf key system.proxy in my case),  but not in consistent way –  while browser is fine with subnet entry in system.proxy.ignore-hosts  like 127.0.0.1/8, Eclipse help browser is not,  it just requires server part of url – e.g. just 127.0.0.1).

Also Native option for proxy settings in Eclipse (which are used for updates, plugins install) seems not to work on Linux.

I spent some time to fix this,  another victim to inconsistent proxy handling.

Playing with maps (QGIS, PostGIS, OSM)

I do like maps.   Recently  I was looking at some digital maps and  got into more details and found several nice open source tools about which I’d like to write in this article.  As in any area open source provides interesting alternatives to work with digital maps and geographical data and what is also very interesting there are now free sources of  good quality geographical  data, which  can be used freely by everybody to create their own maps.   In this article I’m explaining how use data from Open Street Maps (OSM) project  for creating maps in QGIS and how to use PostGIS to store geographic informations and have  fun playing around with their different features of map data and tools. This tutorial is focused on linux (debian based – ubuntu 12.04 resp. mint in my case) desktop and assumes some basic knowledge of  your linux desktop administration. Continue reading Playing with maps (QGIS, PostGIS, OSM)