All posts by admin

Simple statistics from nginx access logs

I required some simple statistics (selected page visits per day)  from web-server logs.   I looked at some web log analyzer packages like AWStats, but it looked to me like as an overkill in my case – I’d probably spent more time to trying make it work then putting together some small script. So here it is – a simple bash script that will take all available access logs (by default on Debian nginx is using logrotate to rotate logs daily and keeps 52 daily logs, old logs are gzipped) and calculate page visits for certain request pattern: Continue reading Simple statistics from nginx access logs

Poor man’s backup for XenServer

I’m running several XenServer hosts and wanted to provide some basic backup of VMs. I decided to use USB disk – XenServer 6.2  provides great support for external disks.   I was looking around for some simple free tool for backup (to backup several VMs from different servers in scheduled batches), but did not find anything suitable (simple scripts were not flexible enough, bigger solutions were overkill in my case) – so I created my own solution xapi-back

My setup is:

  • I created small Debian VM and attached USB disk to it (in XS 6.2 this external disk will stay connected to VM after  VM or host reboot)
  • Installed xapi-back
  • Created special user for backup
  • Scheduled VM backups with cron

Main advantages of xapi-back  compared to other similar solutions:

  • easy to install – just download and run python setup.py
  • easy to configure –   just one simple configuration file with details of xen servers and some basic backup parameters
  • self-contained  – does not need xe or other tools (as many other solutions) and can run on any computer ( not only in xenserver Dom0 as some scripts,  generally I think it’s not good practice to run backups in Dom0, better is to have it separately).
  • complete – you can do all basic tasks from xapi-back via simple command line interface xb –  list VMs and their last backups,  backup, restore,  set VMs for scheduled backup ( with help of cron). You’ll not need any other management tools (xe, XenCenter, ….) to make backups.
  • self-maintaining – xapi-back can be scheduled and run automatically. It maintains backups’ storage, keeps N last backups and removes old backups so it can run unattended for months.
  • compact – it’s very small solution so it can run on any machine, only python  is needed (it can run easily on minimal Debian install or even on NAS)
  • universal – can run on any POSIX system, where python is running ( any Linux, FreeBSD, Solaris …)
  • multiple servers – can handle multiple XenServers and server’s pools

Plugins in OCAML with Dynlink library

I slowly continue with learning of OCAML –  as a training project I work on  simplified Map-Reduce framework (utilizing Core and Async libraries).   Here I had a need to plug  a selectable code (map reduce algorithm) to main program.   Ocaml provides Dynlink library, which can dynamically link either byte-code or native object/library to running program.  This can be utilize to create simple plugin framework as explained below. Continue reading Plugins in OCAML with Dynlink library

Changing Management Interface on XenServer 6.2

If you are changing management interface on XS (from eth1 to eth2, in my case, because eth1 was not connect to right subnet), be aware that management console must not update routing table for local subnet access appropriately – it keeps the record for previous interface there with same metric value, so actually old record will have preference.   So you might end up in situation, where you cannot access local subnet.

You can check with route command:

route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.163.127.32   *               255.255.255.224 U     0      0        0 xenbr1
10.163.127.32   *               255.255.255.224 U     0      0        0 xenbr2
default         prague03-lab-co 0.0.0.0         UG    0      0        0 xenbr2

Temporarily it can be fixed by:

route del -net 10.163.127.32  netmask 255.255.255.224 dev xenbr1

However this change is not saved – so it’ll not survive reboot.

Final solution is to use xsconsole and “Network and Management Interface” /  “Emergency Network Reset” –  this function will reboot host.

Streaming video file from BitTorrent P2P network

Although BitTorrent (BT) protocol was not designed for media streaming, in practice it can be used, with certain extent, to stream a video file from P2P network. Key trick is to force sequential download in BT client (normally BT client selects first pieces, that are least available in swarm,  which contributes to better distribution of the file, sequential download is playing against it, so it is not enabled in regular BT clients).

But if we force BT client to download sequentially, cache incoming pieces and have enough incoming bandwidth from peers, we can stream incoming video directly into video player.  Indeed it’s a poor man streaming, because it lacks any advanced features like stream synchronization, stream seeking etc., but in many cases it works just good enough. Continue reading Streaming video file from BitTorrent P2P network

Migrating xend virtual machines to xapi platform (XCP/XenServer)

Xen hypervisor has currently 3 tool stacks : xend + xm (which is now deprecated),   xl (new low level tool) and xapi+xe.   xapi is most advanced and it is used in XenServer (and XCP , but it is now also deprecated because XenServer is now open source).  Recently I have been migrating some virtual machines from Xen 4 with xend to XenServer  6.2. Below are details of migrating linux machines to new environment. Continue reading Migrating xend virtual machines to xapi platform (XCP/XenServer)

Not Always PyPy Is Faster

PyPy is an alternative Python interpreter, which is known for it’s speed.  However it does not have to be always faster as ‘classic’ Python interpreter (called here CPython). For one small project of mine – PDF Checker – I was testing PyPy hoping to speed up PDF document processing (basically parsing to extract text – pdfminer library is used and document parsing takes majority of time).  Below are results from running program for two different files and in CPython interpreter or in PyPy (with JIT and without JIT compilation):

CPython PyPy PyPy with JIT disabled
Small PDF (110kB)  1.1 s  2.4 s  2.5 s
Big PDF (996kB)  16.6 s  10.9 s  36.5 s

Continue reading Not Always PyPy Is Faster

Decoding Audio Captchas in Python

For good or bad many sites are now using CAPTCHAs to determine if visitor is human or computer program. Captcha presents a task – usually reading some distorted letters  and writing them back to a form.  This is considered to be hard for computer to do, so user must be human.  To improve accessibility visual captchas are accompanied by audio captchas, where letters are spelled (usually with some background noise to make letters recognition more difficult) .  However audio captchas are know to be easier to break.  Inspired by this article [1]  I created a python implementation of audio captchas decoding using commonly available libraries and with just a general knowledge of speech recognition  technologies. Software is called adecaptcha and I tested it on couple of sites, where I got 99.5% accuracy of decoded letters for one site and 90% accuracy for other site (which has much distorted audio). Continue reading Decoding Audio Captchas in Python