Do We Trust Cloud Storage For Privacy?

With more generic offerings from  cloud storage providers –  up to 50GB free,   cloud storage is tempting alternative to store some of our data. I have some data, which I really do not want to loose. I already have them stored on several devices, however additional copy in cloud could help.  But how much I can trust cloud providers to keep my data private, even from their own employees.  Not that I have something super secret, but somehow I do not like idea, that some bored sysadmin, will be browsing my family photos.  Or provider  use my photos for some machine learning algorithms.

Main providers like Dropbox, Google do use some encryption, however they control  encryption keys, so they can theoretically access your data any time and in worst case provide them to third parties – like government agencies.   From what I have been looking around only few providers like Mega or SpiderOak  offer privacy  by design – which means  all encryption is done on client and they should not have any access to your keys (zero knowledge).   However how much we can trust that their implementation is flawless or that there are not intentional back-doors left? There has been some concerns about Mega security couple years ago,  but no major issues appeared since then.

So rather then trusting those guys fully, why not to take additional step and also encrypt our data, before sending them to cloud?  Additional encryption will not cost us much CPU time on current hardware (from tests – 11% of one core of old AMD CPU) and will not slow down transfers, because they are rather limited by Internet connection bandwidth.  And on Linux we have quite few quality encryption  tools like gpg or openssl, which can be relatively easily integrated into our backup/restore chains. In the rest of this article I’ll describe my PoC shell script, that backs up/ restores  whole directory to MEGA, while providing additional encryption / decryption on client side. 

The script does following:

  • creates compressed archive of given directory ( tar.gz)
  • splits archive into files of a given size
  • encrypts each file with AES 256
  • Calculates SHA1 checksums for each file
  • Stores files and their checksum on MEGA

Recovery is done similarly, taking steps in reverse direction.

There is also possibility to share this backup with somebody who does not have account on MEGA.  This is possible due to unique feature of MEGA – sharing links – each file in MEGA is encrypted with unique key ( which is then encrypted with your master key).  MEGA can export links with the keys, so recipient can download and decrypt the files ( but in our case it’ll be still encrypted with our additional encryption).   When backing up to MEGA with our script we can create so called manifest file, which contains links to files and also additional secret used for our private encryption.  If this file is shared somebody who has this script, he can easily download and restore the backup.

Script is designed for efficiency – processing data through piped streams, so it can handle large backups.

The script requires megatools – open source client for MEGA. And here is the script:

 

Usage is pretty simple:

To backup:

And then to restore:

Optionally you can backup and get manifest:

The anybody having the manifest file and this scripts can recover backup:

I personally tested with 32GB directory, it took some time ( several hours to backup, much longer more then half day to restore, looks like download speed from MEGA in much more limited), but generally works fine.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">