Download Email Attachments Automagically

Emails are still one of the most important means of electronic communication.  Apart of everyday usage with some convenient client ( like superb Thunderbird), from time to time one might need to get messages content out of the mailbox and perform some bulk action(s) with it – an example could be to download all image attachments from your mailbox into some folder – this can be done easily manually for few emails, but what if there is 10 thousands of emails?  Your mailbox is usually hosted on some server and you can access it via IMAP protocol. There are many possible  ways how to achieve this, however most of them require to download or synchronize full mailbox locally and then extract required parts from messages and process them.  This could be very inefficient indeed.   Recently I have a need for automated task like one above – search messages in particular IMAP mailbox,  identify attachments of certain type and name and download then and run a command with them, after command is finished successfully delete email (or move it to other folder).   Looking around I did not found anything suitable, which would meet my requirements (Linux, command line, simple yet powerful).  So having some experiences with IMAP and python, I decided to write such tool myself.   It’s called imap_detach, and you can check details on it’s page. Here I’d like to present couple of use cases for this tool in hope they might be useful for people with similar email processing needs.

Let’s start with simple example:

This will download all attachments from all emails in user’s  inbox and save them in subdirectories – first grouped by year, then by sender. If there are many emails it can take quite some time. In some cases you might notice error messages complaining that output file isa  directory, which means that attachment does not have any name defined within the email.

This is resolved in next example by using more sophisticated naming of output file using {name|subject+section}  replacement ( | serves as ‘or’,  + joins two variables – so if attachment does not have name we use subject and section as a file name –  so it can look like “Important message_2.1″)

We also can try to add argument –threads, which will enable concurrent download of attachments in separate threads:

In my tests with my gmail mailbox concurrent download with 5 threads was 3.7 times faster then single threaded ( downloading ~1200 files, ~450MB).

But we are not limited just to email attachments, all email parts are available to us.  What about to get all plain text parts and put them into one big file, which we can later use for some analysis :

We might be more specific on which messages to get – for instance we are interest just in junk messages from this year:

Text message parts in an email can have different charsets encodings ( for instance for Czech language we can  have iso-8859-2 or win-1250 or UTF-8). The tool solves this by re-encoding text to UTF-8, so the in output file all text is in this charset.

Similarly we can look at messages  in other folders – say folder Spam and all it’s sub-folders and just look for text in first sub-part of the email message (that should be the text of the email) and getting only emails, where subject starts with “Re:”:

And what about finding all links in your mailbox (with a bit of quote escape madness):

Or using fairly complex filter:

And there are many more possibilities – check details on the tool home page.

11 thoughts on “Download Email Attachments Automagically”

  1. I am using python 2.7 I went to your git hub.
    I did a pip install.

    I cannot seem to use it . Could you write the steps for windows 10 how I can use it

    Normally i run scripts using python xyz.py

    This is not working for me

    1. This is not working for me

      It’s pretty broad statement – what exactly is not working? On linux it installs runnable script detach.py to /usr/bin. Not sure what is happing in Windows because I’m not using it. So look where it gets installed (I think it’s something like C:/Python27/Scripts ) and run it with path. My primary platform is Linux, never tested on Windows so cannot guarantee it’ll work there.

  2. Thank you very much for that fantastic script!!

    It works fine and fullfills al my needs!

    One question: I could not work out how to mark a mail as read or delete it after it has been processed.

    Could you please explain it in an example command line?

    This would help me a lot!!

    Thank you!!

  3. Wow, please forget about my previous question!

    I asked too quick. I have found out by myself (RFM) :).

    But there is another thing I can’s figure out.

    How do I tell the programm to process only unread messages?

    Thank you once again!

    1. use seen variable in filter – you can learn more details following link on the bottom of the article

      1. Thank you!

        I tried for a few hours to use “seen” but I could’nt make it work.

        detach.py -H xxxx:993 -u xxx.com -p xxxxxxx -f /var/opt/processing/{to}/{name} –seen –log-file /var/log/detach.log -v ‘attached’

        This is how far I got but I am in trouble with the syntax of the filtering. I always end up in syntax errors and so on…

        One example would help me a lot. In the above line I would like to process only unseen mails.

        Thank you!!!

    1. You can specify folder to use by parameter –folder, but there is no way to download from all folders in the tool in one go. However you can create shell script to loop through list of known folders.

  4. I tried to execute below code but got an error message as syntax error in imap

    detach.py -H imap.gmail.com -u python37@gmail.com -p Python3.7 -f ~/tmp/attachments/{2018}/{kk@ao.com/{name.subject_section}  -v –threads 5 ‘attached’.

    1. The file name is clearly incorrect – even brackets are not balanced (and   got there by copy&paste error). If file name {x} is place to be replaced by some variable value, not a place for filter!
      If you need to filter mails add it to filter expression (last argument) – so command line should look rather like:

      detach.py -H imap.gmail.com -u python37@gmail.com -p Python3.7 -f ~/tmp/attachments/{name|subject+section} -v –threads 5 ‘attached & year=2018 & from=”kk@ao.com”’

      See link at the end of this article to look for more details of this tool.

      1. Thank You for your immediate response.
        Since I’m new to python still finding it difficult to understand. Currently, I have modified command as per your reply. Please see the below command. What happens is once it got excuted a notepad file gets open with content “#!d:\kk\python_installation\python.exe
        from imap_detach.cmd import main

        if __name__ == ‘__main__':
        main() ”

        Command:
        detach.py -H imap.gmail.com -u python37@gmail.com -p Python3.7 -f ~/tmp/attachmen
        ts/{2018}/{kk2@co.com} -v ‘attached’

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">