Nathan Grigg


Fastmail JMAP backup

(updated )

[Update: Since I first wrote this, Fastmail switched from using HTTP BasicAuth to Bearer Authorization. I have updated the script to match.]

I use Fastmail for my personal email, and I like to keep a backup of my email on my personal computer. Why make a backup? When I am done reading or replying to an email, I make a split-second decision on whether to delete or archive it on Fastmail’s server. If it turns out I deleted something that I need later, I can always look in my backup. The backup also predates my use of Fastmail and serves as a service-independent store of my email.

My old method of backing up the email was to forward all my email to a Gmail account, then use POP to download the email with a hacked-together script. This had the added benefit that the Gmail account also served as a searchable backup.

Unfortunately the Gmail account ran out of storage and the POP script kept hanging for some reason, which together motivated me to get away from this convoluted backup strategy.

The replacement script uses JMAP to connect directly to Fastmail and download all messages. It is intended to run periodically, and what it does is pick an end time 24 hours in the past, download all email older than that, and then record the end time. The next time it runs, it searches for mail between the previous end time and a new end time, which is again 24 hours in the past.

Why pick a time in the past? Well, I’m not confident that if you search up until this exact moment, you are guaranteed to get every message. A message could come in, then two seconds later you send a query, but it hits a server that doesn’t know about your message yet. I’m sure an hour is more than enough leeway, but since this is a backup, we might as well make it a 24-hour delay.

Note that I am querying all mail, regardless of which mailbox it is in, so even if I have put a message in the trash, my backup script will find it and download it.

JMAP is a modern JSON-based replacement for IMAP and much easier to use, such that the entire script is 140 lines, even with my not-exactly-terse use of Python.

Here is the script, with some notes below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
import argparse
import collections
import datetime
import os
import requests
import string
import sys
import yaml

Session = collections.namedtuple('Session', 'headers account_id api_url download_template')


def get_session(token):
    headers = {'Authorization': 'Bearer ' + token}
    r = requests.get('https://api.fastmail.com/.well-known/jmap', headers=headers)
    [account_id] = list(r.json()['accounts'])
    api_url = r.json()['apiUrl']
    download_template = r.json()['downloadUrl']
    return Session(headers, account_id, api_url, download_template)


Email = collections.namedtuple('Email', 'id blob_id date subject')


def query(session, start, end):
    json_request = {
        'using': ['urn:ietf:params:jmap:core', 'urn:ietf:params:jmap:mail'],
        'methodCalls': [
            [
                'Email/query',
                {
                    'accountId': session.account_id,
                    'sort': [{'property': 'receivedAt', 'isAscending': False}],
                    'filter': {
                        'after': start.isoformat() + 'Z',
                        'before': end.isoformat() + 'Z',
                    },
                    'limit': 50,
                },
                '0',
            ],
            [
                'Email/get',
                {
                    'accountId': session.account_id,
                    '#ids': {
                        'name': 'Email/query',
                        'path': '/ids/*',
                        'resultOf': '0',
                    },
                    'properties': ['blobId', 'receivedAt', 'subject'],
                },
                '1',
            ],
        ],
    }

    while True:
        full_response = requests.post(
            session.api_url, json=json_request, headers=session.headers
        ).json()

        if any(x[0].lower() == 'error' for x in full_response['methodResponses']):
            sys.exit(f'Error received from server: {full_response!r}')

        response = [x[1] for x in full_response['methodResponses']]

        if not response[0]['ids']:
            return

        for item in response[1]['list']:
            date = datetime.datetime.fromisoformat(item['receivedAt'].rstrip('Z'))
            yield Email(item['id'], item['blobId'], date, item['subject'])

        # Set anchor to get the next set of emails.
        query_request = json_request['methodCalls'][0][1]
        query_request['anchor'] = response[0]['ids'][-1]
        query_request['anchorOffset'] = 1


def email_filename(email):
    subject = (
            email.subject.translate(str.maketrans('', '', string.punctuation))[:50]
            if email.subject else '')
    date = email.date.strftime('%Y%m%d_%H%M%S')
    return f'{date}_{email.id}_{subject.strip()}.eml'


def download_email(session, email, folder):
    r = requests.get(
        session.download_template.format(
            accountId=session.account_id,
            blobId=email.blob_id,
            name='email',
            type='application/octet-stream',
        ),
        headers=session.headers,
    )

    with open(os.path.join(folder, email_filename(email)), 'wb') as fh:
        fh.write(r.content)


if __name__ == '__main__':
    # Parse args.
    parser = argparse.ArgumentParser(description='Backup jmap mail')
    parser.add_argument('--config', help='Path to config file', nargs=1)
    args = parser.parse_args()

    # Read config.
    with open(args.config[0], 'r') as fh:
        config = yaml.safe_load(fh)

    # Compute window.
    session = get_session(config['token'])
    delay_hours = config.get('delay_hours', 24)

    end_window = datetime.datetime.utcnow().replace(microsecond=0) - datetime.timedelta(
        hours=delay_hours
    )

    # On first run, 'last_end_time' wont exist; download the most recent week.
    start_window = config.get('last_end_time', end_window - datetime.timedelta(weeks=1))

    folder = config['folder']

    # Do backup.
    num_results = 0
    for email in query(session, start_window, end_window):
        # We want our search window to be exclusive of the right endpoint.
        # It should be this way in the server, according to the spec, but
        # Fastmail's query implementation is inclusive of both endpoints.
        if email.date == end_window:
            continue
        download_email(session, email, folder)
        num_results += 1
    print(f'Archived {num_results} emails')

    # Write config
    config['last_end_time'] = end_window
    with open(args.config[0], 'w') as fh:
        yaml.dump(config, fh)

The get_session function is run once at the beginning of the script, and fetches some important data from the server including the account ID and a URLs to use.

The query function does the bulk of the work, sending a single JSON request multiple times to page through the search results. It is actually a two-part request, first Email/query, which returns a list of ids, and then Email/get, which gets some email metadata for each result. I wrote this as a generator to make the main part of my script simpler. The paging is performed by capturing the ID of the final result of one query, and asking the next query to start at that position plus one (lines 77-78). We are done when the query returns no results (line 69).

The download_email function uses the blob ID to fetch the entire email and saves it to disk. This doesn’t really need to be its own function, but it will help if I later decide to use multiple threads to do the downloading.

Finally, the main part of the script reads configuration from a YAML file, including the last end time. It loops through the results of query, calling download_email on each result. Finally, it writes the configuration data back out to the YAML file, including the updated last_end_time.

To run this, you will need to first populate a config file with the destination folder and your API token, like this:

token: ffmu-xxxxx-your-token-here
folder: /path/to/destination/folder

You will also need to install the ‘requests’ and ‘pyyaml’ packages using python -m pip install requests pyyaml. Copy the above script onto your computer and run it using python script.py --config=config_file. Note that everything here uses Python 3, so you may have to replace ‘python’ with ‘python3’ in these commands.


To be demolished






Productive couple of days for my rather neglected Linode instance. Upgraded the distro from Ubuntu 14.04 to Debian 10. Moved DNS from Amazon to Google. Moved various static sites from S3 to Linode. Somehow it all still works.


Reading feeds in a world of newsletters

I understand the popularity of email newsletters, especially for publishers. It’s a simple way to get paid content out, easier for users than a private RSS feed. But that doesn’t mean I want to read newsletters in my email app.

Feedbin, which I am already using for my regular RSS subscriptions, bridges the gap. As part of my Feedbin account, I get a secret email address, and anything sent to that address ends up in my RSS reader. Problem solved!

But it quickly gets annoying to sign up for newsletters (often creating an account) with an email address that is neither memorable nor truly mine. Fastmail, which I am already using for my regular email, makes it easy to find specified emails sent to my regular address, forward them to my feedbin address, and put the original in the trash.

In fact, Fastmail lets me use “from a member of a given contact group” as the trigger for this automatic rule, which makes the setup for a new newsletter very simple:

  1. Subscribe to the newsletter
  2. Add the sender to my Fastmail address book
  3. Add the newly created contact to my “Feedbin” group

This is very convenient, for newsletters as well as other mail that is more of a notification than an email. Here are some of the emails that I now read as though they were feeds:


I installed homebridge on my Synology so that I could connect my thermostat with HomeKit. Surprisingly easy, minus a few rounds of trial and error on getting the configs right.


In the most recent iOS update, Files app can render html pages. (Before then, only iPadOS could do this.) So now I have an easy way to periodically generate a dashboard or report on my iMac and have it sync via iCloud to my mobile devices.


Maybe Washington Post should separate headlines more clearly; this one had me scratching my head for a while.


I literally bought new AirPods yesterday.


I highly recommend Footpath for planning a running route. Draw your route with your finger and you get mileage and elevation profile.


We have been having so much fun with Zelda at our house. I would usually consider myself a casual gamer, but I can’t put this down.