From Letterboxd to Day One

I use Day One a lot, jotting down thoughts or memories, or anything that I might forget. It is often useful to be able to remember what I was doing or thinking around a given time in the past.

I also use Letterboxd to track the movies that I watch. Sometimes it is helpful to remember whether or not I have seen a movie and whether or not I enjoyed it. It also helps manage my watchlist and gives me recommendations from friends.

I thought it would be fun to combine the two, so that I can use Day One to remember when or if I saw a movie. Letterboxd has RSS feeds and Day One has a MacOS command line interface (you can install them from a menu item in the app).

So here is a Python script to parse the RSS feed, download the poster, and create a new Day One entry.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70


#!/usr/bin/env python3

from tempfile import NamedTemporaryFile
from xml.etree import ElementTree
import re
import subprocess
import urllib.request

# Letterboxd rejects requests with the "python" user agent.
def curl(url):
    req = urllib.request.Request(url, headers={'User-Agent': 'curl/7.64.1'})
    return urllib.request.urlopen(req)


# Returns None if it can't find an image.
def extract_image_url(html):
    m = re.search(r'img src="([^"]+)"', html)
    return m if m is None else m.group(1)


def create_entry(item):
    # Letterboxd uses a Custom XML namespace for some elements
    ns = {'ns': 'https://letterboxd.com'}

    # Extract the watched date. If there isn't one, I don't want it.
    o = item.find('ns:watchedDate', ns)
    if o is None: return
    date = o.text

    # Get the title and year.
    title = item.find('ns:filmTitle', ns).text
    year = int(item.find('ns:filmYear', ns).text)

    # Get my rating if there is one, and translate it into stars.
    o = item.find('ns:memberRating', ns)
    stars = ''
    if o is not None:
        rating = float(o.text)
        stars = ' - ' + '★' * int(rating)
        if rating != int(rating):
            stars += '½'

    # The RSS description has a poster image.
    o = item.find('description')
    image = o if o is None else extract_image_url(o.text)

    # Prepare the Day One command.
    text = f'Watched {title} ({year}){stars}'
    command = [
            '/usr/local/bin/dayone2', '--journal', 'Media',
            '--date', date, '--all-day', 'new']

    with NamedTemporaryFile('wb', suffix='.jpg', delete_on_close=False) as fh:
        if image:
            fh.write(curl(image).read())
            fh.close()
            command.extend(['--attachments', fh.name])
        subprocess.run(command, input=text, check=True, text=True)


with open('/path/to/state/file', 'r+') as fh:
    already_downloaded = set(fh.read().split('\n'))
    root = ElementTree.parse(curl('https://letterboxd.com/username/rss/'))
    for item in root.findall('./channel/item'):
        guid = item.find('guid').text
        if guid in already_downloaded:
            continue
        create_entry(item)
        fh.write(f'{guid}\n')
        fh.flush()

I created a separate journal in Day One called “Media” for these, so that they can be separate from my normal entries.

The GUID of each entry is written into a state file, so that I don’t download anything more than once. I used 'r+' so that I can both read and write. After reading the entire file, the cursor is at the end, which is where the writes happen. It also requires that the file already exist, which I appreciate because it means if I give it the wrong filename, it will crash instead of creating the entire set of entries again.

NamedTemporaryFile creates and opens a temp file where the poster image can be stored. By setting delete_on_close=False, I can close the file and it stays around until the end of the context block. If you are reading closely, you may notice that it creates and then deletes a temp file even if there is no image. I’m okay with that.

ElementTree does a weird thing where the object returned by item.find() evaluates to False even if it exists. This is why there are a bunch of if o is not None instead of the simpler if o.

Lastly, the RSS feed only has about 25 entries. If you want older data, you’ll have to get that some other way. Letterboxd doesn’t have a publicly available API, but they will give you a CSV file, and each row has a link to the movie page. With a little work (and don’t forget to set the user agent), you can scrape a bunch more posters.

As a bonus, the image view in Day One gives me a nice looking table of my watch history. Here is what I was doing about nine months ago.