Nathan Grigg

From Letterboxd to Day One

I use Day One a lot, jotting down thoughts or memories, or anything that I might forget. It is often useful to be able to remember what I was doing or thinking around a given time in the past.

I also use Letterboxd to track the movies that I watch. Sometimes it is helpful to remember whether or not I have seen a movie and whether or not I enjoyed it. It also helps manage my watchlist and gives me recommendations from friends.

I thought it would be fun to combine the two, so that I can use Day One to remember when or if I saw a movie. Letterboxd has RSS feeds and Day One has a MacOS command line interface (you can install them from a menu item in the app).

So here is a Python script to parse the RSS feed, download the poster, and create a new Day One entry.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#!/usr/bin/env python3

from tempfile import NamedTemporaryFile
from xml.etree import ElementTree
import re
import subprocess
import urllib.request

# Letterboxd rejects requests with the "python" user agent.
def curl(url):
    req = urllib.request.Request(url, headers={'User-Agent': 'curl/7.64.1'})
    return urllib.request.urlopen(req)


# Returns None if it can't find an image.
def extract_image_url(html):
    m = re.search(r'img src="([^"]+)"', html)
    return m if m is None else m.group(1)


def create_entry(item):
    # Letterboxd uses a Custom XML namespace for some elements
    ns = {'ns': 'https://letterboxd.com'}

    # Extract the watched date. If there isn't one, I don't want it.
    o = item.find('ns:watchedDate', ns)
    if o is None: return
    date = o.text

    # Get the title and year.
    title = item.find('ns:filmTitle', ns).text
    year = int(item.find('ns:filmYear', ns).text)

    # Get my rating if there is one, and translate it into stars.
    o = item.find('ns:memberRating', ns)
    stars = ''
    if o is not None:
        rating = float(o.text)
        stars = ' - ' + '★' * int(rating)
        if rating != int(rating):
            stars += '½'

    # The RSS description has a poster image.
    o = item.find('description')
    image = o if o is None else extract_image_url(o.text)

    # Prepare the Day One command.
    text = f'Watched {title} ({year}){stars}'
    command = [
            '/usr/local/bin/dayone2', '--journal', 'Media',
            '--date', date, '--all-day', 'new']

    with NamedTemporaryFile('wb', suffix='.jpg', delete_on_close=False) as fh:
        if image:
            fh.write(curl(image).read())
            fh.close()
            command.extend(['--attachments', fh.name])
        subprocess.run(command, input=text, check=True, text=True)


with open('/path/to/state/file', 'r+') as fh:
    already_downloaded = set(fh.read().split('\n'))
    root = ElementTree.parse(curl('https://letterboxd.com/username/rss/'))
    for item in root.findall('./channel/item'):
        guid = item.find('guid').text
        if guid in already_downloaded:
            continue
        create_entry(item)
        fh.write(f'{guid}\n')
        fh.flush()

I created a separate journal in Day One called “Media” for these, so that they can be separate from my normal entries.

The GUID of each entry is written into a state file, so that I don’t download anything more than once. I used 'r+' so that I can both read and write. After reading the entire file, the cursor is at the end, which is where the writes happen. It also requires that the file already exist, which I appreciate because it means if I give it the wrong filename, it will crash instead of creating the entire set of entries again.

NamedTemporaryFile creates and opens a temp file where the poster image can be stored. By setting delete_on_close=False, I can close the file and it stays around until the end of the context block. If you are reading closely, you may notice that it creates and then deletes a temp file even if there is no image. I’m okay with that.

ElementTree does a weird thing where the object returned by item.find() evaluates to False even if it exists. This is why there are a bunch of if o is not None instead of the simpler if o.

Lastly, the RSS feed only has about 25 entries. If you want older data, you’ll have to get that some other way. Letterboxd doesn’t have a publicly available API, but they will give you a CSV file, and each row has a link to the movie page. With a little work (and don’t forget to set the user agent), you can scrape a bunch more posters.

As a bonus, the image view in Day One gives me a nice looking table of my watch history. Here is what I was doing about nine months ago.


It’s sad to see so many people going along with or cheering on Trump’s inhumane and cruel policies. I like to imagine that someday they’ll realize what he is and feel shame for their support, but who am I kidding? Even those who were on the side of racism during the civil rights movement never felt a bit of remorse.

Also going through my head this morning:

I hope you’re proud how you would grovel in submission
To feed your own ambition
So though I can’t imagine how
I hope you’re happy now


I’m trying a couple of new things on this site.

One is to be a little less guarded about myself. I’m sometimes afraid to overshare and so end up overly reserved. I’ve started with a more comprehensive and personal about me page.

I’ve been posting some photos using the Glass app over the last couple of years, and I’ve now copied them here as well, along with an RSS feed.


More Charging Plots

Since I last analyzed my electricity usage two years ago, several things have changed:

  1. I had a 240V charger installed, which speeds up charge time and draws more power.
  2. My electric company changed the XML data slightly, with an entry every 15 minutes instead of 60, and with entries for power returned to the grid. For me, the latter are always zero because my house does not generate electricity.
  3. We replaced our family-hauling gas minivan with an electric vehicle, so now there are two cars to charge. This happened at the end of the year, so you don’t really see it in the data yet.

To extract the data in Python, using the built-in xml.etree module to convert to a Pandas series, I reused most of the code from last time:

from xml.etree import ElementTree
import datetime
import pandas as pd

ns = {'atom': 'http://www.w3.org/2005/Atom', 'espi': 'http://naesb.org/espi'}
root = ElementTree.parse('/path/to/file.xml').getroot()
prefix = "./atom:entry/atom:content/espi:IntervalBlock[@rel='Delivered']/espi:IntervalReading"
times = [datetime.datetime.fromtimestamp(int(x.text))
         for x in root.findall(prefix + "/espi:timePeriod/espi:start", ns)]
values = [float(x.text)
          for x in root.findall(prefix + "/espi:value", ns)]
ts = pd.Series(values, index=times).sort_index()

The main difference is the addition of [@rel='Delivered'] to filter to only power delivered to me and not the other way around. I also added the sort_index command, because for some reason the dates are not entirely in order in the XML.

At this point, I wanted to determine when I was charging a car. Charge loads are probably pretty easy to isolate, because they last for a long time and are relatively constant. If I were trying to be robust, I would probably figure out what expected power draw for a given time of year and time of day, and then find out periods where the draw is significantly higher than that. Using something simple like the median of the surrounding 14 days of a given time would probably work, since I charge less than half of the days.

But in my case, the 7.5 kW of power that our electric Mini draws is more than our entire house uses over any 30-minute period. There are five 15-minute periods that reach that level, but these are relatively easy to filter out.

I wrote this code to compute the charge state. I wanted to separate it into cycles of “off”, “start”, “on”, and “stop”. My thinking was that these “start” and “stop” periods are probably times where I was charging the car for some but not all of the 15-minute period. I used a threshold of 1800 Wh, which is 7.2 kW over a 15-minute period.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
THRESHOLD = 1800
charge_state = pd.Series('off', index=ts.index)
charging = False
for i in range(len(ts)):
    if charging:
        if ts.iloc[i] >= THRESHOLD:
            charge_state.iloc[i] = 'on'
        else:
            charge_state.iloc[i] = 'stop'
            charging = False
    # Look at the two entries after this one to see if
    # they are both above the threshold.
    elif all(ts.iloc[i+1:i+3] >= THRESHOLD) and i+1 != len(ts):
        charge_state.iloc[i] = 'start'
        charging = True

Line 2 creates a new series with the same index as our time series. We then look at the entries one by one and determine when to transition to “start” (Line 13, if we are not already charging and we see two upcoming entries above the threshold), when to stay “on” (Line 6, as long as we stay above the threshold), when to transition to “stop” (Line 8, as soon as we first go below the threshold). Note that Pandas uses iloc to look up an entry by integer offset, rather than by time.

With this charge_state series, it is easy to play around with the data. For example, to count how many charge sessions:

sum(charge_state == 'start')

To look at the entries where usage is high but you aren’t charging. This means “filter ts to points where it is above the threshold but charge_state is off.”

ts[(ts > THRESHOLD) & (charge_state == 'off')]

Finally, a good visualization is always helpful to understand the data. I don’t usually use much ChatGPT while programming, because at work I am usually dealing with complicated code that I don’t want to mess up. But it is impossible to remember how to do anything in Matplotlib, and I confess that I asked ChatGPT for a lot of help, and it did a pretty good job most of the time.

Here my goal is to draw dots at start and stop time for each charge, and connect them with a line. I really just have three arrays here:

  1. start_time is the full date and time when I started charging. This is used as the x axis.
  2. start_hour is the time of day when I started charging.
  3. charge_hours is the number of hours that I charged.

Note that since charging often happens overnight, I’m using the end time as start_hour + charge_hours, which might be greater than 24, but I think that makes a better visualization than wrapping around to the bottom.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import pyplot as plt

start_time = ts[charge_state == 'start'].index
start_hour = start_time.hour + start_time.minute / 60
charge_hours = (
    ts[charge_state == 'stop'].index -
    ts[charge_state == 'start'].index).total_seconds() / 3600

fig, ax = plt.subplots(figsize=(15, 8))
# The bars
bars = ax.bar(start_time, charge_hours, bottom=start_hour, color='b')
# Draw the number on top of the bar
ax.bar_label(bars, labels=[f'{h:.0f}' for h in charge_hours], padding=4)
# Bottom dot
ax.scatter(start_time, start_hour, marker='.', color='b')
# Top dot
ax.scatter(start_time, start_hour+charge_hours, marker='.', color='b')
# Add a grid and make it denser than it wants to be.
plt.grid(True)
plt.yticks(np.arange(0, 29, 2));

And here is the final result. The Mini only has a 32 kWh battery, so can always fill in four hours or so. The longer lines from December are for the new car, which has triple the battery size, but also can max out the 50-amp circuit that my charger is on by pulling 9.5 kW. (If you do the math, that is only 40 amps, because code require that a continuous load uses only 80% of the rated amperage.)

The Mini used to be scheduled to start at 11 p.m., because it draws 30 amps on our 100 amp service, and I was afraid that if I charged it while everyone was still awake it might trip the breaker. In November, I decided to stop babying our house, and scheduled the charge to start at 9:15 instead. Cheap electricity starts at 9:00.

Also, a quirk of the Mini app is that if you plug it in on a Saturday night (the way my schedule is set), it won’t start charging until Sunday morning at 3:00. That is a long story for another time.


The back side of a clock from inside Musée d’Orsay. In the distance, the Louvre.


Welcoming courtyard


Ancient Roman Arena, Arles, France


The Camargues, France


Cloître Saint-Trophime, Arles, France


View of the river from Pont du Gard, France


Pont Neuf and Square du Vert-Galant, Paris


Boats along the Seine


New growth


Scooter in the rain


The DMV

I had the “pleasure” of visiting the DMV this week to apply for a driving permit for my oldest kid. By a miraculous sequence of events, we got the permit in a single visit, but it was a close call.

Because the permit will eventually turn into a driver’s license and therefore a REAL ID, the application required two different documents to provide proof of address. There is a long list of valid documents, such as utility bills and property tax bills, and furthermore the DMV recognizes that not everyone living at an address receives such bills:

What if I do not have one of the above residency documents?

You can use a relative’s (parent, child, spouse/domestic partner) residency document if you live at the same address and provide a document (such as a birth or marriage certifcate [sic]) that shows that relationship.

It all seems reasonable enough, but the rules are implemented like a poorly-written computer program.

The question is, can I use my driver’s license (together with a birth certificate) as proof of my teen’s residency? In theory, this should count as definitive proof of address, since they required me to show two address documents in order to receive the license in the first place. At the very least, it should count as one of the two factors, at least as valid as a SoCalGas bill that anyone with a basic PDF editor could easily doctor.

In practice, as you have probably guessed, it counts as nothing. Why? Because the main list of documents is written assuming that they are in the name of the applicant, and this “relative’s residency document” special case is tacked on at the end. And of course, it would be silly to say that you could use your current REAL ID as proof of address to get a REAL ID, so thus you cannot use a relative’s REAL ID as proof of address to get your REAL ID.

Being a paranoid person, I brought two documents in addition to my driver’s license, but even that was almost not enough. See, my address can be written as either 221B Baker St or 221 Baker St #B. The two bills that I brought didn’t match, which (1) was apparently a problem and (2) my driver’s license wasn’t going to get me out of it. The only thing that saved me (this is the miraculous part) was that one of the two bills had the address written both ways.

(For completeness, two other miracles. One, that my kid passed the ridiculous written exam on the first try. A test that did have a question about NEVs without explaining the acronym, and is known for questions like “In which of these locations is it illegal to park? (a) blocking an unmarked crosswalk (b) in a bicycle lane or (c) within three feet of a driveway.” The answer is (a). Nobody knows why. The second miracle is that my teen even got to take the test in the first place, because the DMV shut down the testing center at 4:30 on the dot, sending away everyone who was in line at the time. Credit for this miracle goes to the employee who processed our application, because she shut down her station and went over to the photo station to clear out the queue, getting us through and into the testing center with less than a minute to spare. At the time, we had no idea that we were up against a clock, but I’m pretty sure that she knew and intervened.)

Anyway, now it is time for 50 hours of supervised (by me) driving practice. Wish us luck!