Nathan Grigg

Margin-free Latex for on-screen viewing

I’m spending some time typing up some recent research. Of course, it’s all very mathematical, so I am using Latex. Latex is a markup language, which means that you write the document in a plain text editor, using codes to indicate font changes or things like that. For example, to typeset some text in italics, instead of pushing a button in some program, you type \emph{word} to mean that word should be emphasized (italicized, it turns out). When you are ready to see your document, you run a program which reads in the text file and outputs a PDF file. It is very useful because it creates well-typeset documents and also has features to make typing math really easy.

The rest of this post is meant for people who already know how to use Latex.

One thing that has often bugged me about my Latex workflow is margins. Not margins in the actual printed document, which you don’t really want to change because that would make your lines too long to easily read. No, I’m talking about while I am working on the document, I have a preview window open so I can see what the thing will look like when it’s done, and 40% of the preview window is wasted to margins.

So here is how you remove margins without changing the line length: by changing the paper size. The easiest way is to use the geometry package, which comes standard with any modern Tex distribution. Just place the following in your preamble:

\usepackage[paperwidth=\textwidth + 50pt,
            paperheight=\textheight + 50pt,
            margin=25pt]{geometry}

Boom. Text the same width and height as before, but with tiny margins and a smaller page.


ASCII

I’ve always loved ASCII. As a kid, I spent a considerable amount of time studying the code chart that was printed in our Epson dot matrix printer’s user manual.

ASCII code chart

The one thing that I always sort of wondered, but never really asked myself, was “Why do they leave space between the uppercase letters and lowercase letters?” (I’m talking about [, \, ], ^, -, and `.) I thought it was a little annoying, actually, but I never questioned, because that was just the way it was.

I can’t believe that it is only now that I find out that they wanted the lowercase letters and uppercase letters to have only a one bit difference. For example, the code for N is 4E, and the code for n is 6E. In binary, then, N is 1001110 and n is 1101110. And if you want to change something to all caps? Just change that second 1 to a 0, and you are good.


Printing git information in Latex

What I wanted was a way to print git information in a Latex file in a way that (1) doesn’t modify the actual source and (2) degrades gracefully, that is, my document will still compile for someone else, even if they do not do things my way.

Setting up the Latex source

I start by putting the macro \RevisionInfo where I want it in my Latex source file. I also put a \providecommand command in the preamble to define the default value and ensure that the document compiles even when the git information is not given.

For example:

\documentclass{amsart}
\providecommand{\RevisionInfo}{}
...
\begin{document}
\maketitle
\RevisionInfo
...
\end{document}

Inserting the git information

With a little effort, you can coax git to output information about the most recent commit in the form you want. For example:

git log -1 --date=short --format=format:\
    '\newcommand{\RevisionInfo}{Revision %h on %ad}'

Then you get Latex to put this at the beginning of the source file as you are compiling:

latex $(git log -1 .....) \input{document.tex}

As I said, I only do this if I’m planning on printing or emailing the pdf. The nice thing is that if I’m working on the project with someone else, and they aren’t using git, it doesn’t matter. Everything still works just fine for them, except copies they compile don’t get commit information on them.

Since I use BBEdit to write most of my Latex, it is easy to make a script that will “Typeset inserting git info.”

In the time between when I figured this stuff out and I wrote this post, a package called gitinfo by Brent Longborough was posted on CTAN. It is almost exactly what I wanted to do, but in package form. It will also compile fine even when not inside the repository and it has the added benefit of being much more automatic (once you set it up). The downside is that whoever compiles it needs a copy of the gitinfo package.


What I want from a weather app

I am always a little disappointed when I look up the current temperature on the internet or a weather app. One number can only tell you so much about what’s going on outside. We try to make up for it by reporting the high and low temperature for the day, but there’s a lot more to a function than one data point plus two extreme values. Luckily the University of Washington records the temperature on the roof of the ATG every minute and allows you do download it in csv format. From there, a little messing with gnuplot makes it readable, and I really know what the temperature is doing. Here’s an example:

Current Weather

The Python script

The Python script downloads the last 12 hours worth of temperature readings from the University of Washington weather station. The readings are available as a csv file. The script then extracts the useful information from the csv file and converts the times into a format that gnuplot understands. Also, it deals with time zone issues. It then feeds the data through gnuplot to draw the graph and outputs the graph to the user. It also caches the graph to prevent unnecessary strain on my or the weather station’s server.

The gnuplot instruction

The main plot command is

plot "-" using 1:2 smooth bezier lt 3 lw 2 notitle

The "-" means the data file will be given on stdin, but you could also use a filename here. The using 1:2 tells it to use columns 1 and 2 for the x and y data, respectively. Then smooth bezier tells it to smooth the data instead of just connecting all the dots. Color is controlled by lt 3 and line weight by lw 2. Counterintuitively, notitle eliminates the key.

Here is the entire gnuplot code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
reset

# configure svg output
set term svg size 600 480 dynamic fname 'Helvetica'

# tell it that the x-axis represents time
set xdata time

# set the format of the data file
set timefmt "%Y-%m-%d-%H-%M"

# set the format of the axis labels
set format x "%l%p"

# display y-axis labels on the right side with gridlines
set y2tics border
set noytics
set grid y2tics

# axis labels and plot title
set xlabel "Time"
set ylabel "degrees Fahrenheit"
set title "Last 12 hours temperature at UW weather station"

# draw the plot
plot "-" using 1:2 smooth bezier lt 3 lw 2 notitle

Private key authentication for ssh using ssh-keygen

Private key authentication is a way to log into another computer via SSH, and is an alternative to the username/password authentication. It can be more secure, because no one will ever guess your private key, and your private key is never sent over the network, so it cannot be intercepted. It can also be more convenient, because if you don’t assign a password to the private key, you don’t have to type a password to use it.

I create a separate key pair for each computer I use, so that I can always adjust which computers are allowed to log into which computer. I always forget how the ssh-keygen command works, though, and that is the main reason I’m writing this down.

Creating a key pair

The command you want to use is

ssh-keygen -t rsa -b 2048 -C comment

The first two options may be unnecessary because on my computer they are the default values. On at least one of the servers I use, however, they are required. The comment is also unnecessary, but helpful.

Using the keys

If you want to use this key to connect to another computer, that computer needs to have a copy of your public key, usually stored in the file ~/.ssh/authorized_keys.

Once I create a keypair for each computer I use, I copy all the public keys into a subdirectory of ~/.ssh that I call authorized_keys.d. It helps to give each key a more useful name like iMac.pub or office.pub. Then I run

cat authorized_keys.d/* > authorized_keys

Repeat for each host that you want to connect for. The good thing is, if I want to authorize (or unauthorize) another computer, I just add (or remove) the new public key to the directory and rerun this command.


What you should know about keeping your passwords safe

If you want to safely guard your passwords, you should first understand how your password could be “stolen” or discovered. Here are some scenarios.

You tell someone.

Oops. Either you actually tell them (be careful who you trust) or you enter it on a phishing site or respond to an email (don’t do it!).

What you can do: protect your passwords by never telling anyone, for any reason. Minimize the potential damage by using different passwords for different sites.

Someone guesses your password.

Maybe they try your phone number or your birthday or something else that they know about you.

What you can do: try to choose passwords that aren’t about you. Choose random words from the dictionary. If your brother could guess in 5 tries what your password is (or all but one letter of your password), then you should use a different password. Not just because your brother might one day try to steal your identity, but because if he knows something about you, then your Facebook friends can probably do too.

Someone steals your password over wireless internet.

There are two main kinds of encryption happening when you use wireless internet. First: if you are visiting a “secure” site, the kind where the URL starts with https, then the stuff you send is encrypted from the moment it leaves your computer until it is received by Google’s or your bank’s computer. Big companies (Facebook, Google, Microsoft, Amazon, your bank) will at the very least make sure your password is sent in this secure method. Often they will encrypt everything you send or receive. Smaller websites may not.

The second encryption happens when you are using secured wireless, the kind where you have to enter a password. In this case everything you do is encrypted from the your computer to the wireless access point.

If you are using unsecured wireless and entering your password into an unsecured site, then anybody on the same wireless network as you could be running a program that intercepts your password and steals it.

What you can do: Don’t mix passwords. If you can’t use a different password for everything, you should at least not mix important passwords (which are likely to be safe by method one) with less important passwords. If you use the same password to log into your bank or email as you do to log into some Harry Potter fan site, you are asking for trouble.

Someone hacks into one of the websites you use and discovers your password.

This is much less likely to be a problem for reputable websites for many reasons.

What you can do: Again, don’t mix passwords. If you are dead-set on using the same password for everything, possibly changing the last number at each website just to make things slightly different, at least increase your password pool to two. Use one password for your bank and email and the other for everything else.

Note: I’m not actually recommending this. I’m saying this is the least you should do.

Summary

Use a complicated password that no one can guess. Make it kind of random, not about you. If they let you, make it a phrase, like “trees eat ice cream.” This is easy to remember, easy to type, and much harder to guess than “(your-middle-name)2!”.

Use different passwords for different places. Even if you have to write it down somewhere. Use 1Password or something similar to keep track of your passwords. Or if you’d rather, write them in a notebook that you keep in that locked desk drawer that you never knew what the lock was for.


Using Unison to sync files

I have been using Unison to sync files for the past several years. It does a great job, and can sync between Windows, OS X, and Linux computers. Of course, nowadays you can also use Dropbox for this sort of thing, if you don’t mind the space constraints and security issues. Allway sync was once my favorite sync program, but it only syncs Windows machines. It took a bit to get Unison going, and I never got the GUI to work, but for the past 3 years it has synched my files both ways without any problems. I have always used these binaries. If you are going to be synching from one computer to another, you will need to install the same version of Unison on both machines. It syncs via ssh, and only sends the pieces of the files that have changed. I always run unison from the command line (usually through a LaunchAgent), as follows:

unison -options /local/folder ssh://remote.host/path

The options I use are


Creating private links to files using Python

For a while I’ve been wanting to create a private link system. Google Docs, Dropbox, YouTube, and others all give you the option to make a file public but “unlisted,” with a long link that no one will likely guess. You can email the link to others, and no one has to worry about usernames or passwords. This week I implemented a rudimentary system as a Python cgi script.

Schematic

Each file is assigned an id. The ids and corresponding filenames are stored in a text file. When a user requests and id, the Python script checks if the id is in the table, and, if so, serves up the appropriate file. If the id does not have a corresponding file, the user gets an error message.

The id

You can use anything you want here, really. I use a 10-byte id encoded in base 32 as a length-16 string. You could really use a shorter id and still be okay. The nice thing about base 32 is that it is URL safe, and it doesn’t use 0’s, 1’s or 8’s, to avoid confusion with O’s, I’s, and B’s. You can generate an id using the following code:

import os,base64
id = base64.b32encode(os.urandom(10))

I store the ids in a text file that looks something like this

NRTDBP5QYKN3WGYP some-file.pdf
WMADW3QOSHSCATWY another-file.pdf
UEGGUKOMB5FXWNR2 a third file.pdf

Serving up the file

As with any cgi script, you just need to print everything to stdout, starting with the headers. The headers I want to use are

Content-Type: application/pdf;
Content-Disposition: inline; filename="name of file.pdf";
Content-Length: (size of file in bytes);

You can replace “inline” with “attachment” if you want the browser to download the file instead of displaying it in the browser. Don’t forget the quotes around the file name if it has any spaces or special characters in it. Also, don’t forget to send a blank line after the headers and before sending the content. Then you finish it off with

print file.read()

The script is here: private-link.py

A little mod_rewrite

So far, the user needs to enter a URL in the form http://example.com/?id=NRTDBP. With the help of mod_rewrite, we can accept URLs like http://example.com/NRTDBP. Here is the relevant .htaccess file, taking into account that the Python script is named index.cgi.

RewriteEngine On
RewriteBase /path/to/folder/
RewriteRule ^index.cgi - [L]
RewriteRule ^([A-Z0-9a-z]+)/?$ index\.cgi?id=$1 [L]

If you are confused about the last line, here some help on regular expressions.


Managing my library books with a Python script

(updated )

[As of the most recent upgrade of the Seattle Public Library’s website, you can no longer access your checkouts or holds by RSS, so this no longer works. Sad.]

Seattle Public Library

When I was a kid, my mom used to save all of the receipts from the library and when it was time to take the books back, we would check each one off to make sure none were left behind. Nowadays, you can just check the library website, but that can get tedious: log into my account, find out which books I have checked out, find out which books are on hold, long out of my account, log into my wife’s account, repeat. And soon my kids will have accounts too? So much clicking! Ahh! Luckily, the Seattle Public Library offers both your holds list and your checked-out list in RSS/XML format. It was not hard to write a script to download the RSS file, extract the useful information, and display it nicely. For a long time, I ran this once a day using a LaunchAgent on my home computer. This was inefficient, so I finally decided I should understand how cgi scripting works, because up till now php was the only web scripting I had done. Of course, I was embarrassed at how easy cgi scripting really is.

The Python script

The script uses Feed Parser to parse the RSS, which makes things easy. The main idea is this:

feed = feedparser.parse("http://example.com/feed/")
booklist = feed.entries
for book in booklist:
    print book.title    #the title of the RSS entry
    print book.summary  #the summary of the RSS entry

Other than that, the script is doing some basic extraction using str.find and some list sorting.

Making it work as a cgi

This program is the simplest possible cgi script, because it requires no input. The idea behind cgi is that everything that the program outputs is served to the user. The only thing you have to do is begin your output with an html header like this:

print "Content-Type: text/html; charset=UTF-8\n"

Remember that your header should be followed by a blank line, as above. Of course, you should also be careful about catching errors so they aren’t inserted into the html. The script is here: library.cgi


Remapping modifier keys in Mac OS X

I feel dumb.

The computer people just installed an iMac in my office to replace a very old computer that was running Ubuntu. Unfortunately, they installed it with a run-of-the-mill keyboard that has the Alt key next to the space bar and the Windows/Super/Command key between Ctrl and Alt. My brain can’t handle it, so I started searching for a keyboard remapper. Eventually I discovered that the ability to remap the modifier keys is built in to Mac OS X. You just go to System Preferences -> Keyboard and click “Modifier Keys.”