Here is my technology setup. Amy always makes fun of me because she thinks I change things around too often. “If the act of optimizing takes more time that the optimization saves, what’s the point?” she often asks. She has a good point. But I’ll continue to ignore it.
At home, I have a 2009 21-inch iMac. For the most part, I stick with the standard tools—I use Safari, Mail, iPhoto, Terminal, and Preview. For Latex, which I use to typeset everything math, I use BBEdit to edit and TeXShop to compile and preview. I use BibDesk to keep track of papers I read, and I read most papers in full-screen Preview, unless I’m reading on my iPad. I also use BBEdit for pretty much all the rest of my text-editing needs.
Some other apps that I love and use often: FaceTime, to talk to family, Caffeine, to keep my Mac awake, and Notational Velocity for small notes. I use Soulver as my calculator and Day One for keeping a journal. Oh, and 1Password for passwords.
On my iPad, I use PDF Expert to read and annotate papers. I use Instapaper, Twitter, and Reeder to keep up with goings-on in the world. I play Carcassone and Lab Solitaire. I use Day One for my journal, the Kindle app to read books, and I use Verbs for instant messaging, mostly with Amy.
I use my iPod touch to listen to podcasts and check my mail or twitter at times when the iPad would be inconvenient. I use TweetBot because I love its “View replies to this tweet” feature. I use OneBusAway when I take the bus. I use Evernote to take pictures of the whiteboard when it looks like it might be worth saving.
Before we talk math, you should know that part of why I care about this is that I take issue with the Malthusian prophecies and general spreading of fear. Many people believe if we don’t take drastic action (on population growth) soon, the fragile planet will collapse under the weight of its huge population. I think this is false, but also misguided. If we are worried about the planet’s resources—and I think we should be—we should be concentrating on limiting our aggregate effect on the environment, insetad of assuming that a smaller population will fix all (or any) of our problems.
I do not think that word means what you think it means
Population growth is exponential. It always has been and always will be. But “exponential” does not mean “huge”. It means that the change in population (number of births minus number of deaths) is a multiple of the current population. This is why we usually talk about population growth in terms of percentages instead of absolute growth.
Of course, when we see a big round number like 7 billion, the percentage talk goes out the window and we start comparing how long it took the world to reach each successive big round number. Did you know that population increased from 6 to 7 billion in less time that it increased from 4 to 5 billion? Therefore population is growing faster now than it was in the 60’s, right?
If we were talking about the number of tater tots produced in a factory, then adding a billion to 4 billion to get 5 billion is exactly the same as adding a billion to 6 billion to get 7 billion. But people aren’t produced by machines. We are the machines. So even though birth rates are lower today than in the 60’s, the population is growing faster. In other words, adding a billion to 4 billion (25%) is harder than adding a billion to 6 billion (17%).
Visualizing the data
Mathematicians would say that in the realm of exponential growth, 6 and 7 billion are closer together than 4 and 5 billion are. Then they would write down some formulas and theorems to indicate exactly what they mean by “closer together”. At the end of the process, you’d get what we call a log graph. This is just like any other graph, except we stretch out the y-axis (by applying the log function) to reflect our new understanding of distance. Here’s what you get.
The graph on the bottom is the growth rate of the population. You see here that population growth peaked in the 60’s and has been decreasing since. You can see this in the population graph as an inflection point, where the graph starts to level out. In fact, the UN predicts that before the end of this century, the graph will hit its maximum (of about 10 billion) and then start to decrease slightly, finally stabilizing at about 9 billion. Of course, this is just an extrapolation of past trends, and no one knows how accurate these predictions will be.
The point I’m trying to make, though, is that it is hard to see these trends by looking at the population graph you usually see. Looking at those graphs, you would say that anyone who believes that world population is “leveling off” is way off track.
It is the people that treat population growth as a giant odometer that are not seeing things clearly.
The graph uses numbers from the U.S. Census Bureau and HYDE, from this wikipedia page. Also, tater tots were invented by my uncle.
I’m spending some time typing up some recent research. Of course, it’s all very mathematical, so I am using Latex. Latex is a markup language, which means that you write the document in a plain text editor, using codes to indicate font changes or things like that. For example, to typeset some text in italics, instead of pushing a button in some program, you type \emph{word} to mean that word should be emphasized (italicized, it turns out). When you are ready to see your document, you run a program which reads in the text file and outputs a PDF file. It is very useful because it creates well-typeset documents and also has features to make typing math really easy.
The rest of this post is meant for people who already know how to use Latex.
One thing that has often bugged me about my Latex workflow is margins. Not margins in the actual printed document, which you don’t really want to change because that would make your lines too long to easily read. No, I’m talking about while I am working on the document, I have a preview window open so I can see what the thing will look like when it’s done, and 40% of the preview window is wasted to margins.
So here is how you remove margins without changing the line length: by changing the paper size. The easiest way is to use the geometry package, which comes standard with any modern Tex distribution. Just place the following in your preamble:
I’ve always loved ASCII. As a kid, I spent a considerable amount of time studying the code chart that was printed in our Epson dot matrix printer’s user manual.
The one thing that I always sort of wondered, but never really asked myself, was “Why do they leave space between the uppercase letters and lowercase letters?” (I’m talking about [, \, ], ^, -, and `.) I thought it was a little annoying, actually, but I never questioned, because that was just the way it was.
I can’t believe that it is only now that I find out that they wanted the lowercase letters and uppercase letters to have only a one bit difference.
For example, the code for N is 4E, and the code for n is 6E. In binary, then, N is 1001110 and n is 1101110. And if you want to change something to all caps? Just change that second 1 to a 0, and you are good.
What I wanted was a way to print git information in a Latex file in a way that (1) doesn’t modify the actual source and (2) degrades gracefully, that is, my document will still compile for someone else, even if they do not do things my way.
Setting up the Latex source
I start by putting the macro \RevisionInfo where I want it in
my Latex source file. I also put a \providecommand command in the
preamble to define the default value and ensure that the document
compiles even when the git information is not given.
With a little effort, you can coax git to output information about the most recent commit in the form you want. For example:
git log -1 --date=short --format=format:\
'\newcommand{\RevisionInfo}{Revision %h on %ad}'
Then you get Latex to put this at the beginning of the source file as you are compiling:
latex $(git log -1 .....)\input{document.tex}
As I said, I only do this if I’m planning on printing or emailing the pdf. The nice thing is that if I’m working on the project with someone else, and they aren’t using git, it doesn’t matter. Everything still works just fine for them, except copies they compile don’t get commit information on them.
Some Related Ideas
Since I use BBEdit to write most of my Latex, it is easy to make a script that will “Typeset inserting git info.”
In the time between when I figured this stuff out and I wrote this post, a
package called gitinfo by Brent Longborough was posted on CTAN. It is
almost exactly what I wanted to do, but in package form. It will also compile
fine even when not inside the repository and it has the added benefit of being
much more automatic (once you set it up). The downside is that whoever compiles
it needs a copy of the gitinfo package.
I am always a little disappointed when I look up the current temperature
on the internet or a weather app. One number can only tell you so much
about what’s going on outside. We try to make up for it by reporting the
high and low temperature for the day, but there’s a lot more to a
function than one data point plus two extreme values. Luckily the
University of Washington records the temperature on the roof of the ATG
every minute and allows you do download it in csv format. From there, a
little messing with gnuplot makes it readable, and I really know what
the temperature is doing. Here’s an example:
The Python script
The Python script downloads the last 12 hours worth of temperature
readings from the University of Washington weather station. The readings
are available as a csv file. The script then extracts the useful
information from the csv file and converts the times into a format that
gnuplot understands. Also, it deals with time zone issues.
It then feeds the data through gnuplot to draw the
graph and outputs the graph to the user. It also caches the graph to
prevent unnecessary strain on my or the weather station’s server.
The gnuplot instruction
The main plot command is
plot"-"using1:2smoothbezierlt3lw2notitle
The "-" means the data file will be given on stdin, but you could also use a filename here. The using 1:2 tells it to use columns 1 and 2 for the x and y data, respectively. Then smooth bezier tells it to smooth the data instead of just connecting all the dots. Color is controlled by lt 3 and line weight by lw 2. Counterintuitively, notitle eliminates the key.
reset# configure svg outputsettermsvgsize600480dynamicfname'Helvetica'# tell it that the x-axis represents timesetxdatatime# set the format of the data filesettimefmt"%Y-%m-%d-%H-%M"# set the format of the axis labelssetformatx"%l%p"# display y-axis labels on the right side with gridlinessety2ticsbordersetnoyticssetgridy2tics# axis labels and plot titlesetxlabel"Time"setylabel"degrees Fahrenheit"settitle"Last 12 hours temperature at UW weather station"# draw the plotplot"-"using1:2smoothbezierlt3lw2notitle
Private key authentication is a way to log into another computer via SSH, and is an alternative to the username/password authentication. It can be more secure, because no one will ever guess your private key, and your private key is never sent over the network, so it cannot be intercepted. It can also be more convenient, because if you don’t assign a password to the private key, you don’t have to type a password to use it.
I create a separate key pair for each computer I use, so that I can always adjust which computers are allowed to log into which computer. I always forget how the ssh-keygen command works, though, and that is the main reason I’m writing this down.
Creating a key pair
The command you want to use is
ssh-keygen -t rsa -b 2048 -C comment
The first two options may be unnecessary because on my computer they are the default values. On at least one of the servers I use, however, they are required. The comment is also unnecessary, but helpful.
Using the keys
If you want to use this key to connect to another computer, that computer needs to have a copy of your public key, usually stored in the file
~/.ssh/authorized_keys.
Once I create a keypair for each computer I use, I copy all the public keys into a subdirectory of ~/.ssh that I call authorized_keys.d. It helps to give each key a more useful name like iMac.pub or office.pub. Then I run
cat authorized_keys.d/* > authorized_keys
Repeat for each host that you want to connect for. The good thing is, if I want to authorize (or unauthorize) another computer, I just add (or remove) the new public key to the directory and rerun this command.
If you want to safely guard your passwords, you should first understand
how your password could be “stolen” or discovered. Here are some
scenarios.
You tell someone.
Oops. Either you actually tell them (be careful who you trust) or you enter it on a phishing site or respond to an email (don’t do it!).
What you can do: protect your passwords by never telling anyone, for any reason. Minimize the potential damage by using different passwords for different sites.
Someone guesses your password.
Maybe they try your phone number or your birthday or something else that they know about you.
What you can do: try to choose passwords that aren’t about you. Choose random words from the dictionary. If your brother could guess in 5 tries what your password is (or all but one letter of your password), then you should use a different password. Not just because your brother might one day try to steal your identity, but because if he knows something about you, then your Facebook friends can probably do too.
Someone steals your password over wireless internet.
There are two main kinds of encryption happening when you use wireless internet. First: if you are visiting a “secure” site, the kind where the URL starts with https, then the stuff you send is encrypted from the moment it leaves your computer until it is received by Google’s or your bank’s computer. Big companies (Facebook, Google, Microsoft, Amazon, your bank) will at the very least make sure your password is sent in this secure method. Often they will encrypt everything you send or receive. Smaller websites may not.
The second encryption happens when you are using secured wireless, the kind where you have to enter a password. In this case everything you do is encrypted from the your computer to the wireless access point.
If you are using unsecured wireless and entering your password into an unsecured site, then anybody on the same wireless network as you could be running a program that intercepts your password and steals it.
What you can do: Don’t mix passwords. If you can’t use a different password for everything, you should at least not mix important passwords (which are likely to be safe by method one) with less important passwords. If you use the same password to log into your bank or email as you do to log into some Harry Potter fan site, you are asking for trouble.
Someone hacks into one of the websites you use and discovers your password.
This is much less likely to be a problem for reputable websites for many reasons.
What you can do: Again, don’t mix passwords. If you are dead-set on using the same password for everything, possibly changing the last number at each website just to make things slightly different, at least increase your password pool to two. Use one password for your bank and email and the other for everything else.
Note: I’m not actually recommending this. I’m saying this is the least you should do.
Summary
Use a complicated password that no one can guess. Make it kind of random, not about you. If they let you, make it a phrase, like “trees eat ice cream.” This is easy to remember, easy to type, and much harder to guess than “(your-middle-name)2!”.
Use different passwords for different places. Even if you have to write it down somewhere. Use 1Password or something similar to keep track of your passwords. Or if you’d rather, write them in a notebook that you keep in that locked desk drawer that you never knew what the lock was for.
I have been using Unison to
sync files for the past several years. It does a great job, and can sync
between Windows, OS X, and Linux computers. Of course, nowadays you can
also use Dropbox for this sort of thing, if you
don’t mind the space constraints and security issues. Allway
sync was once my favorite sync program, but it
only syncs Windows machines. It took a bit to get Unison going, and I
never got the GUI to work, but for the past 3 years it has synched my
files both ways without any problems. I have always used these
binaries. If you are
going to be synching from one computer to another, you will need to
install the same version of Unison on both machines. It syncs via ssh,
and only sends the pieces of the files that have changed. I always run
unison from the command line (usually through a LaunchAgent), as
follows:
-perms 960 This mask is applied to permissions of everything. Note
960=0o1700, so in my case I am making sure that my local files
(which are usually world-readable by default) are only readable by
me on the server.
-auto syncs without asking, unless there are conflicts
-addversionno calls unison-40 instead of unison on the remote
server. I need this because the remote server has a really old
version of unison installed.
-batch -silent Ignores conflicts completely, instead of asking the
user, and prints no output. I only use these in the automated
version that runs once an hour. I rarely (less than twice a year) have
conflicts.
For a while I’ve been wanting to create a private link system. Google
Docs, Dropbox, YouTube, and others all give you the option to make a
file public but “unlisted,” with a long link that no one will likely
guess. You can email the link to others, and no one has to worry about
usernames or passwords. This week I implemented a rudimentary system as
a Python cgi script.
Schematic
Each file is assigned an id. The ids and corresponding filenames are
stored in a text file. When a user requests and id, the Python script
checks if the id is in the table, and, if so, serves up the appropriate
file. If the id does not have a corresponding file, the user gets an
error message.
The id
You can use anything you want here, really. I use a 10-byte id encoded
in base 32 as a length-16 string. You could really use a shorter id and
still be okay. The nice thing about base 32 is that it is URL safe, and
it doesn’t use 0’s, 1’s or 8’s, to avoid confusion with O’s, I’s, and
B’s. You can generate an id using the following code:
I store the ids in a text file that looks something like this
NRTDBP5QYKN3WGYP some-file.pdf
WMADW3QOSHSCATWY another-file.pdf
UEGGUKOMB5FXWNR2 a third file.pdf
Serving up the file
As with any cgi script, you just need to print everything to stdout,
starting with the headers. The headers I want to use are
Content-Type: application/pdf;
Content-Disposition: inline; filename="name of file.pdf";
Content-Length: (size of file in bytes);
You can replace “inline” with “attachment” if you want the browser to
download the file instead of displaying it in the browser. Don’t forget
the quotes around the file name if it has any spaces or special
characters in it. Also, don’t forget to send a blank line after the
headers and before sending the content. Then you finish it off with
So far, the user needs to enter a URL in the form
http://example.com/?id=NRTDBP. With the help of mod_rewrite, we can
accept URLs like http://example.com/NRTDBP. Here is the relevant
.htaccess file, taking into account that the Python script is named
index.cgi.
[As of the most recent upgrade of the Seattle Public Library’s
website, you can no longer access your checkouts or holds by RSS, so
this no longer works. Sad.]
When I was a kid, my mom used to save all of the receipts from the library
and when it was time to take the books back, we would check each one off
to make sure none were left behind. Nowadays, you can just check the
library website, but that can get tedious: log into my account, find out
which books I have checked out, find out which books are on hold, long
out of my account, log into my wife’s account, repeat. And soon my kids
will have accounts too? So much clicking! Ahh! Luckily, the Seattle
Public Library offers both your holds list and your checked-out list in
RSS/XML format. It was not hard to write a script to download
the RSS file, extract the useful information, and display it nicely. For
a long time, I ran this once a day using a LaunchAgent on my home computer. This was inefficient, so I finally decided I should understand how cgi scripting
works, because up till now php was the only web scripting I had done. Of
course, I was embarrassed at how easy
cgi scripting really is.
The Python script
The script uses Feed Parser to parse the
RSS, which makes things easy. The main idea is this:
feed=feedparser.parse("http://example.com/feed/")booklist=feed.entriesforbookinbooklist:printbook.title#the title of the RSS entryprintbook.summary#the summary of the RSS entry
Other than that, the script is doing some basic extraction using
str.find and some list sorting.
Making it work as a cgi
This program is the simplest possible cgi script, because it requires no
input. The idea behind cgi is that everything that the program outputs
is served to the user. The only thing you have to do is begin your
output with an html header like this:
print"Content-Type: text/html; charset=UTF-8\n"
Remember that your header should be followed by a blank line, as above.
Of course, you should also be careful about catching errors so they aren’t
inserted into the html. The script is here:
library.cgi
The computer people just installed an iMac in my office to
replace a very old computer that was running Ubuntu. Unfortunately, they
installed it with a run-of-the-mill keyboard that has the Alt key next
to the space bar and the Windows/Super/Command key between Ctrl and
Alt. My brain can’t handle it, so I started searching for a keyboard
remapper. Eventually I discovered that the ability to remap the modifier
keys is built in to Mac OS X. You just go to System Preferences ->
Keyboard and click “Modifier Keys.”