Nathan Grigg

Removing Latex log and auxiliary files

(updated

How to write a shell script to delete Latex log files. Also, why you should think about using zsh. [Update: In addition, I reveal my complete ignorance of Bash. See the note at the end.]

I try not to write a lot of shell scripts, because they get long and complicated quickly and they are a pain to debug. I made an exception recently because Latex auxiliary files were annoying me, and a zsh script seemed to be a better match than Python for what I wanted to do. Of course, by the time I was finished adding in the different options I wanted, Python may have been the better choice. Oh well.

For a long time I have had an alias named rmtex which essentially did rm *.aux *.log *.out *.synctex.gz to rid the current directory of Latex droppings. This is a dangerous alias because it assumes that all *.log files in the directory come from Latex files and are thus unimportant. But I’m careful and have never accidentally deleted anything (at least not in this way). What I really wanted, though, was a way to make rmtex recurse through subsirectories, which requires more safety.

Here is what I came up with. (I warned you it was long!) I will point out some of the key points, especially the useful things that zsh provides.

#!/usr/local/bin/zsh

# suppress error message on nonmatching globs
setopt local_options no_nomatch

USAGE='USAGE: rmtex [-r] [-a] [foo]

Argument:
    [foo]   file or folder (default: current directory)

Options:
    [-h]    Show help and exit
    [-r]    Recurse through directories
    [-a]    Include files that do not have an associated tex file
    [-n]    Dry run
    [-v]    Verbose
'

# Option defaults
folders=(.)
recurse=false
all=false
dryrun=false
verb=false
exts=(aux synctex.gz log out)

# Process options
while getopts ":ranvh" opt; do
    case $opt in
    r)
        recurse=true
        ;;
    a)
        all=true
        ;;
    n)
        dryrun=true
        verb=true
        ;;
    v)
        verb=true
        ;;
    h)
        echo $USAGE
        exit 0
        ;;
    \?)
        echo "rmtex: Invalid option: -$OPTARG" >&2
        exit 1
        ;;
    esac
done

# clear the options from the argument string
shift $((OPTIND-1))

# set the folders or files if given as arguments
if [ $# -gt 0 ]; then
    folders=$@
fi

# this function performs the rm and prints the verbose messages
function my_rm {
    if $verb; then
        for my_rm_g in $1; do
            if [ -f $my_rm_g ]; then
                echo rm $my_rm_g
            fi
        done
    fi

    if ! $dryrun; then
        rm -f $1
    fi
}

# if all, then just do the removing without checking for the tex file
if $all; then
    for folder in $folders; do
        if [[ -d $folder ]]; then
            if $recurse; then
                for ext in $exts; my_rm $folder/**/*.$ext
            else
                for ext in $exts; my_rm $folder/*.$ext
            fi
        else
            # handle the case that they gave a file rather than folder
            for ext in $exts; my_rm "${folder%%.tex}".$ext
        fi
    done

else
    # loop through folders
    for folder in $folders; do
        # set list of tex files inside folder
        if [[ -d $folder ]]; then
            if $recurse; then
                files=($folder/**/*.tex)
            else
                files=($folder/*.tex)
            fi
        else
            # handle the case the the "folder" is actually a single file
            files=($folder)
        fi
        for f in $files; do
            for ext in $exts; do
                my_rm "${f%%.tex}".$ext
            done
        done
    done
fi

# print a reminder at the end of a dry run
if $dryrun; then
    echo "(Dry run)"
fi

It starts out nice and easy with a usage message. (Always include a usage message!) Then it processes the options using getopts.

Zsh has arrays! Notice line 20 defines the default $folders variable to be an array containing only the current directory. Similarly, line 25 defines the extensions we are going to delete, again using an array.

On the subject of arrays, notice that $@ in line 59, which represents the entire list of arguments passed to rmtex, is also an array. So you don’t have to worry about writing "$@" to account for arguments with spaces, like you would have to in Bash.

Lines 63 to 75 define a function my_rm which runs rm, but optionally prints the name of each file that it is deleting. It also allows a “dry run” mode.

On to the deleting. First I handle the dangerous case, which is when the -a option is given. This deletes all files of the given extensions, like my old alias. Notice the extremely useful zsh glob in line 82. The double star means to look in all subdirectories for a match. This is one of the most useful features of zsh and keeps me away from unnecessary use of find.

In lines 93 through 117, I treat the default case. The $files variable is set to an array of all the .tex files in a given folder, optionally using the double star to recurse through subdirectories. We will only delete auxiliary files that live in the same directory as a tex file of the same name. Notice lines 98 and 100, where the arrays are defined using globs.

In line 108, I delete each file using the substitution command ${f%%.tex} which removes the .tex extension from $f so I can replace it with the extension to be deleted. This syntax is also available in Bash.

My most common use of this is as rmtex -r to clean up a tree full of class notes, exams, and quizzes that I have been working on, so that I can find the PDF files more easily. If I’m feeling especially obsessive, I can always run rmtex -r ~, which takes a couple of minutes but leaves everything squeaky clean.

[Update: While zsh is the shell where I learned how to use arrays and advanced globs, that doesn’t mean that Bash doesn’t have the same capabilities. Turns out I should have done some Bash research.

Bash has arrays too! Arrays can be defined by globs, just as in zsh. The syntax is slightly different, but works just the same. Version 4 of Bash can even use ** for recursive globbing.

Thanks to John Purnell for the very gracious email. My horizons are expanded.]