Saturday, May 17, 2008

bash history forever

In what seems eerily like a burst of structured procrastination I recently got around to a task I'd been meaning to do for ages: infinite, global bash history across all of my shells.
I started by reading over how the history mechanism in bash is supposed to work, and then a co-worker and I set about writing the simplest hack that could actually record all my command lines and aggregate them for me.
Step 1 is to make sure that every typed command is recorded somehow. Simply extending my PROMPT_COMMAND to be a function that would call other functions is a good starting place:

export PROMPT_COMMAND='process_prompt'

function process_prompt {
write_last_command
write_global_history
read_global_history
xsettitle $HN $(shortenpath $PWD)
}

I'm going to ignore the titlebar setting stuff as it's not related to this post. As you can see, the basic structure at the top level is to record the last command typed in the current shell; recompute the global history; and then load the global history into this shell. Inefficient? Horribly so. Can I tell? Not on today's monster workstations. Now, to record the commands of the current shell requires a file to write to:

COMMANDFILE=~/.histories/$HN.$$
trap "rm -f $COMMANDFILE" INT TERM EXIT

where in the above snippet $HN is my hostname after processing. If you're actually following along you will also want a some lines like this:

export HN=$(hostname)
HN=${HN%%.*}

to set the hostname. The commands from each file, then, are to be written to my home directory in a file named .histories/hostname.processid that will get deleted when the current shell exits. The actual writing of each command line is, amusingly enough, accomplished via the history bash built-in:

function write_last_command {
echo $(date +%s).$(history 1 | sed -e 's/^[ 0-9]*//') >> $COMMANDFILE
}

Then, the next step is to merge all the files in ~/.histories into a master file:

function dedup_history {
sort -nr | sort -t\. -k 2 -s -u | sort -n
}
function write_global_history {
cat $(find ~/.histories -type f) | dedup_history |
sort -t\. -k 2 -s -u | sort -n > ~/.histories/global.$$
mv ~/.histories/global.$$ ~/.histories/global
}

And finally, load the history into RAM:

function read_global_history {
sed -e 's/[0-9]*\.//' < ~/.histories/global > $HISTFILE
echo '-------------------' >> $HISTFILE
cat $COMMANDFILE | dedup_history |
sed -e 's/[0-9]*\.//' | tail -10 >> $HISTFILE
history -c
history -r
}

I personally did all this so that I'd have all my commandlines, forever, across all of my computers. So, clearly the default history size was far too small:

export HISTCONTROL="erasedups"
export HISTFILESIZE=10000
export HISTSIZE=10000

And that's a wrap! A simple script to copy the global history from each workstation up to a central repository and pull the workstation specific histories back to each workstation makes it cross-machine. For example, you can periodically copy ~/.histories/global to a central location as global.$HN, and copying global.* back into your history directory will get all the command lines from other workstations merged into your current environment.
I've had this scheme running for a few days, and love it! The idea that I now have a persistent store of all useful command lines may even change how I choose to write those command lines, making me deliberately include extra but useful information in them. Enjoy.

Labels:

1 Comments:

Blogger Shlomo said...

Cool. I had a similar impulse and wrote a client/server tool to achieve something similar. It would be great if you check it out. http://www.cuberick.com/2009/02/bash-history-in-cloud.html

11:57 PM  

Post a Comment

<< Home