Git-Style Automatic Paging in Ruby
I was using Chris Wanstrath’s cheat the other day, seeing if there were any cool git features I was missing out on (did you know you can color the output?). If you haven’t come across it yet, cheat is a nice little Ruby utility that displays “cheat sheets”— user-generated pages of text that serve as miniature reference manuals.
Unfortunately, some of these cheat sheets can get pretty long.
The git one is 228 lines.
Some of the text went off the top of my terminal.
I sighed and typed in cheat git | less,
thinking once again how nice it would be if more programs followed git’s example
and automatically paged their output.
Although git’s not usually held up as a paragon of usability1,
there are a few places where it shines.
My favorite is how it’ll run less on its output
when the output is too big to fit on my terminal screen.
Then I can easily scroll and search through the text.
Since I had a bit of time with nothing urgent to do, I decided to take a crack at making cheat page like git. Two and a half hours later, after digging through git’s source, getting help from the good folks in #git and #ruby-lang on Freenode, and receiving tons of bug fixes from Kevin Ballard, I got it to work.
I was actually surprised at how short the code was. It’s not simple, but it is small. The version below is nicer than the bare minimum, but I think you could make do with about eight lines of code.
def run_pager return if PLATFORM =~ /win32/ return unless STDOUT.tty? read, write = IO.pipe unless Kernel.fork # Child process STDOUT.reopen(write) STDERR.reopen(write) if STDERR.tty? read.close write.close return end # Parent process, become pager STDIN.reopen(read) read.close write.close ENV['LESS'] = 'FSRX' # Don't page if the input is short enough Kernel.select [STDIN] # Wait until we have input before we start the pager pager = ENV['PAGER'] || 'less' exec pager rescue exec "/bin/sh", "-c", pager end
Upon getting this to work, I promptly forked cheat
and added it there.
The really cool thing about this method is that it can be dropped into any Ruby app.
Just call run_pager, and everything you print to standard output will be paginated.
How this works is a bit tricky. It does some dark magic with Unix processes. At a high level, it:
- Creates a child process that’s a copy of the current process.
- Hooks the child’s standard output to the original process’s standard input.
- Replaces the original process with the pager program.
Then the child process continues on, unaware that anything has changed. The only difference between it and the original program is that its output is being sent to the pager, which is now the program the user is directly interacting with. This clever trick is pretty much the same thing Git does.
The Code
Now, let’s see how this works.
return if PLATFORM =~ /win32/ return unless STDOUT.tty?
The stream- and process-munging we do is only really possible on Unix, so we give up if we’re running under Windows. We also don’t want to bother invoking the pager if we aren’t actually talking to a terminal (a.k.a. tty).
read, write = IO.pipeThis sets up an input-output pipe that we can use
to send our output to the pager.
read and write are the output and input ends,
respectively.
unless Kernel.fork
This is where it starts to get really tricky.
Kernel.fork splits off the child process.
This child process is almost identical to the original parent process.
It even begins at the same spot: right after fork returns.
The only difference is the return value of fork.
In the child process, it’s nil;
in the parent process, it’s not.
The upshot of this is that the stuff in the unless statement
is only run in the child process.
Since we return in there,
the stuff afterwards is only run in the parent process.
STDOUT.reopen(write)This hooks up the child’s output to the input end of our pipe. Because our pipe is also hooked up to the parent process, this means that now anything printed by the child is read in by the parent.
STDERR.reopen(write) if STDERR.tty?
This hooks up the child’s error output to the input end of our pipe as well, as long as the error is actually going to the terminal. If our child process has a problem, we want to tell the user about it.
read.close write.close
Now we close up the pipe to the parent process. This seems a little strange at first (this is the part that took me the longest to figure out) - don’t we still want to send text through the pipe?
The way to think about this is that we’ve just used the pipe we created to tell the parent and child processes, “Here, talk to each other.” They aren’t using the pipe to do the talking; it just shows them how to talk. So once we’ve hooked up the child’s output to the parent’s input, we can get rid of the pipe that we used to do it.
returnRemember that the child process is identical to the parent.
Now that we’re done with run_pager,
it’ll just continue on its merry way,
as if the method hadn’t done anything.
In cheat’s case, this means printing out a cheat sheet.
Then whatever it prints will be sent back to the parent.
STDIN.reopen(read)We’re back in the parent process. First we hook up the output end of the pipe to our process’s input. Remember that the current process will eventually become the pager, so this is how it’ll get input from the child process.
read.close write.close
Now we close up the pipe in the parent process, as well.
Although we’re referring to the same pipe in both processes,
when we forked the ends of the pipe—
our read and write variables—got copied.
We closed the child’s ends above;
now we need to close the parent’s.
ENV['LESS'] = 'FSRX'
This is pretty much a magic incantation.
It tells less, the most common pager,
not to bother paging if the output will fit on the terminal screen.
Kernel.select [STDIN]
This oddly-named function tells our current process, the parent,
not to continue doing anything until there’s input ready to be read.
This isn’t strictly necessary,
but according to the git source code,
it works around a bug in less.
pager = ENV['PAGER'] || 'less'
Here we choose what pager program to use.
If the user has a preferred pager defined, we’ll use that;
otherwise, we’ll use less, the typical pager.
exec pager
We finally replace the parent process with the pager.
exec just means “replace the current process with this program.”
It makes the parent process get rid of all its state
and literally become the new program.
Fortunately, our input stream stays hooked up to the child process,
so anything it prints will be read in by our new pager and paged.
rescue exec "/bin/sh", "-c", pager
Unfortunately, the previous call to exec sometimes fails for mysterious reasons on OS X.
We want to catch that exception and try running the pager through sh, the standard shell.
This usually helps.
Update: Added in a bunch of bug fixes found by Kevin Ballard. Thanks, Kevin!
1 Not that it’s unusable; it just lacks polish in some ways and takes longer to get accustomed to than, say, darcs.
About Me
Feed
Putting it in Perspective



I think it’s misleading to say that when you
you are closing the pipe because you “don’t need it anymore”. What you don’t need is the write end in the parent process, because it is going to do the reading (you could also close the read end in the child process).
Nice article anyway, thanks!
Although I don’t fully understand all the pipe munging that’s going on here, I don’t think that’s quite accurate, cyclotron. I think the pipe itself is managed by the OS, and thus is independent of either process. Closing it in the child process actually has the same effect.
Nathan seems to be right, there’s some deeper process to it. Actually closing both ends in parent and child doesn’t affect the pager at all:
It keeps on working..
Nice functionality, and that’s a fancy trick that may come in handy later. Also, I never knew about Kernel#exec and it’s going to solve a little problem for me. Thanks for the write-up!
Nathan, thanks for such a cool snippet. I’ve included it on
http://github.com/vic/buildr/tree/master/doc/scripts/buildr-git.rb
We provide some git-workflow paged tips to buildr developers when they get a buildr fork.
You can try it at http://balloon.hobix.com/buildr-git
Thanks for the tip, I wrote about using this with ack and rak over at http://potatosaladx.blogspot.com/2008/04/automatic-paging-with-ack-and-rak-git.html
The results are awesome.
Hey, thanks for some cool code. :)
Any ideas how to make this work in Readline? I have a command-line application using Readline and I’d like to use something like this to paginate long output from commands. I tried using it as is but as one would expect it continues to do paging after the output text, negatively affecting the command-line of the application. I tried messing around with the code to redirect STDOUT back to the original STDOUT and kill the extra process, but that didn’t work either.
Any suggestions?!
— Thanks! Bryan
The problem is that the pager process replaces the top-level process. Not only that, but once the pager’s the top process, control never passes back to the process that spawned it. The only way you might possibly be able to get this to work is wrapping the pager in another program that re-raises the Ruby process. That would be very complicated, and I’m not really sure that it would work at all. Good luck, though!