Caching with Haml

Posted November 24, 2007

Ah… theres’ nothing like a four-day weekend to allow me to forget about homework and spend some quality time hacking Haml. Over the course of the past two days, I have

  • Fixed the benchmark code (multiple times).
  • Ripped out all the automatic caching.
  • Added support for evaluating templates in the context of Bindings.
  • Cleaned up all sorts of internal organizational stuff.
  • Added a several methods that could be used to implement caching outside of Engine.
  • Changed the way local variables are handled.
  • Reimplemented caching for Rails using the new methods.

It’s been a busy couple days.

What does all this mean for you? Well, if you don’t use Haml at all and just read my blog for Emacs tidbits or make_resourceful tutorials, probably nothing. I guess it means you don’t have to read through this post.

If you just use Haml with Rails or Merb, the language itself isn’t changing. All your code will still work. You don’t really have to think about this. However, there are some interesting performance benefits, so you may want to stick around to hear about those.

It’s those of you who build stuff using Haml’s Ruby API that should be perking up your ears. Ever called Haml::Engine.new? Then pay attention.

A bit of context: this stuff is all currently implemented in trunk. You can get it from Subversion via

svn co svn://hamptoncatlin.com/haml/trunk haml
cd haml
rake install

If you’re patient, though, we’ll probably also release all this as version 1.8 reasonably soon.

Changes in the Old Stuff

First of all, Haml::Engine.new and Haml::Engine#render remain backwards-compatible, API-wise. The following will still work delightfully:

template = <<END
.desc
  %p== I am a \#{self.class}
  %p== When upcased, I am \#{upcase}
END
Haml::Engine.new(template, :attr_wrapper => '"').render("foobar")

This returns

<div class="desc">
  <p>I am a String</p>
  <p>When upcased, I am FOOBAR</p>
</div>

Bindings

There are a few changes, though. Now you can pass in a Binding instead of a scope object. For example, this will render the same thing as the previous example:

Haml::Engine.new(template, :attr_wrapper => '"').render("foobar".instance_eval{binding})

Locals

Template-local variables are also done differently now. They’re now passed as the second argument to #render, rather than as an option to new.

# RIGHT
Haml::Engine.new("%p= foo").render(Object.new, :foo => "Hello!")
  #=> "<p>Hello!</p>" 

# WRONG
Haml::Engine.new("%p= foo", :locals => {:foo => "Hello!").render(Object.new)

That second one will still work for compatibility reasons, but it’ll let off a warning. A couple of releases from now, it’ll just break. So if you’re using it in your code, switch it around.

No More Caching…

In version 1.7, Haml::Engine internally cached all the templates it rendered based on the text of the template and its options hash.

However, this had various issues. The code for it was nasty, and it necessitated defining _render_haml methods on the scope object.

Worst of all, it meant that external users of Haml::Engine couldn’t implement their own caching. However, these users potentially have more information that could make caching more efficient.

For instance, Rails and Merb both have access to the filename and modification time of the template. If they use this to cache, they could avoid reading in the template file altogether.

So now there’s no internal caching.

...Except for Rails

There is, however, built-in external caching. Haml uses the full extent of its snazzy new methods (see below) and Rails’ built-in caching mechanisms to keep as much information at-the-ready as possible.

This creates wonderful performance gains. Check out the benchmarks below for the full story, but when rendered via ActionView Haml’s almost on par with ERB.

Snazzy New Stuff

The main additions to Haml are two new public methods: Haml::Engine#render_proc and Haml::Engine#def_method. These are both ways of getting compiled functions that will evaluate the template when run.

To understand why this produces such a speed gain, you’ve got to understand something about the lifecycle of a Haml template. When a template is first passed into Haml::Engine.new, Haml “precompiles” the template into a string of Ruby code. This code, when run, will produce the rendered XHTML as output.

Haml::Engine#render is a pretty straightforward method. It just pretty much just evaluated the precompiled code. However, evaluating code is slow. You really don’t want to do it more than once per template if you can help it.

So render_proc and def_method let you keep ahold of the compiled code. Then you can run it as many times as you want without having to eval over and over again.

render_proc

render_proc returns a proc… that renders. It takes a scope object or binding, just like render. For example:

string = "peanut butter" 
proc = Haml::Engine.new("%p\n  I like\n  %em= self").render_proc(string)
proc.call #=> "<p>\n  I like\n  <em>peanut butter</em>\n" 

string.replace "chocolate sauce" 
proc.call #=> "<p>\n  I like\n  <em>chocolate sauce</em>\n"

It’s a pretty simple method.

def_method

def_method is a little more complicated. As the name suggests, it actually defines a method. It takes an object, class, or module and a method name, and defined a method with that name that runs the template. For example:

# You can define the method on an individual object...
string = "peanut butter" 
Haml::Engine.new("%p\n  I like\n  %em= self").def_method(string, :haml)
string.haml #=> "<p>\n  I like\n  <em>peanut butter</em>\n" 

# ...or on a class or module,
Haml::Engine.new("%p\n  I like\n  %em= self").def_method(String, :i_like)
"jellybeans".i_like #=> "<p>\n  I like\n  <em>jellybeans</em>\n"

Local Variables

Now, I actually left a bit out of those two explanations. You can pass local variables to the new procs/methods, similarly to the locals hash for render.

However, there’s a gotcha. Ruby offers no way to transform an arbitrary hash into local variables. This means that you have to pre-declare the local variables you use in your template.

This is done by passing a bunch of symbols, the names of the variables, after the other arguments to either render_proc or def_method. Then those variables can be passed in as arguments to the resulting proc or method. For example:

Haml::Engine.new("%p= foo").render_proc(Object.new, :foo).call(:foo => "Pizza")
  #=> "<p>Pizza</p>" 

obj = Object.new
Haml::Engine.new("%p= foo").def_method(obj, :haml, :foo)
obj.haml(:foo => "Bolognese") #=> "<p>Bolognese</p>" 

Haml::Engine.new("%p= foo").render_proc(Object.new).call(:foo => "Failure")
  #=> NameError: undefined local variable or method `foo'

Benchmarks

So, what does performance look like for all this new stuff? Sunshine and rainbows is what. There are a bunch of benchmarks, so we’ll go through them one-by-one.

No Caching

              user     system      total        real
haml:     0.380000   0.030000   0.410000 (  0.411133)
erb:      0.370000   0.010000   0.380000 (  0.389706)
erubis:   0.150000   0.020000   0.170000 (  0.154595)
mab:      0.670000   0.040000   0.710000 (  0.730434)
Haml/ERB:     1.05498          ERB/Haml:     0.947883
Haml/Erubis:  2.65942          Erubis/Haml:  0.376022
Haml/Markaby: 0.562861         Markaby/Haml: 1.77664

These templates weren’t cached at all. Every time through, they were parsed, eval’d, and rendered. Haml’s only a little slower than ERB, which is especially delightful considering that Haml has so much more to do by way of parsing than ERB.

Erubis is clearly better, though. Ezra says that most of Erubis’ speed comes from its efficient parsing, so it’s not surprising that it has such a lead here.

Markaby is pretty darn slow.

With Caching

              user     system      total        real
haml:     0.040000   0.000000   0.040000 (  0.049254)
erb:      0.040000   0.000000   0.040000 (  0.048326)
erubis:   0.040000   0.000000   0.040000 (  0.044333)
Haml/ERB:     1.0192           ERB/Haml:     0.98116
Haml/Erubis:  1.111            Erubis/Haml:  0.90009

I dumped Markaby here because it doesn’t appear to support any sort of caching. ERB doesn’t support caching an eval’d method, either, but I did it manually using ERB’s precompiled Ruby string. Rails caches ERB using the same mechanism.

Haml’s once again barely slower than ERB. In this case, less than 2%. It’s even faring well against Erubis, which is cached using its own def_method.

Via ActionView

            user     system      total        real
haml:   0.050000   0.010000   0.060000 (  0.063137)
erb:    0.050000   0.010000   0.060000 (  0.056374)
Haml/ERB:     1.11997          ERB/Haml:     0.89288

No more Erubis; I couldn’t figure out how to get it set up with ActionView without getting rid of ERB.

This benchmark sets up Haml as an ActionView template handler and renders templates using ActionView#render. It’s similar to the performance you’d see using Haml with Rails.

Haml isn’t doing quite as well against ERB as it was before. I’m not sure why this is; they’re handled in essentially the same way, so you’d think the times would match those for the caching benchmark.

However, 1.12x slower is a lot better than the 1.7x we were seeing in version 1.7.

Via ActionView with Nested Partials

            user     system      total        real
haml:   0.640000   0.100000   0.740000 (  0.750011)
erb:    0.390000   0.050000   0.440000 (  0.434607)
Haml/ERB:     1.72572          ERB/Haml:     0.579468

One of the main things that slows Haml down relative to ERB and Erubis is its insistence on nicely formatted output. In accordance with this insistence, it re-indents every line of Ruby code it inserts so that the resulting template is properly nested.

This can cause some performance loss when you have lots of nested partials. This benchmark was run with five includes of partials nested five-deep.

Eventually we’ll have an :ugly option that will allow users to forego pretty output in favor of increased performance.

Via ActionView with Insanely Deeply Nested Partials

            user     system      total        real
haml:  17.760000   0.700000  18.460000 ( 18.976264)
erb:    4.160000   0.340000   4.500000 (  4.532037)
Haml/ERB:     4.18714          ERB/Haml:     0.238827

This is the same thing as the last benchmark, but instead of five-deep partials, they’re fifty-deep. Haml’s clearly not doing well against ERB, but ten times the partials only leads to 2.5x the difference in rendering speed. That’s not bad scaling.

Pete Forde said November 24, 2007:

Nathan, you are the rock on which we all camp.

I’d just like to say thanks, and point out that this post is current as of Haml svn revision 666.

Make your comments snazzy with Textile!