Highlighting Ruby in haml-mode

Posted March 6, 2009

Along with Haml, I maintain the Haml language mode for Emacs. It’s reasonably featureful, including support for managing indentation, cutting and pasting nested regions (although by default there are no keybindings for this), and some syntax highlighting.

One thing it’s sadly lacked, though (until now!), has been support for highlighting embedded Ruby code in a reasonable manner. Emacs’ syntax-highlighting system, known as font-lock, is quite powerful and flexible, allowing all sorts of constructions to be highlighted using a reasonably straightforward, declarative syntax. On top of that, it ensures that even very large documents are highlighted as well as possible without parsing the entire thing.

Unfortunately, for all its flexibility, font-lock doesn’t include any mechanism for highlighting a subsection of the document based on different rules than the rest of the document. So what I had going for a long time was a cop-out: I had a few random Ruby items (instance variables, strings, etc.) manually coded into the Haml font-lock specification, and the rest of it just went unhighlighted.

MuMaMo

Embedding one sort of syntax-highlighting in another is, of course, a rather common issue. It pops up all over the place when embedding stuff in HTML, like PHP or Ruby via ERB or even just Javascript via <script> tags or CSS via <style>. The standard Emacs answer for this is MuMaMo mode, which stands for “Multi Major Modes.”

MuMaMo is part of nXhtml mode, a gargantuan package of Emacs Lisp code that is generally aimed at dealing with XML and HTML code. This often involves dealing with nested languages (as mentioned above), so MuMaMo was born.

Now, before I continue, I just want to say that MuMaMo is an impressive feat of coding. In general, it manages to not only highlight embedded languages, but switch between their key shortcuts and indentation modes when moving the point1 between them. This is certainly not a trivial wrangling of Emacs’ internal workings, and it’s impressive that it works as well as it does as often as it does.

The problem is that it doesn’t work well all the time. In fact, relative to most other Emacs packages, it’s really quite buggy. The highlighting sometimes randomly doesn’t work, and it comes bundled with all sorts of features that themselves introduce subtle nasty bugs and may just be flat-out unwanted.

All this was compounded when I tried to add my own code for telling MuMaMo where Haml’s Ruby sections were. Rather than hooking into the standard font-lock system and providing some way to say “here’s a section of the buffer, treat it as Ruby” MuMaMo insists that one create four search functions that it runs in its own, non-font-lock way to find the boundaries of the section. I got a set of definitions that seemed to work sporadically, but the MuMaMo source was so complex and messy that I couldn’t stand the thought of debugging. I gave up and decided to see if I could coerce font-lock to do my bidding.

font-lock

One of the huge benefits of font-lock is that it allows the user to drop into actual Emacs Lisp functions whenever the standard declarative, regular-expression-based syntax just won’t cut it. font-lock expects these functions to just select a section of the document to be highlighted in a certain way, but there’s no reason the function can’t be used in other ways.

The question now was whether or not it would work to manually highlight text using these functions. font-lock includes a mechanism for manually highlighting text, but it’s designed for highlighting text as it’s inserted. So my first step was to test if it would work when running a highlighter function:

;; This function is added to the font-lock keywords list
;; as (haml-highlight-ruby-script 1 font-lock-preprocessor-face),
;; which tells font-lock to run haml-highlight-ruby-script
;; and highlight the first subexpression with font-lock-preprocessor-face.
;;
;; The function is run repeatedly
;; with the point at the beginning of a section of code to highlight
;; and passing the end of the section as the argument.
(defun haml-highlight-ruby-script (limit)
  "Highlight a Ruby script expression (-, =, or ~)." 
  (when (re-search-forward "^ *\\([-=~]\\) \\(.*\\)$" limit t)
    (put-text-property (match-beginning 2) (match-end 2) 'font-lock-face
                       font-lock-keyword-face)
    t))

It worked. The script character was highlighted with font-lock-preprocessor-face, and the rest was highlighted with font-lock-keyword-face. Now all that remained was to use Ruby’s highlighting rather than just applying a single face. To do so, I took advantage of Emacs’s dynamically-scoped variables to locally rebind various font-lock variables. Then I used font-lock-fontify-region to fontify the region as Ruby. I wrapped this up in a function:

(defun haml-fontify-region-as-ruby (beg end)
  "Use Ruby's font-lock variables to fontify the region between BEG and END." 
  ;; font-lock-fontify-region mucks with the point
  ;; and regular expression match data,
  ;; so we need to save and restore them.
  (save-excursion
    (save-match-data
      (let (;; The keywords are the most important part;
            ;; they specify how to highlight basic structures
            (font-lock-keywords ruby-font-lock-keywords)
            ;; The syntactic-keywords deal with stuff like strings and regexps
            (font-lock-syntactic-keywords ruby-font-lock-syntactic-keywords)
            ;; We need to make this nil so that it only font-locks
            ;; the region we care about
            font-lock-extend-region-functions
            ;; haml-mode doesn't care about case,
            ;; but ruby-mode does for stuff like constants
            font-lock-keywords-case-fold-search)
        ;; font-lock-fontify-region apparently isn't inclusive,
        ;; so we have to move the beginning back one char
        (font-lock-fontify-region (- beg 1) end)))))

Which I then used in haml-highlight-ruby-script:

(defun haml-highlight-ruby-script (limit)
  "Highlight a Ruby script expression (-, =, or ~)." 
  (when (re-search-forward "^ *\\([-=~]\\) \\(.*\\)$" limit t)
    (haml-fontify-region-as-ruby (match-beginning 2) (match-end 2))))

After a fair amount of parameter-tweaking and edge-case-handling, I got this to the working form you see above. I added a more complicated function for highlighting tags with attribute hashes, object refs, and script, and then called it good.

What’s really cool about this is that the general technique for highlighting regions using faces derived from another mode, in particular in a much more lightweight manner than MuMaMo offers, can be widely applied. It could be used in ruby-mode to highlight the contents of #{} as Ruby code rather than a uniform font-lock-variable-name-face, or to highlight here-docs like <<RUBY or <<HTML properly2. It could even be a less-featureful but also less-buggy replacement for MuMaMo itself.

1. The cursor, for those unfamiliar with Emacs.

2. I may well try implementing these two myself.

Mig said March 10, 2009:

Great job Nathan, this has made my day!

Yong Bakos said March 15, 2009:

Thank you… an older version’s use of font-locks ended up with lots of aquamacs default highlighting in cyan. Cyan!

Now highlighting in haml/sass modes is really nice.

Yong Bakos said March 16, 2009:

Crap. both haml-mode and sass-mode are crashing my Aquamacs.

Nathan said March 16, 2009:

Yeah, there are some issues. This is the last good revision. I’ll look into it.

Nathan said March 16, 2009:

Okay, I’m pretty sure I’ve fixed the issue in the latest master. Let me know if that works.

glauber said January 28, 2010:

whoa very nice!

Steve Purcell said February 02, 2010:

A version of haml-mode that uses this technique to colorize block filter regions for ruby, javascript, css, textile and markdown is here. It’s been working nicely for me for a long time now, and degrades gracefully if the corresponding emacs modes for textile etc. are not installed.

Nathan said February 03, 2010:

Steve: I would have merged those changes into haml-mode long ago if I had known about them. As it is, I’ve merged them in just now. Thanks for the contribution!

Steve Purcell said February 05, 2010:

Hey Nathan, great that you merged the changes – thanks!

Entertainingly, I actually sent a pull request about 10 months ago, but I think at the time you declined it. I might have done some subsequent work to tidy the code up… not sure now.

Cheers!

Mariusz Nowak said March 12, 2010:

It looks that it doesn’t support indentation with tabs. Why do you force spaces for indentation ? I work with tabs and I have no issues with haml. Is there any chance for change on this ?

Make your comments snazzy with Textile!