And now, the rules

In my last post I wrote about how to write your own mini-language in Ruby by abusing method_missing and operator overloading. I know, I know, it totally blew your mind and whatever, but I missed out this huge part of the language I was demonstrating: the rules. And without the rules, all it’s good for is some silly little shorthands in ActionController. Big deal.

So let’s make some rules. We’re going to do this by creating an extension of the expression language that can create rules and store them in some environment. These will be our shiny new building blocks:

module Consent

  class Rule
    class Expression < Consent::Expression
      class Group < Consent::Expression::Group
      end
    end

    module Generator
    end
  end

end

We’ve got a Rule class to store rules, and it has a Generator; we will embed Rule::Generator in environments that should generate rules, just like we did for Expression::Generator. And, we have some new Expression and Group classes that extend the old ones and provide rule-specific functionality.

To get started, we need to figure how rules are stored. Well, we’ve got Expressions and we’re going to add blocks, so let’s say a Rule has an expression and a block. You can add your own methods to figure out whether a request matches the expression and run the block to test the rule, but for now we’ll just figure out how to store things:

class Consent::Rule
  def initialize(expression, block)
    @expression, @predicate = expression, block
  end
end

Note I’m not using the &block parameter format because we’re not going to be calling Rule.new(expr) { code }; the blocks will be called on methods in the expression language then passed around for storage as Procs. Also, omitting the ampersand makes the Proc a required parameter.

The other little bit of setup is to note that the generated rules need to be stored somewhere, but the language does not use assignment. For example, if we call this in a controller:

redirect_to admin/users.find(:id => 'twitter')

The expression generated by admin/users.find(:id => 'twitter') is passed as a parameter to the redirect_to method where it can be interpreted. In contrast, our rule language is declarative: you simply say

admin/users.find(:id => 'twitter') { deny unless has_dictionary? }

and this should store a rule somewhere, even though there’s no assignment. The way I’ve chosen to do this is to make sure that when we mix in Rule::Generator, a @rules array is created in the host class, and expressions generated have a reference to the environment in which they were created so they can push rules into it. In other words:

class Consent::Rule

  class Expression
    def initialize(env, name, params = {})
      @env = env
      super(name, params)
    end
  end

  module Generator
    def self.included(host)
      host.module_eval do
        def rules
          @rules ||= []
        end
      end
    end

    def method_missing(name, params = {})
      Rule::Expression.new(self, name, params)
    end
  end

end

As before, Generator#method_missing simply creates an expression of the correct type, only this time the expression is passed a reference to the object where it was created. This object will have a rules accessor thanks to Generator.included, so the expression can use @env.rules to push new rules into the host environment.

At this point we need to stop and make some changes. Generator#method_missing, as we know, is responsible for creating controller-only expressions. The language needs to support blocks at this level so we can write rules like:

profiles { deny if session[:user].nil? }

Remember, the blocks don’t just float in space; Ruby will pass them to the preceeding word-based method call in the expression (it won’t pass them to operator methods). So we’d better let Generator#method_missing take a block, and use it to generate a rule. I’m going to add a method to Expression for doing just that:

class Consent::Rule

  class Expression
    def rule!(block)
      return if block.nil?
      @env.rules << Rule.new(self, block)
    end
  end

  module Generator
    def method_missing(name, params = {}, &block)
      expression = Rule::Expression.new(self, name, params)
      expression.rule!(block)
      expression
    end
  end

end

See how here we use the ampersand in &block, making the block optional. We need to do the same thing in Expression#method_missing to support rules with actions:

class Consent::Rule::Expression
  def method_missing(name, params = {}, &block)
    rule!(block)
    super(name, params)
  end
end

Okay, we’ve dealt with the easy stuff, so we can handle single expressions with controllers, actions and params. (We can’t handle formats properly yet, as I’ll explain later.) The next step is to handle Groups. Take this expression:

users(:source => 'facebook') + tags.find { request.get? }

This will evalutate users(:source => 'facebook'), then tags.find { request.get? } (the latter of which will generate a rule), then combines them using addition. We need to somehow propagate the block back to the other Expressions in a Group so we can generate Rules for them. We can do this by storing a reference to the block in the final expression, and overriding Group#+ so that it checks the incoming Expression for a block. If the Expression has a block, we can iterate over the Group and generate Rules for its Expressions.

class Consent::Rule::Expression
  attr_reader :block

  def rule!(block)
    return if block.nil?
    @block = block
    @env.rules << Rule.new(self, block)
  end

  class Group
    attr_reader :block

    def +(expression)
      rule!(expression.block)
      super
    end

    def rule!(block)
      return if block.nil?
      @block = block
      each { |exp| exp.rule!(block) }
    end
  end

end

Notice how both Expression and Group now support the methods +, block and rule!. This means the addition operator can cope with adding any combination of Expressions and Groups. Group#rule! method comes in particularly handy when implementing HTTP verb filters, as shown below. But before we get there, there’s one tricky issue to sort out: response formats. Consider the following expression:

categories * xml { params[:debug].nil? }

Ruby will evaluate categories and xml { params[:debug].nil? }, then multiply the resulting Expressions. Clearly we need to override Expression#* to propagate the rule, but there’s another problem: we’ve generated a Rule in memory matching XmlController, which is not what we wanted (the above expression should match any CategoriesController action where the response format is XML). So, we need to tell this Rule that it’s no longer valid, but we don’t have a reference to it (this is also why we can’t just remove it from @env.rules). Expressions don’t hold references to Rules, it’s the other way around!

One way to solve this is to use the observer pattern. The Rule can observe its @expression, and the expression can publish messages to declare itself invalid.

require 'observer'

class Consent::Rule

  def initialize(expression, block)
    @expression, @predicate = expression, block
    @expression.add_observer(self)
  end

  def update(message)
    @invalid = true if message == :destroyed
  end

  class Expression
    include Observable

    def *(expression)
      expression.destroy!
      rule!(expression.block)
      super
    end

    def destroy!
      changed(true)
      notify_observers(:destroyed)
    end
  end

end

This sets up a Rule to observe the Expression it relates to, and to render itself invalid if the Expression is “destroyed”. The multiplication operator destroys the incoming Expression, creates a new rule using self if the expression had a block, then calls super to carry out the inherited meaning of Expression#*.

Phew! Almost there, just one final piece of the puzzle: HTTP verbs. These are top-level methods in the expression language, and it didn’t really make sense to put them in Expression::Generator since you can’t, say, call redirect_to post(tags.list), it just doesn’t make any sense. But they can be used to filter incoming requests, so we’ll put them in Rule::Generator. Each HTTP method takes a list of Expressions and/or Groups, combines them into a single Group, sets the verb on that group and returns it. Since they’re top-level functions, they also need to take blocks and generate rules (using Group#rule!). I’m going to put the verb setter methods on the base Expression classes as they’re not really specific to Rule::Expression.

class Consent

  class Expression
    def verb=(verb)
      @verb = verb.to_s
    end

    class Group
      def verb=(verb)
        each { |exp| exp.verb = verb }
      end
    end
  end

  module Rule::Generator
    %w(get post put head delete).each do |verb|
      module_eval <<-EOS
        def #{verb}(*exprs, &block)
          group = exprs.inject { |grp, exp| grp + exp }
          group.verb = :#{verb}
          group.rule!(block)
          group
        end
      EOS
    end
  end

end

And that pretty much wraps it up. Interpreting rules is another problem in its own right, and to be honest you’re better off reading the Consent source code if you want the complete picture. In the meantime, I’m doing a JSON and JSONQuery interpreter for Ruby that I hope to release soonish, and also a Scheme implementation which will almost certainly never see the light of day but is a lot of fun nonetheless. Looks like 2009 is “tool around with languages” year for me.

Source code for this and the previous post is up on Github if you want to grab it all at once. I’ve added a few handy inspection methods so rules look nicer in irb and I’ve done some cursory testing but if any of it is broken please let me know.

Writing your own expression language in Ruby

The last few days, I’ve been writing Consent, a tool for writing declarative firewalls for Rails apps. I thought it would be interesting to dig into its implementation now that the code’s settled down, as it’s one of the more complicated DSLs I’ve written, and certainly the first one that makes decent use of Ruby’s features. Turns out method_missing and operator overloading can get you a ton of expressive power for writing your own mini-languages.

The grammar for Consent’s expression langauge (not that it’s strictly specified, but this will do) looks something like this:

list          ::=   expression [+ expression]*
expression    ::=   controller[action]?[params]?[format]?
controller    ::=   name[/name]*
action        ::=   .name
params        ::=   (:name => value[, :name => value]*)
format        ::=   [.|*]name
name          ::=   [a-z][a-z_]*
value         ::=   integer|string|regexp|range

Consent expressions are capable of matching controller and action names, parameters, response formats and HTTP verbs (thought I’ve left the latter out of the grammar for now), and if the above looks a little abstract there are examples on Github. Just to get us started, here’s a typical expression that uses most of the syntax available in the language (I’m leaving rules blocks out for this post):

  ajax/maps + tags.create + users(:id => /admin/)*json

This matches requests for:

  • Any action in Ajax::MapsController
  • TagsController#create
  • Any UsersController action where params[:id] matches /admin/ and the response format is JSON

When we talk about doing DSLs in Ruby, we typically mean that we’re going to provide an API for writing Ruby code that’s idiomatic for the problem we’re trying to solve. It does not mean writing a parser and interpreting trees, though we will in effect be writing an interpreter of sorts. The problem Consent solves is that of declaring access control rules for actions in a Rails app, and its expression language was dictated by that and by the constraints of what constitutes valid Ruby code.

So, we’ve established that expressions need to be valid Ruby. Ruby will parse our DSL’s code but it’s up to us to provide all the methods to output useful data structures from the DSL. We want Consent expressions to generate objects storing request data: controller and action names, etc. Now we run into our first problem: the environment the above expression runs in will need to provide the method names ajax, maps, tags, etc. These names could be pretty much anything, so it’s clear we’re going to need method_missing. But, we don’t want to go defining method_missing up at the the top level as that could well wreak havoc; we want to be able to inject our expression language into carefully chosen classes so they only work in certain places. In Consent this is handled something like this:

module Consent
  
  class Expression
    module Generator
      # method_missing etc
    end
    # Expression methods - we'll get to these later
  end
  
  class Description
    include Expression::Generator
    attr_reader :rules
  end
  
  def self.rules(&block)
    desc = Description.new
    desc.instance_eval(&block)
    @rules = desc.rules
  end
  
end

When you call Consent.rules, the block is instance_evaled inside a Description object. Description includes the Expression::Generator module, which contains the method_missing that we need to get our DSL started.

(Bear in mind, this code will not end up looking exactly like Consent. It’s supposed to be a demonstration of how to build up a DSL from scratch, which reminds me: always write your tests first when designing a DSL. It makes it so much easier to figure out what you want and how you might go about building it.)

What does this method_missing need to handle? Well, the top-level method calls in the above expression are (broadly speaking) controller names, and they can take an optional parameter list. So we might guess that method_missing should generate a new Expression using the controller name and params.

module Consent::Expression::Generator
  def method_missing(name, params = {})
    Expression.new(name, params)
  end
end

class Consent::Expression
  def initialize(controller, params = {})
    @controller, @params = controller.to_s, params.dup
  end
end

All the top-level method calls in the language will return an Expression with at least a @controller value. We can knock the first Expression method on the head quickly by looking at the expression tags.create, which we want to match the action TagsController#create. So, we’re going to need another method_missing in Expression itself to handle action names. This should just set the action name and return the Expression so it can be further combined with others. Action calls can also take param lists, so we factor that in:

class Consent::Expression
  def method_missing(name, params = {})
    @action = name.to_s
    @params.update(params)
    self
  end
end

Next, notice the expression ajax/maps, which we want to match Ajax::MapsController. Ruby will interpret the expression by calling ajax and maps, then combining the two resulting Expressions using the division operator. We need to define this operator in Expression, but what should it do? We know the first operand (the receiver of the / call) will simply have a @controller value, whereas the second operand may be more complex, e.g. the expression ajax/maps.find(:user => "James"). Let’s implement it by having the first operand modify the second’s @controller value, then returning the second operand:

class Consent::Expression
  def /(expression)
    expression.nesting = @controller
    expression
  end
  
  def nesting=(name)
    @controller = "#{ name }/#{ @controller }"
  end
end

We modify the nesting of the second expression and return it. To do this, we implement a nesting= method, that adds a namespace onto the front of the expression’s @controller.

The last method we need to deal with before thinking of how to combine expressions is how to deal with response formats. There are two ways to do this: profiles.list.json matches a JSON request for ProfilesController#list, which profiles*json matches a JSON request for any action in ProfilesController. To deal with the first case, we need to change Expression#method_missing so that if the Expression already has an @action, we set the format instead:

class Consent::Expression
  def method_missing(name, params = {})
    @format = name.to_s if @action
    @action ||= name.to_s
    @params.update(params)
    self
  end
end

Notice the ||= assignment so we only set @action if it’s not already set. Also, note that this allows format calls to take param lists, but I’m not particularly bothered about that. The implementation will really allow a fair number of slightly nonsensical things, but right now I’m just concerned with getting the language to work.

The other format method is the multiplication operator. Again, Ruby will interpret profiles*json by creating a ProfilesController expression and a JsonController expression before combining them. This time, it’s the first operand that contains information we want to keep; we just want to modify it using the second’s @controller value. The second operand is effectively thrown away. Also, the * is supposed to signify “any action” so let’s nullify the @action.

class Consent::Expression

  def *(expression)
    @format = expression.instance_eval { @controller }
    @action = nil
    self
  end
end

Finally, we need to combine expressions into lists, which we do using the addition operator. To do this, we’re going to need a new class to represent groups of expressions. This will have a + method that simply takes an expression or a group and adds it to a list:

class Consent::Expression::Group
  include Enumerable
  
  def initialize
    @exprs = []
  end
  
  def each(&block)
    @exprs.each(&block)
  end
  
  def +(expression)
    expression.is_a?(Enumerable) ?
        expression.each { |exp| @exprs << exp } :
        @exprs << expression
    self
  end
end

This allows a Group to be combined with other Groups by adding the Expressions from one to another. Now we can implement Expression#+ by simply creating a new group and adding the two operand expressions to it:

class Consent::Expression
  def +(expression)
    Group.new + self + expression
  end
end

And that's everything you need for a basic implementation of the grammar I showed you up top, in barely 60 lines of Ruby. I've left HTTP verbs out of the mix but hopefully you can see how you might implement them (hint: you need some extra methods in Generator that take Expressions and Groups). I've also neglected rule blocks for the time being as they introduce a lot of complication and I wanted to focus on the expression interpreter in this post. Maybe another time, eh?

Oh, and as an exercise for the reader: see if you can write an inspect method for Expression and Group that returns the original source code that generated the expression, it makes for really helpful log messages.

Consent: a little firewall DSL for your Rails app

Well, it’s been a couple of months. Rest assured I’ve still been hacking away; JS.Class will be getting hashes and constants at some point in the future, I’ve got a bunch of improvements to make on Bluff, and I’ve been contributing to PDoc which is a really promising JavaScript doc engine from Tobie Langel that uses the excellent Treetop parsing engine. But enough about all that.

As a little new year present, I hacked up Consent for a friend. He was telling me he had this nice resourceful, RESTful Rails app but there was some logic, mostly permissions-style stuff, that he wasn’t sure how to do cleanly and without making his app kinda messy. I’m not sure if Consent is the answer he was after, but it’s one stab at a solution.

Consent is basically a declarative firewall for ActionController: you write a bunch of rules in a single config file, and these rules are processed on every request before the request gets to your controller. Using combinations of controller and action names, parameter matches and HTTP verbs, you can select actions from your app and use Ruby blocks to decide whether requests should be processed or whether you should be bounced to a 403 page. It’s an alternative to using verify and method-specific access control logic; you can still access the request and session data but you’ve factored your access control logic out of the controller into a separate layer.

As a quick example, here’s a rule that says all requests to SiteController#hello must be Ajax requests:

  site.hello { request.xhr? }

It’s currently reasonably well-tested and documented but needs some polish; the rule language basically works but we could do with stuff like customisable 403 responses and the like. Check it out on github, hack around, tell me if it sucks, you know the drill by now.