ActiveRecord.where.not(:sane => true)

Comments October 07, 2013

TL;DR - ActiveRecord: This is yak country.

One of the really “fun” things about writing libraries that extend ActiveRecord is that I’m constantly running up against interesting problems that require a choice between multiple less-than-stellar options to resolve. One such case arose not too long ago when working on Rails 4 compatibility for Squeel. Since the problem is niche enough (and my chosen solution weird enough) that I don’t expect anyone else to write about this, I figured I may as well. I have no idea if you’ll find it useful. Read on at your peril.

Joining the WhereChain Gang

When it came time to make Squeel Rails-4.0-compatible, I needed to avoid causing breakage of standard ActiveRecord 4 features where possible. One such feature is the new where.not syntax for negating conditions. Briefly, let’s look at its implementation in ActiveRecord 4.

In query_methods.rb, ActiveRecord defines the ActiveRecord::QueryMethods module. It’s included into ActiveRecord::Relation, and it’s what provides the support for the chainable where, select, and so on. Here’s what ActiveRecord::QueryMethods#where looks like:

def where(opts = :chain, *rest)
  if opts == :chain
    WhereChain.new(spawn)
  elsif opts.blank?
    self
  else
    spawn.where!(opts, *rest)
  end
end

As you can see, there’s an arbitrary default symbol, :chain that gets assigned to opts if no arguments are supplied. This is what triggers the new behavior, which returns a new instance of ActiveRecord::QueryMethods::WhereChain with a newly-spawned scope. Note that spawn is “ActiveRecord” for “build a new copy of the existing scope, so that we don’t modify this one in place.”

What’s WhereChain do? Let’s have a look at its full implementation:

class WhereChain
  def initialize(scope)
    @scope = scope
  end

  def not(opts, *rest)
    where_value = @scope.send(:build_where, opts, rest).map do |rel|
      case rel
      when Arel::Nodes::In
        Arel::Nodes::NotIn.new(rel.left, rel.right)
      when Arel::Nodes::Equality
        Arel::Nodes::NotEqual.new(rel.left, rel.right)
      when String
        Arel::Nodes::Not.new(Arel::Nodes::SqlLiteral.new(rel))
      else
        Arel::Nodes::Not.new(rel)
      end
    end
    @scope.where_values += where_value
    @scope
  end
end

That’s it. Its sole purpose is to exist to expose a not method, which then send a message to the private build_where method on the original ActiveRecord::Relation, takes each of the resulting values returned (which may be an Arel::Nodes::Node, a String, or something else, depending on input, and does its best to negate them in a suitably Arel-like way. Given typical Hash input to this method, build_where will use ActiveRecord::PredicateBuilder to build and return an array of Arel::Node::Nodes.

Problem?

The astute readers among you might quickly recognize that a message to where with no arguments is exactly what you have when you write something in the Squeel DSL, such as:

Article.where { created_at > 2.days.ago }

This is a minor problem, one that can be easily solved by checking for a block using block_given?. The real issue comes when we consider the original Squeel implementation of build_where:

def build_where(opts, other = [])
  case opts
  when String, Array
    super
  else  # Let's prevent PredicateBuilder from doing its thing <=== PROBLEM
    [opts, *other].map do |arg|
      case arg
      when Array  # Just in case there's an array in there somewhere
        @klass.send(:sanitize_sql, arg)
      when Hash
        @klass.send(:expand_hash_conditions_for_aggregates, arg)
      else
        arg
      end
    end
  end
end

The issue is that one of the fundamental things that enables Squeel to do what Squeel does is bypassing PredicateBuilder altogether, which lets us handle the relation’s where_values when we have more information about the entire query. Initially, I struggled with whether or not a user should reasonably expect where.not to accept a block as other chainable methods do, and then negate it. Pretty quickly, I ruled that out, since Squeel’s functionality supersedes what where.not provides. Even so, just permitting the new behavior to work as it would without Squeel installed proves tricky, because of the way that it’s implemented.

We have a few options:

Refactor ActiveRecord to make life just a little less crazy, by at least keeping methods that need access to private methods in the same class. I tried that, and it didn’t take, due to an apparent desire for future extensibility. It would have still meant some hackage, but at least not require something like…
…monkey patching yet another class, making WhereChain#not send a special parameter to build_where that triggers the super implementation. Since there was supposedly a reasonable chance that other methods would later be implemented, if not in core then by other gems, this was deemed too fidgety to allow me to sleep at night.
Something else. Something that would say to ActiveRecord: “You know what? I give up. If you insist on using a WhereChain, I wash my hands of it. Here, have your original implementation of build_where back. I won’t even mess with its method signature.”

Conveniently, WhereChain uses dependency injection already, taking a scope to which it will send the build_where message. So, you might think that I would just provide a wrapper object that implements a build_where which then sends to the private version of the method with a flag that tells it to call super. What?! More sending to private methods? And introducing an object that has no other purpose than to emulate a questionable design choice to begin with? Pfft! Where’s the excitement in that?

Don’t Try This At Home

So, let’s say you have an object, and you’d like to replace one of its methods with a version from a module that was already included in the object’s class, but has since been overridden.

How would you do it? If you said “You wouldn’t, that’s crazy talk!” then give yourself a cookie. You’ve earned it. But now you’re curious, aren’t you? How would you do something like that?

Here’s a simplified scenario that is not unlike the situation we have with ActiveRecord, for the purposes of this discussion.

module Foo
  def do_something
    puts 'Foo#do_something'
  end
end

module Bar
  def do_something
    super
    puts 'Bar#do_something'
  end
end

class Baz
  include Foo
  include Bar
end

Here, we want to have an instance of Baz that, when sent the do_something message, only prints Foo#do_something.

You might assume that you could extend the module onto the object itself, which would include the module into its singleton class, which would be the first implementation of do_something encountered during message dispatch.

Two problems with that approach:

If it worked, it would replace all of the methods with those from the extended module, not just the one you want.
It doesn’t work, anwyay.

Let’s try it:

baz = Baz.new
baz.extend Foo
baz.do_something
# => Foo#do_something
# => Bar#do_something

And, if you take a look at its singleton class’s ancestors, you’ll see a clue as to why:

puts baz.singleton_class.ancestors.inspect
# => [Baz, Bar, Foo, Object, Kernel, BasicObject]

Nope, Foo stays right where it was, before. No cutting to the front of the line for Foo!

“Aha,” you say, “I’ll include it in another module!”

module Pow
  include Foo
end

baz = Baz.new
baz.extend Pow

baz.do_something
# => Foo#do_something
# => Bar#do_something
puts baz.singleton_class.ancestors.inspect
# => [Pow, Baz, Bar, Foo, Object, Kernel, BasicObject]

Still no dice. And still, even if it would have worked, it would have brought along with it all of the module’s methods, not just the one you’re interested in.

Why didn’t it work? The object already had Foo included. Extending another module that also includes Foo won’t shift Foo’s position in the ancestor chain. Of course, if Pow#do_something was defined, it would have responded to the message.

And therein lies the answer:

module Pow
  include Foo
  define_method :do_something, Foo.instance_method(:do_something)
end

baz = Baz.new
baz.extend Pow

baz.do_something
# => Foo#do_something

Success! Because define_method can bind a method from Foo to any Module (and therefore Class, which is a kind of Module) that is a “subclass” of Foo, and Pow includes Foo (and module inclusion is inheritence), it defines the method inside Pow and, well… pow!

An interesting side note: in Ruby 2.0.0, define_method no longer requires that you bind a method to a module that is a subclass of the method’s source. In 1.9.3, however, this will happen:

module Pow
  define_method :do_something, Foo.instance_method(:do_something)
end
# => in `define_method': bind argument must be a subclass of Foo (TypeError)

Wrapping Up

So, with that little experiment out of the way, this actually works for Squeel’s needs:

module WhereChainCompatibility
  include ::ActiveRecord::QueryMethods
  define_method :build_where,
    ::ActiveRecord::QueryMethods.instance_method(:build_where)
end

def where(opts = :chain, *rest)
  if block_given?
    super(DSL.eval &Proc.new)
  else
    if opts == :chain
      scope = spawn
      scope.extend(WhereChainCompatibility)
      ::ActiveRecord::QueryMethods::WhereChain.new(scope)
    else
      super
    end
  end
end

It’s still super-crazy, but kind of beautiful in its minimalism, too. I daresay I like the aesthetic.

Ernie Miller

No, I don't work in NYC, DC, or the valley, and I'm cool with that.

ActiveRecord.where.not(:sane => true)

Joining the WhereChain Gang

Problem?

Don’t Try This At Home

Wrapping Up