ActiveRecord::Relation vs Arel

Comments

Since the release of ActiveRecord Query Interface 3.0, I think the line has been pretty blurry for the Rails community about where ActiveRecord ends and Arel begins. I’ve seen folks attribute ActiveRecord::PredicateBuilder to Arel, and I’ve seen folks assume that an ActiveRecord::Relation is an Arel object. Neither is the case. Since MetaSearch and MetaWhere both use combinations of ActiveRecord::Relation and Arel under the hood, and I’ve spent some time putting together patches for both Rails core and Arel, the blurry line between them has become more distinct for me. In this post, I hope to clear up some misunderstandings before Rails 3 is released, and the AR 3.0 query interface becomes the norm.

ActiveRecord::Relation != Arel::Relation

An ActiveRecord::Relation is the basic building block of the AR 3.0 query interface. In any case where you’re chaining things like joins, where, or order on your model, a subclass of ActiveRecord::Base, you’re actually chaining them on an ActiveRecord::Relation, because the very first “relation-y” method you call will be delegated to ActiveRecord::Base#scoped, which returns an ActiveRecord::Relation. In fact, you can get a “blank” ActiveRecord::Relation by calling that method yourself:

ruby-head > Article.scoped.class
=> ActiveRecord::Relation

Now, here’s where the differences start showing up. If you continue to chain methods on this relation, at no point will you ever actually have an Arel object. As you’re probably aware, rows returned by an ActiveRecord::Relations are lazily loaded. When it comes time for an ActiveRecord::Relation to return rows to you, the Arel relation gets built. If we take a look at activerecord/lib/active_record/relation.rb, we see where this happens:

    def to_a
      return @records if loaded?

      @records = eager_loading? ? find_with_associations : @klass.find_by_sql(arel.to_sql)

      preload = @preload_values
      preload +=  @includes_values unless eager_loading?
      preload.each {|associations| @klass.send(:preload_associations, @records, associations) }

      # @readonly_value is true only if set explicity. @implicit_readonly is true if there are JOINS and no explicit SELECT.
      readonly = @readonly_value.nil? ? @implicit_readonly : @readonly_value
      @records.each { |record| record.readonly! } if readonly

      @loaded = true
      @records
    end

Without complicating this discussion with how eager loading gets handled, you’ll notice that the records are actually still being returned by trusty old find_by_sql, but arel is doing the SQL generation for us (arel.to_sql).

ActiveRecord::Relation#arel is defined over in activerecord/lib/active_record/relation/query_methods.rb, and it’s short and sweet:

    def arel
      @arel ||= build_arel
    end

build_arel is way too large to paste here, but the general idea is that all that your ActiveRecord::Relation was doing while you were happily chaining joins and wheres and includes on it was to keep track of those values in corresponding @joins_values, @where_values arrays. See?

    ruby-head > Article.joins(:comments).joins(:moderations).joins_values
    => [:comments, :moderations]

build_arel is (unsurprisingly) the method that does the job of building up an Arel::Relation from all of those accumulated values, and Arel is what then generates the SQL on ActiveRecord’s behalf. We can even call the ActiveRecord::Relation#arel on our relation to get that object back and have a look at it.

    ruby-head > Article.joins(:comments).joins(:moderations).arel.class
     => Arel::From
    ruby-head > Article.joins(:comments).joins(:moderations).arel.is_a?(Arel::Relation)
    => true

So you don’t actually deal with Arel::Relations. They’re used behind the scenes, as a very real replacement for ActiveRecord’s own custom SQL generation code before. In fact, construct_finder_sql became construct_finder_arel in AR 3.0.

Now that that’s out of the way…

So, where does ActiveRecord::PredicateBuilder fit in? The stock PredicateBuilder is a really short and sweet piece of code (only 43 lines!) that sits in activerecord/lib/active_record/relation/predicate_builder.rb and is responsible for building where and having clauses, as evidenced by it only being used in two places in Rails: ActiveRecord::Base#sanitize_sql_hash_for_conditions (called when a hash is sent to sanitize_sql) and ActiveRecord::Relation#build_where (called by build_arel for where and having clauses).

Anyway, ActiveRecord::PredicateBuilder takes advantage of another feature of Arel: predications. Let’s step through PredicateBuilder a bit, and in the process, learn about Arel predication methods. The important method (and only method, besides initialize) of PredicateBuilder is build_from_hash. Here it is, in its entirety:

    def build_from_hash(attributes, default_table)
      predicates = attributes.map do |column, value|
        table = default_table

        if value.is_a?(Hash)
          table = Arel::Table.new(column, :engine => @engine)
          build_from_hash(value, table)
        else
          column = column.to_s

          if column.include?('.')
            table_name, column = column.split('.', 2)
            table = Arel::Table.new(table_name, :engine => @engine)
          end

          unless attribute = table[column]
            raise StatementInvalid,
                  "No attribute named `#{column}` exists for table `#{table.name}`"
          end

          case value
          when Array, ActiveRecord::Associations::AssociationCollection,
               ActiveRecord::Relation
            values = value.to_a
            attribute.in(values)
          when Range
            attribute.in(value)
          else
            attribute.eq(value)
          end
        end
      end

      predicates.flatten
    end

So, as you can see, this method handles the key/value pairs in the hash recursively. It takes a default Arel::Table to start. If it encounters a key whose corresponding value is a hash, then it assumes the key to be a table name, instantiates a new Arel::Table and calls itself again. Otherwise, it expects keys to be the names of columns in that table, apart from one convenience provided (presumably for quicker reference of a one-off condition on another table) where if the column name contains a period, it’s assumed to be a table/column pair.

Arel::Tables have an overloaded [] operator, which expects the value inside the [] to be a column name, and returns an Arel::Attribute corresponding to that column. The attribute knows about the table it came from, so when an Arel::Attribute gets converted to SQL, it knows how to supply its table name as well. Arel::Attributes have a number of methods defined on them, including the predication methods I mentioned earlier. These methods return a kind of Arel::Predicate (things like Equality, In, Match, etc) condition with two operands. Predicates take two operands. In the case of Predicates created by predication methods, the first operand will be the attribute itself. The second is supplied to the predication method by PredicateBuilder.

We can see this further down in the method, where it calls attribute.in or attribute.eq depending on the type of value received. You can see this reflected in an ActiveRecord::Relation‘s @where_values array:

ruby-head > Article.where(
  :title => 'foo',
  :created_at => 3.days.ago..Time.now
).where_values
=> [#<Arel::Predicates::Equality:0x000001028fe8c0
     @operand1=<Attribute title>, @operand2="foo">,
   #<Arel::Predicates::In:0x000001028fe370
     @operand1=<Attribute created_at>,
     @operand2=Sun, 08 May 2010 19:28:03 EDT -04:00..
       2010-05-11 19:28:03 -0400>]

Incidentally, it should come as no surprise that a lot of the “magic” provided by MetaWhere comes from a rewritten PredicateBuilder, MetaWhere::Builder.

Nifty Arel tricks

ActiveRecord::Relation does the Rails developer a great service by hiding much of the complexity of Arel, and keeping the various parts of your query in their most malleable form (what’s simpler to work with than several arrays?) until the last possible moment. However, there may be cases where you want or need to work with Arel directly. Here are a few handy tricks, and some gotchas to keep in mind.

Get the Arel object representing your current query (restated from above):

    ruby-head > arel = Article.where(
      :title => 'foo', :created_at => 3.days.ago..Time.now
    ).arel

Get the where clauses from the arel object:

    ruby-head > arel.send(:where_clauses)
    => ["("articles"."title" = 'foo')",
         "("articles"."created_at" BETWEEN '2010-05-08 20:42:53.042691'
           AND '2010-05-11 20:42:53.043081')"]

This also works for from_clauses, select_clauses, group_clauses, having_clauses, and order_clauses. For join clauses, see the next tip.

Get the join sql from the arel object as a string:

    ruby-head > arel = Article.joins(:comments).arel
    ruby-head > arel.joins(arel)
    => "INNER JOIN "comments"
       ON "comments"."article_id" = "articles"."id""

Get the Arel::Table for an ActiveRecord model:

ruby-head > Article.arel_table
=> #<Arel::Table:0x00000102b81bb0 @name="articles", ...>
# or, with an alias:
ruby-head > Article.arel_table.as('awesome_articles')
=> #<Arel::Table:0x00000102f0f2f0 @name="articles", ...
   :as=>"awesome_articles" ... >

Create your own conditions using predicates other than Equality and In

ruby-head > Article.where(Article.arel_table[:title].matches('%foo%')).to_sql
=> "SELECT     "articles".* FROM       "articles"
    WHERE     ("articles"."title" LIKE '%foo%')"

See this post for more details on using Arel in your conditions.

See the generated SQL for a given query:
I use this one all the time on this site, but still I think it’s worth mentioning.

    ruby-head > Article.where(:title => 'foo').to_sql
     => "SELECT     "articles".* FROM       "articles"
       WHERE     ("articles"."title" = 'foo')"

One gotcha of note with to_sql:
Near the beginning of this post, you may remember I mentioned that I didn’t want to complicate the discussion of the to_a method with eager loading. Well, now it’s time to complicate things.

If eager loading is a factor in your query, either due to explicit eager_loading values, or includes values that are also used in your where conditions, then you’ll notice that the value returned by to_sql doesn’t reflect what actually gets run on the server:

ruby-head > Article.includes(:comments).where(:comments => {:body => 'hey'}).to_sql
=> "SELECT     \"articles\".* FROM       \"articles\"
   WHERE     (\"comments\".\"body\" = 'hey')"
ruby-head > Article.includes(:comments).where(:comments => {:body => 'hey'}).all
# From the development.log
  Article Load (0.4ms)  SELECT "articles"."id" AS t0_r0, "articles"."title" AS t0_r1,
"articles"."body" AS t0_r2, "articles"."created_at" AS t0_r3, "articles"."updated_at"
AS t0_r4, "articles"."lookup_id" AS t0_r5, "comments"."id" AS t1_r0,
"comments"."article_id" AS t1_r1, "comments"."body" AS t1_r2,
"comments"."created_at" AS t1_r3, "comments"."updated_at" AS t1_r4
FROM "articles" LEFT OUTER JOIN "comments" ON 
"comments"."article_id" = "articles"."id" WHERE ("comments"."body" = 'hey')

Of course, given the nature of how eager loading and including works, you’d expect the query to differ when they’re involved. So to_sql is a useful tool to display what’s being generated in your queries — just be aware of its limitations and remember that it’s no substitute for checking the logs when confusion arises.

Gotchas (reinforcing that Arel != ActiveRecord::Relation since 2010)

Chaining multiple orders on an Arel::Relation will not result in them lining up left to right as they do in ActiveRecord::Relation.

    ruby-head > Article.order('id').order('title').to_sql
    => "SELECT     "articles".* FROM       "articles"
       ORDER BY  id, title" 
    ruby-head > articles = Article.arel_table
    ruby-head > articles.order('id').order('title').to_sql
    => "SELECT     "articles"."id", "articles"."title",
       "articles"."body", "articles"."created_at", "articles"."updated_at",
       "articles"."lookup_id"
       FROM       "articles" ORDER BY  title, id"

With Arel, the last applied order is given precedence. As Bryan Helmkamp explained it to me:

Think of it like a spreadsheet. You click on a column, and it sorts by that column. You click on another column, and that sort takes precedence.

Some methods don’t build on the previous values at all, but replace them entirely.

Adding additional wheres to an Arel::Relation works as you might expect from dealing with ActiveRecord::Relation, but adding new groups, havings, etc will only use the last supplied value.

    ruby-head > articles.where('id > 1').where('id < 3').to_sql
    => "SELECT     "articles"."id", "articles"."title", "articles"."body",
       "articles"."created_at", "articles"."updated_at",
       "articles"."lookup_id" FROM       "articles"
       WHERE     id > 1 AND id < 3" 
    uby-head > articles.having('id > 1').having('id < 3').to_sql
    => "SELECT     "articles"."id", "articles"."title", "articles"."body",
       "articles"."created_at", "articles"."updated_at",
       "articles"."lookup_id" FROM       "articles"
       HAVING    id < 3"

That’s about it, for now. Hope this helps sharpen up that blurry line between ActiveRecord 3.0 and Arel for a few of you out there!

comments powered by Disqus