How do I add a condition to the ON clause generated by includes in active record while retaining eager loading?
Let's say I have these classes:
class Car
has_many :inspections
end
class Inspection
belongs_to :car
end
Now I can do:
Car.includes(:inspections)
Select * from cars LEFT OUTER JOIN inspections ON cars.id = inspections.car_id
But I want to generate this sql:
Select * from cars LEFT OUTER JOIN inspections ON cars.id = inspections.car_id
AND inspections.month = '2013-04-01'
(this doesn't work):
Car.includes(:inspections).where("inspections.month = 2013-04-01")
Select * from cars LEFT OUTER JOIN inspections ON cars.id = inspections.car_id
WHERE inspections.month = '2013-04-01'
I don't know this exactly, but what you are trying to do is probably not recommended i.e. violates one of Rails' conventions. According to this answer in a related question, the default behavior for such queries is to use two queries, like:
SELECT "cars".* FROM "cars";
SELECT "inspections".* FROM "inspections" WHERE "inspections"."car_id" IN (1, 2, 3, 4, 5);
This decision was made for performance reasons. That makes me guess that the exact type of query (JOIN or multiple queries) is an implementation detail that you cannot count on. Going along this train of thought, ActiveRecord::Relation probably wasn't designed for your use case, there is probably no way to add an ON condition in the query.
Going along this sequence of guesses, if you truly believe that your use case is unique, the best thing to do is probably for you to craft your own SQL query as follows:
Car.joins(sanitize_sql_array(["LEFT OUTER JOIN inspections ON inspections.car_id = cars.id AND inspections.month = ?", "2013-04-01"])
(Update: this was asked last year and did not receive a good answer.)
Alternative 1
As Carlos Drew suggested,
#cars = Cars.all
car_ids = #cars.map(&:id)
#inspections = Inspection.where(inspections: {month: '2013-04-01', car_id: car_ids})
# with scopes: Inspection.for_month('2013-04-01').where(car_id: car_ids)
However, in order to prevent car.inspections from triggering unnecessary SQL calls, you also need to do
# app/models/car.rb
has_many :inspections, inverse_of: :car
# app/models/inspection.rb
belongs_to :car, inverse_of: :inspections
Alternative 2
Perhaps you can find a way to cache the inspections for the current month, and then don't worry about eager loading. This might be the best solution, since the cache can be reused in various places.
#cars = Cars.all
#cars.each do |car|
car.inspections.where(month: '2013-04-01')
end
I've rethought your question more broadly. I think you are facing a code design problem as well as (instead of?) an ActiveRecord query problem.
You are asking to return a relation of Cars on which .inspections has been redefined to mean those Inspections matching a specific date. ActiveRecord does not allow you to redefine a model association on the fly, based on a query.
If you were not asking for a dynamic condition on the inspection date, I would tell you to use a has_many :through with a :condition.
has_many :passed_inspections, :through => :inspections, :conditions => {:passed => true}
#cars = Cars.includes(:passed_inspections)
Obviously, that would not work if you need to supply an inspection date on the fly.
So, in the end, I would tell you to do something like this:
#cars = Cars.all
#inspections = Inspection.where(inspections: {month: '2013-04-01', car_id: #cars.pluck(:id)})
(Exact, best implementation of that car_id where condition is up to debate. And you'll then need to group the #inspections by car_id to get the right subset in a given moment.)
Alternately, in a production environment, you might be able to rely on some fairly good/clever ActiveRecord caching. I'm not certain of this.
def inspections_dated(month)
inspections.where(month: month)
end
Car.includes(:inspections).each{|car| car.inspections_dated(month).each.etc. }
Alternately, Alternately
You can, through manual SQL, trick ActiveRecord into giving you extended Car objects with an unclear interface:
#cars_with_insp = Car.join("LEFT OUTER JOIN inspections ON inspections.car_id = cars.id AND inspections.month = '2013-04-01'").select("cars.*, inspections.*")
#cars_with_insp.each{|c| puts c.name; puts c.inspection_month}
You'll see, in that .each, that you have the inspection's attributes available directly on car, because you've convinced ActiveRecord with a join to return two records of one class as a single row. Rails will tell you its class is Car, but it's more than a Car. You'll either get each Car once, for no matching Inspections, or multiple times for each matching Inspection.
This should work:
Car.includes(:inspections).where( inspections: { month: '2013-04-01' })
The authors of Rails did not build this functionality into ActiveRecord, presumably because using WHERE returns the same result set, and they felt no need to have an alternative.
In the docs and code, we find the two "official" methods of adding conditions to included models.
In the actual source code: https://github.com/rails/rails/blob/5245648812733d2c31f251de3e05e78e68bfa3a5/activerecord/lib/active_record/relation/query_methods.rb we find them using WHERE to accomplish this:
And I quote: "
=== conditions
#
# If you want to add conditions to your included models you'll have
# to explicitly reference them. For example:
#
# User.includes(:posts).where('posts.name = ?', 'example')
#
# Will throw an error, but this will work:
#
# User.includes(:posts).where('posts.name = ?', 'example').references(:posts)
_END_QUOTE_
The docs mention another approach: http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html under the header "Eager loading of associations"
QUOTE:
If you do want eager load only some members of an association it is usually more natural to include an association which has conditions defined on it:
class Post < ActiveRecord::Base
has_many :approved_comments, -> { where approved: true }, class_name: 'Comment'
end
Post.includes(:approved_comments)
This will load posts and eager load the approved_comments association, which contains only those comments that have been approved.
END QUOTE
You can technically use such an approach, but it in your case it may not be so useful if you are using dynamic month values.
These are the only options, which in any case return the same results as your AND based query.
Related
For someone who is coming from a non ActiveRecord environment, complex queries are challenging. I know my way quite well in writing SQL's, however I'm having difficulties figuring out how to achieve certain queries in solely AREL. I tried figuring out the examples below by myself, but I can't seem to find the correct answers.
Here are some reasons as to why I'd opt for the AREL way instead of my current find_by_sql-way:
Cleaner code in my model.
Simpler code (when this query is used in combination with pagination because of chaining.)
More multi-db-compatibility (e.g. I'm used to GROUP BY topics.id in stead of specifying all columns I'm using in my SELECT clause.
Here are the simplified version of my models:
class Support::Forum < ActiveRecord::Base
has_many :topics
def self.top
Support::Forum.find_by_sql "SELECT forum.id, forum.title, forum.description, SUM(topic.replies_count) AS count FROM support_forums forum, support_topics topic WHERE forum.id = topic.forum_id AND forum.group = 'theme support' GROUP BY forum.id, forum.title, forum.description ORDER BY count DESC, id DESC LIMIT 4;"
end
def ordered_topics
Support::Topic.find_by_sql(["SELECT topics.* FROM support_forums forums, support_topics topics, support_replies replies WHERE forums.id = ? AND forums.id = topics.forum_id AND topics.id = replies.topic_id GROUP BY topics.id ORDER BY topics.pinned DESC, MAX(replies.id) DESC;", self.id])
end
def last_topic
Support::Topic.find_by_sql(["SELECT topics.id, topics.title FROM support_forums forums, support_topics topics, support_replies replies WHERE forums.id = ? AND forums.id = topics.forum_id AND topics.id = replies.topic_id GROUP BY topics.id, topics.title, topics.pinned ORDER BY MAX(replies.id) DESC LIMIT 1;", self.id]).first
end
end
class Support::Topic < ActiveRecord::Base
belongs_to :forum, counter_cache: true
has_many :replies
end
class Support::Reply < ActiveRecord::Base
belongs_to :topic, counter_cache: true
end
Whenever I can, I try to write stuff like this via AREL and not in SQL (for the reasons mentioned before), but I just can't get my head around the non-basic examples such as the ones above.
Fyi I'm not really looking for straight conversions of these methods to AREL, any directions or insight towards a solution are welcome.
Another remark if you however think this is perfectly acceptable solution to write these queries with an sql-finder, please share your thoughts.
Note: If I need to provide additional examples, please say so and I will :)
For anything that doesn't require custom joins or on clauses - i.e. can be mapped to AR relations - you might want to use squeel instead of arel. AREL is basically a heavyweight relational algebra DSL which you can use to write SQL queries from scratch in ruby. Squeel is more of a fancier DSL for active record queries that eliminates most cases where you would use SQL literal statements.
I've got a fairly complex sql query that I'm pretty sure I can't accomplish with ARel (Rails 3.0.10)
Check out the link, but it has a few joins and a where exists clause, and that I'm pretty sure is too complex for ARel.
My problem however is that, before this query was so complex, with ARel I could use includes to add other models that I needed to avoid n+1 issues. Now that I'm using find_by_sql, includes don't work. I still want to be able to fetch these records and attach them to my model instances, the way includes does, but I'm not quite sure how to achieve this.
Can someone point me in the right direction?
I haven't tried joining them in the same query yet. I'm just not sure how they would be mapped to objects (ie. if ActiveRecord would properly map them to the proper class)
I know that when using includes ActiveRecord actually makes a second query, then somehow attaches those rows to the corresponding instances from the original query. Can someone instruct me on how I might do this? Or do I need to join in the same query?
Let's pretend that the SQL really can't be reduced to Arel. Not everything can, and we happen to really really want to keep our custom find_by_sql but we also want to use includes.
Then preload_associations is your friend:
(Updated for Rails 3.1)
class Person
def self.custom_query
friends_and_family = find_by_sql("SELECT * FROM people")
# Rails 3.0 and lower use this:
# preload_associations(friends_and_family, [:car, :kids])
# Rails 3.1 and higher use this:
ActiveRecord::Associations::Preloader.new(friends_and_family, [:car, :kids]).run
friends_and_family
end
end
Note that the 3.1 method is much better, b/c you can apply the eager-loading at any time. Thus you can fetch the objects in your controller, and then just before rendering, you can check the format and eager-load more associations. That's what happens for me - html doens't need the eager loading, but the .json does.
That help?
I am pretty sure that you can do even the most complex queries with Arel. Maybe you are being over-skeptical about it.
Check these:
Rails 3: Arel for NOT EXISTS?
How to do "where exists" in Arel
#pedrorolo thanks for the heads up on that not exists arel query, helped me achieve what I needed. Here's the final solution (they key is the final .exists on the GroupChallenge query:
class GroupChallenge < ActiveRecord::Base
belongs_to :group
belongs_to :challenge
def self.challenges_for_contact(contact_id, group_id=nil)
group_challenges = GroupChallenge.arel_table
group_contacts = GroupContact.arel_table
challenges = Challenge.arel_table
groups = Group.arel_table
query = group_challenges.project(1).
join(group_contacts).on(group_contacts[:group_id].eq(group_challenges[:group_id])).
where(group_challenges[:challenge_id].eq(challenges[:id])).
where(group_challenges[:restrict_participants].eq(true)).
where(group_contacts[:contact_id].eq(contact_id))
query = query.join(groups).on(groups[:id].eq(group_challenges[:group_id])).where(groups[:id].eq(group_id)) if group_id
query
end
end
class Challenge < ActiveRecord::Base
def self.open_for_participant(contact_id, group_id = nil)
open.
joins("LEFT OUTER JOIN challenge_participants as cp ON challenges.id = cp.challenge_id AND cp.contact_id = #{contact_id.to_i}").
where(['cp.accepted != ? or cp.accepted IS NULL', false]).
where(GroupChallenge.challenges_for_contact(contact_id, group_id).exists.or(table[:open_to_all].eq(true)))
end
end
I've got a model (a Feature) that can have many Assets. These Assets each have an issue_date. I'm struggling with what seems like a simple ActiveRecord query to find all Features and their Assets with an issue_date of tomorrow, regardless of if there are Assets or not — preferably with one query.
Here's my query right now.
Feature.includes(:assets).where(:assets => { :issue_date => Date.tomorrow })
Unfortunately, this returns only the Features that have Assets with an issue_date of tomorrow. Even stranger, the generated SQL looks like this (tomorrow's obviously the 19th).
SELECT `features`.* FROM `features` WHERE `assets`.`issue_date` = '2011-08-19'
Shouldn't this have an LEFT JOIN in there somewhere? That's the sort of thing I'm going for. Using joins instead of includes does an INNER JOIN, but that's not what I want. Strangely enough, it seems like I'm getting an INNER JOIN-type of behavior. When I run that includes query above, the actual SQL that's spit out looks something like this...
SELECT `features`.`id` AS t0_r0, `features`.`property_id` AS t0_r1,
// every other column from features truncated for sanity
`assets`.`feature_id` AS t1_r1, `assets`.`asset_type` AS t1_r2,
// all other asset columns truncated for sanity
FROM `features`
LEFT OUTER JOIN `assets` ON `assets`.`feature_id` = `features`.`id`
WHERE `assets`.`issue_date` = '2011-08-19'
Which looks like it should work right but it doesn't. I get only the Features that have Assets with an issue_date of tomorrow. Any idea what I'm doing wrong?
I've tried the older, Rails v2 way of doing it…
Feature.find(:all,
:include => :assets,
:conditions => ['assets.issue_date = ?', Date.tomorrow])
Which gives me the same results. There's one Feature I know that doesn't have any Assets for tomorrow, and it's not in that list.
I've also poked around and found similar questions, but I couldn't seem to find one that explained this opposite behavior I'm seeing.
Edit: I'm so close. This gets me all the Feature objects.
Feature.joins("LEFT OUTER JOIN assets on assets.feature_id = feature.id AND asset.issue_date = #{Date.tomorrow}")
It does not, however, get me the matching Assets bundled into the object. With feature as a returned item in the query, feature.assets makes another call to the database, which I don't want. I want feature.assets to return only those I've specified in that LEFT OUTER JOIN call. What else do I need to do to my query?
I thought this would get me what I needed, but it doesn't. Calling feature.assets (with feature as an item returned in my query) does another query to look for all assets related to that feature.
Feature.joins("LEFT OUTER JOIN assets on assets.feature_id = feature.id AND asset.issue_date = #{Date.tomorrow}")
So here's what does work. Seems a little cleaner, too. My Feature model already has a has_many :assets set on it. I've set up another association with has_many :tomorrows_assets that points to Assets, but with a condition on it. Then, when I ask for Feature.all or Feature.name_of_scope, I can specify .includes(:tomorrows_assets). Winner winner, chicken dinner.
has_many :tomorrows_assets,
:class_name => "Asset",
:readonly => true,
:conditions => "issue_date = '#{Date.tomorrow.to_s}'"
I can successfully query Features and get just what I need included with it, only if it matches the specified criteria (and I've set :readonly because I know I'll never want to edit Assets like this). Here's an IRB session that shows the magic.
features = Feature.includes(:tomorrows_assets)
feature1 = features.find_all{ |f| f.name == 'This Feature Has Assets' }.first
feature1.tomorrows_assets
=> [#<Asset id:1>, #<Asset id:2>]
feature2 = features.find_all{ |f| f.name == 'This One Does Not' }.first
feature2.tomorrows_assets
=> []
And all in only two SQL queries.
I had a very similar problem and managed to solve it using the following query;
Feature.includes(:assets).where('asset.id IS NULL OR asset.issue_date = ?', Date.tomorrow)
This will load all features, regardless of whether it has any assets. Calling feature.asset will return an array of assets if available without running another query
Hope that helps someone!
You have to specify the SQL for outer joins yourself, the joins method only uses inner joins.
Feature.joins("LEFT OUTER JOIN assets ON assets.feature_id = features.id").
where(:assets => {:issue_date => Date.tomorrow})
Have you tried:
Feature.joins( :assets ).where( :issue_date => Date.tomorrow );
The guide here suggests the includes method is used to reduce the number of queries on a secondary table, rather than to join the two tables in the way you're attempting.
http://guides.rubyonrails.org/active_record_querying.html
See the updates at the bottom. I've narrowed this down significantly.
I've also created a barebones app demonstrating this bug: https://github.com/coreyward/bug-demo
And I've also created a bug ticket in the official tracker: https://rails.lighthouseapp.com/projects/8994/tickets/6611-activerecord-query-changing-when-a-dotperiod-is-in-condition-value
If someone can either tell me how to monkey-patch this or explain where this is happening in Rails I'd be very grateful.
I'm getting some bizarre/unexpected behavior. That'd lead me to believe either there is a bug (confirmation that this is a bug would be a perfect answer), or I am missing something that is right under my nose (or that I don't understand).
The Code
class Gallery < ActiveRecord::Base
belongs_to :portfolio
default_scope order(:ordinal)
end
class Portfolio < ActiveRecord::Base
has_many :galleries
end
# later, in a controller action
scope = Portfolio.includes(:galleries) # eager load galleries
if some_condition
#portfolio = scope.find_by_domain('domain.com')
else
#portfolio = scope.find_by_vanity_url('vanity_url')
end
I have Portfolios which can have multiple Galleries each.
The galleries have ordinal, vanity_url, and domain attributes.
The gallery ordinals are set as integers from zero on up. I've confirmed that this works as expected by checking Gallery.where(:portfolio_id => 1).map &:ordinal, which returns [0,1,2,3,4,5,6] as expected.
Both vanity_url and domain are t.string, :null => false columns with unique indexes.
The Problem
If some_condition is true and find_by_domain is run, the galleries returned do not respect the default scope. If find_by_vanity_url is run, the galleries are ordered according to the default scope. I looked at the queries being generated, and they are very different.
The Queries
# find_by_domain SQL: (edited out additional selected columns for brevity)
Portfolio Load (2.5ms) SELECT DISTINCT `portfolios`.id FROM `portfolios` LEFT OUTER JOIN `galleries` ON `galleries`.`portfolio_id` = `portfolios`.`id` WHERE `portfolios`.`domain` = 'lvh.me' LIMIT 1
Portfolio Load (0.4ms) SELECT `portfolios`.`id` AS t0_r0, `portfolios`.`vanity_url` AS t0_r2, `portfolios`.`domain` AS t0_r11, `galleries`.`id` AS t1_r0, `galleries`.`portfolio_id` AS t1_r1, `galleries`.`ordinal` AS t1_r6 FROM `portfolios` LEFT OUTER JOIN `galleries` ON `galleries`.`portfolio_id` = `portfolios`.`id` WHERE `portfolios`.`domain` = 'lvh.me' AND `portfolios`.`id` IN (1)
# find_by_vanity_url SQL:
Portfolio Load (0.4ms) SELECT `portfolios`.* FROM `portfolios` WHERE `portfolios`.`vanity_url` = 'cw' LIMIT 1
Gallery Load (0.3ms) SELECT `galleries`.* FROM `galleries` WHERE (`galleries`.portfolio_id = 1) ORDER BY ordinal
So the query generated by find_by_domain doesn't have an ORDER statement, hence things aren't being ordered as desired. My question is...
Why is this happening? What is prompting Rails 3 to generate different queries to these two columns?
Update
This is really strange. I've considered and ruled out all of the following:
Indexes on the columns
Reserved/special words in Rails
A column name collision between the tables (ie. domain being on both tables)
The field type, both in the DB and Schema
The "allow null" setting
The separate scope
I get the same behavior as find_by_vanity_url with location, phone, and title; I get the same behavior as find_by_domain with email.
Another Update
I've narrowed it down to when the parameter has a period (.) in the name:
find_by_something('localhost') # works fine
find_by_something('name_routed_to_127_0_0_1') # works fine
find_by_something('my_computer.local') # fails
find_by_something('lvh.me') #fails
I'm not familiar enough with the internals to say where the query formed might change based on the value of a WHERE condition.
The difference between the two strategies for eager loading are discussed in the comments here
https://github.com/rails/rails/blob/3-0-stable/activerecord/lib/active_record/association_preload.rb
From the documentation:
# The second strategy is to use multiple database queries, one for each
# level of association. Since Rails 2.1, this is the default strategy. In
# situations where a table join is necessary (e.g. when the +:conditions+
# option references an association's column), it will fallback to the table
# join strategy.
I believe that the dot in "foo.bar" is causing active record to think that you are putting a condition on a table that is outside of the originating model which prompts the second strategy discussed in the documentation.
The two separate queries runs one with the Person model and the second with the Item model.
Person.includes(:items).where(:name => 'fubar')
Person Load (0.2ms) SELECT "people".* FROM "people" WHERE "people"."name" = 'fubar'
Item Load (0.4ms) SELECT "items".* FROM "items" WHERE ("items".person_id = 1) ORDER BY items.ordinal
Because you run the second query against the Item model, it inherits the default scope where you specified order(:ordinal).
The second query, which it attempts eager loading with the full runs off the person model and will not use the default scope of the association.
Person.includes(:items).where(:name => 'foo.bar')
Person Load (0.4ms) SELECT "people"."id" AS t0_r0, "people"."name" AS t0_r1,
"people"."created_at" AS t0_r2, "people"."updated_at" AS t0_r3, "items"."id" AS t1_r0,
"items"."person_id" AS t1_r1, "items"."name" AS t1_r2, "items"."ordinal" AS t1_r3,
"items"."created_at" AS t1_r4, "items"."updated_at" AS t1_r5 FROM "people" LEFT OUTER JOIN
"items" ON "items"."person_id" = "people"."id" WHERE "people"."name" = 'foo.bar'
It is a little buggy to think that, but I can see how it would be with the several different ways you can present a list of options, the way to be sure that you catch all of them would be to scan the completed "WHERE" conditions for a dot and use the second strategy, and they leave it that way because both strategies are functional. I would actually go as far as saying that the aberrant behavior is in the first query, not the second. If you would like the ordering to persist for this query, I recommend one of the following:
1) If you want the association to have an order by when it is called, then you can specify that with the association. Oddly enough, this is in the documentation, but I could not get it to work.
Source: http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html#method-i-has_many
class Person < ActiveRecord::Base
has_many :items, :order => 'items.ordinal'
end
2) Another method would be to just add the order statement to the query in question.
Person.includes(:items).where(:name => 'foo.bar').order('items.ordinal')
3) Along the same lines would be setting up a named scope
class Person < ActiveRecord::Base
has_many :items
named_scope :with_items, includes(:items).order('items.ordinal')
end
And to call that:
Person.with_items.where(:name => 'foo.bar')
This is issue #950 on the Rails GitHub project. It looks like implicit eager loading (which is what causes this bug) has been deprecated in Rails 3.2 and removed in Rails 4.0. Instead, you'll explicitly tell Rails that you need a JOIN for the WHERE clause — e.g.:
Post.includes(:comments).where("comments.title = 'lol'").references(:comments)
If you desperately need this bug fixed in Rails 3.1.*, you can hack ActiveRecord::Relation#tables_in_string to be less aggressive in matching table names. I created a Gist of my (inelegant and slow) solution. This is the diff:
diff --git a/activerecord/lib/active_record/relation.rb b/activerecord/lib/active_record/relation.rb
index 30f1824..d7335f3 100644
--- a/activerecord/lib/active_record/relation.rb
+++ b/activerecord/lib/active_record/relation.rb
## -528,7 +528,13 ## module ActiveRecord
return [] if string.blank?
# always convert table names to downcase as in Oracle quoted table names are in uppercase
# ignore raw_sql_ that is used by Oracle adapter as alias for limit/offset subqueries
- string.scan(/([a-zA-Z_][.\w]+).?\./).flatten.map{ |s| s.downcase }.uniq - ['raw_sql_']
+ candidates = string.scan(/([a-zA-Z_][.\w]+).?\./).flatten.map{ |s| s.downcase }.uniq - ['raw_sql_']
+ candidates.reject do |t|
+ s = string.partition(t).first
+ s.chop! if s.last =~ /['"]/
+ s.reverse!
+ s =~ /^\s*=/
+ end
end
end
end
It only works for my very specific case (Postgres and an equality condition), but maybe you can alter it to work for you.
Say if #news_writers is an array of records. I then want to use #news_writers to find all news items that are written by all the news writers contained in #news_writers.
So I want something like this (but this is syntactically incorrect):
#news = News.find_all_by_role_id(#news_writers.id)
Note that
class Role < ActiveRecord::Base
has_many :news
end
and
class News < ActiveRecord::Base
belongs_to :role
end
Like ennen, I'm unsure what relationships your models are supposed to have. But in general, you can find all models with a column value from a given set like this:
News.all(:conditions => {:role_id => #news_writers.map(&:id)})
This will create a SQL query with a where condition like:
WHERE role_id IN (1, 10, 13, ...)
where the integers are the ids of the #news_writers.
I'm not sure if I understand you - #news_writers is a collection of Role models? If that assumption is correct, your association appears to be backwards - if these represent authors of news items, shouldn't News belong_to Role (being the author)?
At any rate, I would assume the most direct approach would be to use an iterator over #news_writers, calling on the association for each news_writer (like news_writer.news) in turn and pushing it into a separate variable.
Edit: Daniel Lucraft's suggestion is a much more elegant solution than the above.