Rails join - find records that don't have a join matching a specific condiiton - sql

In my Rails 2.2.2 app I have two tables/models joined like so:
School
has_many :licenses, :as => :licensable
License
belongs_to :licensable, :polymorphic => true
#important fields for this question:
#start_date: datetime
#end_date: datetime
If i want to search for all schools with a current license, it's simple enough:
licensed_schools = School.find(:all, :include => [:licenses], :conditions => ["licenses.start_date < ? and licenses.end_date > ?", Time.now, Time.now])
This finds me all the schools that have a valid license in the join table. So far so good.
However, if i want to find all the schools that don't have a valid license, it's more difficult (so far): for example if i do
unlicensed_schools = = School.find(:all, :include => [:licenses], :conditions => ["licenses.id is null or licenses.start_date > ? or licenses.end_date < ?", Time.now, Time.now]
then i get back any school that a) don't have any licenses at all (fine) or b) has at least one invalid license, including schools which have an old (invalid) license AND a new (valid) license.
In other words it's returning all schools that either have no license OR have one or more invalid licenses (regardless of whether they have a valid license as well). It should be returning all schools that have 0 valid licenses.
I can't quite figure out how to do this. Any help anyone?

You can use Arel for this using exists:
licenses = License.arel_table
School.where(
License.where(
licenses[:school_id].eq(School.arel_table[:id]).
and(licenses[:start_date].gte(Time.now)).
and(licenses[:end_date].lte(Time.now))).exists)
Alternatively you can build a custom LEFT JOIN and check for NULLs:
licenses = License.arel_table
schools = School.arel_table
custom_join = schools.join(licenses, Arel::Nodes::OuterJoin).
on(licenses[:school_id].eq(schools[:id].
and(licenses[:start_date].gte(Time.now)).
and(licenses[:end_date].lte(Time.now))).join_sources
School.joins(custom_join).where(licenses: {id: nil})
You can also skip Arel completely and build the LEFT JOIN by hand:
School.joins("LEFT JOIN licenses ON licenses.school_id = schools.id AND '#{Time.now.to_s(:db)}' BETWEEN licenses.start_date AND licenses.end_date").
where(licenses: {id: nil})
A LEFT JOIN will typically be less performant than EXISTS, although you might find it simpler to understand.

This isn't very satisfactory (i'd really like to know how to do it with just sql) but i ended up caching (with memcache) the ids of all schools that do have a valid license, and then just testing whether the school's id is in that list.
Like i say i'd still like to see a pure sql solution, particularly one that didn't involve more nested queries. Got a feeling i could do it with COALESCE() but haven't quite been able to figure it out.

How about this?
School.find_by_sql("Select * from schools where id not in (select distinct(lisensable_id) from licenses where licensable_type='school' and licenses.start_date > #{Time.now} and licenses.end_date < #{Time.now})")

Related

Get records with no related data using activerecord and RoR3?

I am making scopes for a model that looks something like this:
class PressRelease < ActiveRecord::Base
has_many :publications
end
What I want to get is all press_releases that does not have publications, but from a scope method, so it can be chained with other scopes. Any ideas?
Thanks!
NOTE: I know that there are methods like present? or any? and so on, but these methods does not return an ActiveRecord::Relation as scope does.
NOTE: I am using RoR 3
Avoid eager_loading if you do not need it (it adds overhead). Also, there is no need for subselect statements.
scope :without_publications, -> { joins("LEFT OUTER JOIN publications ON publications.press_release_id = press_releases.id").where(publications: { id: nil }) }
Explanation and response to comments
My initial thoughts about eager loading overhead is that ActiveRecord would instantiate all the child records (publications) for each press release. Then I realized that the query will never return press release records with publications. So that is a moot point.
There are some points and observations to be made about the way ActiveRecord works. Some things I had previously learned from experience, and some things I learned exploring your question.
The query from includes(:publications).where(publications: {id: nil}) is actually different from my example. It will return all columns from the publications table in addition to the columns from press_releases. The publication columns are completely unnecessary because they will always be null. However, both queries ultimately result in the same set of PressRelease objects.
With the includes method, if you add any sort of limit, for example chaining .first, .last or .limit(), then ActiveRecord (4.2.4) will resort to executing two queries. The first query returns IDs, and the second query uses those IDs to get results. Using the SQL snippet method, ActiveRecord is able to use just one query. Here is an example of this from one of my applications:
Profile.includes(:positions).where(positions: { id: nil }).limit(5)
# SQL (0.8ms) SELECT DISTINCT "profiles"."id" FROM "profiles" LEFT OUTER JOIN "positions" ON "positions"."profile_id" = "profiles"."id" WHERE "positions"."id" IS NULL LIMIT 5
# SQL (0.8ms) SELECT "profiles"."id" AS t0_r0, ..., "positions"."end_year" AS t1_r11 FROM "profiles" LEFT OUTER JOIN "positions" ON "positions"."profile_id" = "profiles"."id" # WHERE "positions"."id" IS NULL AND "profiles"."id" IN (107, 24, 7, 78, 89)
Profile.joins("LEFT OUTER JOIN positions ON positions.profile_id = profiles.id").where(positions: { id: nil }).limit(5)
# Profile Load (1.0ms) SELECT "profiles".* FROM "profiles" LEFT OUTER JOIN positions ON positions.profile_id = profiles.id WHERE "positions"."id" IS NULL LIMIT 5
Most importantly
eager_loading and includes were not intended to solve the problem at hand. And for this particular case I think you are much more aware of what is needed than ActiveRecord is. You can therefore make better decisions about how to structure the query.
you can de the following in your PressRelease:
scope :your_scope, -> { where('id NOT IN(select press_release_id from publications)') }
this will return all PressRelease record without publications.
Couple ways to do this, first one requires two db queries:
PressRelease.where.not(id: Publications.uniq.pluck(:press_release_id))
or if you don't want to hardcode association foreign key:
PressRelease.where.not(id: PressRelease.uniq.joins(:publications).pluck(:id))
Another one is to do a left join and pick those without associated elements - you get a relation object, but it will be tricky to work with it as it already has a join on it:
PressRelease.eager_load(:publications).where(publications: {id: nil})
Another one is to use counter_cache feature. You will need to add publication_count column to your press_releases table.
class Publications < ActiveRecord::Base
belongs_to :presss_release, counter_cache: true
end
Rails will keep this column in sync with a number of records associated to given mode, so then you can simply do:
PressRelease.where(publications_count: [nil, 0])

SQL LEFT JOIN value NOT in either join column

I suspect this is a rather common scenario and may show my ineptitude as a DB developer, but here goes anyway ...
I have two tables: Profiles and HiddenProfiles and the HiddenProfiles table has two relevant foreign keys: profile_id and hidden_profile_id that store ids from the Profiles table.
As you can imagine, a user can hide another user (wherein his profile ID would be the profile_id in the HiddenProfiles table) or he can be hidden by another user (wherein his profile ID would be put in the hidden_profile_id column). Again, a pretty common scenario.
Desired Outcome:
I want to do a join (or to be honest, whatever would be the most efficient query) on the Profiles and HiddenProfiles table to find all the profiles that a given profile is both not hiding AND not hidden from.
In my head I thought it would be pretty straightforward, but the iterations I came up with kept seeming to miss one half of the problem. Finally, I ended up with something that looks like this:
SELECT "profiles".* FROM "profiles"
LEFT JOIN hidden_profiles hp1 on hp1.profile_id = profiles.id and (hp1.hidden_profile_id = 1)
LEFT JOIN hidden_profiles hp2 on hp2.hidden_profile_id = profiles.id and (hp2.profile_id = 1)
WHERE (hp1.hidden_profile_id is null) AND (hp2.profile_id is null)
Don't get me wrong, this "works" but in my heart of hearts I feel like there should be a better way. If in fact there is not, I'm more than happy to accept that answer from someone with more wisdom than myself on the matter. :)
And for what it's worth these are two RoR models sitting on a Postgres DB, so solutions tailored to those constraints are appreciated.
Models are as such:
class Profile < ActiveRecord::Base
...
has_many :hidden_profiles, dependent: :delete_all
scope :not_hidden_to_me, -> (profile) { joins("LEFT JOIN hidden_profiles hp1 on hp1.profile_id = profiles.id and (hp1.hidden_profile_id = #{profile.id})").where("hp1.hidden_profile_id is null") }
scope :not_hidden_by_me, -> (profile) { joins("LEFT JOIN hidden_profiles hp2 on hp2.hidden_profile_id = profiles.id and (hp2.profile_id = #{profile.id})").where("hp2.profile_id is null") }
scope :not_hidden, -> (profile) { self.not_hidden_to_me(profile).not_hidden_by_me(profile) }
...
end
class HiddenProfile < ActiveRecord::Base
belongs_to :profile
belongs_to :hidden_profile, class_name: "Profile"
end
So to get the profiles I want I'm doing the following:
Profile.not_hidden(given_profile)
And again, maybe this is fine, but if there's a better way I'll happily take it.
If you want to get this list just for a single profile, I would implement an instance method to perform effectively the same query in ActiveRecord. The only modification I made is to perform a single join onto a union of subqueries and to apply the conditions on the subqueries. This should reduce the columns that need to be loaded into memory, and hopefully be faster (you'd need to benchmark against your data to be sure):
class Profile < ActiveRecord::Base
def visible_profiles
Profile.joins("LEFT OUTER JOIN (
SELECT profile_id p_id FROM hidden_profiles WHERE hidden_profile_id = #{id}
UNION ALL
SELECT hidden_profile_id p_id FROM hidden_profiles WHERE profile_id = #{id}
) hp ON hp.p_id = profiles.id").where("hp.p_id IS NULL")
end
end
Since this method returns an ActiveRecord scope, you can chain additional conditions if desired:
Profile.find(1).visible_profiles.where("created_at > ?", Time.new(2015,1,1)).order(:name)
Personally I've never liked the join = null approach. I find it counter intuitive. You're asking for a join, and then limiting the results to records that don't match.
I'd approach it more as
SELECT id FROM profiles p
WHERE
NOT EXISTS
(SELECT * FROM hidden_profiles hp1
WHERE hp1.hidden_profile_id = 1 and hp1.profile_id = p.profile_id)
AND
NOT EXISTS (SELECT * FROM hidden_profiles hp2
WHERE hp2.hidden_profile_id = p.profile_id and hp2.profile_id = 1)
But you're going to need to run it some EXPLAINs with realistic volumes to be sure of which works best.

Rails: Many to one ( 0 - n ) , finding records

I've got tables items and cards where a card belongs to a user and a item may or may not have any cards for a given user.
The basic associations are set up as follows:
Class User
has_many :cards
Class Item
has_many :cards
Class Card
belongs_to :user
has_and_belongs_to_many :items
I've also created a join table, items_cards with the columns item_id and card_id. I'd like to make a query that tells me if there's a card for a given user/item. In pure SQL I can accomplish this pretty easily:
SELECT count(id)
FROM cards
JOIN items_cards
ON items_cards.card_id = cards.id
WHERE cards.user_id = ?
AND items_cards.item_id = ?
I'm looking for some guidance as to how I'd go about doing this via ActiveRecord. Thanks!
Assuming you have an Item in #item and a User in #user, this will return 'true' if a card exists for that user and that item:
Card.joins(:items).where('cards.user_id = :user_id and items.id = :item_id', :user_id => #user, :item_id => #item).exists?
Here's what's going on:
Card. - You're making a query about the Card model.
joins(:items) - Rails knows how to put together joins for the association types it supports (usually - at least). You're telling it to do whatever joins are required to allow you to query the associated items as well. This will, in this case, result in JOIN items_cards ON items_cards.card_id = cards.id JOIN items ON items_cards.item_id = items.id.
where('cards.user_id = :user_id and items.id = :item_id', :user_id => #user, :item_id => #item) - Your conditional, pretty much the same as in pure SQL. Rails will interpolate the values you specify with a colon (:user_id) using the values in the hash (:user_id => #user). If you give an ActiveRecord object as the value, Rails will automatically use the id of that object. Here, you're saying you only want results where the card belongs to the user you specify, and there is a row for the item you want.
.exists? - Loading ActiveRecord objects is inefficient, so if you only want to know if something exists, Rails can save some time and use a count based query (much like your SQL version). There's also a .count, which you could use instead if you wanted to have the query return the number of results, rather than true or false.

ActiveRecord: Adding condition to ON clause for includes

I have a model offers and another historical_offers, one offer has_many historical_offers.
Now I would like to eager load the historical_offers of one given day for a set of offers, if it exists. For this, I think I need to pass the day to the ON clause, not the WHERE clause, so that I get all offers, also when there is no historical_offer for the given day.
With
Offer.where(several_complex_conditions).includes(:historical_offers).where("historical_offers.day = ?", Date.today)
I would get
SELECT * FROM offers
LEFT OUTER JOIN historical_offers
ON offers.id = historical_offers.offer_id
WHERE day = '2012-11-09' AND ...
But I want to have the condition in the ON clause, not in the WHERE clause:
SELECT * FROM offers
LEFT OUTER JOIN historical_offers
ON offers.id = historical_offers.offer_id AND day = '2012-11-09'
WHERE ...
I guess I could alter the has_many definition with a lambda condition for a specific date, but how would I pass in a date then?
Alternatively I could write the joins mysqlf like this:
Offer.where(several_complex_conditions)
.joins(["historical_offers ON offers.id = historical_offers.offer_id AND day = ?", Date.today])
But how can I hook this up so that eager loading is done?
After a few hours headscratching and trying all sorts of ways to accomplish eager loading of a constrained set of associated records I came across #dbenhur's answer in this thread which works fine for me - however the condition isn't something I'm passing in (it's a date relative to Date.today). Basically it is creating an association with the conditions I wanted to put into the LEFT JOIN ON clause into the has_many condition.
has_many :prices, order: "rate_date"
has_many :future_valid_prices,
class_name: 'Price',
conditions: ['rate_date > ? and rate is not null', Date.today-7.days]
And then in my controller:
#property = current_agent.properties.includes(:future_valid_prices).find_by_id(params[:id])

SQL: how to find a complement to a set with a derived function/value

This one has me stumped, so I'm hoping someone who's smarter than me can help me out.
I'm working on a rails project in which I've got a User model which has an association of clock_periods joined to it, having the following partial definition:
User
has_many :clock_periods
#clock_periods has the following properties:
#clock_in_time:datetime
#clock_out_time:datetime
named_scope :clocked_in, :select => "users.*",
:joins => :clock_periods, :conditions => 'clock_periods.clock_out_time IS NULL'
def clocked_in?
#default scope on clock periods sorts by date
clock_periods.last.clock_out_time.nil?
end
The SQL query to retrieve all clocked in users is trivial:
SELECT users.* FROM users INNER JOIN clock_periods ON clock_periods.user_id = users.id
WHERE clock_periods.clock_out_time IS NULL
The converse however--finding all users who are currently clocked out--is deceptively difficult. I ended up using the following named scope definition, though its hackish:
named_scope :clocked_out, lambda{{
:conditions => ["users.id not in (?)", clocked_in.map(&:id)+ [-1]]
}}
What bothers me about it is that it seems like there ought to be a way to do this in SQL without resorting to generating statements like
SELECT users.* FROM users WHERE users.id NOT IN (1,3,5)
Anybody got a better way, or is this really the only way to handle it?
Besides #Eric's suggestion there's the issue (unless you've guaranteed against it in some other way you're not showing us) that a user might not have any clock period -- then the inner join would fail to include that user and he wouldn't show either as clocked in or as clocked out. Assuming you also want to show those users as clocked out, the SQL should be something like:
SELECT users.*
FROM users
LEFT JOIN clock_periods ON clock_periods.user_id = users.id
WHERE (clock_periods.clock_user_id IS NULL) OR
(getdate() BETWEEN clock_periods.clock_out_time AND
clock_periods.clock_in_time)
(this kind of thing is the main use of outer joins such as LEFT JOIN).
assuming getdate() = the function in your SQL implementation that returns a datetime representing right now.
SELECT users.* FROM users INNER JOIN clock_periods ON clock_periods.user_id = users.id
WHERE getdate() > clock_periods.clock_out_time and getdate() < clock_periods.clock_in_time
In rails, Eric H's answer should look something like:
users = ClockPeriod.find(:all, :select => 'users.*', :include => :user,
:conditions => ['? > clock_periods.clock_out_time AND ? < clock_periods.clock_in_time',
Time.now, Time.now])
At least, I think that would work...