Rails/SQL: Find parents that have no child or have all children with a condition - sql

I need to find parents that either have no child OR have all children exclusively with condition (status = 1).
class Parent
has_many :children
end
class Child
enum status: [ :confirmed, :not_confirmed ]
belongs_to :parent
end
I know the first part, which is finding parents with no children.
Parent.joins(:children).where('count(children) = 0')
Rails answer.

Since you're using Postgres, you can use a NOT EXISTS query:
# Parents with no children
Parent.where.not('exists (?)', Child.where('children.parent_id = parents.id').select(1))
This query performs better than anything requiring a join, as an EXPLAIN will show you that Postgres will accomplish this with a Nested Loop Anti Join operation.

Here is a solution in Rails:
grp = Parent.left_outer_joins(:children).distinct
models = grp.where('children.id IS NULL').or(grp.where('children.status = 1'))
Basically you need to use LEFT OUTER JOIN (see left_outer_joins in Rails5 Reference).
I do not think your example of the first part works. It would return an error message like
# ActiveRecord::StatementInvalid (PG::GroupingError: ERROR:
# aggregate functions are not allowed in WHERE
and also Rails joins is INNER JOIN.

Related

Rails - scope for records that are not in a join table alongside a specific association

I have two models in a Rails app - Tournament and Player associated through a join table:
class Tournament < ApplicationRecord
has_many :tournament_players
has_many :players, through: :tournament_players
end
class Player < ApplicationRecord
has_many :tournament_players
has_many :tournaments, through: :tournament_players
scope :selected, -> (tournament) { includes(:tournaments).where(tournaments: {id: tournament.id}) }
end
I have lots of Tournaments, and each one can have lots of Players. Players can play in lots of Tournaments. The scope
scope :selected, -> (tournament) { includes(:tournaments).where(tournaments: {id: tournament.id}) }
successfuly finds all the players already added to a tournament, given that tournament as an argument.
What I'd like is a scope that does the opposite - returns all the players not yet added to a given tournament. I've tried
scope :not_selected, -> (tournament) { includes(:tournaments).where.not(tournaments: {id: tournament.id}) }
but that returns many of the same players, I think because the players exist as part of other tournaments. The SQL for that looks something like:
SELECT "players".*, "tournaments”.* FROM "players" LEFT OUTER JOIN
"tournament_players" ON "tournament_players"."player_id" =
"players"."id" LEFT OUTER JOIN "tournaments" ON "tournaments"."id" =
"tournament_players"."tournament_id" WHERE ("tournaments"."id" != $1)
ORDER BY "players"."name" ASC [["id", 22]]
I've also tried the suggestions on this question - using
scope :not_selected, -> (tournament) { includes(:tournaments).where(tournaments: {id: nil}) }
but that doesn't seem to work - it just returns an empty array, again I think because the Players exist in the join table as part of a separate Tournament. The SQL for that looks something like:
SELECT "players”.*, "tournaments”.* FROM "players" LEFT OUTER JOIN
"tournament_players" ON "tournament_players"."player_id" =
"players"."id" LEFT OUTER JOIN "tournaments" ON "tournaments"."id" =
"tournament_players"."tournament_id" WHERE "tournaments"."id" IS NULL
ORDER BY "players"."name" ASC
What you need to do is:
Make a left join with the reference table, with an additional condition on the tournament ID matching the one that you want to find the not-selected players for
Apply a WHERE clause indicating that there was no JOIN made.
This code should do it:
# player.rb
scope :not_selected, -> (tournament) do
joins("LEFT JOIN tournament_players tp ON players.id = tp.player_id AND tp.tournament_id = #{tournament.id}").where(tp: {tournament_id: nil})
end
If only Rails had a nicer way to write LEFT JOIN queries with additional conditions...
A few notes:
Don't join the actual relation (i.e. Tournament), it dramatically decreases performance of your query, and it's unnecessary, because all your condition prerequisites are inside the reference table. Besides, all the rows you're interested in return NULL data from the tournaments table.
Don't use eager_load. Besides to my best knowledge its not supporting custom conditions, it would create models for all related objects, which you don't need.
ok try this:
includes(:tournaments).distinct.where.not(tournaments: {id: tournament.id})

Get records with no related data using activerecord and RoR3?

I am making scopes for a model that looks something like this:
class PressRelease < ActiveRecord::Base
has_many :publications
end
What I want to get is all press_releases that does not have publications, but from a scope method, so it can be chained with other scopes. Any ideas?
Thanks!
NOTE: I know that there are methods like present? or any? and so on, but these methods does not return an ActiveRecord::Relation as scope does.
NOTE: I am using RoR 3
Avoid eager_loading if you do not need it (it adds overhead). Also, there is no need for subselect statements.
scope :without_publications, -> { joins("LEFT OUTER JOIN publications ON publications.press_release_id = press_releases.id").where(publications: { id: nil }) }
Explanation and response to comments
My initial thoughts about eager loading overhead is that ActiveRecord would instantiate all the child records (publications) for each press release. Then I realized that the query will never return press release records with publications. So that is a moot point.
There are some points and observations to be made about the way ActiveRecord works. Some things I had previously learned from experience, and some things I learned exploring your question.
The query from includes(:publications).where(publications: {id: nil}) is actually different from my example. It will return all columns from the publications table in addition to the columns from press_releases. The publication columns are completely unnecessary because they will always be null. However, both queries ultimately result in the same set of PressRelease objects.
With the includes method, if you add any sort of limit, for example chaining .first, .last or .limit(), then ActiveRecord (4.2.4) will resort to executing two queries. The first query returns IDs, and the second query uses those IDs to get results. Using the SQL snippet method, ActiveRecord is able to use just one query. Here is an example of this from one of my applications:
Profile.includes(:positions).where(positions: { id: nil }).limit(5)
# SQL (0.8ms) SELECT DISTINCT "profiles"."id" FROM "profiles" LEFT OUTER JOIN "positions" ON "positions"."profile_id" = "profiles"."id" WHERE "positions"."id" IS NULL LIMIT 5
# SQL (0.8ms) SELECT "profiles"."id" AS t0_r0, ..., "positions"."end_year" AS t1_r11 FROM "profiles" LEFT OUTER JOIN "positions" ON "positions"."profile_id" = "profiles"."id" # WHERE "positions"."id" IS NULL AND "profiles"."id" IN (107, 24, 7, 78, 89)
Profile.joins("LEFT OUTER JOIN positions ON positions.profile_id = profiles.id").where(positions: { id: nil }).limit(5)
# Profile Load (1.0ms) SELECT "profiles".* FROM "profiles" LEFT OUTER JOIN positions ON positions.profile_id = profiles.id WHERE "positions"."id" IS NULL LIMIT 5
Most importantly
eager_loading and includes were not intended to solve the problem at hand. And for this particular case I think you are much more aware of what is needed than ActiveRecord is. You can therefore make better decisions about how to structure the query.
you can de the following in your PressRelease:
scope :your_scope, -> { where('id NOT IN(select press_release_id from publications)') }
this will return all PressRelease record without publications.
Couple ways to do this, first one requires two db queries:
PressRelease.where.not(id: Publications.uniq.pluck(:press_release_id))
or if you don't want to hardcode association foreign key:
PressRelease.where.not(id: PressRelease.uniq.joins(:publications).pluck(:id))
Another one is to do a left join and pick those without associated elements - you get a relation object, but it will be tricky to work with it as it already has a join on it:
PressRelease.eager_load(:publications).where(publications: {id: nil})
Another one is to use counter_cache feature. You will need to add publication_count column to your press_releases table.
class Publications < ActiveRecord::Base
belongs_to :presss_release, counter_cache: true
end
Rails will keep this column in sync with a number of records associated to given mode, so then you can simply do:
PressRelease.where(publications_count: [nil, 0])

LEFT OUTER JOIN in Rails 4

I have 3 models:
class Student < ActiveRecord::Base
has_many :student_enrollments, dependent: :destroy
has_many :courses, through: :student_enrollments
end
class Course < ActiveRecord::Base
has_many :student_enrollments, dependent: :destroy
has_many :students, through: :student_enrollments
end
class StudentEnrollment < ActiveRecord::Base
belongs_to :student
belongs_to :course
end
I wish to query for a list of courses in the Courses table, that do not exist in the StudentEnrollments table that are associated with a certain student.
I found that perhaps Left Join is the way to go, but it seems that joins() in rails only accept a table as argument.
The SQL query that I think would do what I want is:
SELECT *
FROM Courses c LEFT JOIN StudentEnrollment se ON c.id = se.course_id
WHERE se.id IS NULL AND se.student_id = <SOME_STUDENT_ID_VALUE> and c.active = true
How do I execute this query the Rails 4 way?
Any input is appreciated.
You can pass a string that is the join-sql too. eg joins("LEFT JOIN StudentEnrollment se ON c.id = se.course_id")
Though I'd use rails-standard table naming for clarity:
joins("LEFT JOIN student_enrollments ON courses.id = student_enrollments.course_id")
If anyone came here looking for a generic way to do a left outer join in Rails 5, you can use the #left_outer_joins function.
Multi-join example:
Ruby:
Source.
select('sources.id', 'count(metrics.id)').
left_outer_joins(:metrics).
joins(:port).
where('ports.auto_delete = ?', true).
group('sources.id').
having('count(metrics.id) = 0').
all
SQL:
SELECT sources.id, count(metrics.id)
FROM "sources"
INNER JOIN "ports" ON "ports"."id" = "sources"."port_id"
LEFT OUTER JOIN "metrics" ON "metrics"."source_id" = "sources"."id"
WHERE (ports.auto_delete = 't')
GROUP BY sources.id
HAVING (count(metrics.id) = 0)
ORDER BY "sources"."id" ASC
There is actually a "Rails Way" to do this.
You could use Arel, which is what Rails uses to construct queries for ActiveRecrods
I would wrap it in method so that you can call it nicely and pass in whatever argument you would like, something like:
class Course < ActiveRecord::Base
....
def left_join_student_enrollments(some_user)
courses = Course.arel_table
student_entrollments = StudentEnrollment.arel_table
enrollments = courses.join(student_enrollments, Arel::Nodes::OuterJoin).
on(courses[:id].eq(student_enrollments[:course_id])).
join_sources
joins(enrollments).where(
student_enrollments: {student_id: some_user.id, id: nil},
active: true
)
end
....
end
There is also the quick (and slightly dirty) way that many use
Course.eager_load(:students).where(
student_enrollments: {student_id: some_user.id, id: nil},
active: true
)
eager_load works great, it just has the "side effect" of loding models in memory that you might not need (like in your case)
Please see Rails ActiveRecord::QueryMethods .eager_load
It does exactly what you are asking in a neat way.
Combining includes and where results in ActiveRecord performing a LEFT OUTER JOIN behind the scenes (without the where this would generate the normal set of two queries).
So you could do something like:
Course.includes(:student_enrollments).where(student_enrollments: { course_id: nil })
Docs here: http://guides.rubyonrails.org/active_record_querying.html#specifying-conditions-on-eager-loaded-associations
Adding to the answer above, to use includes, if you want an OUTER JOIN without referencing the table in the where (like id being nil) or the reference is in a string you can use references. That would look like this:
Course.includes(:student_enrollments).references(:student_enrollments)
or
Course.includes(:student_enrollments).references(:student_enrollments).where('student_enrollments.id = ?', nil)
http://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-references
You'd execute the query as:
Course.joins('LEFT JOIN student_enrollment on courses.id = student_enrollment.course_id')
.where(active: true, student_enrollments: { student_id: SOME_VALUE, id: nil })
I know that this is an old question and an old thread but in Rails 5, you could simply do
Course.left_outer_joins(:student_enrollments)
You could use left_joins gem, which backports left_joins method from Rails 5 for Rails 4 and 3.
Course.left_joins(:student_enrollments)
.where('student_enrollments.id' => nil)
I've been struggling with this kind of problem for quite some while, and decided to do something to solve it once and for all. I published a Gist that addresses this issue: https://gist.github.com/nerde/b867cd87d580e97549f2
I created a little AR hack that uses Arel Table to dynamically build the left joins for you, without having to write raw SQL in your code:
class ActiveRecord::Base
# Does a left join through an association. Usage:
#
# Book.left_join(:category)
# # SELECT "books".* FROM "books"
# # LEFT OUTER JOIN "categories"
# # ON "books"."category_id" = "categories"."id"
#
# It also works through association's associations, like `joins` does:
#
# Book.left_join(category: :master_category)
def self.left_join(*columns)
_do_left_join columns.compact.flatten
end
private
def self._do_left_join(column, this = self) # :nodoc:
collection = self
if column.is_a? Array
column.each do |col|
collection = collection._do_left_join(col, this)
end
elsif column.is_a? Hash
column.each do |key, value|
assoc = this.reflect_on_association(key)
raise "#{this} has no association: #{key}." unless assoc
collection = collection._left_join(assoc)
collection = collection._do_left_join value, assoc.klass
end
else
assoc = this.reflect_on_association(column)
raise "#{this} has no association: #{column}." unless assoc
collection = collection._left_join(assoc)
end
collection
end
def self._left_join(assoc) # :nodoc:
source = assoc.active_record.arel_table
pk = assoc.association_primary_key.to_sym
joins source.join(assoc.klass.arel_table,
Arel::Nodes::OuterJoin).on(source[assoc.foreign_key].eq(
assoc.klass.arel_table[pk])).join_sources
end
end
Hope it helps.
See below my original post to this question.
Since then, I have implemented my own .left_joins() for ActiveRecord v4.0.x (sorry, my app is frozen at this version so I've had no need to port it to other versions):
In file app/models/concerns/active_record_extensions.rb, put the following:
module ActiveRecordBaseExtensions
extend ActiveSupport::Concern
def left_joins(*args)
self.class.left_joins(args)
end
module ClassMethods
def left_joins(*args)
all.left_joins(args)
end
end
end
module ActiveRecordRelationExtensions
extend ActiveSupport::Concern
# a #left_joins implementation for Rails 4.0 (WARNING: this uses Rails 4.0 internals
# and so probably only works for Rails 4.0; it'll probably need to be modified if
# upgrading to a new Rails version, and will be obsolete in Rails 5 since it has its
# own #left_joins implementation)
def left_joins(*args)
eager_load(args).construct_relation_for_association_calculations
end
end
ActiveRecord::Base.send(:include, ActiveRecordBaseExtensions)
ActiveRecord::Relation.send(:include, ActiveRecordRelationExtensions)
Now I can use .left_joins() everywhere I'd normally use .joins().
----------------- ORIGINAL POST BELOW -----------------
If you want OUTER JOINs without all the extra eagerly loaded ActiveRecord objects, use .pluck(:id) after .eager_load() to abort the eager load while preserving the OUTER JOIN. Using .pluck(:id) thwarts eager loading because the column name aliases (items.location AS t1_r9, for example) disappear from the generated query when used (these independently named fields are used to instantiate all the eagerly loaded ActiveRecord objects).
A disadvantage of this approach is that you then need to run a second query to pull in the desired ActiveRecord objects identified in the first query:
# first query
idents = Course
.eager_load(:students) # eager load for OUTER JOIN
.where(
student_enrollments: {student_id: some_user.id, id: nil},
active: true
)
.distinct
.pluck(:id) # abort eager loading but preserve OUTER JOIN
# second query
Course.where(id: idents)
It'a join query in Active Model in Rails.
Please click here for More info about Active Model Query Format.
#course= Course.joins("LEFT OUTER JOIN StudentEnrollment
ON StudentEnrollment .id = Courses.user_id").
where("StudentEnrollment .id IS NULL AND StudentEnrollment .student_id =
<SOME_STUDENT_ID_VALUE> and Courses.active = true").select
Use Squeel:
Person.joins{articles.inner}
Person.joins{articles.outer}
If anyone out there still needs true left_outer_joins support in Rails 4.2 then if you install the gem "brick" on Rails 4.2.0 or later it automatically adds the Rails 5.0 implementation of left_outer_joins. You would probably want to turn off the rest of its functionality, that is unless you want an automatic "admin panel" kind of thing available in your app!

ActiveRecord: Adding condition to ON clause for includes

I have a model offers and another historical_offers, one offer has_many historical_offers.
Now I would like to eager load the historical_offers of one given day for a set of offers, if it exists. For this, I think I need to pass the day to the ON clause, not the WHERE clause, so that I get all offers, also when there is no historical_offer for the given day.
With
Offer.where(several_complex_conditions).includes(:historical_offers).where("historical_offers.day = ?", Date.today)
I would get
SELECT * FROM offers
LEFT OUTER JOIN historical_offers
ON offers.id = historical_offers.offer_id
WHERE day = '2012-11-09' AND ...
But I want to have the condition in the ON clause, not in the WHERE clause:
SELECT * FROM offers
LEFT OUTER JOIN historical_offers
ON offers.id = historical_offers.offer_id AND day = '2012-11-09'
WHERE ...
I guess I could alter the has_many definition with a lambda condition for a specific date, but how would I pass in a date then?
Alternatively I could write the joins mysqlf like this:
Offer.where(several_complex_conditions)
.joins(["historical_offers ON offers.id = historical_offers.offer_id AND day = ?", Date.today])
But how can I hook this up so that eager loading is done?
After a few hours headscratching and trying all sorts of ways to accomplish eager loading of a constrained set of associated records I came across #dbenhur's answer in this thread which works fine for me - however the condition isn't something I'm passing in (it's a date relative to Date.today). Basically it is creating an association with the conditions I wanted to put into the LEFT JOIN ON clause into the has_many condition.
has_many :prices, order: "rate_date"
has_many :future_valid_prices,
class_name: 'Price',
conditions: ['rate_date > ? and rate is not null', Date.today-7.days]
And then in my controller:
#property = current_agent.properties.includes(:future_valid_prices).find_by_id(params[:id])

ActiveRecord::Relation joins with more conditions than just the foreign key

Is there any way to specify more than one conditions for a left outer join using ActiveRecord::Relation?
Take the following SQL statement for example. How can anyone rewrite this using ActiveRecord::Relation objects?
SELECT `texts`.*, `text_translations`.translation FROM `texts` LEFT OUTER JOIN `text_translations` ON `text_translations`.`id` = `texts`.`id` AND `text_translations`.`locale` = 'en'
Is there any way to do this under ActiveRecord 3.0.3+?
Thanks in advance.
first you should consider to use rails/activerecord conform relations. This means the foreign key in the text_translations table should be called text_id
Create your models and associations like this:
class Text < ActiveRecord::Base
# all possible translations!
has_many :text_translations
scope :with_translation_for, lambda { |lang| {
:select => "texts.*, tt.translation",
:joins => "LEFT OUTER JOIN text_translations AS tt ON tt.text_id = texts.id AND tt.locale = #{ActiveRecord::Base.sanitize(lang)}"
}}
# return nil if translation hasn't been loaded, otherwise you get a nasty NoMethod exception
def translation
read_attribute(:translation)
end
end
and
class TextTranslation < ActiveRecord::Base
# every translation belongs to a text
belongs_to :text
# define a scope for the language
scope :language, lambda { |lang| where(['locale = ?', lang]) }
end
How to use:
texts = Text.with_translation_for('en')
texts.each do |c_text|
unless c_text.translation.nil?
puts c_text.translation
else
puts "No translation available!"
end
end
Now to the pro and cons, the way using LEFT OUTER join will load you all texts even if there isn't a translation for a text in the desired language. The con is that you won't get the "TextTranslation" model object.
Anotherway is to load only the text which have the desired translation. You can do it like:
texts = Text.includes(:text_translations).where(:text_translations => {:locale => 'en'})
now texts[i].text_translations will return an array with all TextTranslations model object for this text matching the locale 'en'. But texts without a translation in the locale "en" won't show up.
Edit
Connected to your comment:
The problem about using .join(:tablename) on a relation is that, it will result in an INNER JOIN so this is not an option. You have to explicitly declare the LEFT join. Another thing is that if you use something like Text.includes(:text_translations).where(['text_translations.locale = ?', 'en']) the condition will be applied to the SQL query as whole and not on the possible LEFT join itself. What you actually can do is to declare associations like
has_many :english_translations, :class_name => 'TextTranslation', :conditions => ['locale = ?', 'en']
This way you can manage to load only english translations by eager loading (without any joins at all):
Text.includes(:english_translations).all
Checkt this out:
Ruby On Rails Guide about Joining Tables
ActiveRecord Association Docs, Search for LEFT OUTER JOIN