Rails scope for parent records that do not have particular child records - sql

I have a parent model Effort that has_many split_times:
class Effort
has_many :split_times
end
class SplitTime
belongs_to :effort
belongs_to :split
end
class Split
has_many :split_times
enum kind: [:start, :finish, :intermediate]
end
I need a scope that will return efforts that do not have a start split_time. This seems like it should be possible, but so far I'm not able to do it.
I can return efforts with no split_times with this:
scope :without_split_times, -> { includes(:split_times).where(split_times: {:id => nil}) }
And I can return efforts that have at least one split_time with this:
scope :with_split_times, -> { joins(:split_times).uniq }
Here's my attempt at the scope I want:
scope :without_start_time, -> { joins(split_times: :split).where(split_times: {:id => nil}).where('splits.kind != ?', Split.kinds[:start]) }
But that doesn't work. I need something that will return all efforts that do not have a split_time that has a split with kind: :start even if the efforts have other split_times. I would prefer a Rails solution but can go to raw SQL if necessary. I'm using Postgres if it matters.

You can left join on your criteria (i.e. splits.kind = 'start') which will include nulls (i.e. there was no matching row to join). The difference is that Rails' join will by default give you an inner join (there are matching rows in both tables) but you want a left join as you need to check that there is no matching row on the right table.
With the results of that join you can group by event and then count the number of matching splits - if it's 0 then there are no matching start splits for that event!
This might do the trick for you:
scope :without_start_time, -> {
joins("LEFT JOIN split_times ON split_times.effort_id = efforts.id").
joins("LEFT OUTER JOIN splits ON split_times.split_id = splits.id " \
"AND splits.kind = 0").
group("efforts.id").
having("COUNT(splits.id) = 0")
}

Related

Group entries based on associated model fields in aggregate

I have two models, Keyword and Company, associated using has_and_belongs_to_many.
On the Keyword model, there is a boolean field included. If this is set to true, any Company associated with that Keyword should be considered included, unless it has any associated Keywords with included set to false. included can also be set to nil which I consider a state of "empty".
To summarize:
Company A has 2 associated Keywords, 1 included = true and 1 included = nil. This company is INCLUDED.
Company B has 2 associated Keywords, 1 included = false and 1 included = true. This company is EXCLUDED.
Company C has 2 associated Keywords, both included = nil. This company is EMPTY.
What is the best way to 1) count the number of Included, Excluded, and Empty companies, and 2) query/scope the Company model to Included, Excluded, or Empty?
The current solution I have hacked together is causing expensive queries that are resulting (usually) in request timeouts. Models follow:
class Company < ApplicationRecord
has_and_belongs_to_many :keywords
scope :included, -> { joins(:keywords).merge(Keyword.included).group('id').reorder('') }
scope :excluded, -> { joins(:keywords).merge(Keyword.excluded).group('id').reorder('') }
scope :empty, -> { joins(:keywords).merge(Keyword.empty).group('id').reorder('') }
end
class Keyword < ApplicationRecord
has_and_belongs_to_many :companies
scope :excluded, -> { where(included: false) }
scope :included, -> { where(included: true) }
scope :empty, -> { where(included: nil) }
end
(reorder is included in Company model to resolve a quirk with pg_search gem and grouping)
I´m always trying to avoid joins because they are creating duplicates that you´d have to remove with a group('id').
I´d rather use EXISTS.
class Company < ApplicationRecord
has_and_belongs_to_many :keywords
scope :included, -> do
where(
Keyword
.included
.joins("INNER JOIN companies_keywords ON companies_keywords.keyword_id = keywords.id AND company_id = companies.id")
.arel.exists
).where.not(
Keyword
.excluded
.joins("INNER JOIN companies_keywords ON companies_keywords.keyword_id = keywords.id AND company_id = companies.id")
.arel.exists
)
end
scope :excluded, -> do
where(
Keyword
.excluded
.joins("INNER JOIN companies_keywords ON companies_keywords.keyword_id = keywords.id AND company_id = companies.id")
.arel.exists
)
end
scope :empty, -> do
where.not(
Keyword
.joins("INNER JOIN companies_keywords ON companies_keywords.keyword_id = keywords.id AND company_id = companies.id")
.where(included: [true, false])
.arel.exists
)
end
end
Also i recently discovered this Gem which simplifies exists queries. I find that particularly helpful.
You could just write:
scope :included, -> { where_exists(:keywords, &:included).where_not_exists(:keywords, &:excluded) }
I didn´t test the performance on these but it should be pretty solid (as long as you added the right indexes).
Hope this answers your questions.

Rails - scope for records that are not in a join table alongside a specific association

I have two models in a Rails app - Tournament and Player associated through a join table:
class Tournament < ApplicationRecord
has_many :tournament_players
has_many :players, through: :tournament_players
end
class Player < ApplicationRecord
has_many :tournament_players
has_many :tournaments, through: :tournament_players
scope :selected, -> (tournament) { includes(:tournaments).where(tournaments: {id: tournament.id}) }
end
I have lots of Tournaments, and each one can have lots of Players. Players can play in lots of Tournaments. The scope
scope :selected, -> (tournament) { includes(:tournaments).where(tournaments: {id: tournament.id}) }
successfuly finds all the players already added to a tournament, given that tournament as an argument.
What I'd like is a scope that does the opposite - returns all the players not yet added to a given tournament. I've tried
scope :not_selected, -> (tournament) { includes(:tournaments).where.not(tournaments: {id: tournament.id}) }
but that returns many of the same players, I think because the players exist as part of other tournaments. The SQL for that looks something like:
SELECT "players".*, "tournaments”.* FROM "players" LEFT OUTER JOIN
"tournament_players" ON "tournament_players"."player_id" =
"players"."id" LEFT OUTER JOIN "tournaments" ON "tournaments"."id" =
"tournament_players"."tournament_id" WHERE ("tournaments"."id" != $1)
ORDER BY "players"."name" ASC [["id", 22]]
I've also tried the suggestions on this question - using
scope :not_selected, -> (tournament) { includes(:tournaments).where(tournaments: {id: nil}) }
but that doesn't seem to work - it just returns an empty array, again I think because the Players exist in the join table as part of a separate Tournament. The SQL for that looks something like:
SELECT "players”.*, "tournaments”.* FROM "players" LEFT OUTER JOIN
"tournament_players" ON "tournament_players"."player_id" =
"players"."id" LEFT OUTER JOIN "tournaments" ON "tournaments"."id" =
"tournament_players"."tournament_id" WHERE "tournaments"."id" IS NULL
ORDER BY "players"."name" ASC
What you need to do is:
Make a left join with the reference table, with an additional condition on the tournament ID matching the one that you want to find the not-selected players for
Apply a WHERE clause indicating that there was no JOIN made.
This code should do it:
# player.rb
scope :not_selected, -> (tournament) do
joins("LEFT JOIN tournament_players tp ON players.id = tp.player_id AND tp.tournament_id = #{tournament.id}").where(tp: {tournament_id: nil})
end
If only Rails had a nicer way to write LEFT JOIN queries with additional conditions...
A few notes:
Don't join the actual relation (i.e. Tournament), it dramatically decreases performance of your query, and it's unnecessary, because all your condition prerequisites are inside the reference table. Besides, all the rows you're interested in return NULL data from the tournaments table.
Don't use eager_load. Besides to my best knowledge its not supporting custom conditions, it would create models for all related objects, which you don't need.
ok try this:
includes(:tournaments).distinct.where.not(tournaments: {id: tournament.id})

SQL LEFT JOIN value NOT in either join column

I suspect this is a rather common scenario and may show my ineptitude as a DB developer, but here goes anyway ...
I have two tables: Profiles and HiddenProfiles and the HiddenProfiles table has two relevant foreign keys: profile_id and hidden_profile_id that store ids from the Profiles table.
As you can imagine, a user can hide another user (wherein his profile ID would be the profile_id in the HiddenProfiles table) or he can be hidden by another user (wherein his profile ID would be put in the hidden_profile_id column). Again, a pretty common scenario.
Desired Outcome:
I want to do a join (or to be honest, whatever would be the most efficient query) on the Profiles and HiddenProfiles table to find all the profiles that a given profile is both not hiding AND not hidden from.
In my head I thought it would be pretty straightforward, but the iterations I came up with kept seeming to miss one half of the problem. Finally, I ended up with something that looks like this:
SELECT "profiles".* FROM "profiles"
LEFT JOIN hidden_profiles hp1 on hp1.profile_id = profiles.id and (hp1.hidden_profile_id = 1)
LEFT JOIN hidden_profiles hp2 on hp2.hidden_profile_id = profiles.id and (hp2.profile_id = 1)
WHERE (hp1.hidden_profile_id is null) AND (hp2.profile_id is null)
Don't get me wrong, this "works" but in my heart of hearts I feel like there should be a better way. If in fact there is not, I'm more than happy to accept that answer from someone with more wisdom than myself on the matter. :)
And for what it's worth these are two RoR models sitting on a Postgres DB, so solutions tailored to those constraints are appreciated.
Models are as such:
class Profile < ActiveRecord::Base
...
has_many :hidden_profiles, dependent: :delete_all
scope :not_hidden_to_me, -> (profile) { joins("LEFT JOIN hidden_profiles hp1 on hp1.profile_id = profiles.id and (hp1.hidden_profile_id = #{profile.id})").where("hp1.hidden_profile_id is null") }
scope :not_hidden_by_me, -> (profile) { joins("LEFT JOIN hidden_profiles hp2 on hp2.hidden_profile_id = profiles.id and (hp2.profile_id = #{profile.id})").where("hp2.profile_id is null") }
scope :not_hidden, -> (profile) { self.not_hidden_to_me(profile).not_hidden_by_me(profile) }
...
end
class HiddenProfile < ActiveRecord::Base
belongs_to :profile
belongs_to :hidden_profile, class_name: "Profile"
end
So to get the profiles I want I'm doing the following:
Profile.not_hidden(given_profile)
And again, maybe this is fine, but if there's a better way I'll happily take it.
If you want to get this list just for a single profile, I would implement an instance method to perform effectively the same query in ActiveRecord. The only modification I made is to perform a single join onto a union of subqueries and to apply the conditions on the subqueries. This should reduce the columns that need to be loaded into memory, and hopefully be faster (you'd need to benchmark against your data to be sure):
class Profile < ActiveRecord::Base
def visible_profiles
Profile.joins("LEFT OUTER JOIN (
SELECT profile_id p_id FROM hidden_profiles WHERE hidden_profile_id = #{id}
UNION ALL
SELECT hidden_profile_id p_id FROM hidden_profiles WHERE profile_id = #{id}
) hp ON hp.p_id = profiles.id").where("hp.p_id IS NULL")
end
end
Since this method returns an ActiveRecord scope, you can chain additional conditions if desired:
Profile.find(1).visible_profiles.where("created_at > ?", Time.new(2015,1,1)).order(:name)
Personally I've never liked the join = null approach. I find it counter intuitive. You're asking for a join, and then limiting the results to records that don't match.
I'd approach it more as
SELECT id FROM profiles p
WHERE
NOT EXISTS
(SELECT * FROM hidden_profiles hp1
WHERE hp1.hidden_profile_id = 1 and hp1.profile_id = p.profile_id)
AND
NOT EXISTS (SELECT * FROM hidden_profiles hp2
WHERE hp2.hidden_profile_id = p.profile_id and hp2.profile_id = 1)
But you're going to need to run it some EXPLAINs with realistic volumes to be sure of which works best.

How to select items from one table based on HABTM relation in another? Postgres "ALL" does not work,

I am trying to retrieve mangas (comics) that have certain categories. For example in the code below, I am trying to search for Adventure(id=29) and Comedy(id=25) mangas. I am using "ALL" operator because I want BOTH categories be in mangas. (i.e return all Manga that have both a category of 25 AND 29 through the relation table, but can also have other categories attached to them)
#search = Manga.find_by_sql("
SELECT m.*
FROM mangas m
JOIN categorizations c ON c.manga_id = m.id AND c.category_id = ALL (array[29,25])
")
Problems? The query is not working as I am expecting (maybe I misunderstand something about ALL operator). I am getting nothing back from the query.
So I tried to change it to
JOIN categorizations c ON c.manga_id = m.id AND c.category_id >= ALL (array[29,25])
I get back mangas whose IDs are GREATER than 29. I am not even getting category #29.
Is there something I am missing here?
Also the query is... VERY slow. I would appreciate it if someone comes with a query that return back what I want.
I am using Ruby on Rails 4.2 and postgresql
Thanks
Update: (posting models relationship)
class Manga < ActiveRecord::Base
has_many :categorizations, :dependent => :destroy
has_many :categories, through: :categorizations
end
class Category < ActiveRecord::Base
has_many :categorizations, :dependent => :destroy
has_many :mangas, through: :categorizations
end
class Categorization < ActiveRecord::Base
belongs_to :manga
belongs_to :category
end
My attempt based on #Beartech answer:
wheres = categories_array.join(" = ANY (cat_ids) AND ")+" = ANY (cat_ids)"
#m = Manga.find_by_sql("
SELECT mangas.*
FROM
(SELECT manga_id, cat_ids
FROM
(
SELECT c.manga_id, array_agg(c.category_id) cat_ids
FROM categorizations c GROUP BY c.manga_id
)
AS sub_table1 WHERE #{wheres}
)
AS sub_table2
INNER JOIN mangas ON sub_table2.manga_id = mangas.id
")
I'm adding this as a different answer, because I like to have the other one for historic reasons. It gets the job done, but not efficiently, so maybe someone will see where it can be improved. That said...
THE ANSWER IS!!!
It all comes back around to the Postgresql functions ALL is not what you want. You want the "CONTAINS" operator, which is #>. You also need some sort of aggregate function because you want to match each Manga with all of it's categories, select only the ones that contain both 25 and 29.
Here is the sql for that:
SELECT manga.*
FROM
(SELECT manga_id, cat_ids
FROM
(SELECT manga_id, array_agg(category_id) cat_ids
FROM categorizations GROUP BY manga_id)
AS sub_table1 WHERE cat_ids #> ARRAY[25,29] )
AS sub_table2
INNER JOIN manga
ON sub_table2.manga_id = manga.id
;
So you are pulling a subquery that grabs all of the matching rows in the join table, puts their category ids into an array, and grouping by the manga id. Now you can join that against the manga table to get the actual manga records
The ruby looks like:
#search = Manga.find_by_sql("SELECT manga.* FROM (SELECT manga_id, cat_ids FROM (SELECT manga_id, array_agg(category_id) cat_ids FROM categorizations GROUP BY manga_id) AS sub_table1 WHERE cat_ids #> ARRAY[25,29] ) AS sub_table2 INNER JOIN manga ON sub_table2.manga_id = manga.id
It's fast and clean, doing it all in the native SQL.
You can interpolate variables into the .find_by_sql() text. This gives you an instant search function since #> is asking if the array of categories contains all of the search terms.
terms = [25,29]
q = %Q(SELECT manga.* FROM (SELECT manga_id, cat_ids FROM (SELECT manga_id, array_agg(category_id) cat_ids FROM categorizations GROUP BY manga_id) AS sub_table1 WHERE cat_ids #> ARRAY#{terms} ) AS sub_table2 INNER JOIN manga ON sub_table2.manga_id = manga.id")
Manga.find_by_sql(q)
Important
I am fairly certain that the above code is in some way insecure. I would assume that you are going to validate the input of the array in some way, i.e.
terms.all? {|term| term.is_a? Integer} ? terms : terms = []
Third times the charm, right? LOL
OK, totally changing my answer because it seems like this should be SUPER EASY in Rails, but it has stumped the heck out of me...
I am heavily depending on This answer to come up with this. You should put a scope in your Manga model:
class Manga < ActiveRecord::Base
has_many :categorizations, :dependent => :destroy
has_many :categories, through: :categorizations
scope :in_categories, lambda { |*search_categories|
joins(:categories).where(:categorizations => { :category_id => search_categories } )
}
end
Then call it like:
#search = Manga.in_categories(25,29).group_by {|manga| ([25,29] & manga.category_ids) == [25,29]}
This iterates through all of the Manga that contain at least ONE or more of the two categories, makes a "set" of the array of [25,29] with the array from the manga.category_ids and checks to see if that set equals your reqeusted set. This weeds out ALL Manga that only have one of the two keys.
#search will now be a hash with two keys true and false:
{true => [#<Manga id: 9, name: 'Guardians of...
.... multiple manga objects that belong to at least
the two categories requested but not eliminated if
they also belong to a third of fourth category ... ]
false => [ ... any Manga that only belong to ONE of the two
categories requested ... ]
}
Now to get just the unique Mangas that belong to BOTH categories use .uniq:
#search[true].uniq
BOOM!! You have an array of you Manga objects that match BOTH of your categories.
OR
You can simplify it with:
#search = Manga.in_categories(25,29).keep_if {|manga| ([25,29] & manga.category_ids) == [25,29]}
#search.uniq!
I like that a little bit better, it looks cleaner.
AND NOW FOR YOU SQL JUNKIES
#search = Manga.find_by_sql("Select *
FROM categorizations
JOIN manga ON categorizations.manga_id = manga.id
WHERE categorizations.cateogry_id IN (25,29)").keep_if {|manga| ([25,29] & manga.category_ids) == [25,29]}
#search.uniq!
* OK OK OK I'll stop after this one. :-) *
Roll it all into the scope in Manga.rb:
scope :in_categories, lambda { |*search_categories|
joins(:categories).where(:categorizations => { :category_id => search_categories } ).uniq!.keep_if {|manga| manga.category_ids.include? search_categories[0] and manga.category_ids.include? search_categories[1]} }
THERE HAS GOT TO BE AN EASIER WAY??? (actually that last one is pretty easy)

ActiveRecord::Relation joins with more conditions than just the foreign key

Is there any way to specify more than one conditions for a left outer join using ActiveRecord::Relation?
Take the following SQL statement for example. How can anyone rewrite this using ActiveRecord::Relation objects?
SELECT `texts`.*, `text_translations`.translation FROM `texts` LEFT OUTER JOIN `text_translations` ON `text_translations`.`id` = `texts`.`id` AND `text_translations`.`locale` = 'en'
Is there any way to do this under ActiveRecord 3.0.3+?
Thanks in advance.
first you should consider to use rails/activerecord conform relations. This means the foreign key in the text_translations table should be called text_id
Create your models and associations like this:
class Text < ActiveRecord::Base
# all possible translations!
has_many :text_translations
scope :with_translation_for, lambda { |lang| {
:select => "texts.*, tt.translation",
:joins => "LEFT OUTER JOIN text_translations AS tt ON tt.text_id = texts.id AND tt.locale = #{ActiveRecord::Base.sanitize(lang)}"
}}
# return nil if translation hasn't been loaded, otherwise you get a nasty NoMethod exception
def translation
read_attribute(:translation)
end
end
and
class TextTranslation < ActiveRecord::Base
# every translation belongs to a text
belongs_to :text
# define a scope for the language
scope :language, lambda { |lang| where(['locale = ?', lang]) }
end
How to use:
texts = Text.with_translation_for('en')
texts.each do |c_text|
unless c_text.translation.nil?
puts c_text.translation
else
puts "No translation available!"
end
end
Now to the pro and cons, the way using LEFT OUTER join will load you all texts even if there isn't a translation for a text in the desired language. The con is that you won't get the "TextTranslation" model object.
Anotherway is to load only the text which have the desired translation. You can do it like:
texts = Text.includes(:text_translations).where(:text_translations => {:locale => 'en'})
now texts[i].text_translations will return an array with all TextTranslations model object for this text matching the locale 'en'. But texts without a translation in the locale "en" won't show up.
Edit
Connected to your comment:
The problem about using .join(:tablename) on a relation is that, it will result in an INNER JOIN so this is not an option. You have to explicitly declare the LEFT join. Another thing is that if you use something like Text.includes(:text_translations).where(['text_translations.locale = ?', 'en']) the condition will be applied to the SQL query as whole and not on the possible LEFT join itself. What you actually can do is to declare associations like
has_many :english_translations, :class_name => 'TextTranslation', :conditions => ['locale = ?', 'en']
This way you can manage to load only english translations by eager loading (without any joins at all):
Text.includes(:english_translations).all
Checkt this out:
Ruby On Rails Guide about Joining Tables
ActiveRecord Association Docs, Search for LEFT OUTER JOIN