Rails 5 - ActiveRecord - Filtering has_many relation with includes and not exclude empty values - sql

Let me explain the title.
I have model A that has_many model B.
I want to filter model B by month and year of the date while showing all model A's
so for example:
A1 -> 3 Bs
A2 -> 0 Bs
A3 -> 1 B
This is my query right now:
A.includes(:b_relation)
.where("extract(month from b.date) = #{month}").references(:b_relation)
.where("extract(year from b.date) = #{year}").references(:b_relation)
.all
It works! BUT it only gives me the A that has at least one B. The A's that have none don't show.
How can I make the query include the model A's that don't have any B's?

The query you're doing now is using a INNER JOIN which will exclude records from A that have no associated Bs. What you want instead is a LEFT OUTER JOIN—aka a LEFT JOIN. Left joins include all rows from the parent table, whether or not there are any associated records from the associated table.
I always find this image useful for visualizing SQL join types:
Rails 5 has a left_outer_joins method for this (alias: left_joins):
A.left_outer_joins(:b_relation)
In earlier versions of Rails, it's more manual (I'm just making up the table names here):
A.joins('LEFT OUTER JOIN "bs" ON "bs"."a_id" = "as"."id"')

Related

How to connect ransacker query to ransack sort search parameter

Problem:
I am using the ransack gem to sort columns in a table. I have 2 models: Campaign and Course. A campaign has many courses, and a course belongs to one campaign. Each course has a number of total_attendees. My Campaigns table has a column for Total Attendees, and I want it to be sortable. So it would sum up the total_attendees field for each course that belongs to a single campaign, and sort based on that sum.
Ex. A campaign has 3 courses, each with 10 attendees. The Total Attendees column on the campaign table would show 30 and it would be sortable against total attendees for all the other campaigns.
I found ransackers:
https://github.com/activerecord-hackery/ransack/wiki/Using-Ransackers
and this SO question: Ransack sort by sum of relation
and from that put together a lot of what is below.
From Model - campaign.rb:
class Campaign < ApplicationRecord
has_many :courses
ransacker :sum_of_total_attendees do
query = "SELECT SUM(r.total_attendees)
FROM campaigns c
LEFT OUTER JOIN courses r
ON r.campaign_id = c.id
GROUP BY c.id"
Arel.sql(query)
end
end
From Model - course.rb:
class Course < ApplicationRecord
belongs_to :campaign, optional: true
end
View:
<th scope="col"><%= sort_link(#q, :sum_of_total_attendees, 'Total Attendees') %></th>
Controller - campaigns_controller.rb:
all_campaigns = Campaign.all
#q = all_campaigns.ransack(params[:q])
#campaigns = #q.result
Errors:
The ransacker query gives me the data I want, but I don't know what to do to get the right information .
Originally, when I clicked on the th link to sort the data, I got this error:
PG::CardinalityViolation: ERROR: more than one row returned by a
subquery used as an expression
I don't know what changed, but now I'm getting this error:
PG::SyntaxError: ERROR: syntax error at or near "SELECT"
LINE 1: SELECT "campaigns".* FROM "campaigns" ORDER BY SELECT SUM(r....
^
: SELECT "campaigns".* FROM "campaigns" ORDER BY SELECT
SUM(r.total_attendees)
FROM campaigns c
LEFT OUTER JOIN courses r
ON r.campaign_id = c.id
GROUP BY c.id ASC
This error seems to say that the ransack search parameter, #q and the ransacker query don't work together. There are two selects in this request, when there should definitely be only one, but the first one is coming from ransack, so I'm not sure how to address it.
How do I get my query to sort correctly with ransack?
Articles I've looked at but did not seem to apply to what I was looking to accomplish with this story:
Ransack Sort By Sum of Relation: This is the one I worked from a lot, but I'm not sure why it works for this user and not for me. They don't show what is changed, if anything, in the controller
Ransack Github Issue For Multiple Params: This doesn't cover the issue of summing table columns.
Rails Ransack Sorting Searching Based On A Definition In The Model: This didn't apply to my need to sort based on summed data.
Three Ways to Bend The Ransack Gem: This looks like what I was doing, but I'm not sure why theirs is working but mine isn't.

Many to many query joins in aqueduct

I have A -> AB <- B many to many relationship between 2 ManagedObjects (A and B), where AB is the junction table.
When querying A from db, how do i join B values to AB joint objects?
Query<A> query = await Query<A>(context)
..join(set: (a) => a.ab);
It gives me a list of A objects which contains AB joint objects, but AB objects doesn't include full B objects, but only b.id (not other fields from class B).
Cheers
When you call join, a new Query<T> is created and returned from that method, where T is the joined type. So if a.ab is of type AB, Query<A>.join returns a Query<AB> (it is linked to the original query internally).
Since you have a new Query<AB>, you can configure it like any other query, including initiating another join, adding sorting descriptors and where clauses.
There are some stylistic syntax choices to be made. You can condense this query into a one-liner:
final query = Query<A>(context)
..join(set: (a) => a.ab).join(object: (ab) => ab.b);
final results = await query.fetch();
This is OK if the query remains as-is, but as you add more criteria to a query, the difference between the dot operator and the cascade operator becomes harder to track. I often pull the join query into its own variable. (Note that you don't call any execution methods on the join query):
final query = Query<A>(context);
final join = query.join(set: (a) => a.ab)
..join(object: (ab) => ab.b);
final results = await query.fetch();

Problems with Rails 3 Active Record Query Interface with join on ID

I have been having a problem with the Rails 3 Active Record Query Interface. I have a lookup table (lookups), a Main table (through_references), and a through/join table called through_tables. Thus this is a HABTM configuration that I have set up using has_many :through.
Update: Of special note here is that when I am doing these joins, I have been joining on IDs, to provide filtering of records. It seems that this does not work with Active Record Query Interface. If you do not want to see the gory details of my travails, you can skip down to see my workaround below.
We are also going to have a number of Main Items (through_references table) should be able to have any combination of lookup items, and to conveniently be able to click the relevant lookup items say through check boxes.
I have posted the code on github. There is quite a lot more explanations on the github source code. to see the results, go to the lookups index page. Note that you will need to create the records using the scaffold code.
I also have the code up and running on heroku, with more explanations and examples.
class Lookup < ActiveRecord::Base
has_many :fk_references
has_many :through_tables
has_many :through_references, :through => :through_tables
attr_accessible :name, :value
end
class ThroughTable < ActiveRecord::Base
belongs_to :through_reference
belongs_to :lookup
attr_accessible :description, :through_reference_id, :lookup_id
end
class ThroughReference < ActiveRecord::Base
has_many :through_tables
has_many :lookups, :through => :through_tables
attr_accessible :description
end
If we want to have a listing if all the lookup items, and the Main Items that correspond with them, we can LEFT JOIN the ‘lookups’ table with the Main Items (through_references) table.
Corresponding SQL:
SELECT * FROM lookups
LEFT OUTER JOIN through_tables ON (lookups.id = through_tables.lookup_id AND through_tables.through_reference_id = 1)
LEFT OUTER JOIN through_references ON through_references.id = through_tables.through_reference_id
ORDER BY lookups.id
Returned records:
1;“Lookup Item 1”;“1”;“2012-06-06 17:14:40.819791”;“2012-06-06 17:14:40.819791”;1;1;1;“Main Item 1 has Lookup item 1”;“2012-06-06 17:17:31.355425”;“2012-06-06 17:17:31.355425”;1;“Main Item 1”;“2012-06-06 17:16:30.004375”;“2012-06-06 17:16:30.004375”
2;“Lookup Item 2”;“2”;“2012-06-06 17:14:59.584756”;“2012-06-06 17:14:59.584756”;;;;“”;“”;“”;;“”;“”;“”
3;“Lookup Item 3”;“3”;“2012-06-06 17:15:14.700239”;“2012-06-06 17:15:14.700239”;2;1;3;“Main Item 1 has Lookup item 3”;“2012-06-06 17:17:53.169715”;“2012-06-06 17:17:53.169715”;1;“Main Item 1”;“2012-06-06 17:16:30.004375”;“2012-06-06 17:16:30.004375”
This is what I expected.
=== Active Record Query Interface using custom left join
Lookup.joins(“LEFT OUTER JOIN through_tables ON (lookups.id = through_tables.lookup_id AND through_tables.through_reference_id = 1)” ).includes(:through_references).order(‘lookups.id’)
What is returned from Active Record Query Interface (note I navigate down through the Active Record hierarchy):
Lookup ID Lookup Name Lookup Value Through Table ID Through Table Description Main Item ID Main Item Description
1 Lookup Item 1 1 1 Main Item 1 has Lookup item 1 1 Main Item 1
1 Lookup Item 1 1 3 Main Item 2 has Lookup item 1 2 Main Item 2
2 Lookup Item 2 2 4 Main Item 2 has Lookup item 2 2 Main Item 2
3 Lookup Item 3 3 2 Main Item 1 has Lookup item 3 1 Main Item 1
This is NOT what I expected.
What we have here is identical to the simple left join (without the AND clause). This tells me that the AND clause is being ignored in the Active Record Query Interface.
=== Active Record Query Interface using find_by_sql approach
Lookup.find_by_sql("SELECT * FROM lookups LEFT OUTER JOIN through_tables ON (through_tables.lookup_id = lookups.id AND through_tables.through_reference_id = 1) LEFT OUTER JOIN through_references ON through_references.id = through_tables.through_reference_id ORDER BY lookups.value, through_references.id" )
What is returned from Active Record Query Interface (note I navigate down through the Active Record hierarchy)::
Lookup ID Lookup Name Lookup Value Through Table ID Through Table Description Main Item ID Main Item Description
1 Lookup Item 1 1 3 Main Item 2 has Lookup item 1 2 Main Item 2
1 Lookup Item 1 1 1 Main Item 1 has Lookup item 1 1 Main Item 1
Lookup Item 2 2 No through_tables entry
1 Lookup Item 3 3 3 Main Item 2 has Lookup item 1 2 Main Item 2
1 Lookup Item 3 3 1 Main Item 1 has Lookup item 1 1 Main Item 1
The results here are crazy!
Is this a BUG, is this the intended effects, or am I missing something ?
I hope there is a clean way of doing this, without having to generate two result sets, and merge them by code.
I have found a work-around. The issue seems to be that Active Record will not recognize joins that filter on an ID (LEFT OUTER JOIN xyz ON xyz.id = ID).
My work-around involves creating a stored procedure or function that takes the ID in as a parameter, does the join in the Database, and returns a nice flat recordset.
see: Heroku demo page (skip to bottom)
Note, I am not marking this as a solution, because this is a work-around, and nothing to do with active record.
Well, reading the github project, I see this:
What I really want to do is have a list of all of the lookup items,
and if there are matching Main Items, have them appended on to the
returned record, and if not, I want nulls. This is a technique that I
have used for over 10 years.
I'm thinking that problem is exactly that you want to do it that way, when it would be more natural to let rails eager loading handle it, and so you've gotten fixated on fetching everything in a single massive join.
What I would do is something like:
Lookup.where( .. insert any needed conditions here ...).includes(:through_tables)
Then ActiveQuery will then fetch all the Lookup in one query, and then use eager loading to fetch any associations named in the includes statement, one query per association.
Note I'm not saying that joins are bad, just saying that this is a more natural way to do it in rails. I like to use the Preloader http://apidock.com/rails/ActiveRecord/Associations/Preloader to separate out the decision about what to eager load from the decision about which data to fetch. I find that helpful in controllers - let the model decide what the conditions are, but let the controller decide which objects it'll need to eager load.
HTH

ActiveRecord/ARel modify `ON` in a left out join from includes

I'm wondering if it's possible to specify additional JOIN ON criteria using ActiveRecord includes?
ex: I'm fetching a record and including an association with some conditions
record.includes(:other_record).where(:other_record => {:something => :another})
This gives me (roughly):
select * from records
left outer join other_records on other_records.records_id = records.id
where other_records.something = another
Does anyone know how I can specify an extra join condition so I could achieve something like.
select * from records
left outer join other_records on other_records.records_id = records.id
and other_records.some_date > now()
where other_records.something = another
I want my includes to pull in the other_records but I need additional criteria in my join. Anything using ARel would also be great, I've just never known how to plug a left outer join from ARel into and ActiveRecord::Relation
I can get you close with ARel. NOTE: My code ends up calling two queries behind the scenes, which I'll explain.
I had to work out LEFT JOINs in ARel myself, recently. Best thing you can do when playing with ARel is to fire up a Rails console or IRB session and run the #to_sql method on your ARel objects to see what kind of SQL they represent. Do it early and often!
Here's your SQL, touched up a bit for consistency:
SELECT *
FROM records
LEFT OUTER JOIN other_records ON other_records.record_id = records.id
AND other_records.some_date > now()
WHERE other_records.something = 'another'
I'll assume your records model is Record and other_records is OtherRecord. Translated to ARel and ActiveRecord:
class Record < ActiveRecord::Base
# Named scope that LEFT JOINs OtherRecords with some_date in the future
def left_join_other_in_future
# Handy ARel aliases
records = Record.arel_table
other = OtherRecord.arel_table
# Join criteria
record_owns_other = other[:record_id].eq(records[:id])
other_in_future = other[:some_date].gt(Time.now)
# ARel's #join method lets you specify the join node type. Defaults to InnerJoin.
# The #join_sources method extracts the ARel join node. You can plug that node
# into ActiveRecord's #joins method. If you call #to_sql on the join node,
# you'll get 'LEFT OUTER JOIN other_records ...'
left_join_other = records.join(other, Arel::Nodes::OuterJoin).
on(record_owns_other.and(other_in_future)).
join_sources
# Pull it together back in regular ActiveRecord and eager-load OtherRecords.
joins(left_join_other).includes(:other_records)
end
end
# MEANWHILE...
# Elsewhere in your app
Record.left_join_other_in_future.where(other_records: {something: 'another'})
I bottled the join in a named scope so you don't need to have all that ARel mixed in with your application logic.
My ARel ends up calling two queries behind the scenes: the first fetches Records using your JOIN and WHERE criteria, the second fetches all OtherRecords "WHERE other_records.record_id IN (...)" using a big list of all the Record IDs from the first query.
Record.includes() definitely gives you the LEFT JOIN you want, but I don't know of a way to inject your own criteria into the join. You could use Record.joins() instead of ARel if you wanted to write the SQL yourself:
Record.joins('LEFT OUTER JOIN other_records' +
' ON other_records.record_id = records.id' +
' AND other_records.some_date > NOW()')
I really, really prefer to let the database adapter write my SQL, so I used ARel.
If it were me, I'd consider putting the additional join criterion in the WHERE clause. I assume you're asking because putting the additional criterion on the join makes the query's EXPLAIN look better or because you don't want to deal with NULLs in the other_records.some_date column when there aren't any related other_records.
If you have a simple (equality) extra join condition it could simply be
record.includes(:other_record).where(:other_record => {:something => :another,
:some_date => Time.now})
But if you need the greater than comparison the following should do it.
record.includes(:other_record).where([
'other_records.something = ? and other_records.some_date > ?',
another, Time.now])
Hope that helps.

How do I write a named scope to filter by all of an array passed in, and not just by matching one element (using IN)

I have two models, Project and Category, which have a many-to-many relationship between them. The Project model is very simple:
class Project < ActiveRecord::Base
has_and_belongs_to_many :categories
scope :in_categories, lambda { |categories|
joins(:categories).
where("categories.id in (?)", categories.collect(&:to_i))
}
end
The :in_categories scope takes an array of Category IDs (as strings), so using this scope I can get back every project that belongs to at least one of the categories passed in.
But what I'm actually trying to do is filter (a better name would be :has_categories). I want to just get the projects that belong to all of the categories passed in. So if I pass in ["1", "3", "4"] I only want to get the projects that belong to all of the categories.
There are two common solutions in SQL to do what you're describing.
Self-join:
SELECT ...
FROM Projects p
JOIN Categories c1 ON c1.project_id = p.id
JOIN Categories c3 ON c3.project_id = p.id
JOIN Categories c4 ON c4.project_id = p.id
WHERE (c1.id, c3.id, c4.id) = (1, 3, 4);
Note I'm using syntax to compare tuples. This is equivalent to:
WHERE c1.id = 1 AND c3.id = 3 AND c4.id = 4;
In general, the self-join solution has very good performance if you have a covering index. Probably Categories.(project_id,id) would be the right index, but analyze the SQL with EXPLAIN to be sure.
The disadvantage of this method is that you need four joins if you're searching for projects that match four different categories. Five joins for five categories, etc.
Group-by:
SELECT ...
FROM Projects p
JOIN Categories cc ON c.project_id = p.id
WHERE c.id IN (1, 3, 4)
GROUP BY p.id
HAVING COUNT(*) = 3;
If you're using MySQL (I assume you are), most GROUP BY queries invoke a temp table and this kills performance.
I'll leave it as an exercise for you to adapt one of these SQL solutions to equivalent Rails ActiveRecord API.
It seems like in ActiveRecord you would do it like so:
scope :has_categories, lambda { |categories|
joins(:categories).
where("categories.id in (?)", categories.collect(&:to_i)).
group("projects.id HAVING COUNT(projects.id) = #{categories.count}")
}