Rails 3 HABTM query with array, need AND not OR/IN - ruby-on-rails-3

Say I have two models with a HABTM relationship. Teacher and Student. Here is an example of what I currently have working:
student_ids = [1,2,3,4]
Teacher.joins(:students).where("students.id" => student_ids)
The problem is that this will return all Teacher objects with with ANY of those student ids, but not require ALL of them:
SELECT `teachers`.* FROM `teachers` INNER JOIN `students_teachers` ON `students_teachers`.`teacher_id` = `teachers`.`id` INNER JOIN `students` ON `students`.`id` = `students_teachers`.`student_id` WHERE `students`.`id` IN (1, 2, 3, 4)
I have two cases, one of which is an OR condition, which the above handles fine since I just need to find Teachers with Student.id 1 OR 2 OR 3 OR 4. The other is AND, where I need to ensure that the the Teachers being returned include ALL of the student_ids, so Teachers with Student.id 1 AND 2 AND 3 AND 4.

I would use an include
example here:
clients = Client.includes(:address).limit(10)
this is what would happen:
SELECT * FROM clients LIMIT 10
SELECT addresses.* FROM addresses
WHERE (addresses.client_id IN (1,2,3,4,5,6,7,8,9,10))
you can read more about it here
http://guides.rubyonrails.org/active_record_querying.html

You can do something like:
teachers = nil # so it wont be used in the first pass
Student.include(:teachers).where(id: student_ids).each do |student|
teachers = (teachers || s.teachers) & s.teachers
end

Related

Cypher - Add multiple connections

I have 2 nodes:
Students and Subjects.
I want to be able to add multiple student names to multiple subjects at the same time using cypher query.
So far I have done it by iterating through the list of names of students and subjects and executing the query for each. but is there a way to do the same in the query itself?
This is the query I use for adding 1 student to 1 subject:
MATCH
(s:Student)-[:STUDENT_BELONGS_TO]->(c:Classroom),
(u:Subjects)-[:SUBJECTS_TAUGHT_IN]->(c:Classroom)
WHERE
s.id = ${"$"}studentId
AND c.id = ${"$"}classroomId
AND u.name = ${"$"}subjectNames
AND NOT (s)-[:IN_SUBJECT]->(u)
CREATE (s)-[:IN_SUBJECT]->(u)
So I want to be able to receive multiple subjectNames and studentIds at once to create these connections. Any guidance for multi relationships in cypher ?
I think what you are looking for is UNWIND. If you have an array as parameter to your query:
studentList :
[
studentId: "sid1", classroomId: "cid1", subjectNames: ['s1','s2'] },
studentId: "sid2", classroomId: "cid2", subjectNames: ['s1','s3'] },
...
]
You can UNWIND that parameter in the beginning of your query:
UNWIND $studentList as student
MATCH
(s:Student)-[:STUDENT_BELONGS_TO]->(c:Classroom),
(u:Subjects)-[:SUBJECTS_TAUGHT_IN]->(c:Classroom)
WHERE
s.id = student.studentId
AND c.id = student.classroomId
AND u.name = in student.subjectNames
AND NOT (s)-[:IN_SUBJECT]->(u)
CREATE (s)-[:IN_SUBJECT]->(u)
You probably need to use UNWIND.
I haven't tested the code, but something like this might work:
MATCH
(s:Student)-[:STUDENT_BELONGS_TO]->(c:Classroom),
(u:Subjects)-[:SUBJECTS_TAUGHT_IN]->(c:Classroom)
WITH
s AS student, COLLECT(u) AS subjects
UNWIND subjects AS subject
CREATE (student)-[:IN_SUBJECT]->(subject)

Select Related With Multiple Conditions

Using the Django ORM is it possible to perform a select_related (left join) with conditions additional to the default table1.id = table2.fk
Using the example models:
class Author(models.Model):
name = models.TextField()
age = models.IntegerField()
class Book(models.Model):
title = models.TextField()
and the raw sql
SELECT 'Book'.*, 'Author'.'name'
FROM 'Book'
LEFT JOIN
'Author'
ON 'Author'.'id' = 'Book'.'author_id'
AND 'Author'.'age' > 18 ;<---this line here is what id like to use via the ORM
I understand that in this simple example you can perform the filtering after the join, but that hasn't worked in my specific case. As i am doing sums across multiple left joins that require filters.
# gets all books which has author with age higher than 18
books = Book.objects.filter(author__age__gt=18)
returns queryset.
Then you can loop trough the queryset to access specific values and print them:
for b in books:
print(b.title, b.author.name, b.author.age)

Filtering model with HABTM relationship

I have 2 models - Restaurant and Feature. They are connected via has_and_belongs_to_many relationship. The gist of it is that you have restaurants with many features like delivery, pizza, sandwiches, salad bar, vegetarian option,… So now when the user wants to filter the restaurants and lets say he checks pizza and delivery, I want to display all the restaurants that have both features; pizza, delivery and maybe some more, but it HAS TO HAVE pizza AND delivery.
If I do a simple .where('features IN (?)', params[:features]) I (of course) get the restaurants that have either - so or pizza or delivery or both - which is not at all what I want.
My SQL/Rails knowledge is kinda limited since I'm new to this but I asked a friend and now I have this huuuge SQL that gets the job done:
Restaurant.find_by_sql(['SELECT restaurant_id FROM (
SELECT features_restaurants.*, ROW_NUMBER() OVER(PARTITION BY restaurants.id ORDER BY features.id) AS rn FROM restaurants
JOIN features_restaurants ON restaurants.id = features_restaurants.restaurant_id
JOIN features ON features_restaurants.feature_id = features.id
WHERE features.id in (?)
) t
WHERE rn = ?', params[:features], params[:features].count])
So my question is: is there a better - more Rails even - way of doing this? How would you do it?
Oh BTW I'm using Rails 4 on Heroku so it's a Postgres DB.
This is an example of a set-iwthin-sets query. I advocate solving these with group by and having, because this provides a general framework.
Here is how this works in your case:
select fr.restaurant_id
from features_restaurants fr join
features f
on fr.feature_id = f.feature_id
group by fr.restaurant_id
having sum(case when f.feature_name = 'pizza' then 1 else 0 end) > 0 and
sum(case when f.feature_name = 'delivery' then 1 else 0 end) > 0
Each condition in the having clause is counting for the presence of one of the features -- "pizza" and "delivery". If both features are present, then you get the restaurant_id.
How much data is in your features table? Is it just a table of ids and names?
If so, and you're willing to do a little denormalization, you can do this much more easily by encoding the features as a text array on restaurant.
With this scheme your queries boil down to
select * from restaurants where restaurants.features #> ARRAY['pizza', 'delivery']
If you want to maintain your features table because it contains useful data, you can store the array of feature ids on the restaurant and do a query like this:
select * from restaurants where restaurants.feature_ids #> ARRAY[5, 17]
If you don't know the ids up front, and want it all in one query, you should be able to do something along these lines:
select * from restaurants where restaurants.feature_ids #> (
select id from features where name in ('pizza', 'delivery')
) as matched_features
That last query might need some more consideration...
Anyways, I've actually got a pretty detailed article written up about Tagging in Postgres and ActiveRecord if you want some more details.
This is not "copy and paste" solution but if you consider following steps you will have fast working query.
index feature_name column (I'm assuming that column feature_id is indexed on both tables)
place each feature_name param in exists():
select fr.restaurant_id
from
features_restaurants fr
where
exists(select true from features f where fr.feature_id = f.feature_id and f.feature_name = 'pizza')
and
exists(select true from features f where fr.feature_id = f.feature_id and f.feature_name = 'delivery')
group by
fr.restaurant_id
Maybe you're looking at it backwards?
Maybe try merging the restaurants returned by each feature.
Simplified:
pizza_restaurants = Feature.find_by_name('pizza').restaurants
delivery_restaurants = Feature.find_by_name('delivery').restaurants
pizza_delivery_restaurants = pizza_restaurants & delivery_restaurants
Obviously, this is a single instance solution. But it illustrates the idea.
UPDATE
Here's a dynamic method to pull in all filters without writing SQL (i.e. the "Railsy" way)
def get_restaurants_by_feature_names(features)
# accepts an array of feature names
restaurants = Restaurant.all
features.each do |f|
feature_restaurants = Feature.find_by_name(f).restaurants
restaurants = feature_restaurants & restaurants
end
return restaurants
end
Since its an AND condition (the OR conditions get dicey with AREL). I reread your stated problem and ignoring the SQL. I think this is what you want.
# in Restaurant
has_many :features
# in Feature
has_many :restaurants
# this is a contrived example. you may be doing something like
# where(name: 'pizza'). I'm just making this condition up. You
# could also make this more DRY by just passing in the name if
# that's what you're doing.
def self.pizza
where(pizza: true)
end
def self.delivery
where(delivery: true)
end
# query
Restaurant.features.pizza.delivery
Basically you call the association with ".features" and then you use the self methods defined on features. Hopefully I didn't misunderstand the original problem.
Cheers!
Restaurant
.joins(:features)
.where(features: {name: ['pizza','delivery']})
.group(:id)
.having('count(features.name) = ?', 2)
This seems to work for me. I tried it with SQLite though.

Certain size from collection, including second filter with HQL?

I'm struggling with following scenario: Let's assume I have the two entities Classroom and Member, mapped with many-to-many. Classroom has the collection Members, containing the entities Member.
I would like to get the classrooms which do have members of a certain count. That would result in something like:
FROM Classroom cr WHERE cr.Members.size < 10
Now I have a Type on Classroom. I'd like to filter first on Type, then the size. This won't work:
FROM Classroom cr WHERE cr.Members.size < 10 AND cr.Members.Type = 1
Results in: illegal attempt to dereference collection
How could I write such a query?
I would imagine you need to do a join
from Classroom as cr left join cr.Members as m
where cr.Members.size < 10 and m.Type = 1

How do I write a named scope to filter by all of an array passed in, and not just by matching one element (using IN)

I have two models, Project and Category, which have a many-to-many relationship between them. The Project model is very simple:
class Project < ActiveRecord::Base
has_and_belongs_to_many :categories
scope :in_categories, lambda { |categories|
joins(:categories).
where("categories.id in (?)", categories.collect(&:to_i))
}
end
The :in_categories scope takes an array of Category IDs (as strings), so using this scope I can get back every project that belongs to at least one of the categories passed in.
But what I'm actually trying to do is filter (a better name would be :has_categories). I want to just get the projects that belong to all of the categories passed in. So if I pass in ["1", "3", "4"] I only want to get the projects that belong to all of the categories.
There are two common solutions in SQL to do what you're describing.
Self-join:
SELECT ...
FROM Projects p
JOIN Categories c1 ON c1.project_id = p.id
JOIN Categories c3 ON c3.project_id = p.id
JOIN Categories c4 ON c4.project_id = p.id
WHERE (c1.id, c3.id, c4.id) = (1, 3, 4);
Note I'm using syntax to compare tuples. This is equivalent to:
WHERE c1.id = 1 AND c3.id = 3 AND c4.id = 4;
In general, the self-join solution has very good performance if you have a covering index. Probably Categories.(project_id,id) would be the right index, but analyze the SQL with EXPLAIN to be sure.
The disadvantage of this method is that you need four joins if you're searching for projects that match four different categories. Five joins for five categories, etc.
Group-by:
SELECT ...
FROM Projects p
JOIN Categories cc ON c.project_id = p.id
WHERE c.id IN (1, 3, 4)
GROUP BY p.id
HAVING COUNT(*) = 3;
If you're using MySQL (I assume you are), most GROUP BY queries invoke a temp table and this kills performance.
I'll leave it as an exercise for you to adapt one of these SQL solutions to equivalent Rails ActiveRecord API.
It seems like in ActiveRecord you would do it like so:
scope :has_categories, lambda { |categories|
joins(:categories).
where("categories.id in (?)", categories.collect(&:to_i)).
group("projects.id HAVING COUNT(projects.id) = #{categories.count}")
}