How to eager load associated table with conditions and current user? - sql

Im trying to reduce the number of queries from n+1 to a couple for hacks that have favorites.
User has many hacks. (the creator of the hack)
Hack has many favorites
User has many favorites
Favorites belongs to hack and user.
Favorites is just a join table. It only contains foreign keys, nothing else.
On the hacks index page, I display a star icon based on whether a favorite record exists for the given hack and current_user.
index.hmtl
- #hacks.each do |h|
- if h.favorites(current_user)
#star image
How do I use active record or raw SQL in a '.SQL' method call to eager load the relevant favorites?
I'm thinking first get the paginated hacks, get the current user, and query all of the favorites that have the hack and the current user.
I'm not sure on the syntax. Also Current_user cannot be called from models.
Update:
I tried these queries
#hacks = Hack.includes(:tags, :users, :favorites).where("favorites.user_id = ?", current_user.id).references(:favorites)
I've also tried the following, and both of them without the .references method
#hacks = Hack.includes(:tags, :users, :favorites).where(favorites: { user_id: current_user.id } ).references(:favorites)
The WHERE clause acts on hacks to limit the type of hacks.
But what I want is to limit the favorites to the ones with the condition instead (i.e. favorites.user_id = current_user.id).
(After all, many thousands of user may favorite a hack, but the client side is only concerned with the current user, so loading all the users that favorited the hack could be very expensive and bad)
How can I make conditions apply to eager loaded associations rather than the original model?

If you have belongs_to :user and belongs_to :hack, would it be safe to assume that you have a "has many through" relationship setup between favorites, hack and user? Following is how you could eager load Hack with Favorites and Users.
# app/controllers/hacks_controller.b
def index
#hacks = Hack.includes(favorites: :user)
end
The above code is going to run three queries, one to select all from hacks, second join between hacks with favorites, and third select all from users with users already selected in the second query through favorites.
Another way to execute these joins in a single query would be to use an inner join using joins:
# app/controllers/hacks_controller.b
def index
#hacks = Hack.joins(favorites: :user)
end
With this query, there is only one query with inner join between the three tables, users, favorites and hacks.

Related

Ruby on Rails where query with relations

I am trying to use where query with relationships.
How can I query using where with relations in this case?
This is model
User
has_many :projects
has_many :reasons, through: :projects
Project
belongs_to :user
has_many :reasons
Reasons
belongs_to :project
This is the codes which doesn't work
# GET /reasons
def index
reasons = current_user.reasons
updated_at = params[:updated_at]
# Filter with updated_at for reloading from mobile app
if updated_at.present?
# This one doesn't work!!!!!!!!!!!!
reasons = reasons.includes(:projects).where("updated_at > ?", Time.at(updated_at.to_i))
# Get all non deleted objects when logging in from mobile app
else
reasons = reasons.where(deleted: false)
end
render json: reasons
end
---Update---
This is correct thanks to #AmitA.
reasons = reasons.joins(:project).where("projects.updated_at > ?", Time.at(updated_at.to_i))
If you want to query all reasons whose projects have some constraints, you need to use joins instead of includes:
reasons = reasons.joins(:project).where("projects.updated_at > ?", Time.at(updated_at.to_i))
Note that when both includes and joins receive a symbol they look for association with that precise name. That's why you can't actually do includes(:projects), but must do includes(:project) or joins(:project).
Also note that the constraints on joined tables specified by where must refer to the table name, not the association name. That's why I used projects.updated_at (in plural) rather than anything else. In other words, when calling the where method you are in "SQL domain".
There is a difference between includes and joins. includes runs a separate query to load the dependents, and then populates them into the fetched active record objects. So:
reasons = Reason.where('id IN (1, 2, 3)').includes(:project)
Will do the following:
Run the query SELECT * FROM reasons WHERE id IN (1,2,3), and construct the ActiveRecord objects Reason for each record.
Look into each reason fetched and extract its project_id. Let's say these are 11,12,13. Then run the query SELECT * FROM projects WHERE id IN (11,12,13) and construct the ActiveRecord objects Project for each record.
Pre-populate the project association of each Reason ActiveRecord object fetched in step 1.
The last step above means you can then safely do:
reasons.first.project
And no query will be initiated to fetch the project of the first reason. This is why includes is used to solve N+1 queries. However, note that no JOIN clauses happen in the SQLs - they are separate SQLs. So you cannot add SQL constraints when you use includes.
That's where joins comes in. It simply joins the tables so that you can add where constraints on the joined tables. However, it does not pre-populate the associations for you. In fact, Reason.joins(:project), will never instantiate Project ActiveRecord objects.
If you want to do both joins and includes, you can use a third method called eager_load. You can read more about the differences here.

Deleting an active record entry from a joined table without deleting the orginal

I am working on a web application written in ruby on rails
There is a model of courses that exists in the database.
A user has the ability to save courses that he/she plans to take.
I accomplish this by having a current_user.courses table. (courses_users)
However, if a user wishes to remove a saved course, and I issue a delete request with that id like so
current_user.courses.find_by(id: params[:id]).destroy
I end up deleting both the entry in the joined table and the entry in the courses table. I confirmed this by looking at the server logs and found
DELETE FROM `courses_users` WHERE `courses_users`.`course_id` = 219
SQL (0.4ms) DELETE FROM `courses` WHERE `courses`.`id` = 219
Doing the two actions described above when I only want the first.
Is there a way to just remove just the entry in the joined table? Or does that go against the nature of a joined table? If so, is there a more efficient way of achieving this functionality?
Any help would be appreciated and if you would have me post anything else please ask.
Thank you,
EDIT: The relationships between the two models:
a user has_and_belongs_to_many courses
and a course has_and_belongs_to_many users
Your code is saying to destroy the course, so that's what ActiveRecord is doing.
You want to delete the association, not destroy the user nor the course.
The solution is to use collection.delete(object, …)
This removes each object from the collection by deleting the association from the join table.
This does not destroy the objects.
Example code:
def delete_user_course_association(user_id, course_id)
user = User.find(user_id)
course = user.courses.find(course_id)
user.courses.delete(course) if course
end
See Rails API ActiveRecord Associations has_and_belongs_to_many
The line below deletes the courses record with id params[:id]
current_user.courses.find_by(id: params[:id]).destroy
I think you meant to do:
current_user.courses.delete( params[:id] )

Is it bad data design to duplicate Rails associations?

Let's say I have a Rails 3 application with the following model associations:
user
belongs_to :group
item
belongs_to :group
belongs_to :user
If code is not carefully written, this can result in data discrepancies where:
item.group
and
item.user.group
no longer return the same group, when they should. An item should always only belong to only 1 group.
My understanding is that this duplicate association may have been created to make querying simpler (reduce the number of tables joined).
So my question is, is this just an outright terrible practice or is this a question of valid trade-offs, that there are cases where the data and association duplication are acceptable because we can make querying simpler with fewer joins.
UPDATE
So far seems like the answer is "trade offs" and not "bad practice/code smell".
There seems to be multiple ways this can be handled, probably with a mix of constraints, advantages, disadvantages, use cases, etc:
1) denormalized, duplicated data as above
2) item has_one :group, :through => :user
3) item delegate :group :to => :user
I'm trying to understand the differences between approach #2 and #3. After experimenting with both approaches in the console, seems like the queries produced by Rails when item.group is called will be different. (2) produces a single query that joins groups and users. (2) produces two queries, first to find the user and then to find the group based on the user.
I think this is a question of valid trade-offs. Strictly speaking, in a fully normalized database your items table wouldn't have a group column, instead it would always go through the users table to find the group. That has the least amount of duplication, and thus the highest data integrity, but at the cost of doing that extra join every time you want to find an item's group. I'm assuming that a user also only belongs to one group. If a user can belong to many groups, then I think you would have to have that items.group_id column to know to which of those groups an item belongs.
If you want the faster query performance on lookup, you can keep the extra association like you have, and add an extra before_* hook to make sure that item.group_id = item.user.group_id, and raise a validation error if they don't match. This would make validating/inserting slightly slower, but would maximize your data integrity and still let you get slightly better performance when reading from the database.

How to prevent rails `has_many` relation joining two huge tables

I am using Ruby on Rails 3.1.10 in developing a web application.
Objective is to find all users that a user is following.
Let there be two models User and Following
In User model:
has_many :following_users, :through => :followings
When calling user.following_users, rails help generates a query that INNER JOIN between users and followings table by its magical default.
When users table has over 50,000 records while followings table has over 10,000,000 records, the inner join generated is resource demanding.
Any thoughts on how to optimize the performance by avoiding inner joining two big tables?
To avoid a single query with inner join, you can do 2 select queries by using the following method
# User.rb
# assuming that Following has a followed_id column for user that is being followed
def following_users_nojoin
#following_users_nojoin ||= User.where("id IN (?)", followings.map(&:followed_id))
end
This will not create a join table but would make two sql queries. One to get all the followings that belong to the user (unless it is already in the cache) and second query to find all the followed users. A user_id index on following, as suggested in the comment, would speed up the first query where we get all the followings for the user.
The above method would be faster than a single join query if the followings of a user have already been retrieved.
Read this for details on whether it is faster to make multiple select queries over a single query with join. The best way to find out which one is faster is to benchmark both methods on your production database.

Eager loading with polymorphic association using conditions of another included association

I have an activity feed and I want to eager-load all the activities that are present.
Each activity has an actor, which is polymorphic. Suppose there can be users or groups as actors.
Then I have a UserActivity table to know which activities are copied to which user.
Now, for the activity feed, I want to eager-load all the users and groups that are actors for each activity.
I have been reading and I know it will generate multiple SQL conditions because there is a polymorphic association to do the include. My sentence is now:
#activities = Activity.find(:all, :order=>"activities.updated_at DESC", :conditions => ['user_activities.user_id=?', #user.id], :include=> [:user_activities, :actor])
However, this raises an error:
"Can not eagerly load the polymorphic association :actor"
I tried the same, but without the conditions and it does work, generating only the SQL queries for the associations that are actors in the activities taken into account. (Next sentence does work)
#activities = Activity.find(:all, :order=>"activities.updated_at DESC", :include=> [ :user_activities, :actor])
But of course, this isn't what I want to do, because I want to have the condition, but I don't see a way of making it work. I have tried with where instead of the conditions option and it doesn't work, and I have used includes instead because I read that :include was deprecated for Rails 3.
It is very important for the application because it is the activity feed , so you can imagine it will generate a lot of traffic.
And I don't want to do a query each time the feed is reloaded for getting all users and groups that are in the activities, because if all the activities that are shown at one time didn't have one type of actor, I would do a SQL query that I could be saving.