Counting Child Rows As a Condition in a Query - sql

I have a table of Groups that have a "capacity" column and has a has_many relationship with Enrollments. I want to be able to find Groups whose count of enrollments is less than it's capacity, so using ActiveRecord + Ruby I can do this:
Group.all.select {|g| g.enrollments.count < g.capacity }.first
But it seems like there should be a way to do this in SQL, I just don't know how. Any ideas?

The pure SQL way of doing this would be
select groups.* from groups
inner join enrollments on enrollments.group_id = groups.id
group by groups.id
having count(*) < capacity
Or in activerecord
Group.joins(:enrollments).group('groups.id').having('count(*) < capacity)
A counter cache with an index on the counter column will be faster though, although obviously you have to not create enrollments behind acriverecord's back.

May use :counter_cache option.
Add a enrollments_count column to the Groups table
column :groups, :enrollments_count, :integer, :default => 0
set Enrollment's :counter_cache option
class Enrollment < ActiveRecord::Base
belongs_to :group, :counter_cache => true
end
Group.where("capacity > enrollments_count").first

Related

Count records which has no association or association of association meets condition

Assume we have three classes:
class Person
belongs_to :group
end
class Group
has_many :people
has_many :tags
end
class Tag
belongs_to :group
end
It is possible, that Person may have group_id = NULL.
What I need, is to COUNT Person records which has group_id = NULL or his Group doesn't have any Tag record with attribute status IN ('active', 'finished')
One more thing, it has to be as fast as possible. We are querying over milions of records.
Ideas? Thanks.

Sort records based on number of rows in another table - Rails

So I have a users table and in my relationship, I have defined that a user has many submissions and submissions belong to a user. I want to sort the users table based on how many submissions they have.
submission model
class Submission < ActiveRecord::Base
belongs_to :user
end
user model
class User < ActiveRecord::Base
has_many :submissions, dependent: :destroy
end
The far I have gone is I'm able to get how many submissions a user has using this query
Submission.all.count(:group => "user_id")
With this for example I'm able to get the number of submissions a user with a specific id has
{1=>3, 2=>5}
I want to have a sorted users table with the user with the highest number of submissions first. How can this be achieved in rails activerecord?
You can do what you want in 2 ways:
Using join, group by and order by count
User.select("COUNT(*) AS count_all, submissions.user_id AS submissions_user_id")
.joins('LEFT JOIN submissions ON submissions.user_id = users.id')
.group('submissions.user_id')
.order('COUNT(submissions.user_id) DESC')
This will generate the following sql:
SELECT COUNT(*) AS count_all, submissions.user_id AS submissions_user_id FROM "users" LEFT JOIN submissions ON submissions.user_id = users.id GROUP BY submissions.user_id ORDER BY COUNT(submissions.id) DESC
LEFT JOIN will get the users with 0 submissions too (if you have that situation)
Using counter_cache
The most efficient solutions for querying, in this context, is to use counter_cache
This will enable you to run a query like this:
User.order('submissions_count DESC')
which translates to:
SELECT * FROM users ORDER BY submissions_count DESC
!!! If you want to implement this, especially in production, do a backup of your database before starting. !!!
Read counter_cache docs to understand what it is and how it can help you.
Add a new column on users table named submissions_count.
class AddSubmissionsCountToUsers < ActiveRecord::Migration
def change
add_column :users, :submissions_count, :integer, default: 0
add_index :users, :submissions_count
end
end
Modify your Submission model and add counter_cache.
class Submission < ActiveRecord::Base
belongs_to :user, counter_cache: true
end
If you have a production database update submissions_count to reflect the number of existing submissions:
User.find_in_batches do |group|
group.each do |user|
user_submissions_count = Submission.where(user_id: user.id).count // find how many subscription a user has
user.update_column(:submissions_count, user_submissions_count)
end
end
Every time a user will create/destroy a subscription, submissions_count will be incremented/decremented for that user to reflect the change.

Find list of groups where at least one member is part of list (rails, SQL)

I have a simple has_many :through arrangement, as shown below
# employee.rb
class Employee < ActiveRecord::Base
has_many :group_assignments
has_many :groups, through: :group_assignments
# ...
end
# group.rb
class Group < ActiveRecord::Base
has_many :group_assignments
has_many :employees, through: :group_assignments
# ...
end
# group_assignment.rb
class GroupAssignment < ActiveRecord::Base
belongs_to :employee
belongs_to :group
end
I have a list of employees. For that list, I want to grab every group that contains at least one of the employees on that list. How would I accomplish this in a manner that isn't horridly inefficient? I'm newish to Rails and very new at SQL, and I'm pretty at a loss. I'm using SQLite in development and PostgreSQL in production.
For a list of employees named employees_list, this will work:
Group.includes(:employees).where('employees.id' => employees_list.map(&:id))
This is roughly the kind of SQL you will get:
SELECT "groups"."id" AS t0_r0,
"groups"."created_at" AS t0_r1, "groups"."updated_at" AS t0_r2,
"employees"."id" AS t1_r0, "employees"."created_at" AS t1_r1, "employees"."updated_at" AS t1_r2
FROM "groups"
LEFT OUTER JOIN "group_assignments" ON "group_assignments"."group_id" = "groups"."id"
LEFT OUTER JOIN "employees" ON "employees"."id" = "group_assignments"."employee_id"
WHERE "employees"."id" IN (1, 3)
So what is happening is that groups and group_assignments tables are first being joined with a left outer join (matching the group_id column in the group_assignments table to the id column in the groups table), and then employees again with a left outer join (matching employee_id in the group_assignments table to the id column in the employees table).
Then after that we're selecting all rows where 'employees'.'id' (the id of the employee) is in the array of employees in the employee list, which we get by mapping employees_list to their ids using map: employees_list.map(&:id). The map(&:id) here is shorthand for: map { |e| e.id }.
Note that you could use joins instead of includes here, but then you would get duplicates if one employee is a member of multiple groups. Kind of subtle but useful thing to know.
Hope that makes sense!
This is the general idea, but depending on your data, you may need to select distinct.
Group.includes(:group_assignments => :employee).where(:employee => {:id => ?}, #employees.map(&:id))
try
Group.joins(:group_assignments).where("group_assignments.employee_id in (?)", #employees.map(&:id))

Excluding records from a search which are associated with a record

I'm attempting to write a SQL search which will allow return any records which are tagged with certain values, and exclude any of those results which have other tags.
Tags are applied using a join model, like this:
class Customer < ActiveRecord::Base
has_many :tag_assignments
has_many :tags, :through => :tag_assignments
end
class Tag < ActiveRecord::Base
has_many :tag_assignments
has_many :customers, :through => :tag_assignments
end
class TagAssignment < ActiveRecord::Base
belongs_to :customer
belongs_to :tag
end
The query I currently have is:
SELECT DISTINCT customers.* FROM customers LEFT OUTER JOIN tag_assignments ON tag_assignments.customer_id = customers.id WHERE (tag_assignments.tag_id NOT IN (?))
The ? is then replaced in a query by the list of tags I don't want included.
This works fine when a customer has only a single tag applied, but as soon as they get multiple tags they'll show up despite the exclusion, since one of their other tags does match.
Something to keep in mind is that this needs to continue working when additional clauses are added (such as requiring other tags to be present, or matching on other customer attributes), but any point in the right direction would be appreciated.
I'm rusty with this ... but you need to get all things with tag's first and then negate ..
SELECT DISTINCT customers.*
FROM customers
OUTER JOIN (
SELECT DISTINCT customers.id
FROM customers
INNER JOIN tag_assignments ON tag_assignments.customer_id = customers.id
WHERE tag_assignments.tag_id IN (?)
) AS neg_customers ON (neg_customers.id = customers.id)
WHERE neg_customers.id IS NULL;

Is it possible to define a single SQL query that draws a set of permissible foreign_keys from one table and then uses them to filter another?

Specifically, I'm trying to figure out if it's possible to generate SQL that does what I want to feed into Ruby-on-Rails' find_by_sql method.
Imagine there are Users, who are joined cyclically to other Users by a join table Friendships. Each User has the ability to create Comments.
I'd like a SQL query to return the latest 100 comments created by any friends of a given user so I can display them all in one convenient place for the user to see.
This is tricky, since essentially I'm looking to filter the comments by whether their foreign keys for their author are contained in a set of keys obtained derived from the user's friends' primary keys.
Edit: Clarifying the setup. I'm not exactly sure how to write a schema definition, so I'll describe it in terms of Rails.
class User
has_many :friends, :through => :friendships
has_many :comments
end
class Friendship
belongs_to :user
belongs_to :friend, :class_name => "User", :foreign_key => "friend_id"
end
def Comment
has_one :User
end
It's not that tricky, you just use joins. To get just the comments you only need to join the Friendships table and the Comments table, but you probably also want some information from the Users table for the person who wrote the comment.
This would get the last 100 comments from people who are friends with the user with id 42:
select top 100 c.CommentId, c.CommentText, c.PostDate, u.Name
from Friendships f
inner join Users u on u.UserId = f.FriendUserId
inner join Comments c on c.UserId = u.UserId
where f.UserId = 42
order by c.PostDate desc