Find an existing messaging group when given the potential members - sql

I have an application where users can create messaging groups. MessageGroups have members through MessageMemberships. MessageMemberships belongs to a 'profile', which is polymorphic due to their being different types of 'profiles' in the db.
class MessageGroup < ApplicationRecord
has_many :message_memberships, dependent: :destroy
has_many :coach_profiles, through: :message_memberships, source: :profile, source_type: "CoachProfile"
has_many :parent_profiles, through: :message_memberships, source: :profile, source_type: "ParentProfile"
has_many :customers, through: :message_memberships, source: :profile, source_type: "Customer"
class MessageMembership < ApplicationRecord
belongs_to :message_group
belongs_to :profile, polymorphic: true
In my UI, I'd like to be able to first query to see if a messaging group exists with exactly x members so I can use that, rather than creating an entirely new messaging group (similar to how Slack or iMessages will find you an existing thread).
How would you go about querying that?

The code (not tested) below assumes:
You have (or can add) a message_memberships_count counter_cache column to the message_groups table. (and maybe adding an index to the message_memberships_count column to speed up the query)
You have proper unique indexing in the message_memberships table that will prevent a profile from being added to the same message_group multiple times
How it works:
There is a loop that will do multiple inner joins on the same table to ensure that the association exists for each profile
The query will then check that the total number of members in the group is equal to the number of profiles
class MessageGroup < ApplicationRecord
def self.for_profiles(profiles)
query = "SELECT \"message_groups\".* FROM \"message_groups\""
profiles.each do |profile|
klass =
# provide an alias to the table to prevent `PG::DuplicateAlias: ERROR
table_alias = "message_memberships_#{Digest::SHA1.hexdigest("#{klass}_#{}")[0..6]}"
query += " INNER JOIN \"message_memberships\" \"#{table_alias}\" ON \"#{table_alias}\".\"message_group_id\" = \"message_groups\".\"id\" AND \"#{table_alias}\".\"profile_type\" = #{klass} AND \"#{table_alias}\".\"profile_id\" = #{}"
query += " where \"message_groups\".\"message_memberships_count\" = #{profiles.length}"

Based on #AbM's answer I arrived at the following. This has the same assumptions as the previous answer, counter cache and unique indexing should be in place.
def self.find_direct_with_profiles!(profiles)
# Not present, some authorization checks that may raise (hence the bang method name)
# Loop through the profiles and join them all together so we get a join that contains
# all the data we need in order to filter it down
join = ""
conditions = ""
profiles.each_with_index do |profile, index|
klass =
# provide an alias to the table to prevent `PG::DuplicateAlias: ERROR
table_alias = "message_memberships_#{Digest::SHA1.hexdigest("#{klass}_#{}")[0..6]}"
join += " INNER JOIN \"message_memberships\" \"#{table_alias}\" ON \"#{table_alias}\".\"message_group_id\" = \"message_groups\".\"id\""
condition_join = index == 0 ? 'where' : ' and'
conditions += "#{condition_join} \"#{table_alias}\".\"profile_type\" = '#{klass}' and \"#{table_alias}\".\"profile_id\" = #{}"
# Add one
size_conditional = " and \"message_groups\".\"message_memberships_count\" = #{profiles.size}"
# Add any other conditions you may need
conditions += "#{size_conditional}"
query = "SELECT \"message_groups\".* FROM \"message_groups\" #{join} #{conditions}"
# find_by_sql returns an array with hydrated models from the select statement. In this case I am just grabbing the first one to match other finder active record method conventions


Get most recent records from deeply nested model

Say I have 3 models:
ModelA has many ModelB
ModelB has many ModelC
I'm querying ModelA, but in ModelC I have multiple ones of the same type, let's say I have 3 but I only need the most recently one.
I tried to do something like this...
records = ModelA.where(some query).includes ModelB includes ModelC
// convert activerecord collection to array
records = records.to_a
records.each do |record|
record.modelBs.each do |modelB|
filter the modelCs i don't need
modelB.modelCs = filteredModelCs
return records
but instead of merely returning the array of records, an UPDATE sql query is run and the db records are modified. this is a surprise because i never used the .save method and i thought i had converted the collection from an active record collection to an array
How can I filter deeply nested records without modifying the db records? then i can return the filtered result
Assigning a list of instances to a has_many collection with = will immediately persist the changes to the database.
Instead, I would try to solve this with more specific associations like this:
class A
has_many :bs
has_many(:cs, through: :bs)
has_one :recent_c, -> { order(created_at: :desc).limit(1) }, source: :cs
class B
has_many :cs
With those associations, I would expect the following to work:
as = A.where(some query).includes(:recent_c)
as.each do |a|
a.recent_c # returns the most recent c for this a
If I got you right, you want to get a collection of latest Cs, which are connected to Bs, which are connected to certain A-relation? If so, you can do something like that (considering you have tables as, bs and cs):
class A < ApplicationRecord
has_many :bs
class B < ApplicationRecord
belongs_to :a
has_many :cs
class C < ApplicationRecord
belongs_to :b
scope :recent_for_bs, -> { joins(
INNER JOIN (SELECT b_id, MAX(id) AS max_id FROM cs GROUP BY b_id) recent_cs
ON cs.b_id = recent_cs.b_id AND = recent_cs.max_id
) }
And then you would query Cs like that:
C.recent_for_bs.joins(b: :a).merge(A.where(some_query))
You get recent Cs, inner join them with Bs and As and then get records connected to your A-relation by merging it.

Rails and SQL - get related by all elements from array, entries

I have something like this:
duplicates = ['a','b','c','d']
if duplicates.length > 4
Photo.includes(:tags).where(' IN (?)',duplicates)
.references(:tags).limit(15).each do |f|
duplicates is an array of tags that were duplicated with other Photo tags
What I want is to get Photo which includes all tags from duplicates array, but right now I get every Photo that include at least one tag from array.
I try them and somethings starts to work but wasn't too clear for me and take some time to execute.
Today I make it creating arrays, compare them, take duplicates which exist in array more than X times and finally have uniq array of photos ids.
If you want to find photos that have all the given tags you just need to apply a GROUP and use HAVING to set a condition on the group:
class Photo
def self.with_tags(*names)
t = Tag.arel_table
.where(tags: { name: names })
.having(t[:id].count.eq(tags.length)) # COUNT( = ?
This is somewhat like a WHERE clause but it applies to the group. Using .gteq (>=) instead of .eq will give you records that can have all the tags in the list but may have more.
A better way to solve this is to use a better domain model that doesn't allow duplicates in the first place:
class Photo < ApplicationRecord
has_many :taggings
has_many :tags, through: :taggings
class Tag < ApplicationRecord
has_many :taggings
has_many :photos, through: :taggings
validates :name,
uniqueness: true,
presenece: true
class Tagging < ApplicationRecord
belongs_to :photo
belongs_to :tag
validates :tag_id,
uniqueness: { scope: :photo_id }
By adding unique indexes on and a compound index on taggings.tag_id and taggings.photo_id duplicates cannot be created.
The issue as I see it is that you're only doing one join, which means that you have to specify that is within the list of duplicates.
You could solve this in two places:
In the database query
In you application code
For your example the query is something like "find all records in the photos table which also have a relation to a specific set of records in the tags table". So we need to join the photos table to the tags table, and also specify that the only tags we join are those within the duplicate list.
We can use a inner join for this
select photos.* from photos
inner join tags as d1 on = 'a' and d1.photo_id =
inner join tags as d2 on = 'b' and d2.photo_id =
inner join tags as d3 on = 'c' and d3.photo_id =
inner join tags as d4 on = 'd' and d4.photo_id =
In ActiveRecord it seems we can't specify aliases for joins, but we can chain queries, so we can do something like this:
query = Photo
duplicate.each_with_index do |tag, index|
join_name = "d#{index}"
query = query.joins("inner join tags as #{join_name} on #{join_name}.name = '#{tag}' and #{join_name}.photo_id =")
Ugly, but gets the job done. I'm sure there would be a better way using arel instead - but it demonstrates how to construct a SQL query to find all photos that have a relation to all of the duplicate tags.
The other method is to extent what you have and filter in the application. As you already have the photos that has at least one of the tags, you could just select those which have all the tags.
.where(' IN (?)',duplicates)
.select do |photo|
(duplicates -
(duplicates - takes the duplicates array and removes all occurrences of any item that is also in the photo tags. If this returns an empty array then we know that the tags in the photo had all the duplicate tags as well.
This could have performance issues if the duplicates array is large, since it could potentially return all photos from the database.

Rails Many-to-many relationship with extension generating incorrect SQL

I'm having an issue where a many-to-many relationship with an "extension" is generating incorrect SQL.
class OrderItem < ApplicationRecord
belongs_to :buyer, class_name: :User
belongs_to :order
belongs_to :item, polymorphic: true
class User < ApplicationRecord
has_many :order_items_bought,
-> { joins(:order).where.not(orders: { state: :expired }).order(created_at: :desc) },
foreign_key: :buyer_id,
class_name: :OrderItem
has_many :videos_bought,
-> { joins(:orders).select('DISTINCT ON ( videos.*').reorder(' DESC') },
through: :order_items_bought,
source: :item,
source_type: :Video do
def confirmed
where(orders: { state: :confirmed })
user.videos_bought.confirmed generates this SQL:
Video Load (47.0ms) SELECT DISTINCT ON ( videos.* FROM
"videos" INNER JOIN "order_items" "order_items_videos_join" ON
"order_items_videos_join"."item_id" = "videos"."id" AND
"order_items_videos_join"."item_type" = $1 INNER JOIN
"orders" ON "orders"."id" = "order_items_videos_join"."order_id" INNER JOIN
"order_items" ON "videos"."id" = "order_items"."item_id" WHERE
"order_items"."buyer_id" = $2 AND ("orders"."state" != $3) AND "order_items"."item_type" = $4 AND
"orders"."state" = $5 ORDER BY DESC, "order_items"."created_at" DESC LIMIT $6
Which returns some Video records which are joined with orders that do NOT have state confirmed. I would expect all orders to have state confirmed.
If I use raw SQL everything works fine:
has_many :videos_bought,
-> {
joins('INNER JOIN orders ON = order_items.order_id')
.select('DISTINCT ON ( videos.*')
.reorder(' DESC')
through: :order_items_bought,
source: :item,
source_type: :Video do
def confirmed
where(orders: { state: :confirmed })
Now user.videos_bought.confirmed generates this SQL:
Video Load (5.4ms) SELECT DISTINCT ON ( videos.* FROM
"videos" INNER JOIN "order_items" ON
"videos"."id" = "order_items"."item_id" INNER JOIN orders ON = order_items.order_id WHERE
"order_items"."buyer_id" = $1 AND ("orders"."state" != $2) AND
"order_items"."item_type" = $3 AND "orders"."state" = $4 ORDER BY DESC, "order_items"."created_at" DESC LIMIT $5
Which seems more succinct because it avoids the auto generated order_items_videos_join name. It also only returns orders that have state confirmed.
Any idea what is going on? Does ActiveRecord just generate faulty SQL sometimes?
Using rails 5.1.5. Upgrading to latest made no difference.
I'm hoping to get an explanation on why Rails generates the order_items_videos_join string in the first case but not in the second case. Also, why the second SQL query produces incorrect results. I can edit the question with more code and data samples if needed.
ActiveRecord does not just generate faulty SQL sometimes, but there's a little nuance to it such that starting simple is best when it comes to defining relationships. For example, let's rework your queries to get that DISTINCT ON out of there. I've never seen a need to use that SQL clause.
Before chaining highly customized association logic, let's just see if there's simpler way to query first, and then check to see whether there's a strong case for turning your queries into associations.
Looks like you've got a schema like this:
buyer_id (any reason this is on OrderItem and not on Order?)
item_type (Video)
item_id (
A couple of tidbits
No need to create association extensions for query conditions that would make perfectly good scopes on the model. See below.
A perfectly good query might look like
Video.joins(order_item: :order).
where(order_items: {
buyer_id: 123,
order: {
state: 'confirmed'
# The following was part of the initial logic but
# doesn't alter the query results.
# where.not(order_items: {
# order: {state: 'expired'}
# }).
order('id desc')
Here's another way:
class User < ActiveRecord::Base
has_many :order_items, foreign_key: 'buyer_id'
def videos_purchased
Video.where(id: order_items.videos.confirmed.pluck(:id))
class OrderItem < ActiveRecord::Base
belongs_to :order, class_name: 'User', foreign_key: 'buyer_id'
belongs_to :item, polymorphic: true
scope :bought, -> {where.not(orders: {state: 'cancelled'})}
scope :videos, -> {where(item_type: 'Video')}
class Video < ActiveRecord::Base
has_many :order_items, ...
scope :confirmed, -> {where(orders: {state: 'confirmed'})}
user = User.first()
I might have the syntax a little screwy when it comes to table and attribute names, but this should be a start.
Notice that I changed it from one to two queries. I suggest running with that until you really notice that you have a performance problem, and even then you might have an easier time just caching queries, than trying to optimize complex join logic.

Query for all of a specific record where its related resource all have the same value for a single attribute

class User < ActiveRecord::Base
has_many :memberships
# included columns
# id: integer
Membership < ActiveRecord::Base
belongs_to :user
# included columns
# user_id: integer
# active: boolean
I'd like to be able to grab all users where all their memberships have 'active = false' in a single query. So far the best that I've been able to come up with is:
#grab possibles
users = User.joins(:memberships).where(' = false')
#select ones that satisfy condition{ |user| user.memberships.pluck(&:active).uniq == [false] }
which is not that great since I have to use ruby to pluck out the valid ones.
This can do the trick:
users_with_active_membership = User.joins(:memberships).where(memberships: { active: true })
users = User.where( ' NOT IN (?)', users_with_active_membership.pluck(:id) )
I am not sure of the result but I expect it to be 2 nested queries, one selecting the User ids having an active membership, the other query to select the User not in this previous ids list.
I can't test it since I don't have an environment with these relationships. Can you try it and post the SQL query generated? (add .to_sql to see it)
Another way, I don't know which could be the most efficient:
User.where( ' NOT IN (?)', Membership.where(active: true).group(:user_id).pluck(:user_id) )

Query a 3-way relationship in Active Record

I'm trying to figure out how to query this relationship without using find_by_sql
class User < ActiveRecord::Base
has_many :lists
class List < ActiveRecord::Base
has_many :list_items
belongs_to :user
class ListItem < ActiveRecord::Base
belongs_to :list
belongs_to :item
class Item < ActiveRecord::Base
has_many :list_items
this should be what we are using but How would I do this not by find_by_sql
in user.rb
def self.find_users_who_like_by_item_id item_id
find_by_sql(["select u.* from users u, lists l, list_items li where l.list_type_id=10 and li.item_id=? and and", item_id])
I've tried several different includes / joins / merge scenarios but am not able to get at what I'm trying to do.
It's a bit difficult to tell exactly what query you're trying to do here, but it looks like you want the user records where the user has a list with a particular list_type_id and containing a particular item. That would look approximately like this:
User.joins(:lists => [:list_items]).where('lists.list_type_id = ? and list_items.item_id = ?', list_type_id, item_id)
This causes ActiveRecord to execute a query like the following:
SELECT "users".* FROM "users" INNER JOIN "lists" ON "lists"."user_id" = "users"."id" INNER JOIN "list_items" ON "list_items"."list_id" = "lists"."id" WHERE (lists.list_type_id = 10 and list_items.item_id = 6)
and return the resulting collection of User objects.