Rails and SQL - get related by all elements from array, entries - sql

I have something like this:
duplicates = ['a','b','c','d']
if duplicates.length > 4
Photo.includes(:tags).where('tags.name IN (?)',duplicates)
.references(:tags).limit(15).each do |f|
returned_array.push(f.id)
end
end
duplicates is an array of tags that were duplicated with other Photo tags
What I want is to get Photo which includes all tags from duplicates array, but right now I get every Photo that include at least one tag from array.
THANKS FOR ANSWERS:
I try them and somethings starts to work but wasn't too clear for me and take some time to execute.
Today I make it creating arrays, compare them, take duplicates which exist in array more than X times and finally have uniq array of photos ids.

If you want to find photos that have all the given tags you just need to apply a GROUP and use HAVING to set a condition on the group:
class Photo
def self.with_tags(*names)
t = Tag.arel_table
joins(:tags)
.where(tags: { name: names })
.group(:id)
.having(t[:id].count.eq(tags.length)) # COUNT(tags.id) = ?
end
end
This is somewhat like a WHERE clause but it applies to the group. Using .gteq (>=) instead of .eq will give you records that can have all the tags in the list but may have more.
A better way to solve this is to use a better domain model that doesn't allow duplicates in the first place:
class Photo < ApplicationRecord
has_many :taggings
has_many :tags, through: :taggings
end
class Tag < ApplicationRecord
has_many :taggings
has_many :photos, through: :taggings
validates :name,
uniqueness: true,
presenece: true
end
class Tagging < ApplicationRecord
belongs_to :photo
belongs_to :tag
validates :tag_id,
uniqueness: { scope: :photo_id }
end
By adding unique indexes on tags.name and a compound index on taggings.tag_id and taggings.photo_id duplicates cannot be created.

The issue as I see it is that you're only doing one join, which means that you have to specify that tags.name is within the list of duplicates.
You could solve this in two places:
In the database query
In you application code
For your example the query is something like "find all records in the photos table which also have a relation to a specific set of records in the tags table". So we need to join the photos table to the tags table, and also specify that the only tags we join are those within the duplicate list.
We can use a inner join for this
select photos.* from photos
inner join tags as d1 on d1.name = 'a' and d1.photo_id = photos.id
inner join tags as d2 on d2.name = 'b' and d2.photo_id = photos.id
inner join tags as d3 on d3.name = 'c' and d3.photo_id = photos.id
inner join tags as d4 on d4.name = 'd' and d4.photo_id = photos.id
In ActiveRecord it seems we can't specify aliases for joins, but we can chain queries, so we can do something like this:
query = Photo
duplicate.each_with_index do |tag, index|
join_name = "d#{index}"
query = query.joins("inner join tags as #{join_name} on #{join_name}.name = '#{tag}' and #{join_name}.photo_id = photos.id")
end
Ugly, but gets the job done. I'm sure there would be a better way using arel instead - but it demonstrates how to construct a SQL query to find all photos that have a relation to all of the duplicate tags.
The other method is to extent what you have and filter in the application. As you already have the photos that has at least one of the tags, you could just select those which have all the tags.
Photo
.includes(:tags)
.joins(:tags)
.where('tags.name IN (?)',duplicates)
.select do |photo|
(duplicates - photo.tags.map(&:name)).empty?
end
(duplicates - photo.tags.map(&:name)).empty? takes the duplicates array and removes all occurrences of any item that is also in the photo tags. If this returns an empty array then we know that the tags in the photo had all the duplicate tags as well.
This could have performance issues if the duplicates array is large, since it could potentially return all photos from the database.

Related

Get most recent records from deeply nested model

Say I have 3 models:
ModelA has many ModelB
ModelB has many ModelC
I'm querying ModelA, but in ModelC I have multiple ones of the same type, let's say I have 3 but I only need the most recently one.
I tried to do something like this...
records = ModelA.where(some query).includes ModelB includes ModelC
// convert activerecord collection to array
records = records.to_a
records.each do |record|
record.modelBs.each do |modelB|
filter the modelCs i don't need
modelB.modelCs = filteredModelCs
end
end
return records
but instead of merely returning the array of records, an UPDATE sql query is run and the db records are modified. this is a surprise because i never used the .save method and i thought i had converted the collection from an active record collection to an array
How can I filter deeply nested records without modifying the db records? then i can return the filtered result
Assigning a list of instances to a has_many collection with = will immediately persist the changes to the database.
Instead, I would try to solve this with more specific associations like this:
class A
has_many :bs
has_many(:cs, through: :bs)
has_one :recent_c, -> { order(created_at: :desc).limit(1) }, source: :cs
class B
has_many :cs
With those associations, I would expect the following to work:
as = A.where(some query).includes(:recent_c)
as.each do |a|
a.recent_c # returns the most recent c for this a
end
If I got you right, you want to get a collection of latest Cs, which are connected to Bs, which are connected to certain A-relation? If so, you can do something like that (considering you have tables as, bs and cs):
class A < ApplicationRecord
has_many :bs
end
class B < ApplicationRecord
belongs_to :a
has_many :cs
end
class C < ApplicationRecord
belongs_to :b
scope :recent_for_bs, -> { joins(
<<-sql
INNER JOIN (SELECT b_id, MAX(id) AS max_id FROM cs GROUP BY b_id) recent_cs
ON cs.b_id = recent_cs.b_id AND cs.id = recent_cs.max_id
sql
) }
end
And then you would query Cs like that:
C.recent_for_bs.joins(b: :a).merge(A.where(some_query))
You get recent Cs, inner join them with Bs and As and then get records connected to your A-relation by merging it.

ActiveRecord .merge not working on two relations

I have the following models in my app:
class Company < ActiveRecord::Base
has_many :gallery_cards, dependent: :destroy
has_many :photos, through: :gallery_cards
has_many :direct_photos, class_name: 'Photo'
end
class Photo < ActiveRecord::Base
belongs_to :gallery_card
belongs_to :company
end
class GalleryCard < ActiveRecord::Base
belongs_to :company
has_many :photos
end
As you can see, Company has_many :photos, through: :gallery_cards and also has_many :photos. Photo has both a gallery_card_id and a company_id column.
What I want to be able to do is write a query like #company.photos that returns an ActiveRecord::Relation of all the company's photos. In my Company model, I currently have the method below, but that returns an array or ActiveRecord objects, rather than a relation.
def all_photos
photos + direct_photos
end
I've tried using the .merge() method (see below), but that returns an empty relation. I think the reason is because the conditions that are used to select #company.photos and #company.direct_photos are different. This SO post explains it in more detail.
#company = Company.find(params[:id])
photos = #company.photos
direct_photos = #company.direct_photos
direct_photos.merge(photos) = []
photos.merge(direct_photos) = []
I've also tried numerous combinations of .joins and .includes without success.
this might be a candidate for a raw SQL query, but my SQL skills are rather basic.
For what it's worth, I revisited this and came up (with help) another query that grabs everything in one shot, rather than building an array of ids for a second query. This also includes the other join tables:
Photo.joins("
LEFT OUTER JOIN companies ON photos.company_id = #{id}
LEFT OUTER JOIN gallery_cards ON gallery_cards.id = photos.gallery_card_id
LEFT OUTER JOIN quote_cards ON quote_cards.id = photos.quote_card_id
LEFT OUTER JOIN team_cards ON team_cards.id = photos.team_card_id
LEFT OUTER JOIN who_cards ON who_cards.id = photos.who_card_id
LEFT OUTER JOIN wild_cards ON wild_cards.id = photos.wild_card_id"
).where("photos.company_id = #{id}
OR gallery_cards.company_id = #{id}
OR quote_cards.company_id = #{id}
OR team_cards.company_id = #{id}
OR who_cards.company_id = #{id}
OR wild_cards.company_id = #{id}").uniq
ActiveRecord's merge returns the intersection not the union of the two queries – counterintuitively IMO.
To find the union, you need to use OR, for which ActiveRecord has poor built-in support. So I think you're correct that its best to write the conditions in SQL:
def all_photos
Photo.joins("LEFT OUTER JOIN gallery_cards ON gallery_cards.id = photos.gallery_card_id")
.where("photos.company_id = :id OR gallery_cards.company_id = :id", id: id)
end
ETA The query associates the gallery_cards to photos with a LEFT OUTER JOIN, which preserves those photo rows without associated gallery card rows. You can then query based on either photos columns or on associated gallery_cards columns – in this case, company_id from either table.
You can leverage ActiveRecord scope chaining to join and query from additional tables:
def all_photos
Photo.joins("LEFT OUTER JOIN gallery_cards ON gallery_cards.id = photos.gallery_card_id")
.joins("LEFT OUTER JOIN quote_cards ON quote_cards.id = photos.quote_card_id")
.where("photos.company_id = :id OR gallery_cards.company_id = :id OR quote_cards.company_id = :id", id: id)
end

Query using condition within an array

I have 2 models, user and centre, which have a many to many relationship.
class User < ActiveRecord::Base
attr_accessible :name
has_and_belongs_to_many :centres
end
class Centre < ActiveRecord::Base
attr_accessible :name, :centre_id, :city_id, :state_id
has_and_belongs_to_many :users
end
Now I have an user with multiple centres, and I want to retrieve all the centres that have the same "state_id" as that user.
This is what I am doing now
state_id_array = []
user.centres.each do |centre|
state_id_array << centre.state_id
end
return Centre.where("state_id IN (?)", state_id_array).uniq
It works, but it's very ugly. Is there a better way for achieving this? Ideally a one line query.
UPDATE
Now I have
Centre.where('centres.state_id IN (?)', Centre.select('state_id').joins(:user).where('users.id=(?)', user))
The subquery work by itself, but when I tried to execute the entire query, I get NULL for the inner query.
Centre.select('state_id').joins(:user).where('users.id=(?)', user)
will generate
SELECT state_id FROM "centres" INNER JOIN "centres_users" ON "centres_users"."centre_id" = "centres"."id" INNER JOIN "users" ON "users"."id" = "centres_users"."user_id" WHERE (users.id = (5))
Which return 'SA', 'VIC', 'VIC'
but the whole query will generate
SELECT DISTINCT "centres".* FROM "centres" WHERE (centres.state_id IN (NULL,NULL,NULL))
Does user also has state_id column if yes then try this,
User.joins("LEFT OUTER JOIN users ON users.state_id = centers.state_id")
else
try User.joins(:center)
Solved.
.select(:state_id)
will retrieve a model with only the state_id column populated. To retrieve a field, use
.pluck(:state_id)
Below is the final query I had
Centre.where('centres.state_id IN (?)', Centre.joins(:user).where('users.id=(?)', user).pluck('state_id').uniq)

Filtering Parents by Children

I'm doing a simple blog application - There are posts, which have many tags through a posts_tags table (my models are below). What I have implemented is if a user clicks a tag, it will show just the posts with that tag. What I want is for the user to them be able to select another tag, and it will filter to only the posts that have both of those tags, then a third, then a fourth, etc. I'm having difficulty making the active record query - especially dynamically. The closest I've gotten is listed below - however its in pure SQL and I would like to at least have it in ActiveRecord Rubyland syntax even with the complexity it contains.
Also, the "having count 2" does not work, its saying that "count" does not exist and even if I assign it a name. However, it is outputting in my table (the idea behind count is that if it contains a number that is as much as how many tags we are searching for, then theoretically/ideally it has all the tags)
My current test SQL query
select posts_tags.post_id,count(*) from posts_tags where tag_id=1 or tag_id=3 group by post_id ### having count=2
The output from the test SQL (I know it doesnt contain much but just with some simple seed data).
post_id | count
---------+-------
1 | 2
2 | 1
My Models:
/post.rb
class Post < ActiveRecord::Base
has_many :posts_tags
has_many :tags, :through => :posts_tags
end
/tag.rb
class Tag < ActiveRecord::Base
has_many :posts_tags
has_many :posts, :through => :posts_tags
end
/poststag.rb
class PostsTag < ActiveRecord::Base
belongs_to :tag
belongs_to :post
end
Give a try to:
Post.joins(:tags).where(tags: {id: [1, 3]}).select("posts.id, count(*)").group("posts.id").having("count(*) > 2")
I think "count = 2" is not correct. It should be "count(*) = 2". Your query then will be
select post_id,count(post_id)
from posts_tags
where tag_id = 1 or tag_id = 3
group by post_id
having count(post_id) = 2
In general you want to stay away from writing raw sql when using rails. Active Record has great helper methods to make your sql more readable and maintainable.
If you only have a few tags you can create scopes for each of them (http://guides.rubyonrails.org/active_record_querying.html#scopes)
Since people are clicking on tags one at a time you could just query for each tag and then use the & operator on the arrays. Because you have already requested the exact same set of data from the database the query results should be cached meaning you are only hitting the db for the newest query.

Rails: Many to one ( 0 - n ) , finding records

I've got tables items and cards where a card belongs to a user and a item may or may not have any cards for a given user.
The basic associations are set up as follows:
Class User
has_many :cards
Class Item
has_many :cards
Class Card
belongs_to :user
has_and_belongs_to_many :items
I've also created a join table, items_cards with the columns item_id and card_id. I'd like to make a query that tells me if there's a card for a given user/item. In pure SQL I can accomplish this pretty easily:
SELECT count(id)
FROM cards
JOIN items_cards
ON items_cards.card_id = cards.id
WHERE cards.user_id = ?
AND items_cards.item_id = ?
I'm looking for some guidance as to how I'd go about doing this via ActiveRecord. Thanks!
Assuming you have an Item in #item and a User in #user, this will return 'true' if a card exists for that user and that item:
Card.joins(:items).where('cards.user_id = :user_id and items.id = :item_id', :user_id => #user, :item_id => #item).exists?
Here's what's going on:
Card. - You're making a query about the Card model.
joins(:items) - Rails knows how to put together joins for the association types it supports (usually - at least). You're telling it to do whatever joins are required to allow you to query the associated items as well. This will, in this case, result in JOIN items_cards ON items_cards.card_id = cards.id JOIN items ON items_cards.item_id = items.id.
where('cards.user_id = :user_id and items.id = :item_id', :user_id => #user, :item_id => #item) - Your conditional, pretty much the same as in pure SQL. Rails will interpolate the values you specify with a colon (:user_id) using the values in the hash (:user_id => #user). If you give an ActiveRecord object as the value, Rails will automatically use the id of that object. Here, you're saying you only want results where the card belongs to the user you specify, and there is a row for the item you want.
.exists? - Loading ActiveRecord objects is inefficient, so if you only want to know if something exists, Rails can save some time and use a count based query (much like your SQL version). There's also a .count, which you could use instead if you wanted to have the query return the number of results, rather than true or false.