What is the fastest, most standard way to "mass query" a set of grandchildren (has_many, has_many) belonging to a single object? - sql

Use case: on this site, users will be able to go on and select rental property for a specific amount of days. Users will be often be selling the same type of rental property.
Problem: Because multiple "sellers" will be renting out the same exact item, the "property detail page" will have many listings created by many different sellers (or in some case, a seller will have multiple properties available falling into the same "property detail page"). Each of these "listings" objects will have many pricing objects which contain a date, a price, and an availability boolean.
Current models are broken down below:
property.rb
has_many :listings
has_many :prices, :through => :listings
listing.rb
belongs_to :user
belongs_to :property
has_many :prices
price.rb
belongs_to :listing
What I have tried:
If for example, I wanted to obtain the MINIMUM sum of pricing for a specific property, I had jotted down this:
# property.rb
# minimum price for a pricing set out of all of the price objects
def minimum_price(start_date, end_date)
# this would sum up each days pricing to give the rental period a final price
prices = self.prices.where("day <= ?", end_date).where("day >= ?", start_date).sum(:price)
end
When I do it like this however, it simply combines every single users prices giving nothing of use.
Any help would be greatly appreciated! Of course I could loop through a properties listings until I found a minimum price set for a given date range, but that seems as though it would take an unnecessary amount of time and be largely inefficient.
EDIT
An example of data that should be outputted is a set of price objects that are the cheapest ones in a specific date range from ONE particular listing. It can not just combine all of the best priced dates from all of the users and add them as the buyer will be renting from ONE seller.
This is an actual example of desired output, as you can see these prices are ALL from the same listing ID.
[#<Price id: 156, day: "2020-12-01", listing_id: 7, price: 5.0, available: true, created_at: "2020-12-17 14:22:46", updated_at: "2020-12-17 14:22:46">, #<Price id: 157, day: "2020-12-02", listing_id: 7, price: 5.0, available: true, created_at: "2020-12-17 14:22:46", updated_at: "2020-12-17 14:22:46">, #<Price id: 158, day: "2020-12-03", listing_id: 7, price: 5.0, available: true, created_at: "2020-12-17 14:22:46", updated_at: "2020-12-17 14:22:46">, #<Price id: 159, day: "2020-12-04", listing_id: 7, price: 5.0, available: true, created_at: "2020-12-17 14:22:46", updated_at: "2020-12-17 14:22:46">]

So it sounds like you are calling this on a property so:
prices = self.prices.where("prices.day >= ? AND prices.day <= ?", start_date, end_date).sum(:price).group_by {|price| price.listing_id}
There is probably a SQL based way that you can use AR relations to do this. But this will give you a hash with a key for each listing_id and the value of that key should be the sum. I say "should" because this is a bit abstract for me to do without a system to test it on.

Related

Ruby on Rails - group_by for relation and created_at

I'm looking for a Rails query that will get me the last created rating for every movie made by the current user.
For example, let's say I have this active record collection:
#<ActiveRecord::Relation [
#<Rating id: 115, score: 5, movie_id: 7, user_id: 5, created_at: "2019-09-16 16:47:55", updated_at: "2019-09-16 16:47:55">,
#<Rating id: 116, score: 3, movie_id: 7, user_id: 5, created_at: "2019-09-16 16:47:57", updated_at: "2019-09-16 16:47:57">,
#<Rating id: 117, score: 5, movie_id: 7, user_id: 5, created_at: "2019-09-16 16:50:37", updated_at: "2019-09-16 16:50:37">,
#<Rating id: 118, score: 3, movie_id: 8, user_id: 5, created_at: "2019-09-16 16:50:42", updated_at: "2019-09-16 16:50:42">
]>
We can see there are three objects with movie_id: 7, one with movie_id: 6. This is wrong as I need to get the latest rating for each movie from the current_user.
For example, something alone the lines of this:
Rating.order(created_at: :desc).select('DISTINCT ON (movie_id) *')
My model associations:
class Movie < ApplicationRecord
belongs_to :user
has_many :ratings, dependent: :destroy
end
class Rating < ApplicationRecord
belongs_to :user
belongs_to :movie
end
class User < ApplicationRecord
has_many :movies
has_many :ratings
end
Please explain the Rails query as I really want to understand this.
Let's do it in parts!
You need a Rails query that will get me the last created rating for every movie made by the current user.
Ok, with this information, let`s start with the current_user reference!
Your relation for User is like this:
A user has many movies
A user has many ratings
With these definitions, you can take advantage of ActiveRecord to run the following methods:
current_user.movies # All movies of a current_user
current_user.movies # All ratings of a current_user
Ok, now we can access all movies and ratings from a user instance, but how we can access the ratings grouped by movies?
Let's see the Rating relations:
A rating belongs to a user # this means the rating instance has the user_id
A rating belongs to a movie # this means the rating instance has the movie_id
Nice! Now we can get the user ratings grouped by movie:
current_user.ratings.group_by { |rating| rating.movie_id }
# or the sugar sintax
grouped_ratings = current_user.ratings.group_by(&:movie_id)
This will group all user ratings based on the movie_id, and will produce to us a hash like:
{
1: [#<Rating id: 115...#>, #<Rating id: 116...#>], # The "1" means the movie_id of the group_by
2: [#<Rating id: 117...#>] # The "2" means the movie_id of the group_by
}
Ok, now we have a hash of ratings from a User, grouped by the movie_id.
But we need the last rating created by a user for each Movie correctly?
last_ratings_for_each_movie = grouped_ratings.map do |movie_id, ratings|
movie = Movie.find(movie_id)
last_rating = ratings.sort_by{ |rating| rating.created_at }.last
puts "The last Rating created by the current user for Movie #{movie.title} is the rating #{last_rating.id}" # Assuming movie has a title
last_rating
end
In the last moment, you will have an array of the last rating created for a movie (to access the movie you can execute a rating.movie for example)
I didn't understand if you must do it with SQL, if yes, let me know pls :)
Hope it helps!

Complex rails SQL query

First of all, a user has many age_demographics. An AgeDemographic object looks like this:
#<AgeDemographic id: 4384, user_id: 799, range: "35 - 49", percentage: 3.2, created_at: "2015-05-22 04:17:10", updated_at: "2015-05-22 04:17:10">
I'm building a user search tool where someone will select multiple age ranges that they want to target ("12 - 17" and "18 - 24" for example). I need to select users that have a collection of age demographic objects with a total percentage greater than 50%.
This is what I've started with:
User.joins(:age_demographics).where("age_demographics.range IN (?)", ["12 - 17", "18 - 24", "25 - 34"])
But I can't figure out how to tie in the sum of the percentages of those age_demographics into that where clause.
Let me know if this makes absolutely no sense.
You can use having and group methods for this:
User.joins(:age_demographics)
.where("age_demographics.range IN (?)", ["12 - 17", "18 - 24", "25 - 34"])
.group("users.id")
.having("sum(percentage) >= 50")

Remove duplicate records based on multiple columns?

I'm using Heroku to host my Ruby on Rails application and for one reason or another, I may have some duplicate rows.
Is there a way to delete duplicate records based on 2 or more criteria but keep just 1 record of that duplicate collection?
In my use case, I have a Make and Model relationship for cars in my database.
Make Model
--- ---
Name Name
Year
Trim
MakeId
I'd like to delete all Model records that have the same Name, Year and Trim but keep 1 of those records (meaning, I need the record but only once). I'm using Heroku console so I can run some active record queries easily.
Any suggestions?
class Model
def self.dedupe
# find all models and group them on keys which should be common
grouped = all.group_by{|model| [model.name,model.year,model.trim,model.make_id] }
grouped.values.each do |duplicates|
# the first one we want to keep right?
first_one = duplicates.shift # or pop for last one
# if there are any more left, they are duplicates
# so delete all of them
duplicates.each{|double| double.destroy} # duplicates can now be destroyed
end
end
end
Model.dedupe
Find All
Group them on keys which you need for uniqueness
Loop on the grouped model's values of the hash
remove the first value because you want to retain one copy
delete the rest
If your User table data like below
User.all =>
[
#<User id: 15, name: "a", email: "a#gmail.com", created_at: "2013-08-06 08:57:09", updated_at: "2013-08-06 08:57:09">,
#<User id: 16, name: "a1", email: "a#gmail.com", created_at: "2013-08-06 08:57:20", updated_at: "2013-08-06 08:57:20">,
#<User id: 17, name: "b", email: "b#gmail.com", created_at: "2013-08-06 08:57:28", updated_at: "2013-08-06 08:57:28">,
#<User id: 18, name: "b1", email: "b1#gmail.com", created_at: "2013-08-06 08:57:35", updated_at: "2013-08-06 08:57:35">,
#<User id: 19, name: "b11", email: "b1#gmail.com", created_at: "2013-08-06 09:01:30", updated_at: "2013-08-06 09:01:30">,
#<User id: 20, name: "b11", email: "b1#gmail.com", created_at: "2013-08-06 09:07:58", updated_at: "2013-08-06 09:07:58">]
1.9.2p290 :099 >
Email id's are duplicate, so our aim is to remove all duplicate email ids from user table.
Step 1:
To get all distinct email records id.
ids = User.select("MIN(id) as id").group(:email,:name).collect(&:id)
=> [15, 16, 18, 19, 17]
Step 2:
To remove duplicate id's from user table with distinct email records id.
Now the ids array holds the following ids.
[15, 16, 18, 19, 17]
User.where("id NOT IN (?)",ids) # To get all duplicate records
User.where("id NOT IN (?)",ids).destroy_all
** RAILS 4 **
ActiveRecord 4 introduces the .not method which allows you to write the following in Step 2:
User.where.not(id: ids).destroy_all
Similar to #Aditya Sanghi 's answer, but this way will be more performant because you are only selecting the duplicates, rather than loading every Model object into memory and then iterating over all of them.
# returns only duplicates in the form of [[name1, year1, trim1], [name2, year2, trim2],...]
duplicate_row_values = Model.select('name, year, trim, count(*)').group('name, year, trim').having('count(*) > 1').pluck(:name, :year, :trim)
# load the duplicates and order however you wantm and then destroy all but one
duplicate_row_values.each do |name, year, trim|
Model.where(name: name, year: year, trim: trim).order(id: :desc)[1..-1].map(&:destroy)
end
Also, if you truly don't want duplicate data in this table, you probably want to add a multi-column unique index to the table, something along the lines of:
add_index :models, [:name, :year, :trim], unique: true, name: 'index_unique_models'
You could try the following: (based on previous answers)
ids = Model.group('name, year, trim').pluck('MIN(id)')
to get all valid records. And then:
Model.where.not(id: ids).destroy_all
to remove the unneeded records. And certainly, you can make a migration that adds a unique index for the three columns so this is enforced at the DB level:
add_index :models, [:name, :year, :trim], unique: true
To run it on a migration I ended up doing like the following (based on the answer above by #aditya-sanghi)
class AddUniqueIndexToXYZ < ActiveRecord::Migration
def change
# delete duplicates
dedupe(XYZ, 'name', 'type')
add_index :xyz, [:name, :type], unique: true
end
def dedupe(model, *key_attrs)
model.select(key_attrs).group(key_attrs).having('count(*) > 1').each { |duplicates|
dup_rows = model.where(duplicates.attributes.slice(key_attrs)).to_a
# the first one we want to keep right?
dup_rows.shift
dup_rows.each{ |double| double.destroy } # duplicates can now be destroyed
}
end
end
Based on #aditya-sanghi's answer, with a more efficient way to find duplicates using SQL.
Add this to your ApplicationRecord to be able to deduplicate any model:
class ApplicationRecord < ActiveRecord::Base
# …
def self.destroy_duplicates_by(*columns)
groups = select(columns).group(columns).having(Arel.star.count.gt(1))
groups.each do |duplicates|
records = where(duplicates.attributes.symbolize_keys.slice(*columns))
records.offset(1).destroy_all
end
end
end
You can then call destroy_duplicates_by to destroy all records (except the first) that have the same values for the given columns. For example:
Model.destroy_duplicates_by(:name, :year, :trim, :make_id)
I chose a slightly safer route (IMHO). I started by getting all the unique records.
ids = Model.where(other_model_id: 1).uniq(&:field).map(&:id)
Then I got all the ids
all_ids = Model.where(other_model_id: 1).map(&:id)
This allows me to do a matrix subtraction for the duplicates
dups = all_ids - ids
I then map over the duplicate ids and fetch the model because I want to ensure I have the records I am interested in.
records = dups.map do |id| Model.find(id) end
When I am sure I want to delete, I iterate again to delete.
records.map do |record| record.delete end
When deleting duplicate records on a production system, you want to be very sure you are not deleting important live data, so in this process, I can double-check everything.
So in the case above:
all_ids = Model.all.map(&:ids)
uniq_ids = Model.all.group_by do |model|
[model.name, model.year, model.trim]
end.values.map do |duplicates|
duplicates.first.id
end
dups = all_ids - uniq_ids
records = dups.map { |id| Model.find(id) }
records.map { |record| record.delete }
or something like this.
You can try this sql query, to remove all duplicate records but latest one
DELETE FROM users USING users user WHERE (users.name = user.name AND users.year = user.year AND users.trim = user.trim AND users.id < user.id);

Ruby on Rails finding the number of items in a cart

I have my Rails models in an online store I making setup with a cart that has line items. Every time a product is clicked on, a line item is generated that has a unique cart id, matching carts I make for user sessions (this example comes from the book Agile Web Development with rails.)
I want to count the number of items in a users cart, what's the best way to do this.
here's an example of what
li.each do |line|
puts li.to_yaml
end
outputs ....
- !ruby/object:LineItem
attributes:
id: 14
product_id: 81
cart_id: 11
created_at: 2012-06-27 14:10:09.060706000Z
updated_at: 2012-06-27 14:10:09.060706000Z
quantity: 1
---
- !ruby/object:LineItem
attributes:
id: 1
product_id: 2
cart_id: 6
created_at: 2012-06-25 18:29:20.726280000Z
updated_at: 2012-06-25 18:56:08.690670000Z
quantity: 2
- !ruby/object:LineItem
attributes:
id: 2
product_id: 4
cart_id: 6
created_at: 2012-06-25 18:56:10.014333000Z
updated_at: 2012-06-25 18:56:10.014333000Z
quantity: 1
So, I'd want the user with cart_id of 6 to know they have 3 items. Thanks.
Yes there is a better way. How are your models set up? Basically you'd want users to have a cart and a cart belongs_to a user. Also, since a line_item has a cart_id, you can have line_item belongs_to cart and cart has_many line_items.
With these associations, you can easily get what you need like:
cart.line_items.count
user.cart
user.cart.line_items
etc.
You can read up on rails associations here:
http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html
I figured it out using the console
After choosing a cart so that #cart.id = 6 (current_cart = Cart.find(6) since the console didn't set the cart using a browser session)
#count = 0
LineItem.all.each do |item|
if (item.cart_id == #cart.id)
#count += item.quantity
end
end
There must be a better, more railsy way, though...
current_cart.line_items.size
Provided you have the following in cart.rb
has_many :line_items
-edit-
Ops, sorry, you wanted the sum of quantity.
current_cart.line_items.sum('quantity')

add items to cart Rails

Let's say customer is adding multiple products to cart.
Add car, apartment, tour, some coupons.
Products belong to categories.
Category has name attribute.
But how to give different attributes to car, apartment, tour, coupon?
I can't definitely create everything from product model.
So should I create different models for each category and connect through has many ro products model?
Or am I going to wrong direction?
Thanks
Having a Category for individual product seems to be a good approach here. Since your Category differs according to attributes. What you can do is create One more model, let's say: MasterCategory which has many Category and Category belongs to MasterCategory, which means for your MasterCategory Cars you'll have Audi, BMW, Nissan etc etc. categories and then link your products to their respective vendors in a table products_categories which will have product_id, category_id.
In my opinion, schema could go like this:
ProductsCategory:
product_id, category_id
MasterCategory:
id, name, created_at, updated_at
Category:
id, master_category_id, parent_id, name, position, name, permalink
For e.g. - A car MasterCategory will look like this:
#<MasterCategory id: 1, name: "cars", created_at: "2012-02-14 13:03:45", updated_at: "2012-02-14 13:03:45">
and its categories will be:
[#<Category id: 1, master_category_id: 1, parent_id: nil, position: 0, name: "cars", created_at: "2012-02-14 13:03:45", updated_at: "2012-02-14 13:03:45", permalink: "cars">, #<Category id: 2, master_category_id: 1, parent_id: 1, position: 0, name: "Audi", created_at: "2012-02-14 13:03:45", updated_at: "2012-02-14 13:32:51", permalink: "cars/audi">]
Now you can have two methods parent and children in your Category Model using parent_id attribute to locate and traverse master and sub categories easily. And you can use permalink attribute to easily locate a category which is under another category with one query: Category.find_by_permalink(params[:permalink]) and can display all the products related with this particular category. In future when you'll scale, I can bet you, you'll require this position attribute to manage position of your categories to display on your page. And in last master_category_id will give you an ease in life with:
car = MasterCategory.find(1)
car.categories.where("parent_id IS NOT NULL")
All the best!! :)