How to check if record does NOT exist in Rails 5 with active record querying? - sql

I have an article model and comments model. How do i get a list of articles that does not have any comments using active record?
Model Columns:
Article: body:string (has many comments)
Comment: body:string, article_id:integer (belongs to article)

If you want to get the result using single query and want the result to be an activerecord relation, use:
Article.where('id NOT IN (SELECT DISTINCT(article_id) FROM comments)')

This is same but would be more rails way
Article.where.not('id IN (SELECT DISTINCT(article_id) FROM comments)')

try below code to fetch all articles with no comments:
Article.includes(:comments).where.not(comments: {article_id: nil})
OR
data = []
Article.all.each do |a|
data << a if a.comments.blank?
end
puts data
OR
ids = Comment.all.pluck(:article_id)
data = Article.where.not(id: ids)

Related

Return only unique records in this ActiveRecord query

I have a mildly-complex ActiveRecord query in Rails 3.2 / Postgres that returns documents that are related and most relevant to all documents a user has favorited in the past.
The problem is that despite specifying uniq my query does not return distinct document records:
Document.joins("INNER JOIN related_documents ON
documents.docid = related_documents.docid_id")
.select("documents.*, related_documents.relevance_score")
.where("related_documents.document_id IN (?)",
some_user.favorited_documents)
.order("related_documents.relevance_score DESC")
.uniq
.limit(10)
I use a RelatedDocument join table, ranking each relation by a related_document.relevance_score which I use to order the query result before sampling the top 10. (See this question for schema description.)
The problem is that because I select("documents.*, related_documents.relevance_score"), the same document record returned multiple times with different relevance_scores are considered unique results. (i.e. if the document is a related_document for multiple favorited-documents.)
How do I return unique Documents regardless of the related_document.relevance_score?
I have tried splitting the select into two seperate selects, and changing the position of uniq in the query with no success.
Unfortunately I must select("related_documents.relevance_score") so as to order the results by this field.
Thanks!
UPDATE - SOLUTION
Thanks to Jethroo below, GROUP BY is the needed addition, giving me the follow working query:
Document.joins("INNER JOIN related_documents ON
documents.docid = related_documents.docid_id")
.select("documents.*, max(related_documents.relevance_score)")
.where("related_documents.document_id IN (?)",
some_user.favorited_documents)
.order("related_documents.relevance_score DESC")
.group("documents.id")
.uniq
.limit(10)
Have you tried to group it by documents.docid see http://guides.rubyonrails.org/active_record_querying.html#group?

How to retrieve a list of records and the count of each one's children with condition in Active Record?

There are two models with our familiar one-to-many relationship:
class Custom
has_many :orders
end
class Order
belongs_to :custom
end
I want to do the following work:
get all the custom information whose age is over 18, and how many big orders(pay for 1,000 dollars) they have?
UPDATE:
for the models:
rails g model custom name:string age:integer
rails g model orders amount:decimal custom_id:integer
I hope one left join sql statement will do all my job, and don't construct unnecessary objects like this:
Custom.where('age > ?', '18').includes(:orders).where('orders.amount > ?', '1000')
It will construct a lot of order objects which I don't need, and it will calculate the count by Array#count function which will waste time.
UPDATE 2:
My own solution is wrong, it will remove customs who doesn't have big orders from the result.
Finding adult customers with big orders
This solution uses a single query, with the nested orders relation transformed into a sub-query.
big_customers = Custom.where("age > ?", "18").where(
id: Order.where("amount > ?", "1000").select(:custom_id)
)
Grab all adults and their # of big orders (MySQL)
This can still be done in a single query. The count is grabbed via a join on orders and sticking the count of orders into a column in the result called big_orders_count, which ActiveRecord turns into a method. It involves a lot more "raw" SQL. I don't know any way to avoid this with ActiveRecord except with the great squeel gem.
adults = Custom.where("age > ?", "18").select([
Custom.arel_table["*"],
"count(orders.id) as big_orders_count"
]).joins(%{LEFT JOIN orders
ON orders.custom_id = customs.id
AND orders.amount > 1000})
# see count:
adults.first.big_orders_count
You might want to consider caching counters like this. This join will be expensive on the database, so if you had a dedicated customs.big_order_count column that was either refreshed regularly or updated by an observer that watches for big Order records.
Grab all adults and their # of big orders (PostgreSQL)
Solution 2 is mysql only. To get this to work in postgresql I created a third solution that uses a sub-query. Still one call to the DB :-)
adults = Custom.where("age > ?", "18").select([
%{"customs".*},
%{(
SELECT count(*)
FROM orders
WHERE orders.custom_id = customs.id
AND orders.amount > 1000
) AS big_orders_count}
])
# see count:
adults.first.big_orders_count
I have tested this against postgresql with real data. There may be a way to use more ActiveRecord and less SQL, but this works.
Edited.
#custom_over_18 = Custom.where("age > ?", "18").orders.where("amount > ?", "1000").count

Filtering model with HABTM relationship

I have 2 models - Restaurant and Feature. They are connected via has_and_belongs_to_many relationship. The gist of it is that you have restaurants with many features like delivery, pizza, sandwiches, salad bar, vegetarian option,… So now when the user wants to filter the restaurants and lets say he checks pizza and delivery, I want to display all the restaurants that have both features; pizza, delivery and maybe some more, but it HAS TO HAVE pizza AND delivery.
If I do a simple .where('features IN (?)', params[:features]) I (of course) get the restaurants that have either - so or pizza or delivery or both - which is not at all what I want.
My SQL/Rails knowledge is kinda limited since I'm new to this but I asked a friend and now I have this huuuge SQL that gets the job done:
Restaurant.find_by_sql(['SELECT restaurant_id FROM (
SELECT features_restaurants.*, ROW_NUMBER() OVER(PARTITION BY restaurants.id ORDER BY features.id) AS rn FROM restaurants
JOIN features_restaurants ON restaurants.id = features_restaurants.restaurant_id
JOIN features ON features_restaurants.feature_id = features.id
WHERE features.id in (?)
) t
WHERE rn = ?', params[:features], params[:features].count])
So my question is: is there a better - more Rails even - way of doing this? How would you do it?
Oh BTW I'm using Rails 4 on Heroku so it's a Postgres DB.
This is an example of a set-iwthin-sets query. I advocate solving these with group by and having, because this provides a general framework.
Here is how this works in your case:
select fr.restaurant_id
from features_restaurants fr join
features f
on fr.feature_id = f.feature_id
group by fr.restaurant_id
having sum(case when f.feature_name = 'pizza' then 1 else 0 end) > 0 and
sum(case when f.feature_name = 'delivery' then 1 else 0 end) > 0
Each condition in the having clause is counting for the presence of one of the features -- "pizza" and "delivery". If both features are present, then you get the restaurant_id.
How much data is in your features table? Is it just a table of ids and names?
If so, and you're willing to do a little denormalization, you can do this much more easily by encoding the features as a text array on restaurant.
With this scheme your queries boil down to
select * from restaurants where restaurants.features #> ARRAY['pizza', 'delivery']
If you want to maintain your features table because it contains useful data, you can store the array of feature ids on the restaurant and do a query like this:
select * from restaurants where restaurants.feature_ids #> ARRAY[5, 17]
If you don't know the ids up front, and want it all in one query, you should be able to do something along these lines:
select * from restaurants where restaurants.feature_ids #> (
select id from features where name in ('pizza', 'delivery')
) as matched_features
That last query might need some more consideration...
Anyways, I've actually got a pretty detailed article written up about Tagging in Postgres and ActiveRecord if you want some more details.
This is not "copy and paste" solution but if you consider following steps you will have fast working query.
index feature_name column (I'm assuming that column feature_id is indexed on both tables)
place each feature_name param in exists():
select fr.restaurant_id
from
features_restaurants fr
where
exists(select true from features f where fr.feature_id = f.feature_id and f.feature_name = 'pizza')
and
exists(select true from features f where fr.feature_id = f.feature_id and f.feature_name = 'delivery')
group by
fr.restaurant_id
Maybe you're looking at it backwards?
Maybe try merging the restaurants returned by each feature.
Simplified:
pizza_restaurants = Feature.find_by_name('pizza').restaurants
delivery_restaurants = Feature.find_by_name('delivery').restaurants
pizza_delivery_restaurants = pizza_restaurants & delivery_restaurants
Obviously, this is a single instance solution. But it illustrates the idea.
UPDATE
Here's a dynamic method to pull in all filters without writing SQL (i.e. the "Railsy" way)
def get_restaurants_by_feature_names(features)
# accepts an array of feature names
restaurants = Restaurant.all
features.each do |f|
feature_restaurants = Feature.find_by_name(f).restaurants
restaurants = feature_restaurants & restaurants
end
return restaurants
end
Since its an AND condition (the OR conditions get dicey with AREL). I reread your stated problem and ignoring the SQL. I think this is what you want.
# in Restaurant
has_many :features
# in Feature
has_many :restaurants
# this is a contrived example. you may be doing something like
# where(name: 'pizza'). I'm just making this condition up. You
# could also make this more DRY by just passing in the name if
# that's what you're doing.
def self.pizza
where(pizza: true)
end
def self.delivery
where(delivery: true)
end
# query
Restaurant.features.pizza.delivery
Basically you call the association with ".features" and then you use the self methods defined on features. Hopefully I didn't misunderstand the original problem.
Cheers!
Restaurant
.joins(:features)
.where(features: {name: ['pizza','delivery']})
.group(:id)
.having('count(features.name) = ?', 2)
This seems to work for me. I tried it with SQLite though.

Query: getting the last record for each member

Given a table ("Table") as follows (sorry about the CSV style since I don't know how to make it look like a table with the Stack Overflow editor):
id,member,data,start,end
1,001,abc,12/1/2012,12/31/2999
2,001,def,1/1/2009,11/30/2012
3,002,ghi,1/1/2009,12/31/2999
4,003,jkl,1/1/2012,10/31/2012
5,003,mno,8/1/2011,12/31/2011
If using Ruby Sequel, how should I write my query so I will get the following dataset in return.
id,member,data,start,end
1,001,abc,12/1/2012,12/31/2999
3,002,ghi,1/1/2009,12/31/2999
4,003,jkl,1/1/2012,10/31/2012
I get the most current (largest end date value) record for EACH (distinct) member from the original table.
I can get the answer if I convert the table to an Array, but I am looking for a solution in SQL or Ruby Sequel query, if possible. Thank you.
Extra credit: The title of this post is lame...but I can't come up with a good one. Please offer a better title if you have one. Thank you.
The Sequel version of this is a bit scary. The best I can figure out is to use a subselect and, because you need to join the table and the subselect on two columns, a "join block" as described in Querying in Sequel. Here's a modified version of Knut's program above:
require 'csv'
require 'sequel'
# Create Test data
DB = Sequel.sqlite()
DB.create_table(:mytable){
field :id
String :member
String :data
String :start # Treat as string to keep it simple
String :end # Ditto
}
CSV.parse(<<xx
1,"001","abc","2012-12-01","2999-12-31"
2,"001","def","2009-01-01","2012-11-30"
3,"002","ghi","2009-01-01","2999-12-31"
4,"003","jkl","2012-01-01","2012-10-31"
5,"003","mno","2011-08-01","2011-12-31"
xx
).each{|x|
DB[:mytable].insert(*x)
}
# That was all setup, here's the query
ds = DB[:mytable]
result = ds.join(ds.select_group(:member).select_append{max(:end).as(:end)}, :member=>:member) do |j, lj, js|
Sequel.expr(Sequel.qualify(j, :end) => Sequel.qualify(lj, :end))
end
puts result.all
This gives you:
{:id=>1, :member=>"001", :data=>"abc", :start=>"2012-12-01", :end=>"2999-12-31"}
{:id=>3, :member=>"002", :data=>"ghi", :start=>"2009-01-01", :end=>"2999-12-31"}
{:id=>4, :member=>"003", :data=>"jkl", :start=>"2012-01-01", :end=>"2012-10-31"}
In this case it's probably easier to replace the last four lines with straight SQL. Something like:
puts DB[
"SELECT a.* from mytable as a
join (SELECT member, max(end) AS end FROM mytable GROUP BY member) as b
on a.member = b.member and a.end=b.end"].all
Which gives you the same result.
What's the criteria for your result?
If it is the keys 1,3 and 4 you may use DB[:mytable].filter( :id => [1,3,4]) (complete example below)
For more information about filtering with sequel, please refer the sequel documentation, especially Dataset Filtering.
require 'csv'
require 'sequel'
#Create Test data
DB = Sequel.sqlite()
DB.create_table(:mytable){
field :id
field :member
field :data
field :start #should be date, not implemented in example
field :end #should be date, not implemented in example
}
CSV.parse(<<xx
id,member,data,start,end
1,001,abc,12/1/2012,12/31/2999
2,001,def,1/1/2009,11/30/2012
3,002,ghi,1/1/2009,12/31/2999
4,003,jkl,1/1/2012,10/31/2012
5,003,mno,8/1/2011,12/31/2011
xx
).each{|x|
DB[:mytable].insert(*x)
}
#Create Test data - end -
puts DB[:mytable].filter( :id => [1,3,4]).all
In my opinion, you're approaching the problem from the wrong side. ORMs (and Sequel as well) represent a nice, DSL-ish layer above the database, but, underneath, it's all SQL down there. So, I would try to formulate the question and the answer in a way to get SQL query which would return what you need, and then see how it would translate to Sequel's language.
You need to group by member and get the latest record for each member, right?
I'd go with the following idea (roughly):
SELECT t1.*
FROM table t1
LEFT JOIN table t2 ON t1.member = t2.member AND t2.end > t1.end
WHERE t2.id IS NULL
Now you should see how to perform left joins in Sequel, and you'll need to alias tables as well. Shouldn't be that hard.

rails 3: database query

I have an Artists model with name:string and other attributes. BUT I have multiple Artist entries under the SAME name.
Is there a way to pull an array of artist objects without any duplicates of name?
I've found ways to do with with only the name attribute but nothing where I can get the entire artist object.
These both do just the name attribute:
#artists = Artist.select('DISTINCT name').all
#artists = Artist.all.collect{ |a| a.name }.uniq
Activerecord group does what you're looking for: Artist.group(:name).all
My rails 3 is not so good but it still has rails 2 syntax.
#artists = Artist.find(:all, :select => 'DISTINCT name')
And then we can get some rails 3 love.
One way is to grab the ids of distinct rows and grab the rest of the data from there:
Artist.where('artists.id IN (SELECT MIN(a.id) FROM artists AS a GROUP BY a.name)').all