What's the equivalent rails statement of the following ?
#temp = Post.find_by_sql("SELECT posts.id,posts.title, comment_count.count FROM posts INNER JOIN (SELECT post_id, COUNT(*) AS count FROM comments GROUP BY post_id) AS comment_count ON comment_count.post_id = posts.id ORDER BY count DESC LIMIT 5;")
Is it possible to convert it to find/where/select statements ?
It's a complex query , i can't get it , but tried something like this,
#temp = Post.select("posts.id, posts.title, comment_count.count").joins(:comments).group("post_id").order("countpages.counts desc").limit(5)
Using something like Arel may clean this up even further, I'm meaning to start using it in my next app (haven't yet).
#temp = Post.find_by_sql("SELECT posts.id,posts.title, comment_count.count FROM posts INNER JOIN (SELECT post_id, COUNT(*) AS count FROM comments GROUP BY post_id) AS comment_count ON comment_count.post_id = posts.id ORDER BY count DESC LIMIT 5;")
Could be (only a bit cleaner, but lets you chain scopes at least):
#temp = Post.joins("JOIN (SELECT post_id, COUNT(*) AS count FROM comments GROUP BY post_id) AS comment_count ON comment_count.post_id = posts.id").order("count desc").limit(5)
I'm pretty sure Arel will let you do a sub-select a little more elegantly through scopes but I'm not familiar enough with it to give you a solution.
EDIT
You may be able to just add a counter_cache to your comments association on your Post model to avoid having to calculate it every time.
class Post < ActiveRecord::Base
has_many :comments, :counter_cache => true
end
Docs from here
:counter_cache
Caches the number of belonging objects on the associate class through
the use of increment_counter and decrement_counter. The counter cache
is incremented when an object of this class is created and decremented
when it’s destroyed. This requires that a column named
#{table_name}_count (such as comments_count for a belonging Comment
class) is used on the associate class (such as a Post class). You can
also specify a custom counter cache column by providing a column name
instead of a true/false value to this option (e.g., :counter_cache =>
:my_custom_counter.) Note: Specifying a counter cache will add it to
that model’s list of readonly attributes using attr_readonly.
You don't need the subquery here, you could simply say:
Post.select("posts.id, posts.title, COUNT(*) count").
joins(:comments).
group('posts.id').
order('count DESC').
limit(5)
Related
I am having two models post and like, having a relationship between them. Post has_many likes. I wanted an optimal way to find which post has maximum likes. One way of doing this by
count = {}
Post.includes(:likes).each do |post|
count[post.id] = post.likes.count
end
Initially I used array which is not a good data structure so I used hash,but still I am not satisfy with this type of approach. What would be the best to get posts with likes.
Also, I have tried the following query but it is not working as expected so could anyone can suggest a better and optimal approach.
Post.joins("LEFT OUTER JOIN Likes ON likes.post_id =posts.id").group("posts.id").order("COUNT(likes.id) DESC")
Use counter_cache so that you always have a count of likes on the Post objects, then you can call Post.maximum(:likes_count).first to retrieve the one post that has the most likes. Likewise, any Post query will include a post's like count.
You don't need joining. Group likes by post_id and count them. The resulting post_id with max count will be id of your most liked post. Then you can join or just select the post you're looking for. In pure SQL it would look like:
SELECT l.post_id, count(*) as cnt
FROM likes l
GROUP BY l.post_id
ORDER BY cnt DESC
LIMIT 1;
I'm using Rails 4.2 and PostgreSQL 9.4.
I have a basic users, reservations and events schema.
I'd like to return a list of users and the most recent event they attended, along with what date/time this was at.
I've created a query that returns the user and the time of the most recent event. However I need to return the events.id as well.
My application does not allow a user to reserve two events with the same start time, however I appreciate SQL does not know anything about this and thinks there can be multiple events in the result. Hence I am happy for the query to return an appropriate event ID at random in the case of a hypothetical 'tie' for events.starts_at.
User.all.joins(reservations: :event)
.select('users.*, max(events.starts_at)')
.where('reservations.state = ?', "attended")
.where('events.company_id = ?', 1)
.group('users.id')
The corresponding SQL query is:
SELECT users.*, max(events.starts_at) FROM "users" INNER JOIN "reservations" ON "reservations"."user_id" = "users"."id" INNER JOIN "events" ON "events"."id" = "reservations"."event_id" WHERE (reservations.state = 'attended') AND (events.company_id = 1) GROUP BY users.id
The reservations table is very large so loading the entire set into Rails and processing it via Ruby code is undesirable. I'd like to perform the entire query in SQL if it is possible to do so.
My basic model:
User
has_many :reservations
Reservation
belongs_to :user
belongs_to :event
Event
belongs_to :company
has_many :reservations
The generic sql that returns data for the most recent event looks like this:
select yourfields
from yourtables
join
(select someField
, max(datetimefield) maxDateTime
from table1
where whatever
group by someField ) temp on table1.someField = temp.somefield
and table1.dateTimeField = maxDateTime
where whatever
The two "where whatever" things should be the same. All you have to do is adapt this construct into your app. You might consider putting the query into a stored procedure which you then call from your app.
I think your query should focus first to retrieve the most recent reservation.
SELECT MAX(`events.starts_at`),`events"."id`,`user_id` FROM `reservations` WHERE (reservations.state = 'attended')
Then JOIN the Users and Events.
Assuming the results will include every User and Event it may be more efficient to retrieve all users and events and store then in two arrays keyed by id.
The logic behind that is rather than a separate lookup into the user and events table for each resulting reservation by the db engine, it is more efficient to get them all in a single query.
SELECT * FROM Users' WHERE 1 ORDER BYuser_id`
SELECT * FROM Events' WHERE 1 ORDER BYevent_id`
I am not familiar with Rails syntax so cannot give exact code but can show using it in PHP code, the results are put into the array with a single line of code.
while ($row = mysql_fetch_array($results, MYSQL_NUM)){users[$row(user_id)] = $row;}
Then when processing the Reservations you get the user and event data from the arrays.
The Index for reservations is critical and may be worth profiling.
Possible profile choices may be to include and exclude 'attended' in the Index. The events.starts_at should be the first column in the index followed by user_id. But profiling the Index's column order should be profiled.
You may want to use a unique Index to enforce the no duplicate reservations times.
I have Articles that have_many Metrics. I wish to order the Articles by a specific Metric.value when Metric.name = "score". (Metric records various article stats as 'name' and 'value' pairs. An Article can have multiple metrics, and even multiple 'scores', although I'm only interested in ordering by the most recent.)
class Article
has_many :metrics
class Metric
# name :string(255)
# value :decimal(, )
belongs_to :article
I'm struggling to write a scope to do this - any ideas? Something like this?
scope :highest_score, joins(:metrics).order('metrics.value DESC')
.where('metrics.name = "score"')
UPDATE:
An article may have many "scores" stored in the metrics table (as they are calculated weekly/monthly/yearly etc.) but I'm only interested in using the first-found (most recent) "score" for any one article. The Metric model has a default_scope that ensures DESCending ordering.
Fixed typo on quote location for 'metrics.value DESC'.
Talking to my phone-a-friend uber rails hacker, it looks likely I need a raw SQL query for this. Now I'm in way over my head... (I'm using Postgres if that helps.)
Thanks!
UPDATE 2:
Thanks to Erwin's great SQL query suggestion I have a raw SQL query that works:
SELECT a.*
FROM articles a
LEFT JOIN (
SELECT DISTINCT ON (article_id)
article_id, value
FROM metrics m
WHERE name = 'score'
ORDER BY article_id, date_created DESC
) m ON m.article_id = a.id
ORDER BY m.value DESC;
article_list_by_desc_score = ActiveRecord::Base.connection.execute(sql)
Which gives an array of hashes representing article data (but not article objects??).
Follow-up question:
Any way of translating this back into an activerecord query for Rails? (so I can then use it in a scope)
SOLUTION UPDATE:
In case anyone is looking for the final ActiveRecord query - many thanks to Mattherick who helped me in this question. The final working query is:
scope :highest_score, joins(:metrics).where("metrics.name"
=> "score").order("metrics.value desc").group("metrics.article_id",
"articles.id", "metrics.value", "metrics.date_created")
.order("metrics.date_created desc")
Thanks everyone!
The query could work like this:
SELECT a.*
FROM article a
LEFT JOIN (
SELECT DISTINCT ON (article_id)
article_id, value
FROM metrics m
WHERE name = 'score'
ORDER BY article_id, date_created DESC
) m ON m.metrics_id = a.metrics_id
ORDER BY m.value DESC;
First, retrieve the "most recent" value for name = 'score' per article in the subquery m. More explanation for the used technique in this related answer:
Select first row in each GROUP BY group?
You seem to fall victim to a very basic misconception though:
but I'm only interested in using the first-found (most recent) "score"
for any one article. The Metric model has a default_scope that ensures DESCending ordering.
There is no "natural order" in a table. In a SELECT, you need to ORDER BY well defined criteria. For the purpose of this query I am assuming a column metrics.date_created. If you have nothing of the sort, you have no way to define "most recent" and are forced to fall back to an arbitrary pick from multiple qualifying rows:
ORDER BY article_id
This is not reliable. Postgres will pick a row as it choses. May change with any update to the table or any change in the query plan.
Next, LEFT JOIN to the the table article and ORDER BY value. NULL sorts last, so articles without qualifying value go last.
Note: some not-so-smart ORMs (and I am afraid Ruby's ActiveRecord is one of them) use the non-descriptive and non-distinctive id as name for the primary key. You'll have to adapt to your actual column names, which you didn't provide.
Performance
Should be decent. This is a "simple" query as far as Postgres is concerned. A partial multicolumn index on table metrics would make it faster:
CREATE INDEX metrics_some_name_idx ON metrics(article_id, date_created)
WHERE name = 'score';
Columns in this order. In PostgreSQL 9.2+ you could add the column value to make index-only scans possible:
CREATE INDEX metrics_some_name_idx ON metrics(article_id, date_created, value)
WHERE name = 'score';
So, I have this "advanced" query (not much, really) and I would like to translate it into Ruby Active Record's syntax.
SELECT microposts.*
FROM microposts
WHERE user_id IN
( SELECT r.followed_id as uid
FROM relationships r
WHERE follower_id = 1
UNION
SELECT u.id as uid
FROM users as u
WHERE id = 1
)
ORDER BY microposts.created_at DESC
The idea was to retrieve all microposts for user 1 AND user 1 followed users in desc creation order, but I really don't know how to translate this easily using Active Record's syntax.
Any thought ?
PS : As asked here is some rails context :
I have 3 models : Microposts, Users, Relationships.
Relationships is a join table handling all users relationships (follower/followed stuff).
Users have many followed_users/followers through relationships.
Users have many microhoops, and microhoops have one user.
Thanks.
No idea about Ruby but the SQL can be simplified to:
SELECT microposts.*
FROM microposts
WHERE user_id IN
( SELECT r.followed_id as uid
FROM relationships r
WHERE follower_id = 1
)
OR user_id = 1
ORDER BY microposts.created_at DESC
My answer will assume (since you've provided no ruby/rails-context outside of your raw SQL query) you have a User model, a Micropost model through relation :microposts, and a Relationship model through relation :following. User has many Micropost and Relationship instances related. You could do
u = User.find(1)
user.microposts + user.following.microposts
or you could move this into a method within Micropost
def self.own_and_following(user)
user.microposts + user.following.microposts
end
And call Micropost.own_and_following(User.find(1)).
This may not be what you're looking for, but in given the above mentioned likely relations you have in your Rails application, it sounds like something similar to this should work.
Your query is very specific, therefore your best bet would be to write a good portion of it using SQL, or try a gem like squeel that can help out generating very customized SQL from ActiveRecord.
Nevertheless, this should do the work with no additional gems :
user_id = ... #Get the user_id you want to test
Micropost.where("user_id IN
( SELECT r.followed_id as uid
FROM relationships r
WHERE follower_id = ? )
OR user_id = ?
", user_id, user_id).order("created_at desc")
I managed to do it using only where, seems a lot like a find_by_sql to me, and I don't know which one would be better :
Micropost.order('created_at DESC').
where('user_id in (select r.followed_id as uid from relationships as r where follower_id = ?) or user_id = ?', user.id, user.id)
Don't know how good this is, but it seem to be working.
UPDATE: So thanks to #Erwin Brandstetter, I now have this:
def self.unique_users_by_company(company)
users = User.arel_table
cards = Card.arel_table
users_columns = User.column_names.map { |col| users[col.to_sym] }
cards_condition = cards[:company_id].eq(company.id).
and(cards[:user_id].eq(users[:id]))
User.joins(:cards).where(cards_condition).group(users_columns).
order('min(cards.created_at)')
end
... which seems to do exactly what I want. There are two shortcomings that I would still like to have addressed, however:
The order() clause is using straight SQL instead of Arel (couldn't figure it out).
Calling .count on the query above gives me this error:
NoMethodError: undefined method 'to_sym' for
#<Arel::Attributes::Attribute:0x007f870dc42c50> from
/Users/neezer/.rvm/gems/ruby-1.9.3-p0/gems/activerecord-3.1.1/lib/active_record/relation/calculations.rb:227:in
'execute_grouped_calculation'
... which I believe is probably related to how I'm mapping out the users_columns, so I don't have to manually type in all of them in the group clause.
How can I fix those two issues?
ORIGINAL QUESTION:
Here's what I have so far that solves the first part of my question:
def self.unique_users_by_company(company)
users = User.arel_table
cards = Card.arel_table
cards_condition = cards[:company_id].eq(company.id)
.and(cards[:user_id].eq(users[:id]))
User.where(Card.where(cards_condition).exists)
end
This gives me 84 unique records, which is correct.
The problem is that I need those User records ordered by cards[:created_at] (whichever is earliest for that particular user). Appending .order(cards[:created_at]) to the scope at the end of the method above does absolutely nothing.
I tried adding in a .joins(:cards), but that give returns 587 records, which is incorrect (duplicate Users). group_by as I understand it is practically useless here as well, because of how PostgreSQL handles it.
I need my result to be an ActiveRecord::Relation (so it's chainable) that returns a list of unique users who have cards that belong to a given company, ordered by the creation date of their first card... with a query that's written in Ruby and is database-agnostic. How can I do this?
class Company
has_many :cards
end
class Card
belongs_to :user
belongs_to :company
end
class User
has_many :cards
end
Please let me know if you need any other information, or if I wasn't clear in my question.
The query you are looking for should look like this one:
SELECT user_id, min(created_at) AS min_created_at
FROM cards
WHERE company_id = 1
GROUP BY user_id
ORDER BY min(created_at)
You can join in the table user if you need columns of that table in the result, else you don't even need it for the query.
If you don't need min_created_at in the SELECT list, you can just leave it away.
Should be easy to translate to Ruby (which I am no good at).
To get the whole user record (as I derive from your comment):
SELECT u.*,
FROM user u
JOIN (
SELECT user_id, min(created_at) AS min_created_at
FROM cards
WHERE company_id = 1
GROUP BY user_id
) c ON u.id = c.user_id
ORDER BY min_created_at
Or:
SELECT u.*
FROM user u
JOIN cards c ON u.id = c.user_id
WHERE c.company_id = 1
GROUP BY u.id, u.col1, u.col2, .. -- You have to spell out all columns!
ORDER BY min(c.created_at)
With PostgreSQL 9.1+ you can simply write:
GROUP BY u.id
(like in MySQL) .. provided id is the primary key.
I quote the release notes:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
The SQL standard allows this behavior, and because of the primary key,
the result is unambiguous.
The fact that you need it to be chainable complicates things, otherwise you can either drop down into SQL yourself or only select the column(s) you need via select("users.id") to get around the Postgres issue. Because at the heart of it your query is something like
SELECT users.id
FROM users
INNER JOIN cards ON users.id = cards.user_id
WHERE cards.company_id = 1
GROUP BY users.id, DATE(cards.created_at)
ORDER BY DATE(cards.created_at) DESC
Which in Arel syntax is more or less:
User.select("id").joins(:cards).where(:"cards.company_id" => company.id).group_by("users.id, DATE(cards.created_at)").order("DATE(cards.created_at) DESC")