Sequence of finder methods in Active Record Query in Rails - ruby-on-rails-3

I am writing a complex Active record query to fetch data from multiple tables, the query have joins , select , order , group ,select where.
#posts = Post.published.paginate(:order => 'popularity desc, id',
:joins => [:comments, :images, :updates, :user],
:conditions => conditions,
:group => "posts.id",
:select => "posts.id*,
:per_page => 10,
:page => params[:page])
I wanted to know what should be the sequence of where , joins etc as per the standard and to maximize the performance of the query. If someone could write a query to explain the sequence that would be really great like
#posts = Post.published.joins(:comments, :images, :updates, :user).where(....

As far as I know the sequence does not matter.
An example would be:
#posts = Post.published.select('posts.id').
joins(:comments, :images, :updates, :user).
where('users.email = ?', 'john#doe.com').
group('posts.id')

Related

SQL problems when migrating from MySQL to PostgreSQL

I have a Ruby on Rails 2.3.x application that I'm trying to migrate from my own VPS to Heroku, including porting from SQLite (development) and MySQL (production) to Postgres.
This is a typical Rails call I'm using:
spots = Spot.paginate(:all, :include => [:thing, :user, :store, {:thing => :tags}, {:thing => :brand}], :group => :thing_id, :order => order, :conditions => conditions, :page => page, :per_page => per_page)
Question 1: I get a lot of errors like PG::Error: ERROR: column "spots.id" must appear in the GROUP BY clause or be used in an aggregate function. SQLite/MySQL was evidently more forgiving here. Of course I can easily fix these by adding the specified fields to my :group parameter, but I feel I'm messing up my code. Is there a better way?
Question 2: If I throw in all the GROUP BY columns that Postgres is missing I end up with the following statement (only :group has changed):
spots = Spot.paginate(:all, :include => [:thing, :user, :store, {:thing => :tags}, {:thing => :brand}], :group => 'thing_id,things.id,users.id,spots.id', :order => order, :conditions => conditions, :page => page, :per_page => per_page)
This in turn produces the following SQL code:
SELECT * FROM (SELECT DISTINCT ON ("spots".id) "spots".id, spots.created_at AS alias_0 FROM "spots"
LEFT OUTER JOIN "things" ON "things".id = "spots".thing_id
WHERE (spots.recommended_to_user_id = 1 OR spots.user_id IN (1) OR things.is_featured = 't')
GROUP BY thing_id,things.id,users.id,spots.id) AS id_list
ORDER BY id_list.alias_0 DESC LIMIT 16 OFFSET 0;
...which produces the error PG::Error: ERROR: missing FROM-clause entry for table "users". How can I solve this?
Question 1:
...Is there a better way?
Yes. Since PostgreSQL 9.1 the primary key of a table logically covers all columns of a table in the GROUP BY clause. I quote the release notes for version 9.1:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
Question 2:
The following statement ... produces the error
PG::Error: ERROR: missing FROM-clause entry for table "users"
How can I solve this?
First (as always!), I formatted your query to make it easier to understand. The culprit has bold emphasis:
SELECT *
FROM (
SELECT DISTINCT ON (spots.id)
spots.id, spots.created_at AS alias_0
FROM spots
LEFT JOIN things ON things.id = spots.thing_id
WHERE (spots.recommended_to_user_id = 1 OR
spots.user_id IN (1) OR
things.is_featured = 't')
GROUP BY thing_id, things.id, users.id, spots.id
) id_list
ORDER BY id_list.alias_0 DESC
LIMIT 16
OFFSET 0;
It's all obvious now, right?
Well, not all of it. There is a lot more. DISTINCT ON and GROUP BY in the same query for one, which has its uses, but not here. Radically simplify to:
SELECT s.id, s.created_at AS alias_0
FROM spots s
WHERE s.recommended_to_user_id = 1 OR
s.user_id = 1 OR
EXISTS (
SELECT 1 FROM things t
WHERE t.id = s.thing_id
AND t.is_featured = 't')
ORDER BY s.created_at DESC
LIMIT 16;
The EXISTS semi-join avoids the later need to GROUP BY a priori. This should be much faster (besides being correct) - if my assumptions about the missing table definitions hold.
Going the "pure SQL" route opened up a can of worms for me, so I tried keeping the will_paginate gem and tweak the Spot.paginate parameters instead. The :joins parameter turned out to be very helpful.
This is currently working for me:
spots = Spot.paginate(:all, :include => [:thing, {:thing => :tags}, {:thing => :brand}], :joins => [:user, :store, :thing], :group => 'thing_id,things.id,users.id,spots.id', :order => order, :conditions => conditions, :page => page, :per_page => per_page)

Advanced count and join in Rails

I am try to find the top n number of categories as they relate to articles, there is a habtm relationship set up between the two. This is the SQL I want to execute, but am unsure of how to do this with ActiveRecord, aside from using the find_by_sql method. is there any way of doing this with ActiveRecord methods:
SELECT
"categories".id,
"categories".name,
count("articles".id) as counter
FROM "categories"
JOIN "articles_categories"
ON "articles_categories".category_id = "categories".id
JOIN "articles"
ON "articles".id = "articles_categories".article_id
GROUP BY "categories".id
ORDER BY counter DESC
LIMIT 5;
You can use find with options to achieve the same query:
Category.find(:all,
:select => '"categories".id, "categories".name, count("articles".id) as counter',
:joins => :articles,
:group => '"categories".id',
:order => 'counter DESC',
:limit => 5
)

Ruby on Rails Query Help

I currently have the following query
User.sum(:experience, :group => "clan", :conditions => ["created_at >= ? and created_at <= ?", "2010-02-15", "2010-02-16"])
I want to return the top 50 clans in terms of experience listed from most experience to least experience with only the top 50 in experience returning. How would I modify the query to achieve that result. I know I'll need :limit => 50 to limit the query but if I add :order => "clan DESC" I get the error column "users.experience" must appear in the GROUP BY clause or be used in an aggregate function
You need to repeat the calculation in the order clause
User.sum(:experience, :group => "clan", :order=> "sum(experience) DESC", :limit => 50, :conditions => ["created_at >= ? and created_at <= ?", "2010-02-15", "2010-02-16"])

Can I :select multiple fields (*, foo) without the extra ones being added to my instances (Instance.foo=>bar)

I'm trying to write a named scope that will order my 'Products' class based on the average 'Review' value. The basic model looks like this
Product < ActiveRecord::Base
has_many :reviews
Review < ActiveRecord::Base
belongs_to :product
# integer value
I've defined the following named scope on Product:
named_scope :best_reviews,
:select => "*, AVG(reviews.value) score",
:joins => "INNER JOIN (SELECT * FROM reviews GROUP BY reviews.product_id) reviews ON reviews.product_id = products.id",
:group => "reviews.product_id",
:order => "score desc"
This seems to be working properly, except that it's adding the 'score' value in the select to my Product instances, which causes problems if I try to save them, and makes comparisons return false (#BestProduct != Product.best_reviews.first, becuase Product.best_reviews.first has score=whatever).
Is there a better way to structure the named_scope? Or a way to make Rails ignore the extra field in the select?
I'm not a Rails developer, but I know SQL allows you to sort by a field that is not in the select-list.
Can you do this:
:select => "*",
:joins => "INNER JOIN (SELECT * FROM reviews GROUP BY reviews.product_id) reviews ON reviews.product_id = products.id",
:group => "reviews.product_id",
:order => "AVG(reviews.value) desc"
Wow, so I should really wait before asking questions. Here's one solution (I'd love to hear if there are better approaches):
I moved the score field into the inner join. That makes it available for ordering but doesn't seem to add it to the instance:
named_scope :best_reviews,
:joins => "INNER JOIN (
SELECT *, AVG(value) score FROM reviews GROUP BY reviews.product_id
) reviews ON reviews.product_id = products.id",
:group => "reviews.product_id",
:order => "reviews.score desc"

named_scope to order posts by last comment date

Posts has_many Comments
I'm using searchlogic which will order by named scopes. So, I'd like a named scope that orders by each post's most recent comment.
named_scope :ascend_by_comment, :order => ...comments.created_at??...
I'm not sure how to do a :joins and get only the most recent comment and sort by its created_at field, all in a named_scope.
I'm using mysql, fyi.
EDIT:
This is the SQL query I'd be trying to emulate:
SELECT tickets.*, comments.created_at AS comment_created_at FROM tickets
INNER JOIN
(SELECT comments.ticket_id, MAX(comments.created_at) AS created_at
FROM comments group by ticket_id) comments
ON tickets.id = comments.ticket_id ORDER BY comment_created_at DESC;
named_scope :ascend_by_comment,
:joins => "LEFT JOIN comments ON comments.post_id = posts.id",
:group => "id",
:select => "posts.*, max(comments.created_at) AS comment_created_max",
:order => "comment_created_max ASC"
You can try to optimize it, but it should work and give you some hints how to do it.
Edit:
After you edited question and shown that you want inner join (no posts without comments?), you can of course change :joins => "..." with :joins => :comments.
You can do that by joining or including the associated model through the scope, something like this will do the trick:
named_scope :ascend_by_comment, :joins => :comments, :order => "comments.created_at DESC"