Which of these two AR queries is more efficient? (ie. is it better to outsource some query work to Ruby ) - sql

#users = Hash.new
#users[:count] = User.count(:all, :joins => my_join, :conditions => my_conditions)
#users[:data] = User.find(:all, :joins => my_join, :conditions => my_conditions)
or
#users = Hash.new
#users[:data] = User.find(:all, :joins => my_join, :conditions => my_conditions)
#users[:count] = #users[:data].count
It seems like the first option consists of two database queries (which from what I read is expensive) while in the second one, we only make one database query and do the counting work at the Ruby level.
Which one is more efficient?

The second one is better, since, just like you said, it saves a database query.
p.s.
Please be careful if you use some new finder methods introduced in Rails 3, then calling count after would fire a COUNT(*) query:
users = User.where(...) # SELECT "users".* FROM "users" WHERE ...
users_count = users.count # SELECT COUNT(*) FROM "users" WHERE ...
To prevent that, you might want to call size:
users = User.where(...) # SELECT "users".* FROM "users" WHERE ...
users_count = users.size # No database query

Related

Sequence of finder methods in Active Record Query in Rails

I am writing a complex Active record query to fetch data from multiple tables, the query have joins , select , order , group ,select where.
#posts = Post.published.paginate(:order => 'popularity desc, id',
:joins => [:comments, :images, :updates, :user],
:conditions => conditions,
:group => "posts.id",
:select => "posts.id*,
:per_page => 10,
:page => params[:page])
I wanted to know what should be the sequence of where , joins etc as per the standard and to maximize the performance of the query. If someone could write a query to explain the sequence that would be really great like
#posts = Post.published.joins(:comments, :images, :updates, :user).where(....
As far as I know the sequence does not matter.
An example would be:
#posts = Post.published.select('posts.id').
joins(:comments, :images, :updates, :user).
where('users.email = ?', 'john#doe.com').
group('posts.id')

SQL problems when migrating from MySQL to PostgreSQL

I have a Ruby on Rails 2.3.x application that I'm trying to migrate from my own VPS to Heroku, including porting from SQLite (development) and MySQL (production) to Postgres.
This is a typical Rails call I'm using:
spots = Spot.paginate(:all, :include => [:thing, :user, :store, {:thing => :tags}, {:thing => :brand}], :group => :thing_id, :order => order, :conditions => conditions, :page => page, :per_page => per_page)
Question 1: I get a lot of errors like PG::Error: ERROR: column "spots.id" must appear in the GROUP BY clause or be used in an aggregate function. SQLite/MySQL was evidently more forgiving here. Of course I can easily fix these by adding the specified fields to my :group parameter, but I feel I'm messing up my code. Is there a better way?
Question 2: If I throw in all the GROUP BY columns that Postgres is missing I end up with the following statement (only :group has changed):
spots = Spot.paginate(:all, :include => [:thing, :user, :store, {:thing => :tags}, {:thing => :brand}], :group => 'thing_id,things.id,users.id,spots.id', :order => order, :conditions => conditions, :page => page, :per_page => per_page)
This in turn produces the following SQL code:
SELECT * FROM (SELECT DISTINCT ON ("spots".id) "spots".id, spots.created_at AS alias_0 FROM "spots"
LEFT OUTER JOIN "things" ON "things".id = "spots".thing_id
WHERE (spots.recommended_to_user_id = 1 OR spots.user_id IN (1) OR things.is_featured = 't')
GROUP BY thing_id,things.id,users.id,spots.id) AS id_list
ORDER BY id_list.alias_0 DESC LIMIT 16 OFFSET 0;
...which produces the error PG::Error: ERROR: missing FROM-clause entry for table "users". How can I solve this?
Question 1:
...Is there a better way?
Yes. Since PostgreSQL 9.1 the primary key of a table logically covers all columns of a table in the GROUP BY clause. I quote the release notes for version 9.1:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
Question 2:
The following statement ... produces the error
PG::Error: ERROR: missing FROM-clause entry for table "users"
How can I solve this?
First (as always!), I formatted your query to make it easier to understand. The culprit has bold emphasis:
SELECT *
FROM (
SELECT DISTINCT ON (spots.id)
spots.id, spots.created_at AS alias_0
FROM spots
LEFT JOIN things ON things.id = spots.thing_id
WHERE (spots.recommended_to_user_id = 1 OR
spots.user_id IN (1) OR
things.is_featured = 't')
GROUP BY thing_id, things.id, users.id, spots.id
) id_list
ORDER BY id_list.alias_0 DESC
LIMIT 16
OFFSET 0;
It's all obvious now, right?
Well, not all of it. There is a lot more. DISTINCT ON and GROUP BY in the same query for one, which has its uses, but not here. Radically simplify to:
SELECT s.id, s.created_at AS alias_0
FROM spots s
WHERE s.recommended_to_user_id = 1 OR
s.user_id = 1 OR
EXISTS (
SELECT 1 FROM things t
WHERE t.id = s.thing_id
AND t.is_featured = 't')
ORDER BY s.created_at DESC
LIMIT 16;
The EXISTS semi-join avoids the later need to GROUP BY a priori. This should be much faster (besides being correct) - if my assumptions about the missing table definitions hold.
Going the "pure SQL" route opened up a can of worms for me, so I tried keeping the will_paginate gem and tweak the Spot.paginate parameters instead. The :joins parameter turned out to be very helpful.
This is currently working for me:
spots = Spot.paginate(:all, :include => [:thing, {:thing => :tags}, {:thing => :brand}], :joins => [:user, :store, :thing], :group => 'thing_id,things.id,users.id,spots.id', :order => order, :conditions => conditions, :page => page, :per_page => per_page)

Rails 3.0 One-One Association Using associated model in WHERE clause

When I do:
conditions = {:first_name => 'Chris'}
Patient.joins(:user).find(:all, :conditions => conditions)
It Produces (and fails because the first_name is not in the patients table)
SELECT "patients".* FROM "patients" INNER JOIN "users" ON "users"."id" = "patients"."user_id" WHERE "patients"."first_name" = 'Chris'
I need to be able to query the User model's fields also and get back Patient objects. Is this possible?
Try this:
conditions = ['users.first_name = ?', 'Chris']
Patient.joins(:user).find(:all, :conditions => conditions)
Try changing you conditions hash to:
conditions = {'users.first_name' => 'Chris'}
I've used this style in Rails 2.3, and it worked great for me. Cheers!

Advanced count and join in Rails

I am try to find the top n number of categories as they relate to articles, there is a habtm relationship set up between the two. This is the SQL I want to execute, but am unsure of how to do this with ActiveRecord, aside from using the find_by_sql method. is there any way of doing this with ActiveRecord methods:
SELECT
"categories".id,
"categories".name,
count("articles".id) as counter
FROM "categories"
JOIN "articles_categories"
ON "articles_categories".category_id = "categories".id
JOIN "articles"
ON "articles".id = "articles_categories".article_id
GROUP BY "categories".id
ORDER BY counter DESC
LIMIT 5;
You can use find with options to achieve the same query:
Category.find(:all,
:select => '"categories".id, "categories".name, count("articles".id) as counter',
:joins => :articles,
:group => '"categories".id',
:order => 'counter DESC',
:limit => 5
)

Complex Join Queries in Rails

I have 3 tables - venues, users, and updates (which have a integer for rating) - and I want to write a query that will return a list of all my venues as well as their average ratings using only the most recent update for each person, venue pair. For example, if user 1 rates venue A once at 9 am with a 4, and then rates it again at 5 pm with a 3, I only want to use the rating of 3, since it's more recent. There are also some optional conditions, such as how recent the updates must be, and if there is an array of user ids the users must be within.
Does anybody have a suggestion on what the best way to write something like this is so that it is clean and efficient? I have written the following named_scope which should do the trick, but it is pretty ugly:
named_scope :with_avg_ratings, lambda { |*params|
hash = params.first || {}
has_params = hash[:user_ids] || hash[:time_ago]
dir = hash[:dir] || 'DESC'
{
:joins => %Q{
LEFT JOIN (select user_id, venue_id, max(updated_at) as last_updated_at from updates
WHERE type = 'Review' GROUP BY user_id, venue_id) lu ON lu.venue_id = venues.id
LEFT JOIN updates ON lu.last_updated_at = updates.updated_at
AND updates.venue_id = venues.id AND updates.user_id = lu.user_id
},
:select => "venues.*, ifnull(avg(rating),0) as avg_rating",
:group => "venues.id",
:order => "avg_rating #{dir}",
:conditions => Condition.block { |c|
c.or { |a|
a.and "updates.user_id", hash[:user_ids] if hash[:user_ids]
a.and "updates.updated_at", '>', hash[:time_ago] if hash[:time_ago]
} if has_params
c.or "updates.id", 'is', nil if has_params
}
}
}
I include the last "updates.id is null" condition because I still want the venues returned even if they don't have any updates associated with them.
Thanks,
Eric
Yikes, that looks like a job for find_by_sql to me. When you're doing something that complex, I find it's best to take the job away from ActiveRecord and DIY.