rails - how to specify given number of top unique results returned in a given join query - sql

I'm trying to pull a query that will pull the top 20 unique "members" based on their top individual "report scores". the problem is that if the top member's 2nd best report score is also in the top 20, then the query only returns 19 members
top_members = Member.all(:joins=>[:reports], :conditions => ["score > 0"],:order => ["score DESC"],:limit => 20).uniq
what do i need to do get the query to keep going till i have 20 members?
Thanks!

At the moment you're grabbing 20 results from the database and then discarding the duplicates when calling uniq(). Instead you can select distinct results from the database, something like:
top_members = Member.all(:joins=>[:reports],
:conditions => ["score > 0"],
:order => ["score DESC"],
:limit => 20,
:select => 'DISTINCT members.*').uniq

I think you'll have to do this in two stages:
1) id_list = Member.all(:select => 'distinct id', :conditions => ["score > 0"],:order => ["score DESC"],:limit => 20
2) top_members = Member.find(id_list, :joins=>[:reports]).

Related

Sequence of finder methods in Active Record Query in Rails

I am writing a complex Active record query to fetch data from multiple tables, the query have joins , select , order , group ,select where.
#posts = Post.published.paginate(:order => 'popularity desc, id',
:joins => [:comments, :images, :updates, :user],
:conditions => conditions,
:group => "posts.id",
:select => "posts.id*,
:per_page => 10,
:page => params[:page])
I wanted to know what should be the sequence of where , joins etc as per the standard and to maximize the performance of the query. If someone could write a query to explain the sequence that would be really great like
#posts = Post.published.joins(:comments, :images, :updates, :user).where(....
As far as I know the sequence does not matter.
An example would be:
#posts = Post.published.select('posts.id').
joins(:comments, :images, :updates, :user).
where('users.email = ?', 'john#doe.com').
group('posts.id')

SQL problems when migrating from MySQL to PostgreSQL

I have a Ruby on Rails 2.3.x application that I'm trying to migrate from my own VPS to Heroku, including porting from SQLite (development) and MySQL (production) to Postgres.
This is a typical Rails call I'm using:
spots = Spot.paginate(:all, :include => [:thing, :user, :store, {:thing => :tags}, {:thing => :brand}], :group => :thing_id, :order => order, :conditions => conditions, :page => page, :per_page => per_page)
Question 1: I get a lot of errors like PG::Error: ERROR: column "spots.id" must appear in the GROUP BY clause or be used in an aggregate function. SQLite/MySQL was evidently more forgiving here. Of course I can easily fix these by adding the specified fields to my :group parameter, but I feel I'm messing up my code. Is there a better way?
Question 2: If I throw in all the GROUP BY columns that Postgres is missing I end up with the following statement (only :group has changed):
spots = Spot.paginate(:all, :include => [:thing, :user, :store, {:thing => :tags}, {:thing => :brand}], :group => 'thing_id,things.id,users.id,spots.id', :order => order, :conditions => conditions, :page => page, :per_page => per_page)
This in turn produces the following SQL code:
SELECT * FROM (SELECT DISTINCT ON ("spots".id) "spots".id, spots.created_at AS alias_0 FROM "spots"
LEFT OUTER JOIN "things" ON "things".id = "spots".thing_id
WHERE (spots.recommended_to_user_id = 1 OR spots.user_id IN (1) OR things.is_featured = 't')
GROUP BY thing_id,things.id,users.id,spots.id) AS id_list
ORDER BY id_list.alias_0 DESC LIMIT 16 OFFSET 0;
...which produces the error PG::Error: ERROR: missing FROM-clause entry for table "users". How can I solve this?
Question 1:
...Is there a better way?
Yes. Since PostgreSQL 9.1 the primary key of a table logically covers all columns of a table in the GROUP BY clause. I quote the release notes for version 9.1:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
Question 2:
The following statement ... produces the error
PG::Error: ERROR: missing FROM-clause entry for table "users"
How can I solve this?
First (as always!), I formatted your query to make it easier to understand. The culprit has bold emphasis:
SELECT *
FROM (
SELECT DISTINCT ON (spots.id)
spots.id, spots.created_at AS alias_0
FROM spots
LEFT JOIN things ON things.id = spots.thing_id
WHERE (spots.recommended_to_user_id = 1 OR
spots.user_id IN (1) OR
things.is_featured = 't')
GROUP BY thing_id, things.id, users.id, spots.id
) id_list
ORDER BY id_list.alias_0 DESC
LIMIT 16
OFFSET 0;
It's all obvious now, right?
Well, not all of it. There is a lot more. DISTINCT ON and GROUP BY in the same query for one, which has its uses, but not here. Radically simplify to:
SELECT s.id, s.created_at AS alias_0
FROM spots s
WHERE s.recommended_to_user_id = 1 OR
s.user_id = 1 OR
EXISTS (
SELECT 1 FROM things t
WHERE t.id = s.thing_id
AND t.is_featured = 't')
ORDER BY s.created_at DESC
LIMIT 16;
The EXISTS semi-join avoids the later need to GROUP BY a priori. This should be much faster (besides being correct) - if my assumptions about the missing table definitions hold.
Going the "pure SQL" route opened up a can of worms for me, so I tried keeping the will_paginate gem and tweak the Spot.paginate parameters instead. The :joins parameter turned out to be very helpful.
This is currently working for me:
spots = Spot.paginate(:all, :include => [:thing, {:thing => :tags}, {:thing => :brand}], :joins => [:user, :store, :thing], :group => 'thing_id,things.id,users.id,spots.id', :order => order, :conditions => conditions, :page => page, :per_page => per_page)

How do I select a certain field from an array of objects of a certain class in Ruby on Rails 2.3.14

If I have an array of 100 objects of a class called Stream, like this:
User.find(1001).streams.find( :all, :order => "id", :limit => 100 )
which is basically a data table, how would I select a certain field, such as rating, from it? In SQL, all I need to do is, SELECT rating FROM streams WHERE user_id=1001 ORDER BY id LIMIT 100, but I don't know how to do that in Ruby on Rails. Using the command above returns all the fields, which I can't use.
Code below will return an array of rating
User.find(1001).streams.all( :order => "id", :limit => 100).map(&:rating)
If you want to avoid the cost of retrieving all attributes of the User object:
User.find(1001).streams.all:select => "rating",
:order => "id", :limit => 100).map(&:rating)
You can use the standard SELECT clause syntax in the value for the key :select.
If you want to avoid constructing User objects:
User.connection.select_values("select rating from users order by id limit 100")

Advanced count and join in Rails

I am try to find the top n number of categories as they relate to articles, there is a habtm relationship set up between the two. This is the SQL I want to execute, but am unsure of how to do this with ActiveRecord, aside from using the find_by_sql method. is there any way of doing this with ActiveRecord methods:
SELECT
"categories".id,
"categories".name,
count("articles".id) as counter
FROM "categories"
JOIN "articles_categories"
ON "articles_categories".category_id = "categories".id
JOIN "articles"
ON "articles".id = "articles_categories".article_id
GROUP BY "categories".id
ORDER BY counter DESC
LIMIT 5;
You can use find with options to achieve the same query:
Category.find(:all,
:select => '"categories".id, "categories".name, count("articles".id) as counter',
:joins => :articles,
:group => '"categories".id',
:order => 'counter DESC',
:limit => 5
)

Ruby on Rails Query Help

I currently have the following query
User.sum(:experience, :group => "clan", :conditions => ["created_at >= ? and created_at <= ?", "2010-02-15", "2010-02-16"])
I want to return the top 50 clans in terms of experience listed from most experience to least experience with only the top 50 in experience returning. How would I modify the query to achieve that result. I know I'll need :limit => 50 to limit the query but if I add :order => "clan DESC" I get the error column "users.experience" must appear in the GROUP BY clause or be used in an aggregate function
You need to repeat the calculation in the order clause
User.sum(:experience, :group => "clan", :order=> "sum(experience) DESC", :limit => 50, :conditions => ["created_at >= ? and created_at <= ?", "2010-02-15", "2010-02-16"])