I have a simple relationship
Questionnaire has_many Answers
I have multiple questionnaires. I want to get the latest 5 answers across distinct questionnaires.
If I do:
Answer.find(:all, :order => "id desc" , :limit => 5)
I get the last 5 answers but most of the time all 5 answers belong to the same questionnaire. How can I query the latest answers from distinct questionnaires something like
Answer.find(:all, :order => "id desc" , :limit => 5, :conditions => "DISTINCT questionnaire.id") ??
(the idea is to show an activity feed to an administrator e.g. user A replied to answers in questionnaire X on 11-11-2012, user B replied to answers in questionnaire Y on 01-11-2012 etc.)
If you're using Postgres I would do
Answer.find(:all,
:order => "answers.attr1 asc, answers.attr2 asc, answers.attr3 asc, ..." , :limit => 5,
:select => "DISTINCT ON (answers.questionnaire_id) answers.*")
Note you might have to list every column in your answer table except questionnaire_id in the order clause, so Postgres can know for sure what you want to select, otherwise it might be ambigious and you might get a SQL error.
And for gods sake upgrade to Rails 3!
Rails 3:
Answer.order("answers.attr1 asc, answers.attr2 asc, answers.attr3 asc, ...").
limit(5).select("DISTINCT ON (answers.questionnaire_id) answers.*")
Use GROUP BY to get the latest answers for each questionare, and limit to latest 5:
latest_answer_ids = Answer.group(:questionnaire_id).
order('MAX(answers.id)').
limit(5).maximum(:id).values
latest_answers = Answer.where(:id => latest_answer_ids)
If you need to do this as one query, you can use Arel to build a subquery like this:
answers = Answer.arel_table
latest_answers = Answer.where(:id =>
answers.
group(:questionnaire_id).
project('max(id)')
).order('id desc').limit(5)
The benefit here is that the subquery will work in any DBMS.
Related
I have a Ruby on Rails 2.3.x application that I'm trying to migrate from my own VPS to Heroku, including porting from SQLite (development) and MySQL (production) to Postgres.
This is a typical Rails call I'm using:
spots = Spot.paginate(:all, :include => [:thing, :user, :store, {:thing => :tags}, {:thing => :brand}], :group => :thing_id, :order => order, :conditions => conditions, :page => page, :per_page => per_page)
Question 1: I get a lot of errors like PG::Error: ERROR: column "spots.id" must appear in the GROUP BY clause or be used in an aggregate function. SQLite/MySQL was evidently more forgiving here. Of course I can easily fix these by adding the specified fields to my :group parameter, but I feel I'm messing up my code. Is there a better way?
Question 2: If I throw in all the GROUP BY columns that Postgres is missing I end up with the following statement (only :group has changed):
spots = Spot.paginate(:all, :include => [:thing, :user, :store, {:thing => :tags}, {:thing => :brand}], :group => 'thing_id,things.id,users.id,spots.id', :order => order, :conditions => conditions, :page => page, :per_page => per_page)
This in turn produces the following SQL code:
SELECT * FROM (SELECT DISTINCT ON ("spots".id) "spots".id, spots.created_at AS alias_0 FROM "spots"
LEFT OUTER JOIN "things" ON "things".id = "spots".thing_id
WHERE (spots.recommended_to_user_id = 1 OR spots.user_id IN (1) OR things.is_featured = 't')
GROUP BY thing_id,things.id,users.id,spots.id) AS id_list
ORDER BY id_list.alias_0 DESC LIMIT 16 OFFSET 0;
...which produces the error PG::Error: ERROR: missing FROM-clause entry for table "users". How can I solve this?
Question 1:
...Is there a better way?
Yes. Since PostgreSQL 9.1 the primary key of a table logically covers all columns of a table in the GROUP BY clause. I quote the release notes for version 9.1:
Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)
Question 2:
The following statement ... produces the error
PG::Error: ERROR: missing FROM-clause entry for table "users"
How can I solve this?
First (as always!), I formatted your query to make it easier to understand. The culprit has bold emphasis:
SELECT *
FROM (
SELECT DISTINCT ON (spots.id)
spots.id, spots.created_at AS alias_0
FROM spots
LEFT JOIN things ON things.id = spots.thing_id
WHERE (spots.recommended_to_user_id = 1 OR
spots.user_id IN (1) OR
things.is_featured = 't')
GROUP BY thing_id, things.id, users.id, spots.id
) id_list
ORDER BY id_list.alias_0 DESC
LIMIT 16
OFFSET 0;
It's all obvious now, right?
Well, not all of it. There is a lot more. DISTINCT ON and GROUP BY in the same query for one, which has its uses, but not here. Radically simplify to:
SELECT s.id, s.created_at AS alias_0
FROM spots s
WHERE s.recommended_to_user_id = 1 OR
s.user_id = 1 OR
EXISTS (
SELECT 1 FROM things t
WHERE t.id = s.thing_id
AND t.is_featured = 't')
ORDER BY s.created_at DESC
LIMIT 16;
The EXISTS semi-join avoids the later need to GROUP BY a priori. This should be much faster (besides being correct) - if my assumptions about the missing table definitions hold.
Going the "pure SQL" route opened up a can of worms for me, so I tried keeping the will_paginate gem and tweak the Spot.paginate parameters instead. The :joins parameter turned out to be very helpful.
This is currently working for me:
spots = Spot.paginate(:all, :include => [:thing, {:thing => :tags}, {:thing => :brand}], :joins => [:user, :store, :thing], :group => 'thing_id,things.id,users.id,spots.id', :order => order, :conditions => conditions, :page => page, :per_page => per_page)
I have a voting system with two models: Item(id, name) and Vote(id, item_id, user_id).
Here's the code I have so far:
class Item < ActiveRecord::Base
has_many :votes
def self.most_popular
items = Item.all #where can I optimize here?
items.sort {|x,y| x.votes.length <=> y.votes.length}.first #so I don't need to do anything here?
end
end
There's a few things wrong with this, mainly that I retrieve all the Item records, THEN use Ruby to compute popularity. I am almost certain there is a simple solution to this, but I can't quite put my finger on it.
I'd much rather gather records and run the calculations in the initial query. This way, I can add a simple :limit => 1 (or LIMIT 1) to the query.
Any help would be great--either rewrite in all ActiveRecord or even in raw SQl. The latter would actually give me a much clearer picture of the nature of the query I want to execute.
Group the votes by item id, order them by count and then take the item of the first one. In rails 3 the code for this is:
Vote.group(:item_id).order("count(*) DESC").first.item
In rails 2, this should work:
Vote.all(:order => "count(*) DESC", :group => :item_id).first.item
sepp2k has the right idea. In case you're not using Rails 3, the equivalent is:
Vote.first(:group => :item_id, :order => "count(*) DESC", :include => :item).item
Probably there's a better way to do this in ruby, but in SQL (mysql at least) you could try something like this to get a top 10 ranking:
SELECT i.id, i.name, COUNT( v.id ) AS total_votes
FROM Item i
LEFT JOIN Vote v ON ( i.id = v.item_id )
GROUP BY i.id
ORDER BY total_votes DESC
LIMIT 10
One easy way of handling this is to add a vote count field to the Item, and update that each time there is a vote. Rails used to do that automatically for you, but not sure if it's still the case in 2.x and 3.0. It's easy enough for you to do it in any case using an Observer pattern or else just by putting in a "after_save" in the Vote model.
Then your query is very easy, by simply adding a "VOTE_COUNT DESC" order to your query.
i have budgets table with emptype_id and calendar_id actual_head, estimated_head
when i do Budgets.sum(:actual_head ,:group=>"emptype_id,calendar_id") i do not get the result grouped by the above two columns but only by the emptype_id
however when i check the log the sql query is right
SELECT sum(`budgets`.actual_head) AS sum_actual_head, emptype_id,calendar_id AS emptype_id_calendar_id FROM `budgets` GROUP BY emptype_id,calendar_id
has 103 rows
I wanted to iterate through each emptype_id and calendar_id to get a sum of actual_head
and do some calculations on it.
Grouping with multiple columns cannot be supported by rails. You have to use a regular find all:
budgets = Budgets.find(:all,
:select => "emptype_id, calendar_id, sum(budgets.actual_head) AS sum_actual_head",
:group => "emptype_id, calendar_id")
budgets.each { |budget| puts budget.sum_actual_head }
I cheat. Do :group => ["emptype_id,calendar_id"].
Not want you nor I want, but this works at least.
I'm not sure of this, buy try :group => [:emptype_id, :calendar_id]
I want this SQL query to be written in rails controller using find:
select id,name from questions
where id not in (select question_id from levels_questions where level_id=15)
How will I do this? I am using Rails framework and MySQL.
Thanks in advance.
Question.find_all_by_sql('select id,name from questions where id not in (select question_id from levels_questions where level_id=15)')
This is admittedly non-ActiveRecord-ish, but I find that complicated queries such as this tend to be LESS clear/efficient when using the AR macros. If you already have the SQL constructed, you might as well use it.
Some suggestions: encapsulate this find call in a method INSIDE the Question class to hide the details from the controller/view, and consider other SQL constructions that may be more efficient (eg, an OUTER JOIN where levels_questions.question_id is null)
Simple way:
ids = LevelsQuestion.all(:select => "question_id",
:conditions => "level_id = 15").collect(&:question_id)
Question.all(:select => "id, name", :conditions => ["id not in (?)", ids])
One shot:
Question.all(:select => "id, name",
:conditions => ["id not in (select question_id from levels_questions where level_id=15)"])
And the rails 3 way:
ids = LevelsQuestion.select(:question_id).where(:level_id => 15).collect(&:question_id)
Question.select(:id, :name).where("id not in (?)", ids)
I cant get rails to return combined ('AND') searches on associated join tables of an Object.
E.g. I have Books that are in Categories. Lets say: Book 1: is in category 5 and 8
But I can't get 'AND' to filter results using the join table? E.g ::->
Class Books
has_and_belongs_to_many :categories, :join_table => "book_categories"
Book.find :all, :conditions => "book_categories.category_id = 5 AND book_categories.category_id = 8", :include => "categories"
... returns nil
(why does it not return all books that are in both 5 & 8 ??)
However: 'OR' does work:
Book.find :all, :conditions => "book_categories.category_id = 5 OR book_categories.category_id = 8"
... returns all books in category 5 and 8
I must be missing something?
The problem is at the SQL level. That condition runs on a link table row, and any individual link table row can never have a category_id of both 5 and 8. You really want separate link table rows to have these IDs.
Try looking into Rails' named_scope, specifically the part that allows filtering with a lambda (so you can take an argument). I've never tried it out myself, but if I had to implement what you're looking for, that's what I'd look in to.