Rails query - latest Post for each day - sql

A User has_many Posts. I want to retrieve the latest Post for each day (using created_at), ignoring other posts that may have been written earlier. Another way to pose this question might to ask for a each top salary earning employee by department - same thing I think.
How do I write this query in Rails (4.0 preferably)? I think it has something to do with group and maximum but I can't seem to get it. Is there a way to do it without resorting to SQL?
To clarify, what I'd like returned is an array of post objects that are the last ones written on their respective date.
Thanks!

Something like this. You can convert this to AREL syntax as needed:
SELECT posts.created_at, *
FROM posts
INNER JOIN (
SELECT MAX(created_at) AS max_order_date FROM posts
GROUP BY DATE(posts.created_at)
) AS last_postings ON last_postings.max_order_date = posts.created_at
ORDER BY DATE(created_at) DESC
LIMIT 10
AREL syntax might be:
join_sql = <<-SQL
INNER JOIN (
SELECT MAX(created_at) AS max_order_date FROM posts
GROUP BY DATE(posts.created_at)
) AS last_postings ON last_postings.max_order_date = posts.created_at
SQL
Post.joins(join_sql).order('DATE(created_at) DESC')
Remove the LIMIT as it suits you.

It's not very clean, but this works in Rails 3 (taken from a Book model in my case) using PostgreSQL syntax for truncating the created_at to the date:
max_created_at_list = Book.select("max(created_at) as created_at").group("date_trunc('day',created_at)")
last_books = Book.where(:created_at => max_created_at_list)
... or just:
Book.where(:created_at =>Book.select("max(created_at) as created_at").group("date_trunc('day',created_at)"))
You'd want an index on created_at for large data sets, and either created_at to be constrained to not null at the database level or an "is not null" predicate if the RDBMS you use does not index nulls (eg. Oracle)

Try this
Post.select("user_id, max(created_at) as created_at").group(:user_id)

Related

PGError: ERROR: column “recipes.id” must appear in the GROUP BY clause or be used in an aggregate function

I have this SQL query in my DB which is causing a problem with PostgreSQL on heroku, Causing the page not to load with the above error in the heroku logs. I am using postgreSQL 9.1.6 so previous bugs have apparently been fixed
def self.top_countries
joins(:recipes).
select('countries.*, count(*) AS recipes_count').
group('countries.id').
order('recipes_count DESC')
end
I am unsure on how to refactor this so that it will work.Could anyone advise please?
Thank You
def self.top_countries
joins(:recipes).
select('countries.id, count(*) AS recipes_count').
group('countries.id').
order('recipes_count DESC')
This generates the SQL
select countries.id, count(*) AS recipes_count
from countries
join recipes on countries.id = recipes.country_id
group by countries.id
order by recipes_count
You'll notice that you only have 2 columns in the SELECT.
Not being a Heroku expert, I suspect you can get it to work by explicitly listing all column that you need from countries, and grouping by the full column list i.e.
def self.top_countries
joins(:recipes).
select('countries.id, countries.name, countries.other, count(*) AS recipes_count').
group('countries.id, countries.name, countries.other').
order('recipes_count DESC')
There might be a more concise way to join the original answer (top part) with another join to top_countries on countries.id to get the rest of the columns after the group by.

Group by query in Rails 3

I have the (working) code
counts = Registration.select('regulator_id').group('regulator_id').count
#regulators.each {|r| r.registration_count=counts[r.id]}
which allows me to show how many Registrations there are per Regulator. The query it generates is:
SELECT COUNT("registrations"."regulator_id") AS count_regulator_id, regulator_id AS regulator_id FROM "registrations" GROUP BY regulator_id
I would like to restrict my count to those registrations from the last scrape only, with a query like:
select
regulator_id, count(*)
from
registrations inner join
regulators on regulators.id = registrations.regulator_id
where
registrations.updated_at > regulators.last_scrape_start
group by
regulator_id
but I cannot get the syntax to work either using arel or find_by_sql. I am sure this is simple when you know the answer but it has cost me ages so far.
Thanks in advance.
Just add 'joins' and 'where'
Registration.joins(:regulator).where('registrations.updated_at > regulators.last_scrape_start').select('regulators.id').group('regulators.id').count

Joining tables, counting, and group to return a Model

So I've got a SQL query I'd like to duplicate in rails:
select g.*
from gamebox_favorites f
inner join gameboxes g on f.gamebox_id = g.id
group by f.gamebox_id
order by count(f.gamebox_id) desc;
I've been reading over the rails Active Record Query Interface site, but can't quite seem to put this together. I'd like the query to return a collection of Gamebox records, sorted by the number of 'favorites' a gamebox has. What is the cleanest way to do this in rails?
I believe this will work (works on a similarly structured database locally), though I'm not sure I have the proper models in the proper spots for what you're trying to do, so you might need to move a coule things around:
Gamebox.joins(:gamebox_favorites).
group('"gamebox_favorites"."gamebox_id"').
order('count("gamebox_favorites"."gamebox_id")')
On the console, this should compile to (in the case of PostgreSQL on the back end):
SELECT "gameboxes".* FROM "gamebox_favorites"
INNER JOIN "gamebox_favorites"
ON "gamebox_favorites"."gamebox_id" = "gamebox"."id"
GROUP BY "gamebox_favorites"."gamebox_id"
ORDER BY count("gamebox_favorites"."gamebox_id")
...and I'm guessing that you don't want do just wrap it in a find_by_sql call, such as:
Gamebox.find_by_sql("select g.* from gamebox_favorites f
inner join gameboxes g
on f.gamebox_id = g.id
group by f.gamebox_id
order by count(f.gamebox_id) desc")

ORDER BY issue in Postgres (Heroku)

The following code works for Postgres (Heroku):
#messages = Message.select("DISTINCT
ON (messages.conversation_id)
*").where("messages.sender_id = (?) OR messages.recipient_id = (?)",
current_user.id, current_user.id)
However, when attempting to order the results by appending .order("messages.read_at DESC") I receive the following error:
ActionView::Template::Error (PGError: ERROR: column id_list.alias_0 does not exist)
In looking at the generated SQL, I see that an alias is being created around the ORDER BY statement when not asked for:
messages.recipient_id = (32))) AS id_list ORDER BY id_list.alias_0 DESC)
I've not been able to figure out a workaround short of using "find_by_sql" for the entire statement - which takes a heavy toll on the app.
Don't vote this, I only post because posting many lines in comments does not show very well.
I would write a "query that returns messages grouped by their conversation_id, so that the last message in each conversation is shown" like this:
SELECT m.*
FROM messages m
JOIN
( SELECT conversation_id
, MAX(created_date) AS maxdate
FROM messages
WHERE ...
GROUP BY conversation_id
) AS grp
ON grp.conversation_id = m.conversation_id
AND grp.maxdate = m.created_date
ORDER BY m.read_at DESC
No idea how this can be done in Heroku or if it even possible, but it avoids the DISTINCT ON. If that's causing the error, it may be of help.

Sql query has unwanted results - Where is the problem?

I have a table which records users's scores at a game (a user may submit 5,10,20,as many scores as he wants).
I need to show the 20 top scores of a game, but per user. (as a user may have submitted eg 4 scores which are the top according to other users's scores)
The query i have written is:
SELECT DISTINCT
`table_highscores`.`userkey`,
max(`table_highscores`.`score`),
`table_users`.`username`,
`table_highscores`.`dateachieved`
FROM
`table_highscores`, `table_users`
WHERE
`table_highscores`.`userkey` = `table_users`.`userkey`
AND
`table_highscores`.`gamekey` = $gamekey
GROUP BY
`userkey`
ORDER BY
max(`table_highscores`.`score`) DESC,
LIMIT 0, 20;
The output result is ok, but there is a problem. When i calculate the difference of days (today-this of dateachieved) the result is wrong. (eg instead of saying "the score was submitted 22 days ago, it says 43 days ago) So,I have to do a second query for each score so to find the true date (meaning +20 queries). Is there any shorter way to find the correct date?
Thanks.
there is a problem. When i calculate the difference of days (today-this of dateachieved) the result is wrong.
There's two issues
the dateachieved isn't likely to be the value associated with the high score
you can use MySQL's DATEDIFF to return the the number of days between the current date and the dateachieved value.
Use:
SELECT u.username,
hs.userkey,
hs.score,
DATEDIFF(NOW(), hs.dateachieved)
FROM TABLE_HIGHSCORES hs
JOIN TABLE_USERNAME u ON u.userkey = hs.userkey
JOIN (SELECT ths.userkey,
ths.gamekey,
ths.max_score,
MAX(ths.date_achieved) 'max_date'
FROM TABLE_HIGHSCORES ths
JOIN (SELECT t.userkey,
t.gamekey,
MAX(t.score) 'max_score'
FROM TABLE_HIGHSCORES t
GROUP BY t.userkey, t.gamekey) ms ON ms.userkey = ths.userkey
AND ms.gamekey = ths.gamekey
AND ms.max_score = ths.score
) x ON x.userkey = hs.userkey
AND x.gamekey = hs.gamekey
AND x.max_score = hs.score
AND x.max_date = hs.dateachieved
WHERE hs.gamekey = $gamekey
ORDER BY hs.score DESC
LIMIT 20
I also changed your query to use ANSI-92 JOIN syntax, from ANSI-89 syntax. It's equivalent performance, but it's easier to read, syntax is supported on Oracle/SQL Server/Postgres/etc, and provides consistent LEFT JOIN support.
Another thing - you only need to use backticks when tables and/or column names are MySQL keywords.
In your query you should use an explicit JOIN and you don't need the DISTINCT keyword.
This query should solve your problem. I am assuming here that it is possible for a user to submit the same highscore more than once on different dates, and if that happens then you want the oldest date:
SELECT T1.userkey, T1.score, username, dateachieved FROM (
(SELECT userkey, max(score) AS score
FROM table_highscores
WHERE gamekey = $gamekey
GROUP BY userkey) AS T1
JOIN
(SELECT userkey, score, min(dateachieved) as dateachieved
FROM table_highscores
WHERE gamekey = $gamekey
GROUP BY userkey, score) AS T2
ON T1.userkey = T2.userkey AND T1.score = T2.score
) JOIN table_users ON T1.userkey = table_users.userkey
LIMIT 20
You didn't say what language you are using to calculate the difference but I'm guessing it's PHP because of the $gamekey you used there (which should be escaped properly, btw).
If your dateachieved field is in the DATETIME format, you can calculate the difference like this:
$diff = round((time() - strtotime($row['dateachieved'])) / 86400);
I think you need to clarify your question a little better. Can you provide some data and expected outputs and then I should be able to help you further?