How to rewrite this SQL query in Rails 3? - sql

Suppose I have two models, submission has_many submissionstate
and table submissionstates has the following columns:
id | submission_id | state_id | created_at
and the query is
SELECT submission_id, state_id
FROM submissionstates ss
JOIN (
SELECT MAX(created_at) as created_at
FROM submissionstates ss
GROUP BY submission_id
) x
ON ss.created_at = x.created_at
WHERE state_id = 0

like the link Saurabh gave, you end up putting your sql fragments into the rails query methods; something like this
list = SubmissionState.select("submission_id, state_id")
.where(:state_id => 0)
.joins("
JOIN (
SELECT MAX(created_at) as created_at
FROM submissionstates ss
GROUP BY submission_id
) x
ON ss.created_at = x.created_at
")
puts list.length
to be honest at this point you might be better off just using find_by_sql
sql = "
SELECT submission_id, state_id
FROM submissionstates ss
JOIN (
SELECT MAX(created_at) as created_at
FROM submissionstates ss
GROUP BY submission_id
) x
ON ss.created_at = x.created_at
WHERE state_id = ?
AND some_other_value = ?
"
list = SubmissionState.find_by_sql([sql, 0, 'something-else'])
puts list.length
NOTE: once you start using joins or find_by_sql rails acts like it gives you objects back but really they will contain any attributes defined in the select clause and find_by_sql returns all attributes as strings which can be annoying

Related

Get result where exist on date X and not exist on day Y

I have a table where I store some info about events. My table have columns: id, created_at(which is date field), key(which is varchar255), quantity(which is integer) and event(which is varchar255). I'm making query to take all keys which exists on date X (for example 2022-09-05) and NOT exists on date Y (example 2022-09-06). The table has no relation with other tables.
The query that I tried is:
SELECT s.key
FROM stats s
WHERE created_at = '2022-09-05'
AND NOT EXISTS(
SELECT *
FROM stats s
WHERE s.created_at = '2022-09-06'
)
GROUP BY s.key
;
The problem is this returns me 0 result, but I expect at least 1.
You have to check that the key of 2022-09-05 does not appear on 2022-09-06. So the query changes to
SELECT s.key
FROM stats s
WHERE s.created_at = '2022-09-05' AND NOT EXISTS
(SELECT FROM stats st WHERE st.key = s.key AND st.created_at = '2022-09-06');
You can try this
SELECT s.key
FROM stats s
LEFT JOIN (
SELECT s.key FROM stats s
WHERE created_at = '2022-09-05'
) dayAfter ON s.key = dayAfter.key
WHERE s.created_at = '2022-09-06'
AND dayAfter.key IS NULL
GROUP BY s.key

Where to insert a GROUP BY in a complex PostgreSQL query which aggregate two tables

In the below PSQL query I need to add a GROUP BY member.id but I have difficulties to where to put it as always saying syntax error on the GROUP BY.
The query is aggregating 2 tables and update message excluding where we have a USER type. The script complains about the missing GROUP BY member.id but have no idea at the moment where to put it and need ideas.
update
message m
set
status = 'DEFAULT'
where
status = 'PENDING'
and conversation_id = '1'
and not exists (
select
1
from
"member"
having
id = m.member_id
and origin_type = 'USER')
and m.updated_at < (
select
max(m.updated_at)
from
message
having
conversation_id = m.conversation_id
and origin_type = 'USER')
The tables where I'm doing this as screenshot
Message
Member
I think there are some mistakes below.
having => where
max(m.updated_at) => max(updated_at)
and origin_type = 'USER' really needs?
The following query will not result in an error.
However, I think the query is not your desired one.
update
message m
set
status = 'DEFAULT'
where
status = 'PENDING'
and conversation_id = '1'
and not exists (
select
1
from
"member"
--having
where
id = m.member_id
and origin_type = 'USER')
and m.updated_at < (
select
--max(m.updated_at)
max(updated_at)
from
message
--having
where
conversation_id = m.conversation_id
--and origin_type = 'USER')
)
Usually GROUP BY clause is placed between WHERE and HAVING statements (reference here).
In your case, however, I believe you just need to substitute your 'HAVINGstatements withWHERE`.
The WHERE clause defines filters on the original dataset pre-aggregations.
The HAVING clause defines filters on the original dataset post-aggregation and usually is dedicated to aggregated result filtering.
In your case, since you just want to filter the rows in the original dataset pre-aggregation, WHERE should be used
update
message m
set
status = 'DEFAULT'
where
status = 'PENDING'
and conversation_id = '1'
and not exists (
select
1
from
"member"
WHERE
id = m.member_id
and origin_type = 'USER')
and m.updated_at < (
select
max(m.updated_at)
from
message
WHERE
conversation_id = m.conversation_id
and origin_type = 'USER')
According to the SQL syntax the GROUP BY should come prior to the HAVING. But are you sure that the problem is that you are missing the GROUP BY? Can it be instead that the problem that you should use WHERE instead of the HAVING? For me this seams to make much more sence.
update
message m
set
status = 'DEFAULT'
where
status = 'PENDING'
and conversation_id = '1'
and not exists (
select
1
from
"member"
where
id = m.member_id
and origin_type = 'USER'
)
and m.updated_at < (
select
max(m.updated_at)
from
message
where
conversation_id = m.conversation_id
and origin_type = 'USER'
)

postgresql/sql - Improve query where same sub select used for IN & NOT IN select

How could I improve this query. The problem is that the same sub select is used twice, first for an IN and then for a NOT IN:
SELECT
"activities".*
FROM "activities"
WHERE (
user_id IN (
SELECT followed_id
FROM relationships
WHERE follower_id = 1 AND blocked = false)
AND
targeted_user_id NOT IN (
SELECT followed_id
FROM relationships
WHERE follower_id = 1 AND blocked = false )
)
Using a common table expression will help:
WITH users_cte AS (
SELECT followed_id
FROM relationships
WHERE follower_id = 1 AND blocked = false)
SELECT "activities.*"
FROM "activities"
WHERE user_id IN (SELECT followed_id FROM users_cte)
AND targeted_user_id NOT IN (SELECT followed_id FROM users_cte)
I would phrase the query using exists:
SELECT a.*
FROM activities a
WHERE EXISTS (SELECT 1
FROM relationships r
WHERE r.followed_id = a.user_id AND
r.follower_id = 1 and
r.blocked = false
) AND
NOT EXISTS (SELECT 1
FROM relationships r
WHERE r.followed_id = a.targeted_user_id AND
r.follower_id = 1 and
r.blocked = false
);
Then, I would create an index on relationships(followed_id, follower_id, blocked):
create index idx_relationships_3 on relationships(followed_id, follower_id, blocked);
From a performance perspective, the index should be much better than using a CTE (if you are really using Postgres, MySQL doesn't support CTEs).
In addition to indexes you could try rewriting the query as follows:
SELECT distinct a.*
FROM activities a
join relationships x
on a.user_id = x.followed_id
left join relationships r
on a.targeted_user_id = r. followed_id
and r.follower_id = 1
and r.blocked = false
where r.followed_id is null
and x.follower_id = 1
and x.blocked = false
If the inner join to relationships (x) above does not result in repeated rows of activities, you can get rid of the DISTINCT.

Complex conditional SQL statement in SQLite

I'm trying to build a support system in which I now face a complex query. I've got a couple tables in my SQLite table wich look like so (slightly simplified):
CREATE TABLE "assign" (
"id" INTEGER NOT NULL PRIMARY KEY,
"created" DATETIME NOT NULL,
"is_assigned" SMALLINT NOT NULL,
"user_id" INTEGER NOT NULL REFERENCES "user" ("id")
);
CREATE TABLE "message" (
"id" INTEGER NOT NULL PRIMARY KEY,
"created" DATETIME NOT NULL,
"user_id" INTEGER REFERENCES "user" ("id") ,
"text" TEXT NOT NULL
);
CREATE TABLE "user" (
"id" INTEGER NOT NULL PRIMARY KEY,
"name" VARCHAR(255) NOT NULL
);
I now want to do a query which gives me *a list of users for which the last created Assign.is_assigned == False and the last created Message is later than the last created Assign*. So I now have the following (pseudo) query:
SELECT *
FROM user
WHERE ((IF (
SELECT is_assigned
FROM assign
WHERE assign.user_id = user.id
ORDER BY created DESC LIMIT 1
) = False)
AND ((
SELECT created
FROM message
WHERE message.user_id = user.id
ORDER BY created DESC
LIMIT 1
) > (
SELECT created
FROM assign
WHERE assign.user_id = user.id
ORDER BY created DESC
LIMIT 1))
);
This makes sense to me, but unfortunately not to the computer. I guess I need to make use of case statements or even joins or something but I have no clue how. Does anybody have a tip on how to do this?
You don't need the IF in there, and SQLite has no False, but otherwise, your query is quite correct:
SELECT *
FROM "user"
WHERE NOT (SELECT is_assigned
FROM assign
WHERE user_id = "user".id
ORDER BY created DESC
LIMIT 1)
AND (SELECT created
FROM message
WHERE user_id = "user".id
ORDER BY created DESC
LIMIT 1
) > (
SELECT created
FROM assign
WHERE user_id = "user".id
ORDER BY created DESC
LIMIT 1)
Try following query I have created in mysql
SELECT u.id AS 'user',u.name AS 'User_Name', ass.created AS 'assign_created',ass.is_assigned AS 'is_assigned',
msg.created AS 'message_created'
FROM `user` AS u
LEFT JOIN `assign` AS ass ON ass.`user_id` = u.`id`
LEFT JOIN `message` AS msg ON msg.`user_id` = u.id
LEFT JOIN (SELECT u.id AS 'user_id',u.name AS 'username',ass.created AS 'max_ass_created',ass.is_assigned AS 'assigned'
FROM `user` AS u
LEFT JOIN `assign` AS ass ON ass.`user_id` = u.`id`
LEFT JOIN `message` AS msg ON msg.`user_id` = u.`id`
GROUP BY u.id ORDER BY ass.created DESC) AS sub ON sub.user_id = u.id
WHERE (sub.assigned IS FALSE AND msg.created < sub.max_ass_created)
check SQL Fiddle of your scenario
hope this will solve your problem !

Complex ARel query

I've got a complicated query that I can't wrap my head around (using either sql or ActiveRecord) Here are my models:
class Contact
has_many :profile_answers
end
class ProfileAnswer
belongs_to :contact
belongs_to :profile_question
end
class ProfileQuestion
has_many :profile_answers
end
I'm trying to find the number of ProfileAnswers for two contacts that have the same value for a particular ProfileQuestion. In other words:
Get the total number of profile answers that two contacts have answered with the same value for a particular profile_question
I don't want to make multiple queries and filter as I know this is possible with Sql only, i just don't know how to do it
I had considered a self join of profile_answers on profile_question_id then filtering by value being equal, but i still can't wrap my head around that. Any help is greatly appreciated.
I think this will do:
SELECT COUNT(DISTINCT profile_question_id)
FROM
( SELECT profile_question_id
FROM ProfileAnswer an
JOIN ProfileQuestion qu
ON qu.id = an.profile_question_id
WHERE contact_id IN ( id1, id2 )
GROUP BY profile_question_id
, value
HAVING COUNT(*) = 2
) AS grp
And the JOIN seems not be used. So, if ProfileAnswer.profile_question_id is NOT NULL, this will suffice:
SELECT COUNT(*)
FROM
( SELECT profile_question_id
FROM ProfileAnswer
WHERE contact_id IN ( id1, id2 )
GROUP BY profile_question_id
, value
HAVING COUNT(*) = 2
) AS grp
EDITED for two specific contacts (with ids id1 and id2).
Added the WHERE and changed the COUNT (DINSTINCT ) to COUNT(*).
Perhaps this version with JOIN can be more easily adapted to ActiveRecord.
Using JOIN
SELECT COUNT(*)
FROM ProfileAnswer a
JOIN ProfileAnswer b
ON a.profile_question_id = b.profile_question_id
AND a.value = b.value
WHERE a.contact_id = id1
AND b.contact_id = id2
Here's how I ended up doing it, thanks again #ypercube:
class ProfileAnswer < ActiveRecord::Base
def self.for_contacts(*contacts)
where :contact_id => contacts.collect(&:id)
end
def self.common_for_contacts(*contacts)
select(:profile_question_id).for_contacts(*contacts).group(:profile_question_id, :value).having("count(*) = #{contacts.length}")
end
def self.common_count_for_contacts(*contacts)
find_by_sql("select count(*) as answer_count from (#{common_for_contacts(*contacts).to_sql})").first.answer_count
end
end
# Usage
ProfileAnswer.common_count_for_contacts(contact1, contact2[, contact3...])
Still had to use a find_by_sql in the end for the nested select... not sure if there's any way around that ??
Also annoying that find_by_sql returns an array, so I had to use .first which then gives me the object that has my answer_count property on it.