PostgreSQL Filter condition by subquery output - sql

I have the query with subquery for result output as yesterday_sum column. I need to filter only rows where yesterday_sum > 1 but can't add HAVING sum(p.profit_percent) / :workDays > 1 AND yesterday_sum > 1 because yesterday_sum is not part of the GROUP BY. And I can't add a condition for positions p because yesterday_sum is not the positions column.
SELECT u.id AS id,
u.nickname AS title,
sum(p.profit_percent) / :workDays AS middle,
(
SELECT sum(ps.profit_percent)
FROM positions ps
WHERE ps.user_id = u.id
AND ps.open_at BETWEEN
:dateYesterday AND
:dateYesterday + INTERVAL '1 day'
GROUP BY (ps.user_id)
) AS yesterday_sum
FROM positions p
INNER JOIN users u ON u.id = p.user_id
AND p.profit_percent IS NOT NULL
AND p.parent_ticket IS NULL
AND p.close_at IS NOT NULL
AND p.open_at BETWEEN :dateFrom AND :dateTo
GROUP BY (u.id, u.nickname)
HAVING sum(p.profit_percent) / :workDays > 1
ORDER BY middle DESC;
How can I get rid of rows with yesterday_sum column less than 1 and NULL?

To use the column yesterday_sum in a WHERE clause you can produce the named column by placing the query as a subquery of the main one. Then, filtering is trivial.
For example, you can do:
select *
from (
SELECT u.id AS id,
u.nickname AS title,
sum(p.profit_percent) / :workDays AS middle,
(
SELECT sum(ps.profit_percent)
FROM positions ps
WHERE ps.user_id = u.id
AND ps.open_at BETWEEN
:dateYesterday AND
:dateYesterday + INTERVAL '1 day'
GROUP BY (ps.user_id)
) AS yesterday_sum
FROM positions p
INNER JOIN users u ON u.id = p.user_id
AND p.profit_percent IS NOT NULL
AND p.parent_ticket IS NULL
AND p.close_at IS NOT NULL
AND p.open_at BETWEEN :dateFrom AND :dateTo
GROUP BY (u.id, u.nickname)
HAVING sum(p.profit_percent) / :workDays > 1
) x
where yesterday_sum >= 1 -- yesterday_sum is available here
ORDER BY middle DESC;

Related

SQL query to conditionally select a field value

I have an SQL query that joins 3 tables to return me the required data. The query is as follows:
SELECT (s.user_created,
u.first_name,
u.last_name,
u.email,
s.school_name,
p.plan_name,
substring( a.message, 11), u.phone1)
FROM cb_school s
inner join ugrp_user u
on s.user_created = u.user_id
inner join cb_plan p
on s.current_plan = p.plan_id
inner join audit a
on u.user_id = a.user_id
where s.type = 'sample'
and a.module_short = 'sample-user'
and s.created_time > current_timestamp - interval '10 day';
The query works fine if all the attributes are present. Now for few of my rows, the following value would be a.module_short = 'sample-user' missing. But since I have included it as an AND condition, those rows will not be returned. I am trying to return an empty string for that field if it is present, else the value as per my current query. Is there any way to achieve this.
Think you could possibly use a CASE WHEN statement, like this:
SELECT CASE WHEN a.module_short = 'sample-user' THEN a.module_short
ELSE '' END AS a.module_short
FROM TableA
you can use COALESCE it returns the first not null.
SELECT COALESCE(a.module_short,'')
FROM TableA AS a
SELECT (s.user_created,
u.first_name,
u.last_name,
u.email,
s.school_name,
p.plan_name,
substring( a.message, 11), u.phone1)
FROM cb_school s
INNER JOIN ugrp_user u
ON s.user_created = u.user_id
INNER JOIN cb_plan p
ON s.current_plan = p.plan_id
INNER JOIN audit a
ON u.user_id = a.user_id
AND a.module_short = 'sample-user'
WHERE s.type = 'sample'
AND s.created_time > current_timestamp - interval '10 day';
You want to show all users that have at least one module_short.
If the module_short contains 'sample-user' then it should show it, else it should show NULL as module_short. You only want 1 row per user, even if it has multiple module_shorts.
You can use a CTE, ROW_NUMBER() and the CASE clause for this question.
Example Question
I have 3 tables.
Users: Users with an ID
Modules: Modules with an ID
UserModules: The link between users and modules. You user can have multiple models.
I need a query that returns me all users that have at least 1 module with 2 columns UserName and ModuleName.
I only one 1 row for each user. The ModuleName should only display SQL if the user has that module. Else it should display no module.
Example Tables:
Users:
id name
1 Manuel
2 John
3 Doe
Modules:
id module
1 StackOverflow
2 SQL
3 StackExchange
4 SomethingElse
UserModules:
id module_id user_id
1 1 2
2 1 3
4 2 2
5 2 3
6 3 1
7 3 3
8 4 1
9 4 3
Example Query:
with CTE as (
select
u.name as UserName
, CASE
WHEN m.module = 'SQL' THEN 'SQL' ELSE NULL END as ModuleName
, ROW_NUMBER() OVER(PARTITION BY u.id
ORDER BY (CASE
WHEN m.module = 'SQL' THEN 'Ja' ELSE NULL END) DESC) as rn
from UserModules as um
inner join Users as u
on um.user_id = u.id
inner join Modules as m
on um.module_id = m.id
)
select UserName, ModuleName from CTE
where rn = 1
Example Result:
UserName ModuleName
Manuel NULL
John SQL
Doe SQL
Your query would look like this:
with UsersWithRownumbersBasedOnModule_short as (
SELECT s.user_created,
u.first_name,
u.last_name,
u.email,
s.school_name,
p.plan_name,
substring( a.message, 11),
u.phone1)
CASE
WHEN a.module_short = 'sample-user'
THEN a.module_short
ELSE NULL
END AS ModuleShort
ROW_NUMBER() OVER(PARTITION BY u.user_id ORDER BY (
CASE
WHEN a.module_short = 'sample-user'
THEN a.module_short
ELSE NULL
END) DESC) as rn
FROM cb_school s
inner join ugrp_user u
on s.user_created = u.user_id
inner join cb_plan p
on s.current_plan = p.plan_id
inner join audit a
on u.user_id = a.user_id
where s.type = 'sample'
and s.created_time > current_timestamp - interval '10 day';)
select * from UsersWithRownumbersBasedOnModule_short
where rn = 1
PS: I removed a lose bracket after SELECT and your SUBSTRING() is missing 1 parameter, it needs 3.

Remove grouped data set when total of count is zero with subquery

I'm generating a data set that looks like this
category user total
1 jonesa 0
2 jonesa 0
3 jonesa 0
1 smithb 0
2 smithb 0
3 smithb 5
1 brownc 2
2 brownc 3
3 brownc 4
Where a particular user has 0 records in all categories is it possible to remove their rows form the set? If a user has some activity like smithb does, I'd like to keep all of their records. Even the zeroes rows. Not sure how to go about that, I thought a CASE statement may be of some help but I'm not sure, this is pretty complicated for me. Here is my query
SELECT DISTINCT c.category,
u.user_name,
CASE WHEN (
SELECT COUNT(e.entry_id)
FROM category c1
INNER JOIN entry e1
ON c1.category_id = e1.category_id
WHERE c1.category_id = c.category_id
AND e.user_name = u.user_name
AND e1.entered_date >= TO_DATE ('20140625','YYYYMMDD')
AND e1.entered_date <= TO_DATE ('20140731', 'YYYYMMDD')) > 0 -- I know this won't work
THEN 'Yes'
ELSE NULL
END AS TOTAL
FROM user u
INNER JOIN role r
ON u.id = r.user_id
AND r.id IN (1,2),
category c
LEFT JOIN entry e
ON c.category_id = e.category_id
WHERE c.category_id NOT IN (19,20)
I realise the case statement won't work, but it was an attempt on how this might be possible. I'm really not sure if it's possible or the best direction. Appreciate any guidance.
Try this:
delete from t1
where user in (
select user
from t1
group by user
having count(distinct category) = sum(case when total=0 then 1 else 0 end) )
The sub query can get all the users fit your removal requirement.
count(distinct category) get how many category a user have.
sum(case when total=0 then 1 else 0 end) get how many rows with activities a user have.
There are a number of ways to do this, but the less verbose the SQL is, the harder it may be for you to follow along with the logic. For that reason, I think that using multiple Common Table Expressions will avoid the need to use redundant joins, while being the most readable.
-- assuming user_name and category_name are unique on [user] and [category] respectively.
WITH valid_categories (category_id, category_name) AS
(
-- get set of valid categories
SELECT c.category_id, c.category AS category_name
FROM category c
WHERE c.category_id NOT IN (19,20)
),
valid_users ([user_name]) AS
(
-- get set of users who belong to valid roles
SELECT u.[user_name]
FROM [user] u
WHERE EXISTS (
SELECT *
FROM [role] r
WHERE u.id = r.[user_id] AND r.id IN (1,2)
)
),
valid_entries (entry_id, [user_name], category_id, entry_count) AS
(
-- provides a flag of 1 for easier aggregation
SELECT e.[entry_id], e.[user_name], e.category_id, CAST( 1 AS INT) AS entry_count
FROM [entry] e
WHERE e.entered_date BETWEEN TO_DATE('20140625','YYYYMMDD') AND TO_DATE('20140731', 'YYYYMMDD')
-- determines if entry is within date range
),
user_categories ([user_name], category_id, category_name) AS
( SELECT u.[user_name], c.category_id, c.category_name
FROM valid_users u
-- get the cartesian product of users and categories
CROSS JOIN valid_categories c
-- get only users with a valid entry
WHERE EXISTS (
SELECT *
FROM valid_entries e
WHERE e.[user_name] = u.[user_name]
)
)
/*
You can use these for testing.
SELECT COUNT(*) AS valid_categories_count
FROM valid_categories
SELECT COUNT(*) AS valid_users_count
FROM valid_users
SELECT COUNT(*) AS valid_entries_count
FROM valid_entries
SELECT COUNT(*) AS users_with_entries_count
FROM valid_users u
WHERE EXISTS (
SELECT *
FROM user_categories uc
WHERE uc.user_name = u.user_name
)
SELECT COUNT(*) AS users_without_entries_count
FROM valid_users u
WHERE NOT EXISTS (
SELECT *
FROM user_categories uc
WHERE uc.user_name = u.user_name
)
SELECT uc.[user_name], uc.[category_name], e.[entry_count]
FROM user_categories uc
INNER JOIN valid_entries e ON (uc.[user_name] = e.[user_name] AND uc.[category_id] = e.[category_id])
*/
-- Finally, the results:
SELECT uc.[user_name], uc.[category_name], SUM(NVL(e.[entry_count],0)) AS [entry_count]
FROM user_categories uc
LEFT OUTER JOIN valid_entries e ON (uc.[user_name] = e.[user_name] AND uc.[category_id] = e.[category_id])
Here's another method:
WITH totals AS (
SELECT
c.category,
u.user_name,
COUNT(e.entry_id) AS total,
SUM(COUNT(e.entry_id)) OVER (PARTITION BY u.user_name) AS user_total
FROM
user u
INNER JOIN
role r ON u.id = r.user_id
CROSS JOIN
category c
LEFT JOIN
entry e ON c.category_id = e.category_id
AND u.user_name = e.user_name
AND e1.entered_date >= TO_DATE ('20140625', 'YYYYMMDD')
AND e1.entered_date <= TO_DATE ('20140731', 'YYYYMMDD')
WHERE
r.id IN (1, 2)
AND c.category_id IN (19, 20)
GROUP BY
c.category,
u.user_name
)
SELECT
category,
user_name,
total
FROM
totals
WHERE
user_total > 0
;
The totals derived table calculates the totals per user and category as well as totals across all categories per user (using SUM() OVER ...). The main query returns only rows where the user total is greater than zero.

oracle group by can't get last payment_date, payment_sum [duplicate]

This question already has answers here:
ORA-00979 not a group by expression
(10 answers)
Closed 8 years ago.
I have the following query:
SELECT
p.id,
last_date_ps.pay_date last_pay_date
FROM projects p
LEFT JOIN
(
SELECT
pp.project_id,
max(pp.pay_date) AS pay_date,
pp.pay_sum
FROM project_partuals pp
WHERE pp.status IN (2, 4) AND pp.pay_sum > 0 AND pp.pay_date IS NOT NULL
GROUP BY pp.project_id
) last_date_ps ON last_date_ps.project_id = p.id,
contacts c
WHERE (p.debtor_contact_id = c.id)
ORDER BY priority_value DESC, name_f ASC;
and I get this error:
Error: ORA-00979: not a GROUP BY expression
SQLState: 42000
ErrorCode: 979
Position: 216
When I remove pp.pay_sum query works. How can I get in the left join (.... pay_date and pay_sum ORDER BY date DESC (Maximum date)?
If you want pay_sum per project as a result from the inner query you need to aggregate it:
(select pp.project_id, max(pp.pay_date) as pay_date, sum(pp.pay_sum) as pay_sum
from project_partuals pp
where pp.status in (2,4) and pp.pay_sum > 0 and pp.pay_date is not null
group by pp.project_id ) last_date_ps
If you want only the last payment per project, the inner query should be:
(select project_id, pay_date, pay_sum FROM
(select pp.project_id, pp.pay_date, pp.pay_sum,
row_number() over (PARTITION by pp.project_id order by pp.pay_date desc) rn
from project_partuals pp where pp.status in (2,4) and pp.pay_sum > 0 and pp.pay_date is not null
) X where X.rn = 1)
You have to include all other selected columns apart from constants and those in aggregate functions in your query if you are using group by clause, so include pp.pay_sum in it :
select
p.id,
last_date_ps.pay_date last_pay_date
from projects p
left join
(select pp.project_id,max(pp.pay_date) as pay_date, pp.pay_sum
from project_partuals pp
where pp.status in (2,4) and pp.pay_sum > 0 and pp.pay_date is not null
group by pp.project_id, pp.pay_sum
) last_date_ps on last_date_ps.project_id = p.id
join contacts c on (p.debtor_contact_id=c.id)
order by priority_value desc, name_f asc

SQL Null Data Canceling Out All of Calculation

I have this calculation/query here:
SELECT u.username,
(a.totalCount * 7) +
(b.totalCount * 3) +
(c.totalCount * 1) AS totalScore
FROM users u
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM items
GROUP BY user_id
) a ON a.user_id= u.user_id
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM comments
GROUP BY user_id
) b ON b.user_id= u.user_id
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM ratings
GROUP BY user_id
) c ON c.user_id = u.user_id
ORDER BY totalScore DESC LIMIT 10;
The problem is, if either a,b, or c returns 0, the entire totalScore is 0. I can't figure out what is going on? I am not multiplying the final tally by 0 I don't think?
I think it's rather a null problem (with your LEFT JOIN, this might easily happen).
And NULL + 1 + 2 = NULL
So use the COALESCE (if null then...) operator
so
SELECT u.username,
(COALESCE(a.totalCount, 0) * 7) +
(COALESCE(b.totalCount, 0) * 3) +
(COALESCE(c.totalCount, 0) * 1) AS totalScore
and a little SqlFiddle

Postgres CASE in ORDER BY using an alias

I have the following query which works great in Postgres 9.1:
SELECT users.id, GREATEST(
COALESCE(MAX(messages.created_at), '2012-07-25 16:05:41.870117'),
COALESCE(MAX(phone_calls.created_at), '2012-07-25 16:05:41.870117')
) AS latest_interaction
FROM users LEFT JOIN messages ON users.id = messages.user_id
LEFT JOIN phone_calls ON users.id = phone_calls.user_id
GROUP BY users.id
ORDER BY latest_interaction DESC
LIMIT 5;
But what I want to do is something like this:
SELECT users.id, GREATEST(
COALESCE(MAX(messages.created_at), '2012-07-25 16:05:41.870117'),
COALESCE(MAX(phone_calls.created_at), '2012-07-25 16:05:41.870117')
) AS latest_interaction
FROM users LEFT JOIN messages ON users.id = messages.user_id
LEFT JOIN phone_calls ON users.id = phone_calls.user_id
GROUP BY users.id
ORDER BY
CASE WHEN(
latest_interaction > '2012-09-05 16:05:41.870117')
THEN 0
WHEN(latest_interaction > '2012-09-04 16:05:41.870117')
THEN 2
WHEN(latest_interaction > '2012-09-04 16:05:41.870117')
THEN 3
ELSE 4
END
LIMIT 5;
And I get the following error:
ERROR: column "latest_interaction" does not exist
It seems like I cannot use the alias for the aggregate latest_interaction in the order by clause with a CASE statement.
Are there any workarounds for this?
Try to wrap it as a subquery:
SELECT *
FROM
(
SELECT users.id,
GREATEST(
COALESCE(MAX(messages.created_at), '2012-07-25 16:05:41.870117'),
COALESCE(MAX(phone_calls.created_at), '2012-07-25 16:05:41.870117')
) AS latest_interaction
FROM users LEFT JOIN messages ON users.id = messages.user_id
LEFT JOIN phone_calls ON users.id = phone_calls.user_id
GROUP BY users.id
) Sub
ORDER BY
CASE WHEN(
latest_interaction > '2012-09-05 16:05:41.870117')
THEN 0
WHEN(latest_interaction > '2012-09-04 16:05:41.870117')
THEN 2
WHEN(latest_interaction > '2012-09-04 16:05:41.870117')
THEN 3
ELSE 4
END
LIMIT 5;
The PG manual says the ORDER BY expression:
Each expression can be the name or ordinal number of an output column (SELECT list item), or it can be an arbitrary expression formed from input-column values.
The sub-query solution from #Mahmoud will work, or you can create the ORDER BY using the original columns messages.created_at or phone_calls.created_at