Aggregating SQL results, two tables, then summarizing results - sql

Using MSSQL
I have a table of users, and a table of products to which they are subscribed. Those subscriptions are either Free (F) or Paid (P). I have joined the tables, converted the F/P value to numeric using a case statement, and then summed up those values by user ID, the idea being that anyone with only free accounts will have a sum of 0, those with at least one paid account a value 1 or greater. I've gotten this far with the following:
SELECT t1.user_id, SUM(
CASE
WHEN t2.free_paid = 'P'
THEN 1
ELSE 0
END ) as highest
FROM users t1 INNER JOIN accounts t2
ON t1.user_id = t2.user_id
WHERE t2.account = 'A'
GROUP BY t1.user_id
ORDER BY t1.user_id
This yields a result like:
755 2
1259 2
2031 1
3888 0
Meaning that all but 3888 have at least one paid account.
But now I would like to simply add those up somehow to get our two values, one a count of users with at least one paid account (3 in example), and a count of those with only free accounts (1 in example).
I tried declare two variables, e.g. #free and #paid and using a case statement to add to those values by wrapping that around the above and treating it as a subquery, but I am unable to get that run.
Any ideas?

Re-using the query from the question you can create a CTE (or a subquery) and aggregate the results:
;WITH CTE_UserAccounts AS (
SELECT t1.user_id, SUM(
CASE
WHEN t2.free_paid = 'P'
THEN 1
ELSE 0
END ) as highest
FROM users t1 INNER JOIN accounts t2
ON t1.user_id = t2.user_id
WHERE t2.account = 'A'
GROUP BY t1.user_id
)
SELECT
COUNT(CASE WHEN highest > 0 THEN 1 END) AS [Paid],
COUNT(CASE WHEN highest = 0 THEN 1 END) AS [Free]
FROM
CTE_UserAccounts;

SELECT
COUNT(DISTINCT CASE WHEN t2.free_paid = 'P' THEN t1.user_id END) as atleast_one_paid,
COUNT(DISTINCT CASE WHEN t2.free_paid <> 'P' THEN t1.user_id END) as onlyfree
FROM users t1
INNER JOIN accounts t2 ON t1.user_id = t2.user_id
WHERE t2.account = 'A'

You could just wrap around your original query and sum it up
SELECT SUM (case when highest > 0 THEN 1 else 0 END) as UsersWithPaidAccount,
SUM (case when highest = 0 THEN 1 else 0 END) as UsersWithOnlyFreeAccount
FROM (SELECT t1.user_id, SUM(
CASE
WHEN t2.free_paid = 'P'
THEN 1
ELSE 0
END ) as highest
FROM users t1 INNER JOIN accounts t2
ON t1.user_id = t2.user_id
WHERE t2.account = 'A'
GROUP BY t1.user_id)
as DerivedTable

Related

SQL select and order by (count of related records inside a second table) subtracted by (count of related records inside second table with condition)

I am trying to SELECT every record in a table one ORDERed BY the COUNT of records inside a second table that store the related primary key value and WHERE 'positive' is true, subtracted by the COUNT of records inside the second table that store the related primary key value and WHERE 'positive' is false.
Here is my database structure
Table 1
id
data
0
zero
1
one
2
two
3
three
Table 2
id
related_tableone_id
positive
0
1
0
1
2
1
2
2
0
3
2
1
4
3
1
5
3
1
Here is what I am trying to get
id
data
subtracted_counts (i dont need this but these values are what the records should be ordered by)
3
three
2
2
two
1
0
zero
0
1
one
-1
For better understanding on what i want to achieve:
This database structure can be compared with a voting system, where
Table 1 are entities that can be voted up or voted down.
In this case, Table 2 would store the votes with positive=true for an upvote and positive=false for a downvote.
The goal is to get all entities ORDERed BY their summarized vote value.
(Within a single query)
My research
I found this post
SQL - How To Order Using Count From Another Table, tho there is no subtraction logic
I tried this query
SELECT
tableone.*,
COUNT(related_tableone_id) - COUNT(negative_related_tableone_id) AS subtracted_count
FROM
tableone
LEFT JOIN
(SELECT related_tableone_id
FROM tabletwo
WHERE positive = true) AS positives ON tableone.id = positives.related_tableone_id
LEFT JOIN
(SELECT related_tableone_id AS negative_related_tableone_id
FROM tabletwo
WHERE positive = false) AS negatives ON tableone.id = negatives.negative_related_tableone_id
GROUP BY
tableone.id
ORDER BY
subtracted_count DESC;
But it doesn't subtract the counts right for some reason and there is probably a more clear solution
Use a LEFT join of Table1 to Table2 and conditional aggregation in the ORDER BY clause:
SELECT t1.id, t1.data
FROM Table1 t1 LEFT JOIN Table2 t2
ON t2.related_tableone_id = t1.id
GROUP BY t1.id
ORDER BY SUM(CASE t2.positive WHEN true THEN 1 WHEN false THEN -1 ELSE 0 END) DESC;
or, a correlated subquery in the ORDER BY clause (which may perform better):
SELECT t1.*
FROM Table1 t1
ORDER BY (
SELECT COALESCE(SUM(CASE t2.positive WHEN true THEN 1 WHEN false THEN -1 END) , 0)
FROM Table2 t2
WHERE t2.related_tableone_id = t1.id
) DESC;
See the demo (works in MySql, Postgresql and SQLite).
A single subquery can count both upvotes and downvotes, using conditional aggregation. I would use a lateral join to do the computation:
select t1.*, t2.*
from tableone t1
cross join lateral (
select
sum(case when t2.positive = true then 1 else 0 end) upvotes,
sum(case when t2.positive = false then 1 else 0 end) downvotes
from tabletwo t2
where t2.related_tableone_id = t1.id
) t2
order by t2.upvotes - t2.downvotes desc, t1.id
Depending on your database, the lateral join might be introduced by cross applyinstead (eg in Oracle or SQL Server).

How to avoid `missing expression` error in sql

I'd like to count number of each product by following sql
but I suffered following error..
ORA-00936: Missing Expression
Where is wrong with my sql.
If someone has opinion,please let me know.
Thanks
select t.customer,
count(case when left(t.product,2)='AD' then 1 end) as A,
count(case when left(t.product,1)='F' then 1 end) as B,
count(case when left(t.product,1)='H' then 1 end) as C,
from table t
left join table2 t2
on t.customer = t2.customer
where t2.type='VLI'
group by t.customer
Oracle doesn't have LEFT() function while MySQL does, use SUBSTR() instead. And remove the comma, which's typo , at the end of the fourth line
SELECT t.customer,
COUNT(CASE
WHEN SUBSTR(t.product, 1, 2) = 'AD' THEN
1
END) AS A,
COUNT(CASE
WHEN SUBSTR(t.product, 1, 1) = 'F' THEN
1
END) AS B,
COUNT(CASE
WHEN SUBSTR(t.product, 1, 1) = 'H' THEN
1
END) AS C
FROM yourtable t
LEFT JOIN table2 t2
ON t.customer = t2.customer
WHERE t2.type = 'VLI'
GROUP BY t.customer
You have extra comma in row count(case when left(t.product,1)='H' then 1 end) as C,.
Delete the last comma and this error will go away.
I would suggest LIKE:
select t.customer,
sum(case when t.product like 'AD%' then 1 else 0 end) as A,
sum(case when t.product like 'F%' then 1 else 0 end) as B,
sum(case when t.product like 'H%' then 1 else 0 end) as C
from table t left join
table2 t2
on t.customer = t2.customer and t2.type='VLI'
group by t.customer;
Notes:
With like, you don't have to worry about an argument to a substring function matching the length of the matched string. I learned about the issues of inconsistencies there a long, long, long time ago.
I prefer sum() in this case over count(), but that is mostly aesthetic.
You are using a left join, but filtering in the where clause. This turns the join into an inner join. Either change the join to inner join or move the condition to the on clause (as this does).

SQL using COUNT and CASE for related table - error is returned

I have table T1:
ID GROUPINFO STATUS
1 GROUP1 NEW
2 GROUP1 INPROG
3 GROUP2 INPROG
I have table T2 also
T2ID T1ID STATUS
1 1 NEW
2 2 NEW
3 2 VENDOR
4 3 NEW
5 3 VENDOR
I want to count by group how many of those records have the status NEW, and how many of those records were in the status VENDOR (information from T2 table)
This SQL works
SELECT T1.GROUPINFO,
count (case when T1.status='NEW' then 1 end) as New
FROM T1
GROUP BY T1.GROUPINFO
However when I try to add how many of those records were in the VENDOR status I am getting an error:
SELECT T1.GROUPINFO,
count (case when T1.status='NEW' then 1 end) as New,
count (case when T1.ID in (select T2.ID from T2 where T2.status='VENDOR') then 1 end) as Vendor
FROM T1
GROUP BY T1.GROUPINFO
I am getting an error:
Bad Plan; Unresolved QNC found
I am not sure what I am doing wrong and how should I organize my SQL query so I could get this result for this example:
GROUPINFO NEW VENDOR
GROUP1 1 1
GROUP2 0 1
If ID exists multiple times in T2 it should only be counted 1 time because this is one record for which I want to check if it was in that status or not.
Update:
I also tried EXISTS:
SELECT T1.GROUPINFO,
count (case when T1.status='NEW' then 1 end) as New,
count (case when exists (select 1 from T2 where T2.ID=T1.ID and T2.status='VENDOR') then 1 end) as Vendor
FROM T1
GROUP BY T1.GROUPINFO
but again having an SQL error: regarding the ID column
Update:
I also tried EXISTS and JOIN:
SELECT T1.GROUPINFO,
count (case when T1.status='NEW' then 1 end) as New,
count (case when exists (select 1 from T2 where T2.ID=T1.ID and T2.status='VENDOR') then 1 end) as Vendor
FROM T1 LEFT OUTER JOIN T2 ON T1.ID=T2.ID
GROUP BY T1.GROUPINFO
but again having an SQL error: regarding the ID column
select T3.Groupinfo,
count (distinct(case when t3.status="NEW" then t3.ID end)) as new,
count (distinct(case when t3.status_1 ="VENDOR" then t3.id end)) as Vendor
from
(select t1.groupinfo,t1.id,t1.status,t2.t2id as id1,t2.status as status_1 from
t1 join t2
on t1.id=t2.t1id
) as t3
group by t3.Groupinfo;
Something Similar to this might help you i guess.

SQL JOIN WITH TWO WHERE CLAUSES

I would like to implement the following SQL query : suppose using JOIN clause, due to now it's running quite slow:
SELECT ID_USER, NICK
FROM TABLE1
WHERE ID_USER IN
(
SELECT ID_INDEX1
FROM TABLE2
WHERE ID_INDEX2 = '2'
)
AND ID_USER NOT IN
(
SELECT ID_INDEX2
FROM TABLE2
WHERE ID_INDEX1 = '2' AND GO ='NO'
)
ORDER BY NICK ASC
You could do the "including" part with INNER JOIN and the "excluding" part with a "LEFT JOIN" + filtering:
SELECT DISTINCT t1.ID_USER, t1.NICK
FROM TABLE1 t1
INNER JOIN TABLE2 t2IN
ON t1.ID_USER = t2IN.ID_INDEX1
AND t2IN.ID_INDEX2 = '2'
LEFT JOIN TABLE2 t2OUT
ON t1.ID_USER = t2OUT.ID_INDEX2
AND t2OUT.ID_INDEX1 = '2'
AND t2OUT.GO = 'NO'
WHERE t2OUT.ID_INDEX IS NULL
ORDER BY t1.NICK ASC
Assuming that you want to filter by ID_INDEX1 in both cases (see my comment on your question), you can:
count the number of rows per user in table2 with value = 2
count the number of rows per user in table2 with value = 2 and go = 'NO'
return only those for which the first count is greater than 0 and the second count equals 0
i.e.:
select * from (
select
id_user,
nick,
sum(case when table2.id_index2 = '2' then 1 else 0 end) as count2_overall,
sum(case when table2.id_index2 = '2' and go = 'NO' then 1 else 0 end) as count2_no
from table1
join table2 on table1.id_user = table2.id_index1
group by id_user, nick
)
where count2_overall > 0 and count2_no = 0

GROUP BY CLAUSE USE

I have a table with different payment method types: CHECK, MONEY ORDER, CASH. I need to show the break down of different payment types, following is my query. Could anyone suggest or comment how to do it optimally.
select
COUNT(T3.PAY_METHOD),
T1.CLAIM_ID
from TB1 T3
JOIN TB2 T2 ON T3.CLAIM_ID = T2.CLAIM_ID
GROUP BY T3.PAYMENT_METHOD_CD
Following is output column should look
|CHECK|MONEY ORDER|CASHIER'S CHECK|CASH|CREDIT CARD|
Displays the total count of all payments received via CHECK that were applied to the specific LIABLE INDIVIDUAL.
If you really need the output as columns, then you will want to use a SUM(case when method = '' then 0 else 1 end) type solution. Like this.
select t1.claim_id,
sum(case when t3.pay_method = 'CASH' then 1 else 0 end) as "CASH",
sum(case when t3.pay_method = 'MONEY ORDER' then 1 else 0 end) as "MONEY ORDER",
sum(case when t3.pay_method = 'CHECK' then 1 else 0 end) as "CHECK",
T1.CLAIM_ID
from TB1 T3
JOIN TB2 T2 ON T3.CLAIM_ID = T2.CLAIM_ID
GROUP BY T1.CLAIM_ID
But it would be much simpler if you just want them each as rows.
select T1.CLAIM_ID, T3.PAY_METHOD, COUNT(*)
from TB1 T3
JOIN TB2 T2 ON T3.CLAIM_ID = T2.CLAIM_ID
GROUP BY T1.CLAIM_ID, T3.PAY_METHOD
sql would be:
select T3.PAY_METHOD
, COUNT(T3.PAY_METHOD) records
from TB1 T3
JOIN TB2 T2 ON T3.CLAIM_ID = T2.CLAIM_ID
GROUP BY T3.PAY_METHOD
You can use application code to format the output.