Using CASE to properly count items with if/else logic in SQL - sql

Say I have a table table in the form:
| user | class |
|------|-------|
| 1 | a |
| 1 | b |
| 1 | b |
| 2 | b |
| 3 | a |
There are only two classes.
I want to write a query such that we count the number of users in each class such that any user who has label a and b gets sorted into a, any user with just a gets sorted into a and then any user with just b gets into b. If applied to the table snippet above we would get:
| class | count |
|-------|-------|
| a | 2 |
| b | 1 |
Also acceptable is the transpose, like:
| a | b |
|---|---|
| 2 | 1 |
My current solution involves two CTEs:
WITH a_users AS
(
SELECT
user,
SUM(CASE WHEN class = 'a' THEN 1 ELSE 0 END) AS a_class
FROM
table
WHERE
class in ('a', 'b')
GROUP BY
user
),
labeled_users as (
SELECT
user,
CASE WHEN a_class >=1 then 'a' ELSE 'b' END as label
FROM
a_users
)
SELECT
label,
COUNT(DISTINCT user)
FROM
labeled_users;
Is there a (1) more efficient way to solve for this or (2) a more concise/readable solution?

Basically, you want "a" for a user who has "a" at all. A subquery is the first approach:
select sum(case when num_as > 0 then 1 else 0 end) as num_class_a,
sum(case when num_as = 0 then 1 else 0 end) as num_class_b
from (select user, sum(case when class = 'a' then 1 else 0 end) as num_as
from t
group by user
) t;
With a little trick, you can eliminate the subquery:
select count(distinct case when class = 'a' then user end) as num_as,
count(distinct user) - count(distinct case when class = 'a' then user end) as num_bs
from t;

Something like this should work, if a and b really are your classes. Otherwise adjust the min/max as needed.
; with CTE as (
Select user, min(class) as Class
from Labeled_Users
group by user)
Select Class, count(*)
from CTE
group by Class

Here is a straight forward query to get the job done using a subquery and conditional aggregation. It should return the second version of your expected result (pivoted) :
SELECT
SUM(CASE WHEN x.minc <> x.maxc OR x.maxc = 'a' THEN 1 ELSE 0 END) a,
SUM(CASE WHEN x.minc = x.maxc AND x.maxc = 'b' THEN 1 ELSE 0 END) b
FROM (
SELECT user, MAX(class) maxclass, MIN(class) minclass
FROM mytable
GROUP BY user
) x
The subquery computes the minimum and maximum class of each user. Then the outer query separatly counts users :
a : users that belong to both classes or just to class a
b : users that belong to class b only
This is standard SQL syntax that will work on most RDBMS (obviously, even those who do not support CTEs, such as pre-8.0 MySQL versions).

Using String_agg():
with usr_class as(
SELECT DISTINCT usr,
string_agg(txt,':') as all_class
FROM abc
GROUP BY usr
)
select count(usr),
case when POSITION('a' in all_class)>0 THEN 'a'
ELSE 'b'
END AS CLASS
FROM usr_class
GROUP BY case when POSITION('a' in all_class)>0 THEN 'a'
ELSE 'b'
END;

Related

Calculate difference in Oracle table from sum

I have a table which looks as followed:
ID | Value
A | 2
A | 5
A | 6
B | 1
B | 7
B | -3
I am currently using a statement as followed
select ID, sum(VALUE)
where ...
group by ID.
Now I need the difference from A and B.
Could anyone send me on the right path? I am working with Oracle.
Use conditional aggregation:
SELECT SUM(CASE WHEN id = 'A' THEN "Value" ELSE 0 END) -
SUM(CASE WHEN id = 'B' THEN "Value" ELSE 0 END) "Difference"
FROM tablename;
See the demo.

Improving query that counts distinct values that have particular values in another column

Say I have a table in the format:
| id | category|
|----|---------|
| 10 | A |
| 10 | B |
| 10 | C |
| 2 | C |
I want to count the number of distinct id's that have all three values A, B, and C in the category variable. In this case, the query would return 1 since only for id = 10 is this true.
My intuition is to write the following query to get this value:
SELECT
COUNT(DISTINCT id),
SUM(CASE WHEN category = 'A' THEN 1 else 0 END) AS A,
SUM(CASE WHEN category = 'B' THEN 1 else 0 END) AS B,
SUM(CASE WHEN category = 'C' THEN 1 else 0 END) AS C
FROM
table
GROUP BY
id
HAVING
A >= 1
AND
B >= 1
AND
C >= 1
This feels a bit overwrought though -- is there a simpler way to achieve the desired outcome?
You are close, but you need two levels of aggregation. Assuming no duplicate rows:
SELECT COUNT(*)
FROM (SELECT id
FROM t
WHERE Category IN ('A', 'B', 'C')
GROUP BY id
HAVING COUNT(*) = 3
) t;
I assume this is part of a larger table, your id and categories can appear multiple times and still be distinct due to other fields, and that you know how many categories you're looking for.
SELECT ID, COUNT(ID)
FROM(
SELECT DISTINCT ID, CATEGORY
FROM TABLE)
GROUP BY ID
HAVING COUNT(ID) = 3 --or however many categories you want
Your subquery here removes extraneous info and forces your id to show up once per category. You then count up the number of times it shows up and look up the ones that show up 3 or however many times you want.

How Do you select group that doesnt contain certain value but must have specific values

I have table Order
ID | State |
===================
1 | A |
1 | B |
1 | C |
1 | D |
1 | E |
2 | A |
2 | B |
2 | E |
3 | A |
3 | B |
3 | E |
4 | A |
4 | B |
4 | C |
4 | D |
From where I like to select group of Ids which must have state value B and E AND must not have state value C and D.
From the above table - The right result should have id 2 and 3
Thanks,
SELECT *
FROM Order
WHERE State IN ('B','E')
That's it. The fact that you're stating the value can only be 'B' or 'E' means you're already excluding any values of 'C' or 'D', or anything else really.
Hope this helps:
SELECT id FROM ORDER
WHERE STATE = E AND STATE = B
You may use the set operator: EXCEPT
SELECT ID FROM Order WHERE State IN ('B','E')
EXCEPT
SELECT ID FROM Order WHERE State IN ('C','D')
The following should work (there might be a better alternative using windowing functions or depending on the specific features available in your dbms)
SELECT ID FROM
(
SELECT
ID,
CASE STATE
WHEN 'B' THEN 'Y'
ELSE 'N'
END AS HasB,
CASE STATE
WHEN 'E' THEN 'Y'
ELSE 'N'
END AS HasE,
CASE STATE
WHEN 'C' THEN 'Y'
ELSE 'N'
END AS HasC,
CASE STATE
WHEN 'D' THEN 'Y'
ELSE 'N'
END AS HasD
FROM TABLE
)
GROUP BY ID
HAVING MAX(HasB) = 'Y' AND MAX(HasE) = 'Y' AND MAX(HasC) = 'N' AND MAX(HasD) = 'N'
A simple way to do this uses aggregation and a having clause:
select id
from t
where sum(case when state = 'B' then 1 else 0 end) > 0 and
sum(case when state = 'E' then 1 else 0 end) > 0 and
sum(case when state = 'C' then 1 else 0 end) = 0 and
sum(case when state = 'D' then 1 else 0 end) = 0;
Each condition in the having clause counts the number of times that a given value is present. The = 0 means there are no matches and > 0 means there is at least one.

Multiple Groupings with a sum

My Table :
ID | TIME OF CREATION | OWNER | STATE
1 2015-1-1 arpan A
2 2015-1-2 arpan B
My desired o/p from my query is :
DATE | OWNER | COUNT(STATE = A) | COUNT(STATE = B) | ...
I checked out SUM( CASE ) but you cant group by date and sum by owner right?
Stuck here. :(
Can someone help?
I think you just want conditional aggregation:
select date, owner, sum(case when state = 'A' then 1 else 0 end) as state_A,
sum(case when state = 'B' then 1 else 0 end) as state_b
from table t
group by date, owner;

SQL - Return A when B is exactly x,y,z

Consider the following table with column A and B:
A | B
--+--
1 | A
1 | B
1 | C
2 | A
2 | B
3 | A
3 | C
4 | B
4 | C
I would like to get the value 2 from column A in case my set is [A,B].
IN
select a from table where b IN ('A','B'), that will return value 1 and 2.
Intersect
select a from table where b = 'A'
intersect
select a from table where b = 'B'
intersect
select a from table where b = 'C', that will return 1 however it will not work if I for instance remove the 'B' criteria and only look for [A,C]. Such a query will return 1 and 3.
Is there a smarter way of using sets with one to many relations, or perhaps another approach I just did not think of? I will be using Oracle btw in case any Oracle specific solution should be available.
EDIT:
Use this for testing: SQLFiddle Link
This is an example of a "set-within-sets" query. I like to solve these using aggregation and a having clause for each condition. In this case, there are three conditions: Does a given value for A have a B with a value of 'A'? For 'B'? For anything else?
This results in the query:
select A
from t
group by A
having sum(case when B = 'A' then 1 else 0 end) > 0 and
sum(case when B = 'B' then 1 else 0 end) > 0 and
sum(case when B not in ('A', 'B') then 1 else 0 end) = 0;
Try this
select A
from t
group by A
having count(*)=2 and min(B)='A' and Max(B)='B'