What does THEN in this CASE statement does? - sql

folks!
Could someone please explain to me this CASE statement? I'm puzzled about the THEN user_id, what does it does exactly?
SELECT modal_text,
COUNT(DISTINCT CASE
WHEN ab_group = 'control' THEN user_id
END) AS 'control_clicks'
FROM onboarding_modals
GROUP BY 1
ORDER BY 1;
Thanks in advance!

This is simple aggregation:
COUNT(DISTINCT user_id)
and it counts all the distinct non null user_ids.
But this is conditional aggregation:
COUNT(DISTINCT CASE WHEN ab_group = 'control' THEN user_id END)
and it counts the distinct non null user_ids only if in the same row the column ab_group contains the value 'control'.

For an AB test , the select statement is trying to find out the the count of distinct users in control_group.
So instead of counting all distinct users for each modal_text, the case is counting the user only if it is in control_group i.e. the column ab_group = 'control'

THEN is a conditional statement.
To explain you more clearly,
If your ab_group column has value 'control' then print user_id column
It's similar to if else statement
if (ab_group = 'control')
{
user_id
}
Use below link to understand more,
https://www.w3schools.com/sql/sql_case.asp

Related

Sum of multiple select count distinct with case function

I try to make a sum of multiple select count distinct with case function. For example:
SELECT id_dept,
count(DISTINCT case when e.statut='pub' then id_patients end) AS nb_patients_pub,
count(DISTINCT case when e.statut='priv' then id_patients end) AS nb_patients_priv
FROM venues
I would like to make of these two results in only one columns.
Is it possible?
I think that you want in:
SELECT
id_dept,
COUNT(DISTINCT CASE WHEN e.statut IN ('pub', 'priv') THEN id_patients END) AS nb_patients_pub_and_venues
FROM venues
GROUP BY id_dept
Note that I added a GROUP BY clause to the query, which was initially missing (this is a syntax error in almost all databases).
Depending on your data, this might not do exactly what you want; if a given id_patient has both statuses, then it will be counted only once, whereas your code counted it once in each count(distinct ...). If so, then you can just keep the two separated counts, and sum them:
SELECT
id_dept,
COUNT(DISTINCT CASE WHEN e.statut IN = 'pub' THEN id_patients END)
+ COUNT(DISTINCT CASE WHEN e.statut IN = 'priv' THEN id_patients END)
AS nb_patients_pub_and_venues
FROM venues
GROUP BY id_dept
If you're happy with current code, then either sum (using +) those counts, or use that query as a CTE (or an inline view) and
with test as
(SELECT id_dept,
count(DISTINCT case when e.statut='pub' then id_patients end)
AS nb_patients_pub,
count(DISTINCT case when e.statut='priv' then id_patients end)
AS nb_patients_priv
FROM venues
GROUP BY id_dept
)
select id, nb_patients_pub + nb_patients_priv as result
from test;

count query with group by but exclude users

I have made a sql query:
select Count (UserId)
from PlanData
where AirlineCode='cl'
and DateStart>'2019-07-01'
and DateEnd<'2019-07-31'
group by UserId
This will give me false results because actually I would like to exclude UserIds completely from the query which have GeneralEventCode='code1','code2' in it
From your comment I think that you need a HAVING clause and not a WHERE clause:
select Count(UserId)
from PlanData
where AirlineCode='cl'
and DateStart>'2019-07-01'
and DateEnd<'2019-07-31'
group by UserId
having Count(case when GeneralEventCode in ('code1', 'code2') then 1 end) = 0
This conditional COUNT() in the HAVING clause will filter out any user with any occurrence of 'code1' or 'code2' in the column GeneralEventCode.
Also recheck your condition about the dates, maybe you need >= and <=.

Case Statements with Any

I have some E_ids which are linked to a couple of d_ids and with o_count in any of (1,0,null).
So if any of the E_IDs have an O_count = 1, I have to club it into one row and write the O_count = 1 for that E_ID else 0.
But when I do the below, I get all the rows without the grouping, i.e, I get two rows of the same e_ids. Is there any other way to do the same?
SELECT DISTINCT E_ID, status
(CASE WHEN o_count = any(1) THEN 1
WHEN o_count = any(0) THEN 0
ELSE null END
) Ocount
FROM (SELECT e_id, status, o_count FROM A)
GROUP BY e_id, status, o_count
Yes, just wrap it with MAX() :
SELECT E_ID,status
MAX(case when o_count = 1 then 1 ELSE 0 END) as Ocount
FROM A
GROUP BY e_id,status
Also , the sub query was unnecessary , you are not doing any logic in there.
First of all you group by e_id, status, o_count. That means you aggregate your data such that you get one row for each such combination. Don't you rather want to get one result row per e_id alone or maybe e_id plus status?
Then you have that case construct not containing any aggregate function, but only the o_count which is part of your group by clause. So you are looking at one row where you want o_count = any(1) which is exactly the same as o_count = 1 of course, because there is only one value in the specified set. You can replace the complete case expression with a mere o_count.
Then you apply distinct. But there can be no duplicates, as you are grouping by all columns used. So distinct doesn't do anything here.
Selecting from a subquery without any where clause or aggregation is also superfluous and you can select from table a directly.
Your query can be re-written as
select distinct e_id, status, o_count
from a;
I suppose you want something like this instead:
select e_id, status, max(o_count)
from a
group by e_id, status;
Or this:
select e_id, max(status), max(o_count)
from a
group by e_id;

Using sql, how can I simultaneously count the filtered and unfiltered occurrences of an attribute?

So this is probably a simple question, but here goes. I have some filtered data that get via a query like this:
SELECT DISTINCT account_id, count(*) as filtered_count
FROM my_table
WHERE attribute LIKE '%filter%'
GROUP BY account_id
ORDER BY account_id
This gives me an output table with two columns.
I'd like to add a third column,
count(*) as total_count
that counts the total number of occurrences of each account_id in the entire table (ignoring the filter).
How can I write the query for this three column table?
You can put a case expression inside the count function, then remove your where clause:
SELECT account_id,
count(case when attribute LIKE '%filter%' then 1 end) as filtered_count,
count(*) as total_count
FROM my_table
GROUP BY account_id
ORDER BY account_id;
Using DISTINCT although not actually harmful to your query, was redundant due to the grouping, so I have removed it.
You'll have to use a case statement for counting with your filter:
SELECT DISTINCT account_id,
count(case when attribute LIKE '%filter%' then 1 else null end) as filtered_count,
count (*)
FROM my_table
GROUP BY account_id
ORDER BY account_id

COUNT() doesn't work with GROUP BY?

SELECT COUNT(*) FROM table GROUP BY column
I get the total number of rows from table, not the number of rows after GROUP BY. Why?
Because that is how group by works. It returns one row for each identified group of rows in the source data. In this case, it will give the count for each of those groups.
To get what you want:
select count(distinct column)
from table;
EDIT:
As a slight note, if column can be NULL, then the real equivalent is:
select (count(distinct column) +
max(case when column is null then 1 else 0 end)
)
from table;
Try this:
SELECT COUNT(*), column
FROM table
GROUP BY column