Case Statements with Any - sql

I have some E_ids which are linked to a couple of d_ids and with o_count in any of (1,0,null).
So if any of the E_IDs have an O_count = 1, I have to club it into one row and write the O_count = 1 for that E_ID else 0.
But when I do the below, I get all the rows without the grouping, i.e, I get two rows of the same e_ids. Is there any other way to do the same?
SELECT DISTINCT E_ID, status
(CASE WHEN o_count = any(1) THEN 1
WHEN o_count = any(0) THEN 0
ELSE null END
) Ocount
FROM (SELECT e_id, status, o_count FROM A)
GROUP BY e_id, status, o_count

Yes, just wrap it with MAX() :
SELECT E_ID,status
MAX(case when o_count = 1 then 1 ELSE 0 END) as Ocount
FROM A
GROUP BY e_id,status
Also , the sub query was unnecessary , you are not doing any logic in there.

First of all you group by e_id, status, o_count. That means you aggregate your data such that you get one row for each such combination. Don't you rather want to get one result row per e_id alone or maybe e_id plus status?
Then you have that case construct not containing any aggregate function, but only the o_count which is part of your group by clause. So you are looking at one row where you want o_count = any(1) which is exactly the same as o_count = 1 of course, because there is only one value in the specified set. You can replace the complete case expression with a mere o_count.
Then you apply distinct. But there can be no duplicates, as you are grouping by all columns used. So distinct doesn't do anything here.
Selecting from a subquery without any where clause or aggregation is also superfluous and you can select from table a directly.
Your query can be re-written as
select distinct e_id, status, o_count
from a;
I suppose you want something like this instead:
select e_id, status, max(o_count)
from a
group by e_id, status;
Or this:
select e_id, max(status), max(o_count)
from a
group by e_id;

Related

Hive SQL nested query use similar column

I have a query that includes two subqueries with similar column 'day'. I would like to show values in a following way:
day cnt1 cnt_total
But in a query I have it does not recognize that the day column is similar and makes a multiplication of all rows in nested statement one by all rows in nested statement two.
Is there a way to make it recognize that the day column is similar?
The query looks as follows:
SELECT p1.day, p1.count AS cnt1, p2.count AS cnt_total
FROM
(
SELECT day, COUNT(DISTINCT id) AS count FROM table
WHERE 1=1
AND service="service"
AND action="action"
AND path LIKE "%search%"
AND year="2021"
GROUP BY day
) p1,
(
SELECT day, COUNT(DISTINCT id) AS count FROM table
WHERE 1=1
AND service="service"
AND action="action"
AND year="2021"
GROUP BY day
) p2;
You should be able to do this with conditional aggregation, so only one SELECT is needed:
SELECT day,
COUNT(DISTINCT CASE WHEN action = 'mousedown' AND data["path"] LIKE '%go-to-latest-search%' THEN gsid END) AS count,
COUNT(DISTINCT CASE WHEN action = 'impress' THEN gsid END) as cnt_total
FROM hit
WHERE service = 'sauto' AND
year = '2021' AND
month = '07'
GROUP BY day

How to take count of distinct rows which have a specific column with NULL values is all rows

I have a table CodeResult as follows:
Here we can notice that Code 123 alone has a Code2, that has a value in Result. I want to take a count of distinct Codes that has no values at all in Result. Which means, in this example, I should get 2.
I do not want to use group by clause because it will slow down the query.
Below code gives wrong result:
Select count(distinct code) from CodeResult where Result is Null
One method is two levels of aggregation:
select count(*)
from (select code
from t
group by code
having max(result) is null
) c;
A more clever method doesn't use a subquery. It counts the number of distinct codes and then removes the ones that have a result:
select ( count(distinct code) -
count(distinct case when result is not null then code end )
)
from t;
You simply can't avoid a GROUP BY: In all DBMSs I know, the query plan you get from a:
SELECT DISTINCT a,b,c FROM tab; ,
is the same as the one for:
SELECT a,b,c FROM tab GROUP BY a,b,c;
The following query will return each of the Code values for which there are no corresponding non-NULL values in CodeResult:
select distinct Code
from CodeResult as CR
where not exists
( select 42 from CodeResult as iCR where iCR.Code = CR.Code and iCR.CodeResult is not NULL );
Counting the rows is left as an exercise for the reader.

Sum of multiple select count distinct with case function

I try to make a sum of multiple select count distinct with case function. For example:
SELECT id_dept,
count(DISTINCT case when e.statut='pub' then id_patients end) AS nb_patients_pub,
count(DISTINCT case when e.statut='priv' then id_patients end) AS nb_patients_priv
FROM venues
I would like to make of these two results in only one columns.
Is it possible?
I think that you want in:
SELECT
id_dept,
COUNT(DISTINCT CASE WHEN e.statut IN ('pub', 'priv') THEN id_patients END) AS nb_patients_pub_and_venues
FROM venues
GROUP BY id_dept
Note that I added a GROUP BY clause to the query, which was initially missing (this is a syntax error in almost all databases).
Depending on your data, this might not do exactly what you want; if a given id_patient has both statuses, then it will be counted only once, whereas your code counted it once in each count(distinct ...). If so, then you can just keep the two separated counts, and sum them:
SELECT
id_dept,
COUNT(DISTINCT CASE WHEN e.statut IN = 'pub' THEN id_patients END)
+ COUNT(DISTINCT CASE WHEN e.statut IN = 'priv' THEN id_patients END)
AS nb_patients_pub_and_venues
FROM venues
GROUP BY id_dept
If you're happy with current code, then either sum (using +) those counts, or use that query as a CTE (or an inline view) and
with test as
(SELECT id_dept,
count(DISTINCT case when e.statut='pub' then id_patients end)
AS nb_patients_pub,
count(DISTINCT case when e.statut='priv' then id_patients end)
AS nb_patients_priv
FROM venues
GROUP BY id_dept
)
select id, nb_patients_pub + nb_patients_priv as result
from test;

What does THEN in this CASE statement does?

folks!
Could someone please explain to me this CASE statement? I'm puzzled about the THEN user_id, what does it does exactly?
SELECT modal_text,
COUNT(DISTINCT CASE
WHEN ab_group = 'control' THEN user_id
END) AS 'control_clicks'
FROM onboarding_modals
GROUP BY 1
ORDER BY 1;
Thanks in advance!
This is simple aggregation:
COUNT(DISTINCT user_id)
and it counts all the distinct non null user_ids.
But this is conditional aggregation:
COUNT(DISTINCT CASE WHEN ab_group = 'control' THEN user_id END)
and it counts the distinct non null user_ids only if in the same row the column ab_group contains the value 'control'.
For an AB test , the select statement is trying to find out the the count of distinct users in control_group.
So instead of counting all distinct users for each modal_text, the case is counting the user only if it is in control_group i.e. the column ab_group = 'control'
THEN is a conditional statement.
To explain you more clearly,
If your ab_group column has value 'control' then print user_id column
It's similar to if else statement
if (ab_group = 'control')
{
user_id
}
Use below link to understand more,
https://www.w3schools.com/sql/sql_case.asp

How to retrieve information from table in one statement when the result has different numbers of rows?

I want to retrieve different information in one statement from the same table and they have different number of rows.
The first select has five rows in the result and the second select has three rows because some prices have null value. I thought maybe if I can put zero instead of null so they will match the same number of rows but I don't know how to do that, or is there another solution?
select count(ID), Land
from Film_ha2911
group by Land
union
select count(ID)
from Film_ha2911
where Price is not null
group by Land;
The use of UNION implies that the number and type of columns in select must corresponding
so in your case you should use null for not select columns
select count(ID), Land
from Film_ha2911
group by Land
union
select count(ID), null
from Film_ha2911
where Price is not null
group by Land;
But in this case seems you need a left join on the subquery for land
select t1.count1, t1.land , t2.count2
from (
select count(ID) count1, Land
from Film_ha2911
group by Land
) t1
left join (
select count(ID) count2, land
from Film_ha2911
where Price is not null
group by Land;
) t2 on t1.land = t2.land
The desired result can be achieved by single SELECT without UNION.
Extra column: PriceNotNull to differentiate is Price value filled or not:
SELECT
Land,
CASE WHEN Price IS NOT NULL THEN 'True' ELSE 'False' END PriceNotNull,
COUNT(ID) AS Count_ID
FROM Film_ha2911
GROUP BY Land, CASE WHEN Price IS NOT NULL THEN 'True' ELSE 'False' END
You can just use count():
select Land, count(*) as total_rows,
count(price) as total_with_price
from Film_ha2911
group by Land;
count() counts the number of non-NULL values, so no special logic is needed to count non-NULL values. By count(id) I assume you want to count all the rows. count(*) is more explicit -- as would count(1) which some people prefer.
If you actually want this on separate rows, I would add an indicator for what the count means:
select Land, 'total rows' as which, count(*) as total_rows
from Film_ha2911
group by Land
union all
select Land, 'with price', count(price)
from Film_ha2911
group by Land;
However, I think the first version with two separate columns is more useful.