HiveQL - Grouping Count - hive

I'm a newbie here and still facing a lot of problems in HiveQL, need to consult with you all. I have a table named Vote Table, and I'd like to count the "yes" vote for A,B,C,D (sorry I wasn't able to post the image, so I sent it as a link instead).
Vote_table
But here I'd like to only join the count of A1,A2,A3,A4 together; and for B,C will still be counted individually. The output I'm expecting would be
Result_table
What I've tried is
select
type,
count(
case
when type = 'A1' and vote = 'yes' then 1
when type = 'A2' and vote = 'yes' then 1
when type = 'A3' and vote = 'yes' then 1
when type = 'A4' and vote = 'yes' then 1
else vote = 'yes' then 1
)
from vote_table
where …
group by type
I've also tried this way
if (type in ('A1', 'A2', 'A3', 'A4') and vote = 'yes' then count(*) else (if (vote = 'yes' then count(*)))) as cnt_yes
But both don't work. So, I'd like to consult it with the experts here, is there any better way in doing this? Thanks!

Group by calculated type: case when type in ('A1', 'A2', 'A3', 'A4') then 'A' else type end
select
case when type in ('A1', 'A2', 'A3', 'A4') then 'A' else type end as type,
sum(case when vote = 'yes' then 1 else 0 end) as number_of_vote
from vote_table
where ...
group by case when type in ('A1', 'A2', 'A3', 'A4') then 'A' else type end

Related

How to check if any value in group is equal to a specific value in BigQuery SQL?

I have a dataset like the following:
ID|Date_Val|Data
1|2022-01-01|A
1|2022-01-01|I
1|2022-01-01|H
2|2022-01-01|G
2|2022-01-01|G
2|2022-01-01|I
I would like to run a query like the following:
SELECT ID, Date_Val, IF(/logic here/, 'A', 'B')
GROUP BY 1,2
Output dataset
ID|Date_Val|Data
1|2022-01-01|A
2|2022-01-01|B
How would I write /logic here/ so that if any Data value in the grouping (ID, Date_Val) is = 'A' then 'A' else 'B'.
We can try:
SELECT ID, Date_Val,
CASE WHEN MAX(CASE WHEN Data = 'A' THEN 1 END) > 0 THEN 'A' ELSE 'B' END AS Data
FROM yourTable
GROUP BY 1, 2;
SELECT ID, Date_Val,
CASE WHEN SUM(CASE WHEN Data = 'A' THEN 1 ELSE 0 END) >= 1 THEN 'A' ELSE 'B' END as Data
FROM your_table
GROUP BY ID, Date_Val
In BigQuery, another option would be:
SELECT ID, Date_Val, IF('A' IN UNNEST(ARRAY_AGG(Data)), 'A', 'B') Data
FROM sample_table
GROUP BY 1, 2;
Below is most compact version so far for you to consider
select id, date_val,
if(logical_or(data = 'A'), 'A', 'B') as data
from your_table
group by id, date_val
if applied to sample data in your question - output is

Return single value when checking table rows values

Im trying to return a single value from a table with a lot of rows if a condition is met.
For example, I have a table (ID (pk), CODE (pk), DESCRIPTION) which has a lot of rows. How can I return in a single row if..
SELECT CASE
WHEN CODE IN ('1', '2') THEN '100'
WHEN CODE IN ('2', '3') THEN '200'
WHEN CODE IN ('5', '7') THEN '300'
END AS ASDASD
FROM TABLE
WHERE ID = 1;
The problem is that CODE must check for both and not just one of them. The code as it is will return if for example that ID has got the code '2'.
ASDASD
NULL
'200'
And I want to return just '200' because that ID has got code '2' and '3'.
Assuming codes are not duplicated for a particular id:
SELECT ID,
(CASE WHEN SUM(CASE WHEN CODE IN ('1', '2') THEN 1 ELSE 0 END) = 2
THEN '100'
WHEN SUM(CASE WHEN CODE IN ('2', '3') THEN 1 ELSE 0 END) = 2
THEN '200'
WHEN SUM(CASE WHEN CODE IN ('5', '7') THEN 1 ELSE 0 END) = 2
THEN '300'
END) AS ASDASD
FROM TABLE
WHERE ID = 1
GROUP BY ID;
I added ID to the SELECT, just because this might be useful for multiple ids.
You could try and use condition aggregation, as follows :
SELECT CASE
WHEN MAX(DECODE(code, '1', 1)) = 1 AND MAX(DECODE(code, '2', 1)) = 1
THEN '100'
WHEN MAX(DECODE(code, '2', 1)) = 1 AND MAX(DECODE(code, '3', 1)) = 1
THEN '200'
WHEN MAX(DECODE(code, '5', 1)) = 1 AND MAX(DECODE(code, '7', 1)) = 1
THEN '300'
END AS asdasd
FROM TABLE
WHERE ID = 1;
DECODE() is a handy Oracle function that compares an expression (code) to a series of values and returns results accordingly. Basically, condition MAX(DECODE(code, '1', 1)) = 1 ensures that at least one row has code = '1'.
PS : are you really storing numbers as strings ? If code is a number datatype, please remove the single quotes in the above query.
You need to check the number returned by a query like this:
SELECT COUNT(DISTINCT CODE) FROM TABLE WHERE ID = 1 AND CODE IN ('1', '2')
If this number is 2 then ID = 1 has both CODE values '1' and '2'.
SELECT
CASE
WHEN (SELECT COUNT(DISTINCT CODE) FROM TABLE WHERE ID = 1 AND CODE IN ('1', '2')) = 2 THEN '100'
WHEN (SELECT COUNT(DISTINCT CODE) FROM TABLE WHERE ID = 1 AND CODE IN ('2', '3')) = 2 THEN '200'
WHEN (SELECT COUNT(DISTINCT CODE) FROM TABLE WHERE ID = 1 AND CODE IN ('5', '7')) = 2 THEN '300'
END AS ASDASD
FROM TABLE

simplify multiple union in sql

i am trying to simply the following union query .Basically i am trying to get all the possible values from same table different column and i have to take value which are not equal to no then replace them with specific text when they are from respective column.
select 'a' from Mytable where a!='no' and id='1'
union
select 'b' from Mytable where b!='no' and id='1'
union select 'c' from Mytable where c!='no' and id='1'
so my table structure will be
id Acolumn BColumn Ccolumn
1 123a no 345v
so my expected result is
a c
so please suggest me to simplify this query thanks in advance
According to what you've written, this might be OK.
select case when a <> 'no' and id = '1' then 'a'
when b <> 'no' and id = '1' then 'b'
when c <> 'no' and id = '1' then 'c'
end
from mytable
where (a <> 'no' and id = '1')
or (b <> 'no' and id = '1')
or (c <> 'no' and id = '1')

How to use sum(case) with three conditions

I usually use sum(case) to get sum of some columns:
i.e. SUM(CASE WHEN over05 = 'OK' THEN 1 ELSE 0 END) AS OK_05
and this is perfect when I have a column with two values, but when I have a column where I have three values:
i.e. over 05 = '1' or 'X' or '2'
how can I do a sum(case)?
If you want all three values to return the same thing, you should use IN():
SUM(
CASE
WHEN over05 IN ('1', 'X', '2') THEN 1
ELSE 0 END
) AS OK_05
If you want each value to return something different, you should use multiple WHEN ... THEN :
SUM(
CASE
WHEN over05 = '1' THEN 1
WHEN over05 = 'X' THEN 2
WHEN over05 = '2' THEN 3
ELSE 0 END
) AS OK_05

Multiple Columns/Where From SQL Query

New to the site, and SQL queries in general here, so forgive the noobness here. I'm looking to create a SQL query that returns 3 columns (from a single table):
Distinct "Region__C"
Count of "ID" Where "ACTIVE__C" is "Y"
Count of "ID" Where "ACTIVE__C" is "N"
Here's the query that would do #1 and #2 OR #3. Just not sure how to approach creating both column #2 and #3 in the same query:
SELECT DISTINCT SCHEMA.CONTACT.REGION__C AS "Region",COUNT(SCHEMA.CONTACT.ID) AS "Active Contacts"
FROM SCHEMA.CONTACT
WHERE SCHEMA.CONTACT.ACTIVE__C = 'Y' AND SCHEMA.CONTACT.REGION__C != 'Unknown'
GROUP BY SCHEMA.CONTACT.REGION__C
Thanks in advance for any help that anyone can provide!
SELECT SCHEMA.CONTACT.REGION__C ,
COUNT(CASE WHEN SCHEMA.CONTACT.ACTIVE__C = 'Y' THEN 1
END) AS Y ,
COUNT(CASE WHEN SCHEMA.CONTACT.ACTIVE__C = 'N' THEN 1
END) AS N
FROM SCHEMA.CONTACT
WHERE SCHEMA.CONTACT.ACTIVE__C IN ( 'N', 'Y' ) AND
SCHEMA.CONTACT.REGION__C != 'Unknown'
GROUP BY SCHEMA.CONTACT.REGION__C
I think this will work:
SELECT DISTINCT SCHEMA.CONTACT.REGION__C AS "Region",
sum(case ACTIVE__C when 'Y' then 1 else 0 end) as "CountActive",
sum(case ACTIVE__C when 'N' then 1 else 0 end) as "CountInactive",
COUNT(SCHEMA.CONTACT.ID) AS "Active Contacts"
FROM SCHEMA.CONTACT
WHERE SCHEMA.CONTACT.REGION__C != 'Unknown'
GROUP BY SCHEMA.CONTACT.REGION__C