How to count and group by combinations of multiple columns? - sql

For a table such as:
foo_table
id | str_col | bool_col
1 "1234" 0
2 "3215" 0
3 "8132" 1
4 NULL 1
5 "" 1
6 "" 0
I know how to query both of:
count(*) | bool_col
3 0
3 1
and
count(*) | isnull(str_col) or str_col = ""
3 0
3 1
but how could I get something like:
count(*) | bool_col | isnull(str_col) or str_col = ""
2 0 0
1 0 1
1 1 0
2 1 1
In the meantime, I'm just individually doing:
select count(*) from foo_table where bool_col and (isnull(str_col) or str_col = "");
select count(*) from foo_table where not bool_col and (isnull(str_col) or str_col = "");
select count(*) from foo_table where bool_col and not (isnull(str_col) or str_col = "");
select count(*) from foo_table where not bool_col and not (isnull(str_col) or str_col = "");

Try
SELECT COUNT(*),
bool_col,
CASE WHEN str_col IS NULL OR str_col = '' THEN 1 ELSE 0 END str_col
FROM foo_table
GROUP BY bool_col,
CASE WHEN str_col IS NULL OR str_col = '' THEN 1 ELSE 0 END
Output (MySql):
| COUNT(*) | BOOL_COL | STR_COL |
---------------------------------
| 2 | 0 | 0 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
| 2 | 1 | 1 |
SQLFiddle MySQL
SQLFiddle SQL Server

SELECT COUNT(CASE
WHEN bool_col AND (isnull(str_col) or str_col = "") THEN 1
END) as c1,
COUNT(CASE
WHEN not bool_col and (isnull(str_col) or str_col = "") THEN 1
END) as c2,
COUNT(CASE
WHEN bool_col and not (isnull(str_col) or str_col = "") THEN 1
END) as c3,
COUNT(CASE
WHEN not bool_col and not (isnull(str_col) or str_col = "") THEN 1
END) as c4
FROM table1

In oracle there is a build in function for that called cube
select bool_col ,
case when str_col is null or str_col = '' then 1 else 0 end str_col ,
count(*)
from table1
group by cube (bool_col , case when str_col is null or str_col = '' then 1 else 0 end)
cube will give you all combination. there is also rollup which is a private case of cube.

Related

Multiple instances of the same ID populating in different columns because of the case statement

I'm getting the following results:
| Memb_i | p1 | p2 | p3 |
+--------+----+-----+-----+
| 2 | 0 | 1 | 0 |
| 2 | 1 | 0 | 0 |
when I run the below query:
SELECT DISTINCT
ME1.MEMB_ID
,CASE WHEN CL.CLAIM_ID > = 1 AND cl.PROC_CD = 'P1' THEN 1 else 0 END AS P1_flag
,CASE WHEN CL.CLAIM_ID > = 1 AND cl.PROC_CD = 'P2' THEN 1 else 0 END AS P2_flag
,CASE WHEN CL.CLAIM_ID > = 1 AND cl.PROC_CD = 'P3' THEN 1 else 0 END AS P3_flag
,CASE WHEN CL.CLAIM_ID > = 1 AND cl.PROC_CD = 'P4' THEN 1 else 0 END AS P4_flag
FROM MEMBER_ELIGIBILITY ME1
LEFT OUTER JOIN CLAIM CL
ON CL.MEMB_ID = ME1.MEMB_ID
what I'd like to see is the following
| Memb_i | p1 | p2 | p3 |
+--------+----+----+----+
| 2 | 1 | 1 | 0 |
notice how member 2 has all the fields it has values in in a single row opposed to having them in multiple rows.
Not sure what you need but here is one way to get it:
SELECT ME1.MEMB_ID
,MAX(CASE WHEN CL.CLAIM_ID > = 1 AND cl.PROC_CD = 'P1' THEN 1 else 0 END) AS P1_flag
,MAX(CASE WHEN CL.CLAIM_ID > = 1 AND cl.PROC_CD = 'P2' THEN 1 else 0 END) AS P2_flag
,MAX(CASE WHEN CL.CLAIM_ID > = 1 AND cl.PROC_CD = 'P3' THEN 1 else 0 END) AS P3_flag
,MAX(CASE WHEN CL.CLAIM_ID > = 1 AND cl.PROC_CD = 'P4' THEN 1 else 0 END) AS P4_flag
FROM MEMBER_ELIGIBILITY ME1
LEFT OUTER JOIN CLAIM CL
ON CL.MEMB_ID = ME1.MEMB_ID
GROUP BY ME1.MEMB_ID
Here is the DEMO
This should work:
Demo
select
m.memb_id,
case when c.P1_total > 0 then 1 else 0 end P1_Flag,
case when c.P2_total > 0 then 1 else 0 end P2_Flag,
case when c.P3_total > 0 then 1 else 0 end P3_Flag
from member_eligibility m
left join (
select
memb_id,
sum(case when proc_cd = 'P1' then 1 else 0 end) 'P1_total',
sum(case when proc_cd = 'P2' then 1 else 0 end) 'P2_total',
sum(case when proc_cd = 'P3' then 1 else 0 end) 'P3_total'
from claim
group by memb_id
) c on c.memb_id = m.memb_id

SQL Query for below scenario for multiple rows

Need response as per expected result column in attached image.
The row filtration is required in multiple rows
The rule is (x.attr2 = '1' AND x.attr3 = '1') AND (x.attr2='' AND x.attr3='2') then its expected column value is true but all other conditions its false
Its MS SQL
Key Atr2 Atr3 expected result
111 1 1 TRUE
111 2 2
112 1 4 FALSE
113 1 4 FALSE
113 2 2
114 1 1 FALSE
Check the below script-
IF OBJECT_ID('[Sample]') IS NOT NULL
DROP TABLE [Sample]
CREATE TABLE [Sample]
(
[Key] INT NOT NULL,
Attr1 INT NOT NULL,
Attr2 INT NOT NULL,
Attr3 INT NOT NULL
)
GO
INSERT INTO [Sample] ([Key],Attr1,Attr2,Attr3)
VALUES (111,62,1,1),
(111,62,2,2),
(112,62,1,4),
(113,62,1,4),
(113,62,2,2),
(114,62,1,1)
--EXPECTED_RESULT:
SELECT S.*,CASE WHEN T.[KEY] IS NOT NULL THEN 'TRUE' ELSE 'FALSE' END AS Expected_Result
FROM [Sample] S LEFT JOIN
(
SELECT T.[KEY] FROM
(
SELECT x.*,
ROW_NUMBER() OVER( PARTITION BY x.[KEY],x.attr1 ORDER BY x.attr2,x.attr3) AS r_no
--,CASE WHEN (x.attr2 = 1 AND x.attr3 = 1) OR (x.attr2 = 2 AND x.attr3 = 2)
--then 'TRUE' else 'FALSE' end as expected_result
FROM [Sample] x WHERE x.attr2=x.attr3
) T WHERE T.r_no>1
) T ON S.[KEY]=T.[KEY]
This query:
select key from tablename
group by key
having sum(case when atr2 = '1' and atr3 = '1' then 1 else 0 end) > 0
and sum(case when atr2 = '2' and atr3 = '2' then 1 else 0 end) > 0
and count(*) = 2
uses conditional aggregation to find the keys for which the result should be true.
So join it to the table like this:
select t.*,
case when g.[key] is null then 'FALSE' else 'TRUE' end result
from tablename t left join (
select [key] from tablename
group by [key]
having sum(case when atr2 = '1' and atr3 = '1' then 1 else 0 end) > 0
and sum(case when atr2 = '2' and atr3 = '2' then 1 else 0 end) > 0
and count(*) = 2
) g on g.[key] = t.[key]
See the demo.
Results:
> Key | Atr2 | Atr3 | result
> --: | ---: | ---: | :-----
> 111 | 1 | 1 | TRUE
> 111 | 2 | 2 | TRUE
> 112 | 1 | 4 | FALSE
> 113 | 1 | 4 | FALSE
> 113 | 2 | 2 | FALSE
> 114 | 1 | 1 | FALSE

Parse column B content into logical parts according to column A

I have a table like this
sessionId | hostname
------ | ------
a1 | domain1
a1 | domain2
a2 | domain1
a3 | domain1
a3 | domain2
a4 | domain2
What I want is to build a logical table containing the follwoing
sessionId | only domain1 | only domain2 | domain1 OR domain2 | domain1 AND domain2
-----------|----------------|--------------|--------------------|--------------------
a1 | 1 | 1 | 1 | 1
a2 | 1 | 0 | 1 | 0
a3 | 1 | 1 | 1 | 1
a4 | 0 | 1 | 1 | 0
I guess there's a simple solution for this, but I can't get my head over it :(
You can use conditional aggregation:
select (case when sum(case when hostname = 'domain1' then 1 else 0 end) > 0
then 1 else 0
end) as domain1,
(case when sum(case when hostname = 'domain2' then 1 else 0 end) > 0
then 1 else 0
end) as domain2,
(case when sum(case when hostname = 'domain1' then 1 else 0 end) > 0 or
sum(case when hostname = 'domain2' then 1 else 0 end) > 0
then 1 else 0
end) as either,
(case when sum(case when hostname = 'domain1' then 1 else 0 end) > 0 and
sum(case when hostname = 'domain2' then 1 else 0 end) > 0
then 1 else 0
end) as both
from t
group by sessionid;
Try this :
Declare #Table as Table (sessionId varchar(100),hostname varchar(100))
Insert into #Table Values
('a1','domain1'),
('a1','domain2'),
('a2','domain1'),
('a3','domain1'),
('a3','domain2'),
('a4','domain2')
Select distinct T.sessionId,
case when s1.sessionid is null then 0 else 1 end [only domain1],
case when s2.sessionid is null then 0 else 1 end [only domain2],
case when
(
case when s1.sessionid is null then 0 else 1 end = 1 or
case when s2.sessionid is null then 0 else 1 end = 1
) then 1 else 0 end [domain1 OR domain2],
case when
(
case when s1.sessionid is null then 0 else 1 end = 1 and
case when s2.sessionid is null then 0 else 1 end = 1
) then 1 else 0 end [domain1 AND domain2]
from #Table T
Left Join
(
Select sessionId From #Table where hostname = 'domain1'
) s1 on s1.sessionId = T.sessionId
Left Join
(
Select sessionId From #Table where hostname = 'domain2'
) s2 on s2.sessionId = T.sessionId
For BigQuery Standard SQL
#standardSQL
SELECT
sessionId,
SIGN(COUNTIF(hostname='domain1')) only_domain1,
SIGN(COUNTIF(hostname='domain2')) only_domain2,
SIGN(COUNTIF(hostname='domain1')+COUNTIF(hostname='domain2')) domain1_or_domain2,
SIGN(COUNTIF(hostname='domain1')*COUNTIF(hostname='domain2')) domain1_and_domain2
FROM `yourproject.yourdataset.yourtable`
GROUP BY sessionId
you can test / play with it using dummy data from your question
#standardSQL
WITH `yourproject.yourdataset.yourtable` AS (
SELECT 'a1' sessionId, 'domain1' hostname UNION ALL
SELECT 'a1', 'domain2' UNION ALL
SELECT 'a2', 'domain1' UNION ALL
SELECT 'a3', 'domain1' UNION ALL
SELECT 'a3', 'domain2' UNION ALL
SELECT 'a4', 'domain2'
)
SELECT
sessionId,
SIGN(COUNTIF(hostname='domain1')) only_domain1,
SIGN(COUNTIF(hostname='domain2')) only_domain2,
SIGN(COUNTIF(hostname='domain1')+COUNTIF(hostname='domain2')) domain1_or_domain2,
SIGN(COUNTIF(hostname='domain1')*COUNTIF(hostname='domain2')) domain1_and_domain2
FROM `yourproject.yourdataset.yourtable`
GROUP BY sessionId
ORDER BY sessionId

SQL separate the count of one column

I have a SQL table that contains three columns:
userId
userName
item
and I created this SQL query which will count all the items types of one user:
select
count(ItemID) as 'count of all items types',
userId,
userName
from
userTable
where
ItemID in (2, 3, 4)
and userId = 1
group by
userId, userName
The result will be like this:
+--------+----------+--------------------------+
| userId | userName | count of all items types |
+--------+----------+--------------------------+
| 1 | kim | 25 |
and I am looking for a way to separate the counting of itemes types, so the result should be like this:
+--------+----------+----------------+----------------+-----------------+
| userId | userName | count of item1 | count of item2 | count of item3 |
+--------+----------+----------------+----------------+-----------------+
| 1 | kim | 10 | 10 | 5 |
SELECT
userID,
userName,
SUM(CASE WHEN ItemID = 2 THEN 1 ELSE 0 END) AS count_of_item1,
SUM(CASE WHEN ItemID = 3 THEN 1 ELSE 0 END) AS count_of_item2,
SUM(CASE WHEN ItemID = 4 THEN 1 ELSE 0 END) AS count_of_item3
FROM
My_Table
GROUP BY
userID,
userName
This is called conditional aggregation. Use CASE for this.
With COUNT:
select
count(case when ItemID = 1 then 1 end) as count_item1,
count(case when ItemID = 2 then 1 end) as count_item2,
count(case when ItemID = 3 then 1 end) as count_item3
...
(then 1 could also be anything else except null, e.g. then 'count me'. This works because COUNT counts non-null values and when omitting the ELSE in CASE WHEN you get null. You could also explicitly add else null.)
Or with SUM:
select
sum(case when ItemID = 1 then 1 else 0 end) as count_item1,
sum(case when ItemID = 2 then 1 else 0 end) as count_item2,
sum(case when ItemID = 3 then 1 else 0 end) as count_item3
...
This is how you would do it :
select userId,
username,
SUM(CASE WHEN ItemID = '2' THEN 1 ELSE 0 END) AS Item2-Cnt,
SUM(CASE WHEN ItemID = '3' THEN 1 ELSE 0 END) AS Item3-Cnt,
SUM(CASE WHEN ItemID = '4' THEN 1 ELSE 0 END) AS Item4-Cnt
FROM userTable
GROUP BY userID, userName

SQL check if column contains specific values

I have a table like this:
id | Values
------------------
1 | a
1 | b
1 | c
1 | d
1 | e
2 | a
2 | a
2 | c
2 | c
2 | e
3 | a
3 | c
3 | b
3 | d
Now I want to know which id contains at least one of a, one of b and one of c.
This is the result I want:
id
--------
1
3
One method is aggregation with having:
select id
from t
where values in ('a', 'b', 'c')
group by id
having count(distinct values) = 3;
If you wanted more flexibility with the counts of each value:
having sum(case when values = 'a' then 1 else 0 end) >= 1 and
sum(case when values = 'b' then 1 else 0 end) >= 1 and
sum(case when values = 'c' then 1 else 0 end) >= 1
You can use grouping:
SELECT id
FROM your_table
GROUP BY id
HAVING SUM(CASE WHEN value = 'a' THEN 1 ELSE 0 END) >= 1
AND SUM(CASE WHEN value = 'b' THEN 1 ELSE 0 END) = 1
AND SUM(CASE WHEN value = 'c' THEN 1 ELSE 0 END) = 1;
or using COUNT:
SELECT id
FROM your_table
GROUP BY id
HAVING COUNT(CASE WHEN value = 'a' THEN 1 END) >= 1
AND COUNT(CASE WHEN value = 'b' THEN 1 END) = 1
AND COUNT(CASE WHEN value = 'c' THEN 1 END) = 1;