SQL to get count of distinct rows based on different rules - sql

Say you have a table like:
| key | status |
| --- | ------ |
| 3 | A |
| 4 | A |
| 4 | C |
| 5 | B |
| 6 | B |
| 6 | C |
| 7 | A |
| 7 | B |
I want a query that returns, in a single row, the count of the number of rows that contain a specific status, but applying some priority rules. The rules would be different for each row and something like:
Column a_count = count of any distinct key that has a status of A
Column b_count = count of any distinct key that has a status of B, but where the same key does not also appear with a status of A
Column c_count = count of any distinct key that has a status of C, but where the same key does not also appear with a status of A or B
The point being that the total of all counts should equal the total number of distinct keys in the source table. In my sample data above, the results should be:
| a_count | b_count | c_count |
| ------- | ------- | ------- |
| 3 | 2 | 0 |

should be able to do your pivot with case statements and not exists.
SELECT Count (CASE
WHEN status = 'A' THEN 1
ELSE 0
END) AS a_count,
Count (CASE
WHEN status = 'B'
AND NOT EXISTS (SELECT 1
FROM mytable b
WHERE a.KEY = b.KEY
AND b.status = 'A') THEN 1
ELSE 0
END) AS b_count,
Count (CASE
WHEN status = 'C'
AND NOT EXISTS (SELECT 1
FROM mytable c
WHERE a.KEY = c.KEY
AND c.status IN ( 'A', 'B' )) THEN 1
ELSE 0
END) AS c_count
FROM mytable a

Related

Expanding information from one row to all similarly grouped rows in SQL

I am not sure of the logic required to accomplish this, but I want to take a table like this...
+----+------+
| Id | Type |
+----+------+
| 10 | A |
| 10 | B |
| 10 | C |
| 20 | A |
| 20 | C |
+----+------+
...and end up with a table like this...
+----+------+---+---+---+
| Id | Type | A | B | C |
+----+------+---+---+---+
| 10 | A | 1 | 1 | 1 |
| 10 | B | 1 | 1 | 1 |
| 10 | C | 1 | 1 | 1 |
| 20 | A | 1 | 0 | 1 |
| 20 | C | 1 | 0 | 1 |
+----+------+---+---+---+
...where each Id will have new columns created to consolidate information about Type into every row of that Id. Since 10 has a row of types A, B, and C, then all rows that have an ID of 10 should have a 1/true in the new columns A, B and C.
I know how to do this on a per-row basis, but can't wrap my head around how to consolidate the information from multiple rows into each row of the same ID.
Try this below logic- Demo
SELECT *,
(SELECT COUNT(DISTINCT Type) FROM your_table B WHERE B.ID = A.Id and B.Type = 'A') A,
(SELECT COUNT(DISTINCT Type) FROM your_table C WHERE C.ID = A.Id and C.Type = 'B') B,
(SELECT COUNT(DISTINCT Type) FROM your_table D WHERE D.ID = A.Id and D.Type = 'C') C
FROM your_table A
And just another option- Demo
SELECT *,
SUM(CASE WHEN Type= 'A' THEN 1 ELSE 0 END) OVER(PARTITION BY Id) A,
SUM(CASE WHEN Type= 'B' THEN 1 ELSE 0 END) OVER(PARTITION BY Id) B,
SUM(CASE WHEN Type= 'C' THEN 1 ELSE 0 END) OVER(PARTITION BY Id) C
FROM your_table

Count amount of duplicates of specific value in SQL

I have this table:
ID | mssg
--------
A | 1
B | 1
C | 1
A | 3
B | 2
C | 2
D | 5
...|...
I would like to count how many distinct messages of each type there are and group them for each correlated ID.
I currently have a working script that successfully counts the total amount of duplicate messages for each ID:
SELECT DISTINCT ID, COUNT(ID) AS mssg_count
FROM my_table
GROUP BY ID
ORDER BY ID
But I want to modify it to count for specific messages for when mssg = '1' or mssg = '2'.
I have tried something like this:
SELECT DISTINCT ID, (SELECT COUNT(mssg)
WHERE mssg = '1'
HAVING COUNT(mssg) > 1)
GROUP BY ID
FROM my_table
But I am inexperienced with SQL and get syntax errors.
My expected table would look something like this:
ID | Total number of messages | Amnt of messages where mssg = 1 | mssg = 2..
------------------------------------------------------------------------------
A | 2 | 1 | 0
B | 2 | 1 | 1
C | 2 | 1 | 1
D | 1 | 0 | 0
...|..........................|.................................|............
For your individual column. You can use:
sum(CASE WHEN mssg = '1' THEN 1 ELSE 0 END)
Full SQL:
SELECT
ID,
sum(CASE WHEN mssg = '1' THEN 1 ELSE 0 END)
FROM my_table
GROUP BY ID

SQL select distinct when one column in and another column greater than

Consider the following dataset:
+---------------------+
| ID | NAME | VALUE |
+---------------------+
| 1 | a | 0.2 |
| 1 | b | 8 |
| 1 | c | 3.5 |
| 1 | d | 2.2 |
| 2 | b | 4 |
| 2 | c | 0.5 |
| 2 | d | 6 |
| 3 | a | 2 |
| 3 | b | 4 |
| 3 | c | 3.6 |
| 3 | d | 0.2 |
+---------------------+
I'm tying to develop a sql select statement that returns the top or distinct ID where NAME 'a' and 'b' both exist and both of the corresponding VALUE's are >= '1'. Thus, the desired output would be:
+---------------------+
| ID | NAME | VALUE |
+---------------------+
| 3 | a | 2 |
+----+-------+--------+
Appreciate any assistance anyone can provide.
You can try to use MIN window function and some condition to make it.
SELECT * FROM (
SELECT *,
MIN(CASE WHEN NAME = 'a' THEN [value] end) OVER(PARTITION BY ID) aVal,
MIN(CASE WHEN NAME = 'b' THEN [value] end) OVER(PARTITION BY ID) bVal
FROM T
) t1
WHERE aVal >1 and bVal >1 and aVal = [Value]
sqlfiddle
This seems like a group by and having query:
select id
from t
where name in ('a', 'b')
having count(*) = 2 and
min(value) >= 1;
No subqueries or joins are necessary.
The where clause filters the data to only look at the "a" and "b" records. The count(*) = 2 checks that both exist. If you can have duplicates, then use count(distinct name) = 2.
Then, you want the minimum value to be 1, so that is the final condition.
I am not sure why your desired results have the "a" row, but if you really want it, you can change the select to:
select id, 'a' as name,
max(case when name = 'a' then value end) as value
you can use in and sub-query
select top 1 * from t
where t.id in
(
select id from t
where name in ('a','b')
group by id
having sum(case when value>1 then 1 else 0)>=2
)
order by id

SQL Grouping entries with a different value

Let's assume I have a report that displays an ID and VALUE from different tables
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 1 | 0 |
3 | 1 | 1 |
4 | 2 | 0 |
5 | 2 | 0 |
My goal is to display this table with grouped IDs and VALUEs. My rule to grouping VALUEs would be "If VALUE contains atleast one '1' then display '1' otherwise display '0'".
My current SQL is (simplified)
SELECT
TABLE_A.ID,
CASE
WHEN TABLE_B.VALUE = 1 OR TABLE_C.VALUE NOT IN (0,1,2,3)
THEN 1
ELSE 0
END AS VALUE
FROM TABLE_A, TABLE_B, TABLE_C
GROUP BY
TABLE_A.ID
(CASE
WHEN TABLE_B.VALUE = 1 OR TABLE_C.VALUE NOT IN (0,1,2,3)
THEN 1
ELSE 0
END)
The output is following
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 1 | 0 |
3 | 2 | 0 |
Which is half way to the output I want
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 2 | 0 |
So my Question is: How do I extend my current SQL (or change it completely) to get my desired output?
If you are having only 0 and 1 as distinct values in FOREIGN_VALUE column then using max() function as mentioned by HoneyBadger in the comment will fulfill your requirement.
SELECT
ID,
MAX(FOREIGN_VALUE) AS VALUE
FROM (SELECT
ID,
CASE WHEN FOREIGN_VALUE = 1
THEN 1
ELSE 0
END AS FOREIGN_VALUE
FROM TABLE,
FOREIGN_TABLE)
GROUP BY
ID;
Assuming value is always 0 or 1, you can do:
select id, max(value) as value
from t
group by id;
If value can take on other values:
select id,
max(case when value = 1 then 1 else 0 end) as value
from t
group by id;

Get ID if table has one or more row exist for a condition

Suppose I have a table as below:
ID | Account| Status
---+--------+-------
1 | acct1 | A
1 | acct2 | S
1 | acct3 | C
2 | acct4 | C
2 | acct5 | C
3 | acct6 | A
3 | acct7 | C
4 | acct8 | C
4 | acct9 | C
4 | acct10 | C
Condition: return ID if accounts do not have any 'A' and 'S' status.
For this case, I only want ID '2' and '4' to be returned.
You could use HAVING and conditional SUM:
SELECT ID
FROM tab
GROUP BY ID
HAVING SUM(CASE WHEN Status IN ('A', 'S') THEN 1 ELSE 0 END) = 0
First select id which record don't have 'A' and 'S'. Then get distinct record:
Select distinct(ID) as ID
from table_name where id not in
(
select ID from table_name where status in('A', 'S')
)