Count amount of duplicates of specific value in SQL - sql

I have this table:
ID | mssg
--------
A | 1
B | 1
C | 1
A | 3
B | 2
C | 2
D | 5
...|...
I would like to count how many distinct messages of each type there are and group them for each correlated ID.
I currently have a working script that successfully counts the total amount of duplicate messages for each ID:
SELECT DISTINCT ID, COUNT(ID) AS mssg_count
FROM my_table
GROUP BY ID
ORDER BY ID
But I want to modify it to count for specific messages for when mssg = '1' or mssg = '2'.
I have tried something like this:
SELECT DISTINCT ID, (SELECT COUNT(mssg)
WHERE mssg = '1'
HAVING COUNT(mssg) > 1)
GROUP BY ID
FROM my_table
But I am inexperienced with SQL and get syntax errors.
My expected table would look something like this:
ID | Total number of messages | Amnt of messages where mssg = 1 | mssg = 2..
------------------------------------------------------------------------------
A | 2 | 1 | 0
B | 2 | 1 | 1
C | 2 | 1 | 1
D | 1 | 0 | 0
...|..........................|.................................|............

For your individual column. You can use:
sum(CASE WHEN mssg = '1' THEN 1 ELSE 0 END)
Full SQL:
SELECT
ID,
sum(CASE WHEN mssg = '1' THEN 1 ELSE 0 END)
FROM my_table
GROUP BY ID

Related

SQL to get count of distinct rows based on different rules

Say you have a table like:
| key | status |
| --- | ------ |
| 3 | A |
| 4 | A |
| 4 | C |
| 5 | B |
| 6 | B |
| 6 | C |
| 7 | A |
| 7 | B |
I want a query that returns, in a single row, the count of the number of rows that contain a specific status, but applying some priority rules. The rules would be different for each row and something like:
Column a_count = count of any distinct key that has a status of A
Column b_count = count of any distinct key that has a status of B, but where the same key does not also appear with a status of A
Column c_count = count of any distinct key that has a status of C, but where the same key does not also appear with a status of A or B
The point being that the total of all counts should equal the total number of distinct keys in the source table. In my sample data above, the results should be:
| a_count | b_count | c_count |
| ------- | ------- | ------- |
| 3 | 2 | 0 |
should be able to do your pivot with case statements and not exists.
SELECT Count (CASE
WHEN status = 'A' THEN 1
ELSE 0
END) AS a_count,
Count (CASE
WHEN status = 'B'
AND NOT EXISTS (SELECT 1
FROM mytable b
WHERE a.KEY = b.KEY
AND b.status = 'A') THEN 1
ELSE 0
END) AS b_count,
Count (CASE
WHEN status = 'C'
AND NOT EXISTS (SELECT 1
FROM mytable c
WHERE a.KEY = c.KEY
AND c.status IN ( 'A', 'B' )) THEN 1
ELSE 0
END) AS c_count
FROM mytable a

Suggest SQL query for given use case

Original Table
Id | Time | Status
------------------
1 | 5 | T
1 | 6 | F
2 | 3 | F
1 | 2 | F
2 | 4 | T
3 | 7 | F
2 | 3 | T
3 | 1 | F
4 | 7 | H
4 | 6 | S
4 | 5 | F
4 | 4 | T
5 | 5 | S
5 | 6 | F
Expected Table
Id | Time | Status
------------------
1 | 6 | F
3 | 7 | F
4 | 5 | F
I want all the distinct ids who have status as F but time should be maximum, if for any id status is T for given maximum time then that id should not be picked. Also only those ids should be picked who have at-least one T. For e.g 4 will not be picked at it doesn't have any 'T' as status.
Please help in writing the SQL query.
You can use EXISTS and NOT EXISTS in the WHERE clause:
select t.*
from tablename t
where t.status = 'F'
and exists (select 1 from tablename where id = t.id and status = 'T')
and not exists (
select 1
from tablename
where id = t.id and status in ('F', 'T') and time > t.time
)
See the demo.
Results:
| Id | Time | Status |
| --- | ---- | ------ |
| 1 | 6 | F |
| 4 | 5 | F |
Try the below way -
select * from tablename t
where time = (select max(time) from tablename t1 where t.id=t1.id and Status='F')
and Status='F'
the following should work
select id,max(time) as time,status
from table
where status='F'
group by id,status
select id, max(time), status
from stuff s
where status = 'F'
and id not in (
select id
from stuff s2
where s2.id = s.id
and s2.time > s.time
and s2.status = 'T')
group by id, status;
You can see the Fiddle here.
As I understand it, you want to find the highest time for each ID (max(time)) where the status is F, but only if there isn't a later record where the status is 'T'. The sub query filters out records where there exists a later record where the status is T.
WITH MAX_TIME_ID AS (
SELECT
ID
,MAX(TIME) AS MAX_TIME
GROUP BY
ID
)
SELECT
O.*
FROM
ORIGINAL_TABLE O
INNER JOIN
MAX_TIME_ID MAX
ON
O.ID = MAX.ID
WHERE
O.STATUS = 'F'
The CTE will find the max time for each ID and the inner join with the where clause on the status will select it only if the latest is 'F'.
I would just use window functions:
select t.*
from (select t.*
row_number() over (partition by id order by time desc) as seqnum,
sum(case when status = 'T' then 1 else 0 end) over (partition by id) as num_t
from t
) t
where num_t > 0 and
seqnum = 1 and status = 'F';
There is a another fun way to do this just with aggregation:
select id, max(time) as time, 'F' as status
from t
group by id
having sum(case when status = 'T' then 1 else 0 end) > 0 and
max(time) = max(case when status 'F' then time end);

SQL Grouping entries with a different value

Let's assume I have a report that displays an ID and VALUE from different tables
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 1 | 0 |
3 | 1 | 1 |
4 | 2 | 0 |
5 | 2 | 0 |
My goal is to display this table with grouped IDs and VALUEs. My rule to grouping VALUEs would be "If VALUE contains atleast one '1' then display '1' otherwise display '0'".
My current SQL is (simplified)
SELECT
TABLE_A.ID,
CASE
WHEN TABLE_B.VALUE = 1 OR TABLE_C.VALUE NOT IN (0,1,2,3)
THEN 1
ELSE 0
END AS VALUE
FROM TABLE_A, TABLE_B, TABLE_C
GROUP BY
TABLE_A.ID
(CASE
WHEN TABLE_B.VALUE = 1 OR TABLE_C.VALUE NOT IN (0,1,2,3)
THEN 1
ELSE 0
END)
The output is following
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 1 | 0 |
3 | 2 | 0 |
Which is half way to the output I want
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 2 | 0 |
So my Question is: How do I extend my current SQL (or change it completely) to get my desired output?
If you are having only 0 and 1 as distinct values in FOREIGN_VALUE column then using max() function as mentioned by HoneyBadger in the comment will fulfill your requirement.
SELECT
ID,
MAX(FOREIGN_VALUE) AS VALUE
FROM (SELECT
ID,
CASE WHEN FOREIGN_VALUE = 1
THEN 1
ELSE 0
END AS FOREIGN_VALUE
FROM TABLE,
FOREIGN_TABLE)
GROUP BY
ID;
Assuming value is always 0 or 1, you can do:
select id, max(value) as value
from t
group by id;
If value can take on other values:
select id,
max(case when value = 1 then 1 else 0 end) as value
from t
group by id;

Get ID if table has one or more row exist for a condition

Suppose I have a table as below:
ID | Account| Status
---+--------+-------
1 | acct1 | A
1 | acct2 | S
1 | acct3 | C
2 | acct4 | C
2 | acct5 | C
3 | acct6 | A
3 | acct7 | C
4 | acct8 | C
4 | acct9 | C
4 | acct10 | C
Condition: return ID if accounts do not have any 'A' and 'S' status.
For this case, I only want ID '2' and '4' to be returned.
You could use HAVING and conditional SUM:
SELECT ID
FROM tab
GROUP BY ID
HAVING SUM(CASE WHEN Status IN ('A', 'S') THEN 1 ELSE 0 END) = 0
First select id which record don't have 'A' and 'S'. Then get distinct record:
Select distinct(ID) as ID
from table_name where id not in
(
select ID from table_name where status in('A', 'S')
)

Finding records sets with GROUP BY and SUM

I'd like to do a query for every GroupID (which always come in pairs) in which both entries have a value of 1 for HasData.
|GroupID | HasData |
|--------|---------|
| 1 | 1 |
| 1 | 1 |
| 2 | 0 |
| 2 | 1 |
| 3 | 0 |
| 3 | 0 |
| 4 | 1 |
| 4 | 1 |
So the result would be:
1
4
here's what I'm trying, but I can't seem to get it right. Whenever I do a GROUP BY on the GroupID then I only have access to that in the selector
SELECT GroupID
FROM Table
GROUP BY GroupID, HasData
HAVING SUM(HasData) = 2
But I get the following error message because HasData is acutally a bit:
Operand data type bit is invalid for sum operator.
Can I do a count of two where both records are true?
just exclude those group ID's that have a record where HasData = 0.
select distinct a.groupID
from table1 a
where not exists(select * from table1 b where b.HasData = 0 and b.groupID = a.groupID)
You can use the having clause to check that all values are 1:
select GroupId
from table
group by GroupId
having sum(cast(HasData as int)) = 2
That is, simply remove the HasData column from the group by columns and then check on it.
One more option
SELECT GroupID
FROM table
WHERE HasData <> 0
GROUP BY GroupID
HAVING COUNT(*) > 1