Finding records sets with GROUP BY and SUM - sql

I'd like to do a query for every GroupID (which always come in pairs) in which both entries have a value of 1 for HasData.
|GroupID | HasData |
|--------|---------|
| 1 | 1 |
| 1 | 1 |
| 2 | 0 |
| 2 | 1 |
| 3 | 0 |
| 3 | 0 |
| 4 | 1 |
| 4 | 1 |
So the result would be:
1
4
here's what I'm trying, but I can't seem to get it right. Whenever I do a GROUP BY on the GroupID then I only have access to that in the selector
SELECT GroupID
FROM Table
GROUP BY GroupID, HasData
HAVING SUM(HasData) = 2
But I get the following error message because HasData is acutally a bit:
Operand data type bit is invalid for sum operator.
Can I do a count of two where both records are true?

just exclude those group ID's that have a record where HasData = 0.
select distinct a.groupID
from table1 a
where not exists(select * from table1 b where b.HasData = 0 and b.groupID = a.groupID)

You can use the having clause to check that all values are 1:
select GroupId
from table
group by GroupId
having sum(cast(HasData as int)) = 2
That is, simply remove the HasData column from the group by columns and then check on it.

One more option
SELECT GroupID
FROM table
WHERE HasData <> 0
GROUP BY GroupID
HAVING COUNT(*) > 1

Related

PSQL select all rows with a non-unique column

The query is supposed to query the item table and:
filter out active=0 items
select id and groupId where there's at least one more item with that groupId
Example:
| id | groupId | active |
| --- | ------- | ------ |
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 2 | 0 |
| 4 | 3 | 1 |
| 5 | 3 | 1 |
| 6 | 4 | 1 |
Desired Output:
| id | groupId |
| --- | ------- |
| 4 | 3 |
| 5 | 3 |
Explanation
groupID 1: invalid because has only 1 member
groupID 2: invalid because has two members, but one is inactive
groupID 3: valid
groupID 4: invalid because has only 1 member
What I tried
SELECT id, groupId
FROM items
WHERE id IN (
SELECT id
FROM items
WHERE active=1
GROUP BY groupId
HAVING COUNT(*) > 1
);
But I get the id must appear in the GROUP BY clause or be used in an aggregate function error.
I understand I can mess around with the sql_mode to get rid of that error, but I would rather avoid that.
Go for window functions:
select i.*
from (select i.*, count(*) over (partition by groupid) as cnt
from items i
where active = 1
) i
where cnt > 1
Window functions is the way to go.
But if you want to fix your query then this should do it:
select a.id, a.groupId from items a
where active = 1 and groupid in(
select groupId from item
where active = 1
group by groupId
having count(distinct id) > 1
)
because we are counting which groupid has more than 1 id for the same groupid

Returning rows with the same ID but exclude some on second column

I've seen similar questions about but not quite hitting the nail on the head for what I need. Lets say I have a table.
+-----+-------+
| ID | Value |
+-----+-------+
| 123 | 1 |
| 123 | 2 |
| 123 | 3 |
| 456 | 1 |
| 456 | 2 |
| 456 | 4 |
| 789 | 1 |
| 789 | 2 |
+-----+-------+
I want to return DISTINCT IDs but exclude those that have a certain value. For example lets say I don't want any IDs that have a 3 as a value. My results should look like.
+-----+
| ID |
+-----+
| 456 |
| 789 |
+-----+
I hope this makes sense. If more information is needed please ask and if this has been answered before please point me in the right direction. Thanks.
You can use group by and having:
select id
from t
group by id
having sum(case when value = 3 then 1 else 0 end) = 0;
The having clause counts the number of "3"s for each id. The = 0 returns only returns groups where the count is 0 (i.e. there are no "3"s).
You can use not exists :
select distinct t.id
from table t
where not exists (select 1 from table t1 where t1.id = t.id and t1.value = 3);
Try this:
select id from tablename
group by id
having (case when value=3 then 1 else 0 end)=0
You can also use EXCEPT for comparing following two data sets that will give the desired result set
select distinct Id from ValuesTbl
except
select Id from ValuesTbl where Value = 3

SQL Grouping entries with a different value

Let's assume I have a report that displays an ID and VALUE from different tables
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 1 | 0 |
3 | 1 | 1 |
4 | 2 | 0 |
5 | 2 | 0 |
My goal is to display this table with grouped IDs and VALUEs. My rule to grouping VALUEs would be "If VALUE contains atleast one '1' then display '1' otherwise display '0'".
My current SQL is (simplified)
SELECT
TABLE_A.ID,
CASE
WHEN TABLE_B.VALUE = 1 OR TABLE_C.VALUE NOT IN (0,1,2,3)
THEN 1
ELSE 0
END AS VALUE
FROM TABLE_A, TABLE_B, TABLE_C
GROUP BY
TABLE_A.ID
(CASE
WHEN TABLE_B.VALUE = 1 OR TABLE_C.VALUE NOT IN (0,1,2,3)
THEN 1
ELSE 0
END)
The output is following
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 1 | 0 |
3 | 2 | 0 |
Which is half way to the output I want
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 2 | 0 |
So my Question is: How do I extend my current SQL (or change it completely) to get my desired output?
If you are having only 0 and 1 as distinct values in FOREIGN_VALUE column then using max() function as mentioned by HoneyBadger in the comment will fulfill your requirement.
SELECT
ID,
MAX(FOREIGN_VALUE) AS VALUE
FROM (SELECT
ID,
CASE WHEN FOREIGN_VALUE = 1
THEN 1
ELSE 0
END AS FOREIGN_VALUE
FROM TABLE,
FOREIGN_TABLE)
GROUP BY
ID;
Assuming value is always 0 or 1, you can do:
select id, max(value) as value
from t
group by id;
If value can take on other values:
select id,
max(case when value = 1 then 1 else 0 end) as value
from t
group by id;

Filter query with a GROUP BY based on column not in GROUP BY statement

Given the following table structure and sample data:
+-------------+------+-------------+
| EmployeeID | Name | WorkWeek |
+--------------+-------+-----------+
| 1 | A | 1 |
| 2 | B | 1 |
| 2 | B | 2 |
| 3 | C | 1 |
| 3 | C | 2 |
| 4 | D | 2 |
+--------------+-------+-----------+
I am looking to select all employees that only worked week 1 (so in this example, only employeeid = 1 would be returned. I am able to get the data with the following query:
SELECT EmployeeId, Name
FROM SomeTable
GROUP BY EmployeeId, Name
HAVING SUM ( WorkWeek ) = 1;
To me, the HAVING SUM( WorkWeek ) = 1 is a hack and this should be handled with some form of a GROUP BY and COUNT but I cannot wrap my head around how that query would be structured.
Any help would be useful and enlightening.
HAVING SUM( WorkWeek ) = 1 may work for week 1 or 2, but will fail for week 3 (since 1+2 = 3).
Use NOT EXISTS operator with a subquery instead:
SELECT EmployeeId, Name
FROM SomeTable t1
WHERE NOT EXISTS (
SELECT * FROM SomeTable t2
WHERE t1.EmployeeId = t2.EmployeeId
AND t2.WorkWeek <> 1
)
Actually, that's exactly why the having clause is for - to filter records according to the aggregated values.
From w3schools sql tutorial:
The HAVING clause was added to SQL because the WHERE keyword could not be used with aggregate functions.

SQL DELETE group of records based on opposite group being empty

In table T, I'm trying to delete all records in a groups having same value of A, but only if all members of this group have B set to 'x'.
Given the Table T:
+-------+--------+
| A | B |
+-------+--------+
| 2 | '' |
| 2 | 'x' |
| 2 | '' |
| 8 | 'x' |
| 8 | 'x' |
| 15 | '' |
| 15 | '' |
+-------+--------+
The two records with A == 8 have to be deleted as all two of them have B==1. The group of A==2 has mixed value of B so it stays. And group of A==15 doesn't have all of it's B equal to 1 it also stays.
Is this possible to do by one query?
If not, any other way that is fast enough for a table with a lot of records?
you can try this query:
delete from T
where A in (
select A
from T
group by A
having sum(B) = count(*)
)
if column b can contain non 0/1 values, you can add additional conditions:
having sum(B) = count(*) and min(b)=1 and max(b)=1
if you can't use numeric values, you can just use min/max, like
having min(b)='x' and max(b)='x'
Try this. Group by and Having with some aggregate should work
DELETE FROM tablename
WHERE a IN(SELECT a
FROM tablename
GROUP BY a
HAVING count(case when b='x' then 1 end) = Count(b)