Find where two conditions are present in group - sql

I have a table:
ref | name
===========
123abc | received
123abc | pending
134b | pending
156c | received
I want to be able to identify instances where a ref only has a pending and not a received. Note there could be multiple receives and pendings for the same ref.
How can I output the ref's that only have a pending and not a received?
So in my example, it would return:
134b | pending
I think it's something like:
SELECT ref, name FROM my_table
WHERE ref IS NOT NULL
GROUP BY ref, name
HAVING ref = 'pending' AND ref = 'received'
;

I would use aggregation:
select name
from my_table
where ref in ('pending', 'received')
group by name
having min(ref) = 'pending' and min(ref) = max(ref);
The second condition comparing min and max is, strictly speaking, not necessary. But it eliminates the dependence on the alphabetical ordering of the values.

You can use not exists for what you need (btw, from your data, column "name" contains values like pending and received):
select distinct ref, name
from my_table t1
where t1.name = 'pending' and not exists (select * from my_table t2 where t1.ref=t2.ref and t2.name='received')
PS. You can validate here with your sample data and my query:
https://dbfiddle.uk/?rdbms=postgres_10&fiddle=6fd633fe52129ff3246d8dba55e5fc17

Another way of doing it is with a WITH statement. This way, there is no need for nested sub-queries.
WITH ref_recieved_pending AS (
SELECT
ref,
sum(CASE WHEN name = 'received'
THEN 1
ELSE 0 END) as recieved_count,
sum(CASE WHEN name = 'pending'
THEN 1
ELSE 0 END) as pending_count
FROM test_table_2
GROUP BY ref
)
SELECT DISTINCT
ref,
'pending' as pending
FROM ref_recieved_pending
WHERE pending_count > 0 AND recieved_count = 0;

Related

Querying a subset

I want to write an SQL query to find records which contain a particular column and from that subset want to find records which doesn't contain a some other value. How do you write a query for that?
cid id2 attribute
--------------------------------
1 100 delete
1 100 payment
1 100 void
2 100 delete
2 102 payment
2 102 void
3 102 delete
3 103 payment
In above example, I want to list cid for which payment and delete attributes exist but void attribute doesn't exist. So it should list out 3 from above example because it doesn't have void attribute.
Forgot to mention that there could be more attributes. However, I need to list out records for which delete and payment exist regardless of other attributes but void doesn’t.
I call this a "set-within-sets" query, because you are looking for particular sets of attributes within each cid.
I would express this with group by and conditions in the having:
select cid
from t
group by cid
having sum(case when attribute = 'payment' then 1 else 0 end) > 0 and
sum(case when attribute = 'delete' then 1 else 0 end) > 0 and
sum(case when attribute = 'void' then 1 else 0 end) = 0 ;
In some databases, you can simplify this with string aggregation -- assuming there are no duplicate attributes for cids. For instance, using the MySQL function:
select cid
from t
where attribute in ('payment', 'delete' 'void')
group by cid
having group_concat(attribute order by attribute) = 'delete,payment';
You can use conditional aggregation:
select cid
from tablename
where attribute in ('delete', 'payment', 'void')
group by cid
having
count(distinct attribute) = 2
and
sum(
case attribute
when 'void' then 1
else 0
end
) = 0
If there are not more attributes than these 3, then you can omit the WHERE clause.
See the demo.
Results:
| cid |
| --- |
| 3 |
I'm assuming that there are only three attributes, so the logic behind this query is:
First COUNT the number of attributes GROUP BY cid, and then LEFT JOIN the original table ON attribute is void. You should grab cid that has exactly 2 attributes and no void.
The original table is named as temp:
SELECT
subq2.result_cid
FROM (
SELECT
*
FROM (
SELECT
T.cid AS result_cid,
COUNT(T.attribute) AS count
FROM
temp AS T
GROUP BY
T.cid
) AS subq
LEFT OUTER JOIN temp AS T2 ON subq.result_cid = T2.cid AND T2.attribute = 'void'
) AS subq2
WHERE subq2.count = 2 AND subq2.id2 IS NULL
use corelated subquery by using not exists
select t1.* from tablename t1
where not exists( select 1 from tablename t2
where t1.cid=t2.cid and attribute='void'
)
and exists ( select 1 from tablename t2
where t1.cid=t2.cid
having count(distinct attribute)=2
)
and attribute in ('payment','delete')
demo online

Selecting a group with or without certain conditions across many rows in SQL

I have data like this:
ID SomeVar
123 0
123 1
123 2
234 1
234 2
234 3
456 3
567 0
567 1
I'm trying to group by my ID to to return all of the IDs that do not have a record with the value 0. That is, my selection would look like this:
ID
234
456
Is there an easy way to do this without creating a subset table with all records not containing 0 then joining it back to the full data set where the tables don't match?
I generally try to avoid subqueries, but you could use one for this case. Do the same group by, and check that the id isn't in a subquery of ids that have 0 for SomeVar. In this case, distinct will do the same and more efficiently, so I'll do that first:
SELECT DISTINCT ID
FROM [table_name]
WHERE ID NOT IN (
SELECT ID FROM [table_name] WHERE SomeVar = 0
);
And if you want to get other information by using a GROUP BY:
SELECT ID, max(SomeVar), count(*), sum(SomeVar)
FROM [table_name]
WHERE ID NOT IN (
SELECT ID FROM [table_name] WHERE SomeVar = 0
)
GROUP BY ID;
You can use aggregation and having:
select id
from t
group by id
having min(somevar) > 0;
This assumes that somevar is never negative. If that is a possibility, then you can use the slightly more verbose:
select id
from t
group by id
having sum(case when somevar = 0 then 1 else 0 end) = 0;
Use case statement with count or sum aggregation, filter by count using having:
select ID
from
(
select ID, count(case when SomeVar=0 then 1 end) cnt
from mytable
group by ID having count(case when SomeVar=0 then 1 end) = 0
) s
;

identify rows with not null values in sql

How to retrieve all rows having value in a status column (not null) group by ID column.
Id Name Status
1394 Test 1 Y
1394 Test 2 null
1394 Test 3 null
1395 Test 4 Y
1395 Test 5 Y
I wrote like select * from table where status = 'Y'. It brings me 3 records, how to add condition to bring in only last 2? the 1394 ID have other 2 records, which status is null.
If you want to select groups where the status is only y, you can do:
select t.*
from t
where not exists (select 1
from t t2
where t2.id = t.id and
(t2.Status <> 'Y' or t2.status is null)
);
If you only want the ids, I would use group by and having:
select id
from t
group by id
having min(status) = 'Y' and max(status) = 'Y' and count(*) = count(status);
The last condition checks for no NULL values.
You could also write:
having min(status = 'Y' then 1 else 0 end) = 1
A simple way is:
select * from mytable
where status = 'Y'
and id not in (select id from mytable where status is null)
The existing query "where status = 'Y'" will bring you not null by definition.
If you are trying to get grouped results, a "GROUP BY id" clause will achieve this, which will also require putting id in the select explicitly instead of "*".
Example: SELECT id, COUNT(id) from table where status = 'Y'
If I am reading this correctly you want to bring in the ID for a grouping that never has a NULL status value:
I would use a subquery with a not-exist:
SELECT DISTINCT ID FROM mytable WHERE status IS NULL;
Then filter IDs that do not exist in that list:
SELECT * FROM mytable WHERE id NOT IN (SELECT DISTINCT ID FROM mytable WHERE status IS NULL);
Here are some possible solutions, because I am unclear on exactly what you want as output:
Select Id, Name, Status from table where status is not null;
results in 3 rows:
Id Name Status
1394 Test 1 Y
1395 Test 4 Y
1395 Test 5 Y
Select Id, count(*) as anAmt from table where status is not null group by Id;
/* only retrieves counts per Id */
results in 1 row for each Id:
Id anAmt
1394 1
1395 2

SQL Server : do not Select all if true

I have these columns
Id Status
----------
1 pass
1 fail
2 pass
3 pass
How do I select all that only have a status of pass but if the Id has at least one fail it will not be selected as well.
If same id can have multiple passes
SELECT id
from table
WHERE status = 'pass'
and id NOT IN (SELECT id FROM table WHERE status = 'fail')
You need to use GROUP BY & HAVING clause
SELECT Id
FROM yourtable
GROUP BY Id
HAVING Sum(case when status ='pass' then 1 else 0 end) = count(status)
HAVING clause can be changed to
HAVING Count(case when status ='pass' then 1 end) = count(status)
I just hate chatty case statement, so
SELECT Id
FROM table1
GROUP BY Id
HAVING COUNT(DISTINCT [Status]) = 1 AND MIN([Status]) = 'pass'
or
SELECT Id
FROM table1
GROUP BY Id
HAVING COUNT(NULLIF([Status], 'fail')) = 1 AND COUNT(NULLIF([Status], 'pass')) = 0
The second query only works when status has two values 'pass' and 'fail'.

Replace NULL with values

Here is my challenge:
I have a log table which every time a record is changed adds a new record but puts a NULL value for each non-changed value in each record. In other words only the changed value is set, the rest unchanged fields in each row simply has a NULL value.
Now I would like to replace each NULL value with the value above it that is NOT a NULL value like below:
Source table: Task_log
ID Owner Status Flag
1 Bob Registrar T
2 Sue NULL NULL
3 NULL NULL F
4 Frank Admission T
5 NULL NULL F
6 NULL NULL T
Desired output table: Task_log
ID Owner Status Flag
1 Bob Registrar T
2 Sue Registrar T
3 Sue Registrar F
4 Frank Admission T
5 Frank Admission F
6 Frank Admission T
How do I write a query which will generate the desired output table?
One the new windowed function of SQLServer 2012 is FIRST_VALUE, wich have quite a direct name, it can be partitioned through the OVER clause, before using it is necessary to divide every column in data block, a block for a column begin when a value is found.
With Block As (
Select ID
, Owner
, OBlockID = SUM(Case When Owner Is Null Then 0 Else 1 End)
OVER (ORDER BY ID)
, Status
, SBlockID = SUM(Case When Status Is Null Then 0 Else 1 End)
OVER (ORDER BY ID)
, Flag
, FBlockID = SUM(Case When Flag Is Null Then 0 Else 1 End)
OVER (ORDER BY ID)
From Task_log
)
Select ID
, Owner = FIRST_VALUE(Owner) OVER (PARTITION BY OBlockID ORDER BY ID)
, Status = FIRST_VALUE(Status) OVER (PARTITION BY SBlockID ORDER BY ID)
, Flag = FIRST_VALUE(Flag) OVER (PARTITION BY FBlockID ORDER BY ID)
FROM Block
SQLFiddle demo
The UPDATE query is easily derived
As I mentioned in my comment, I would try to fix the process that is creating the records rather than fixing the junk data. If that is not an option, the code below should get you pointed in the right direction.
UPDATE t1
set t1.owner = COALESCE(t1.owner, t2.owner),
t1.Status = COALESCE(t1.status, t2.status),
t1.Flag = COALESCE(t1.flag, t2.flag)
FROM Task_log as t1
INNER JOIN Task_log as t2
ON t1.id = (t1.id + 1)
where t1.owner is null
OR t1.status is null
OR t1.flag is null
I can think of several approaches.
You could use a combination of COALESCE with an array aggregate function. Unfortunately it doesn't look like SQL Server supports array_agg natively (although some nice people have developed some workarounds).
You could also use a subselect for each column.
SELECT id,
(SELECT TOP 1 FROM (SELECT owner FROM ... WHERE id = outer_id AND owner IS NOT NULL order by ID desc )) AS owner,
-- other columns
You could probably do something with window functions, too.
A vanilla solution would be:
select id
, owner
, coalesce(owner, ( select owner from t t2
where id = (select max(id) from t t3
where id < t1.id and owner is not null))
) as new_owner
, flag
, coalesce(flag, ( select flag from t t2
where id = (select max(id) from t t3
where id < t1.id and flag is not null))
) as new_flag
from t t1
Rather inefficient, but should work on most DBMS