SQL Query for below scenario for multiple rows - sql

Need response as per expected result column in attached image.
The row filtration is required in multiple rows
The rule is (x.attr2 = '1' AND x.attr3 = '1') AND (x.attr2='' AND x.attr3='2') then its expected column value is true but all other conditions its false
Its MS SQL
Key Atr2 Atr3 expected result
111 1 1 TRUE
111 2 2
112 1 4 FALSE
113 1 4 FALSE
113 2 2
114 1 1 FALSE

Check the below script-
IF OBJECT_ID('[Sample]') IS NOT NULL
DROP TABLE [Sample]
CREATE TABLE [Sample]
(
[Key] INT NOT NULL,
Attr1 INT NOT NULL,
Attr2 INT NOT NULL,
Attr3 INT NOT NULL
)
GO
INSERT INTO [Sample] ([Key],Attr1,Attr2,Attr3)
VALUES (111,62,1,1),
(111,62,2,2),
(112,62,1,4),
(113,62,1,4),
(113,62,2,2),
(114,62,1,1)
--EXPECTED_RESULT:
SELECT S.*,CASE WHEN T.[KEY] IS NOT NULL THEN 'TRUE' ELSE 'FALSE' END AS Expected_Result
FROM [Sample] S LEFT JOIN
(
SELECT T.[KEY] FROM
(
SELECT x.*,
ROW_NUMBER() OVER( PARTITION BY x.[KEY],x.attr1 ORDER BY x.attr2,x.attr3) AS r_no
--,CASE WHEN (x.attr2 = 1 AND x.attr3 = 1) OR (x.attr2 = 2 AND x.attr3 = 2)
--then 'TRUE' else 'FALSE' end as expected_result
FROM [Sample] x WHERE x.attr2=x.attr3
) T WHERE T.r_no>1
) T ON S.[KEY]=T.[KEY]

This query:
select key from tablename
group by key
having sum(case when atr2 = '1' and atr3 = '1' then 1 else 0 end) > 0
and sum(case when atr2 = '2' and atr3 = '2' then 1 else 0 end) > 0
and count(*) = 2
uses conditional aggregation to find the keys for which the result should be true.
So join it to the table like this:
select t.*,
case when g.[key] is null then 'FALSE' else 'TRUE' end result
from tablename t left join (
select [key] from tablename
group by [key]
having sum(case when atr2 = '1' and atr3 = '1' then 1 else 0 end) > 0
and sum(case when atr2 = '2' and atr3 = '2' then 1 else 0 end) > 0
and count(*) = 2
) g on g.[key] = t.[key]
See the demo.
Results:
> Key | Atr2 | Atr3 | result
> --: | ---: | ---: | :-----
> 111 | 1 | 1 | TRUE
> 111 | 2 | 2 | TRUE
> 112 | 1 | 4 | FALSE
> 113 | 1 | 4 | FALSE
> 113 | 2 | 2 | FALSE
> 114 | 1 | 1 | FALSE

Related

Select only the "most complete" record

I need to solve the following problem.
Let's suppose I have a table with 4 fields called a, b, c, d.
I have the following records:
-------------------------------------
a | b | c | d
-------------------------------------
1 | 2 | | row 1
1 | 2 | 3 | 4 row 2
1 | 2 | | 4 row 3
1 | 2 | 3 | row 4
As it's possible to observe, rows 1,3,4 are "sub-records" of row 2.
What I would like to do is, to extract only 2nd row.
Could you help me please?
Thanks in advance for the answer
EDIT: I need to be more specific.
I could have also the cases:
-------------------------------------
a | b | c | d
-------------------------------------
1 | 2 | | row 1
1 | 2 | | 4 row 2
1 | | | 4 row 3
where I need to extract the 2nd row,
-------------------------------------
a | b | c | d
-------------------------------------
1 | 2 | | row 1
1 | 2 | 3 | row 2
1 | | 3 | row 3
and again I need to extract the 2nd row.
Same for couples,
a | b | c | d
-------------------------------------
1 | | | row 1
1 | | 3 | row 2
| | 3 | row 3
and so on for the other examples.
(Of course, it's now always 2nd row)
Using a NOT EXISTS the records that have a better duplicate can be filtered out.
create table abcd (
a int,
b int,
c int,
d int
);
insert into abcd (a, b, c, d) values
(1, 2, null, null)
,(1, 2, 3, 4)
,(1, 2, null, 4)
,(1, 2, 3, null)
,(2, 3, null,null)
,(2, 3, null, 5)
,(2, null, null, 5)
,(3, null, null, null)
,(3, null, 5, null)
,(null, null, 5, null)
SELECT *
FROM abcd AS t
WHERE NOT EXISTS
(
select 1
from abcd as d
where (t.a is null or d.a = t.a)
and (t.b is null or d.b = t.b)
and (t.c is null or d.c = t.c)
and (t.d is null or d.d = t.d)
and (case when t.a is null then 0 else 1 end +
case when t.b is null then 0 else 1 end +
case when t.c is null then 0 else 1 end +
case when t.d is null then 0 else 1 end) <
(case when d.a is null then 0 else 1 end +
case when d.b is null then 0 else 1 end +
case when d.c is null then 0 else 1 end +
case when d.d is null then 0 else 1 end)
);
a | b | c | d
-: | ---: | ---: | ---:
1 | 2 | 3 | 4
2 | 3 | null | 5
3 | null | 5 | null
db<>fiddle here
You will need to compute a "completion index" for each row. In the example you provided, you might use something along the lines of:
(CASE WHEN a IS NULL THEN 0 ELSE 1) +
(CASE WHEN b IS NULL THEN 0 ELSE 1) +
(CASE WHEN c IS NULL THEN 0 ELSE 1) +
(CASE WHEN d IS NULL THEN 0 ELSE 1) AS CompletionIndex
Then SELECT the top 1 ordered by CompletionIndex in descending order.
This is obviously not very scalable across a large number of columns. But if you have a large number of sparsely populated columns you might consider a row-based rather than column-based structure for your data. That design would make it much easier to count the number of non-NULL values for each entity.
Most complete rows, by your definition, are the ones with the least null columns:
SELECT * FROM tablename
WHERE (
(CASE WHEN a IS NULL THEN 0 ELSE 1 END) +
(CASE WHEN b IS NULL THEN 0 ELSE 1 END) +
(CASE WHEN c IS NULL THEN 0 ELSE 1 END) +
(CASE WHEN d IS NULL THEN 0 ELSE 1 END)
) =
(SELECT MAX(
(CASE WHEN a IS NULL THEN 0 ELSE 1 END) +
(CASE WHEN b IS NULL THEN 0 ELSE 1 END) +
(CASE WHEN c IS NULL THEN 0 ELSE 1 END) +
(CASE WHEN d IS NULL THEN 0 ELSE 1 END))
FROM tablename)
Hmmm . . . I think you can use not exists:
with t as (
select t.*, row_number() over (order by a) as id
from t
)
select t.*
from t
where not exists (select 1
from t t2
where ((t2.a is not distinct from t.a or t2.a is not null and t.a is null) and
(t2.b is not distinct from t.b or t2.b is not null and t.b is null) and
(t2.c is not distinct from t.c or t2.c is not null and t.c is null) and
(t2.d is not distinct from t.d or t2.d is not null and t.d is null)
) and
t2.id <> t.id
);
The logic is that no more specific row exists, where the values match
Here is a db<>fiddle.
As mentioned by Gordon Linoff, we do have to use something like not exists too,
Edit Using EXCEPT helps
This might work...
SELECT * from table1
EXCEPT
(
SELECT t1.*
FROM table1 t1
JOIN table1 t2
ON COALESCE(t1.a, t2.a, -1) = COALESCE(t2.a, -1)
AND COALESCE(t1.b, t2.b, -1) = COALESCE(t2.b, -1)
AND COALESCE(t1.c, t2.c, -1) = COALESCE(t2.c, -1)
AND COALESCE(t1.d, t2.d, -1) = COALESCE(t2.d, -1)
)
Here, t1 is every subset row.
Note: We are assuming value -1 as sentinel value and it does not occur in any column.

Parse column B content into logical parts according to column A

I have a table like this
sessionId | hostname
------ | ------
a1 | domain1
a1 | domain2
a2 | domain1
a3 | domain1
a3 | domain2
a4 | domain2
What I want is to build a logical table containing the follwoing
sessionId | only domain1 | only domain2 | domain1 OR domain2 | domain1 AND domain2
-----------|----------------|--------------|--------------------|--------------------
a1 | 1 | 1 | 1 | 1
a2 | 1 | 0 | 1 | 0
a3 | 1 | 1 | 1 | 1
a4 | 0 | 1 | 1 | 0
I guess there's a simple solution for this, but I can't get my head over it :(
You can use conditional aggregation:
select (case when sum(case when hostname = 'domain1' then 1 else 0 end) > 0
then 1 else 0
end) as domain1,
(case when sum(case when hostname = 'domain2' then 1 else 0 end) > 0
then 1 else 0
end) as domain2,
(case when sum(case when hostname = 'domain1' then 1 else 0 end) > 0 or
sum(case when hostname = 'domain2' then 1 else 0 end) > 0
then 1 else 0
end) as either,
(case when sum(case when hostname = 'domain1' then 1 else 0 end) > 0 and
sum(case when hostname = 'domain2' then 1 else 0 end) > 0
then 1 else 0
end) as both
from t
group by sessionid;
Try this :
Declare #Table as Table (sessionId varchar(100),hostname varchar(100))
Insert into #Table Values
('a1','domain1'),
('a1','domain2'),
('a2','domain1'),
('a3','domain1'),
('a3','domain2'),
('a4','domain2')
Select distinct T.sessionId,
case when s1.sessionid is null then 0 else 1 end [only domain1],
case when s2.sessionid is null then 0 else 1 end [only domain2],
case when
(
case when s1.sessionid is null then 0 else 1 end = 1 or
case when s2.sessionid is null then 0 else 1 end = 1
) then 1 else 0 end [domain1 OR domain2],
case when
(
case when s1.sessionid is null then 0 else 1 end = 1 and
case when s2.sessionid is null then 0 else 1 end = 1
) then 1 else 0 end [domain1 AND domain2]
from #Table T
Left Join
(
Select sessionId From #Table where hostname = 'domain1'
) s1 on s1.sessionId = T.sessionId
Left Join
(
Select sessionId From #Table where hostname = 'domain2'
) s2 on s2.sessionId = T.sessionId
For BigQuery Standard SQL
#standardSQL
SELECT
sessionId,
SIGN(COUNTIF(hostname='domain1')) only_domain1,
SIGN(COUNTIF(hostname='domain2')) only_domain2,
SIGN(COUNTIF(hostname='domain1')+COUNTIF(hostname='domain2')) domain1_or_domain2,
SIGN(COUNTIF(hostname='domain1')*COUNTIF(hostname='domain2')) domain1_and_domain2
FROM `yourproject.yourdataset.yourtable`
GROUP BY sessionId
you can test / play with it using dummy data from your question
#standardSQL
WITH `yourproject.yourdataset.yourtable` AS (
SELECT 'a1' sessionId, 'domain1' hostname UNION ALL
SELECT 'a1', 'domain2' UNION ALL
SELECT 'a2', 'domain1' UNION ALL
SELECT 'a3', 'domain1' UNION ALL
SELECT 'a3', 'domain2' UNION ALL
SELECT 'a4', 'domain2'
)
SELECT
sessionId,
SIGN(COUNTIF(hostname='domain1')) only_domain1,
SIGN(COUNTIF(hostname='domain2')) only_domain2,
SIGN(COUNTIF(hostname='domain1')+COUNTIF(hostname='domain2')) domain1_or_domain2,
SIGN(COUNTIF(hostname='domain1')*COUNTIF(hostname='domain2')) domain1_and_domain2
FROM `yourproject.yourdataset.yourtable`
GROUP BY sessionId
ORDER BY sessionId

SQL check if column contains specific values

I have a table like this:
id | Values
------------------
1 | a
1 | b
1 | c
1 | d
1 | e
2 | a
2 | a
2 | c
2 | c
2 | e
3 | a
3 | c
3 | b
3 | d
Now I want to know which id contains at least one of a, one of b and one of c.
This is the result I want:
id
--------
1
3
One method is aggregation with having:
select id
from t
where values in ('a', 'b', 'c')
group by id
having count(distinct values) = 3;
If you wanted more flexibility with the counts of each value:
having sum(case when values = 'a' then 1 else 0 end) >= 1 and
sum(case when values = 'b' then 1 else 0 end) >= 1 and
sum(case when values = 'c' then 1 else 0 end) >= 1
You can use grouping:
SELECT id
FROM your_table
GROUP BY id
HAVING SUM(CASE WHEN value = 'a' THEN 1 ELSE 0 END) >= 1
AND SUM(CASE WHEN value = 'b' THEN 1 ELSE 0 END) = 1
AND SUM(CASE WHEN value = 'c' THEN 1 ELSE 0 END) = 1;
or using COUNT:
SELECT id
FROM your_table
GROUP BY id
HAVING COUNT(CASE WHEN value = 'a' THEN 1 END) >= 1
AND COUNT(CASE WHEN value = 'b' THEN 1 END) = 1
AND COUNT(CASE WHEN value = 'c' THEN 1 END) = 1;

SQL: Get multiple line entries linked to one item?

I have a table:
ID | ITEMID | STATUS | TYPE
1 | 123 | 5 | 1
2 | 123 | 4 | 2
3 | 123 | 5 | 3
4 | 125 | 3 | 1
5 | 125 | 5 | 3
Any item can have 0 to many entries in this table. I need a query that will tell me if an ITEM has all it's entries in either a state of 5 or 4. For example, in the above example, I would like to end up with the result:
ITEMID | REQUIREMENTS_MET
123 | TRUE --> true because all statuses are either 5 or 4
125 | FALSE --> false because it has a status of 3 and a status of 5.
If the 3 was a 4 or 5, then this would be true
What would be even better is something like this:
ITEMID | MET_REQUIREMENTS | NOT_MET_REQUIREMENTS
123 | 3 | 0
125 | 1 | 1
Any idea how to write a query for that?
Fast, short, simple:
SELECT itemid
,count(status = 4 OR status = 5 OR NULL) AS met_requirements
,count(status < 4 OR status > 5 OR NULL) AS not_met_requirements
FROM tbl
GROUP BY itemid
ORDER BY itemid;
Assuming all columns to be integer NOT NULL.
Builds on basic boolean logic:
TRUE OR NULL yields TRUE
FALSE OR NULL yields NULL
And NULL is not counted by count().
->SQLfiddle demo.
SELECT a.ID FROM (SELECT ID, MIN(STATUS) AS MINSTATUS, MAX(STATUS) AS MAXSTATUS FROM TABLE_NAME AS a GROUP BY ID)
WHERE a.MINSTATUS >= 4 AND a.MAXSTATUS <= 5
One way of doing this would be
SELECT t1.itemid, NOT EXISTS(SELECT 1
FROM mytable t2
WHERE itemid=t1.itemid
AND status NOT IN (4, 5)) AS requirements_met
FROM mytable t1
GROUP BY t1.itemid
UPDATE: for your updated requirement, you can have something like:
SELECT itemid,
sum(CASE WHEN status IN (4, 5) THEN 1 ELSE 0 END) as met_requirements,
sum(CASE WHEN status IN (4, 5) THEN 0 ELSE 1 END) as not_met_requirements
FROM mytable
GROUP BY itemid
simple one:
select
"ITEMID",
case
when min("STATUS") in (4, 5) and max("STATUS") in (4, 5) then 'True'
else 'False'
end as requirements_met
from table1
group by "ITEMID"
better one:
select
"ITEMID",
sum(case when "STATUS" in (4, 5) then 1 else 0 end) as MET_REQUIREMENTS,
sum(case when "STATUS" in (4, 5) then 0 else 1 end) as NOT_MET_REQUIREMENTS
from table1
group by "ITEMID";
sql fiddle demo
WITH dom AS (
SELECT DISTINCT item_id FROM items
)
, yes AS ( SELECT item_id, COUNT(*) AS good_count FROM items WHERE status IN (4,5) GROUP BY item_id
)
, no AS ( SELECT item_id, COUNT(*) AS bad_count FROM items WHERE status NOT IN (4,5) GROUP BY item_id
)
SELECT d.item_id
, COALESCE(y.good_count,0) AS good_count
, COALESCE(n.bad_count,0) AS bad_count
FROM dom d
LEFT JOIN yes y ON y.item_id = d.item_id
LEFT JOIN no n ON n.item_id = d.item_id
;
Can be done with an outer join, too:
WITH yes AS ( SELECT item_id, COUNT(*) AS good_count FROM items WHERE status IN (4,5) GROUP BY item_id)
, no AS ( SELECT item_id, COUNT(*) AS bad_count FROM items WHERE status NOT IN (4,5) GROUP BY item_id)
SELECT COALESCE(y.item_id, n.item_id) AS item_id
, COALESCE(y.good_count,0) AS good_count
, COALESCE(n.bad_count,0) AS bad_count
FROM yes y
FULL JOIN no n ON n.item_id = y.item_id
;
Nevermind, it was actually easy to do:
select ITEM_ID ,
sum (case when STATUS >= 3 then 1 else 0 end ) as met_requirements,
sum (case when STATUS < 3 then 1 else 0 end ) as not_met_requirements
from TABLE as d
group by ITEM_ID

How to count and group by combinations of multiple columns?

For a table such as:
foo_table
id | str_col | bool_col
1 "1234" 0
2 "3215" 0
3 "8132" 1
4 NULL 1
5 "" 1
6 "" 0
I know how to query both of:
count(*) | bool_col
3 0
3 1
and
count(*) | isnull(str_col) or str_col = ""
3 0
3 1
but how could I get something like:
count(*) | bool_col | isnull(str_col) or str_col = ""
2 0 0
1 0 1
1 1 0
2 1 1
In the meantime, I'm just individually doing:
select count(*) from foo_table where bool_col and (isnull(str_col) or str_col = "");
select count(*) from foo_table where not bool_col and (isnull(str_col) or str_col = "");
select count(*) from foo_table where bool_col and not (isnull(str_col) or str_col = "");
select count(*) from foo_table where not bool_col and not (isnull(str_col) or str_col = "");
Try
SELECT COUNT(*),
bool_col,
CASE WHEN str_col IS NULL OR str_col = '' THEN 1 ELSE 0 END str_col
FROM foo_table
GROUP BY bool_col,
CASE WHEN str_col IS NULL OR str_col = '' THEN 1 ELSE 0 END
Output (MySql):
| COUNT(*) | BOOL_COL | STR_COL |
---------------------------------
| 2 | 0 | 0 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
| 2 | 1 | 1 |
SQLFiddle MySQL
SQLFiddle SQL Server
SELECT COUNT(CASE
WHEN bool_col AND (isnull(str_col) or str_col = "") THEN 1
END) as c1,
COUNT(CASE
WHEN not bool_col and (isnull(str_col) or str_col = "") THEN 1
END) as c2,
COUNT(CASE
WHEN bool_col and not (isnull(str_col) or str_col = "") THEN 1
END) as c3,
COUNT(CASE
WHEN not bool_col and not (isnull(str_col) or str_col = "") THEN 1
END) as c4
FROM table1
In oracle there is a build in function for that called cube
select bool_col ,
case when str_col is null or str_col = '' then 1 else 0 end str_col ,
count(*)
from table1
group by cube (bool_col , case when str_col is null or str_col = '' then 1 else 0 end)
cube will give you all combination. there is also rollup which is a private case of cube.