I have a dataset that looks similar to this:
UniqueRef Description Date
1 Y 14/04/2020
2 Y 02/04/2020
2 X 07/04/2020
2 Y 12/04/2020
3 X 16/04/2020
3 Y 24/04/2020
4 Y 24/04/2020
4 X 21/04/2020
4 X 14/05/2020
4 Y 23/03/2020
I want to check if the description was ever equal to X, grouped by unique ref. I would also like it to have a separate column for the date in which description was X. If the unique ref has been X more than once then the date would be the most recent.
which would give an output similar to:
UniqueRef Description Date Check CheckDate
1 Y 14/04/2020 No NA
2 Y 02/04/2020 Yes 07/04/2020
2 X 07/04/2020 Yes 07/04/2020
2 Y 12/04/2020 Yes 07/04/2020
3 X 16/04/2020 Yes 16/04/2020
3 Y 24/04/2020 Yes 16/04/2020
4 Y 24/04/2020 Yes 14/05/2020
4 X 21/04/2020 Yes 14/05/2020
4 X 14/05/2020 Yes 14/05/2020
4 Y 23/03/2020 Yes 14/05/2020
You can use window functions:
select t.*,
max(case when description = 'X' then 'Y' else 'N' end) over (partition by uniqueref) as ever_x,
max(case when description = 'X' then date end) over (partition by uniqueref) as x_date
from t;
Seems like a couple of windowed conditional aggregates gets you what you're after here:
WITH YourTable AS(
SELECT V.UniqueRef,
V.[Description],
CONVERT(date,V.[Date],103) AS [Date]
FROM (VALUES(1,'Y','14/04/2020'),
(2,'Y','02/04/2020'),
(2,'X','07/04/2020'),
(2,'Y','12/04/2020'),
(3,'X','16/04/2020'),
(3,'Y','24/04/2020'),
(4,'Y','24/04/2020'),
(4,'X','21/04/2020'),
(4,'X','14/05/2020'),
(4,'Y','23/03/2020'))V(UniqueRef,[Description],[Date]))
SELECT YT.UniqueRef,
YT.[Description],
YT.[Date],
MAX(CASE YT.[Description] WHEN 'X' THEN 'Yes' ELSE 'No' END) OVER (PARTITION BY YT.UniqueRef) AS [Check], --CHECK is a reserved keyword, i suggest a different name
MAX(CASE YT.[Description] WHEN 'X' THEN YT.[Date] END) OVER (PARTITION BY YT.UniqueRef) AS CheckDate
FROM YourTable YT;
Related
I have a data set below with ID, Date and Value. I want to flag the ID where three consecutive days has value 0.
id
date
value
1
8/10/2021
1
1
8/11/2021
0
1
8/12/2021
0
1
8/13/2021
0
1
8/14/2021
5
2
8/10/2021
2
2
8/11/2021
3
2
8/12/2021
0
2
8/13/2021
0
2
8/14/2021
6
3
8/10/2021
3
3
8/11/2021
4
3
8/12/2021
0
3
8/13/2021
0
3
8/14/2021
0
output
id
date
value
Flag
1
8/10/2021
1
Y
1
8/11/2021
0
Y
1
8/12/2021
0
Y
1
8/13/2021
0
Y
1
8/14/2021
5
Y
2
8/10/2021
2
N
2
8/11/2021
3
N
2
8/12/2021
0
N
2
8/13/2021
0
N
2
8/14/2021
6
N
3
8/10/2021
3
Y
3
8/11/2021
4
Y
3
8/12/2021
0
Y
3
8/13/2021
0
Y
3
8/14/2021
0
Y
Thank you.
Using window count() function you can count 0's in the frame [current row, 2 following] (ordered by date) - three consecutive rows frame calculated for each row:
count(case when value=0 then 1 else null end) over(partition by id order by date_ rows between current row and 2 following ) cnt.
If count happens to equal 3 then it means 3 consecutive 0's found, case expression produces Y for each row with cnt=3 : case when cnt=3 then 'Y' else 'N' end.
To propagate 'Y' flag to the whole id group use max(...) over (partition by id)
Demo with your data example (tested on Hive):
with mydata as (--Data example, dates converted to sortable format yyyy-MM-dd
select 1 id,'2021-08-10' date_, 1 value union all
select 1,'2021-08-11',0 union all
select 1,'2021-08-12',0 union all
select 1,'2021-08-13',0 union all
select 1,'2021-08-14',5 union all
select 2,'2021-08-10',2 union all
select 2,'2021-08-11',3 union all
select 2,'2021-08-12',0 union all
select 2,'2021-08-13',0 union all
select 2,'2021-08-14',6 union all
select 3,'2021-08-10',3 union all
select 3,'2021-08-11',4 union all
select 3,'2021-08-12',0 union all
select 3,'2021-08-13',0 union all
select 3,'2021-08-14',0
) --End of data example, use your table instead of this CTE
select id, date_, value,
max(case when cnt=3 then 'Y' else 'N' end) over (partition by id) flag
from
(
select id, date_, value,
count(case when value=0 then 1 else null end) over(partition by id order by date_ rows between current row and 2 following ) cnt
from mydata
)s
order by id, date_ --remove ordering if not necessary
--added it to get result in the same order
Result:
id date_ value flag
1 2021-08-10 1 Y
1 2021-08-11 0 Y
1 2021-08-12 0 Y
1 2021-08-13 0 Y
1 2021-08-14 5 Y
2 2021-08-10 2 N
2 2021-08-11 3 N
2 2021-08-12 0 N
2 2021-08-13 0 N
2 2021-08-14 6 N
3 2021-08-10 3 Y
3 2021-08-11 4 Y
3 2021-08-12 0 Y
3 2021-08-13 0 Y
3 2021-08-14 0 Y
You can identify the ids by comparing lag()s. Then spread the value across all rows. The following gets the flag on the third 0:
select t.*,
(case when value = 0 and prev_value_date_2 = prev_date_2
then 'Y' else 'N'
end) as flag_on_row
from (select t.*,
lag(date, 2) over (partition by value, id order by date) as prev_value_date_2,
lag(date, 2) over (partition by id order by date) as prev_date_2
from t
) t;
The above logic uses lag() so it is easy to extend to longer streaks of 0s. The "2" is looking two rows behind, so if the lagged values are the same, then there are three rows in a row with the same value.
And to spread the value:
select t.*, max(flag_on_row) over (partition by id) as flag
from (select t.*,
(case when value = 0 and prev_value_date_2 = prev_date_2
then 'Y' else 'N'
end) as flag_on_row
from (select t.*,
lag(date, 2) over (partition by value, id order by date) as prev_value_date_2,
lag(date, 2) over (partition by id order by date) as prev_date_2
from t
) t
) t;
I have a table X which has three columns SN , OI, FLAG .
Some sample values are
SN OI FLAG
1 a Y
1 a N
2 x N
3 d N
3 d Null
4 z Y
4 z null
5 k Y
5 k Y
5 k Y
6 l N
6 l N
I want the result on the below condition
If there are multiple values of same SN , i want the result on the condition that if FLAG has Y and N , then it should show Y , IF Flag has Null and N , it should show N , IF Flag has Y and Null, then is should show Y. SO in the above example this is what I should get .
SN FLAG
1 Y
2 N
3 N
4 Y
5 Y
6 N
You can group by sn and get the flag with conditional aggregation:
select sn,
case
when sum(case flag when 'Y' then 1 end) > 0 then 'Y'
when sum(case flag when 'N' then 1 end) > 0 then 'N'
end flag
from tablename
group by sn
order by sn
In your special case, this should also work:
select sn, max(flag) flag
from tablename
group by sn
order by sn
because 'Y' > 'N'.
See the demo.
Results:
> SN | FLAG
> -: | :---
> 1 | Y
> 2 | N
> 3 | N
> 4 | Y
> 5 | Y
> 6 | N
For your given rules, you can just use MAX():
select sn, max(flag) as flag
from t
group by sn;
This is a tough one. I've read about concatating values from multible rows in a table, but can't find anything on how to go about the task set before me.
I'm not an oracle-man, and untill now have only made simple select queries, so I'm at a loss here.
In a huge oracle database table (severel hundred millions of rows) containing laboratory results, I need to select information on specific requisitions, that meet a specific criteria.
Criteria: For the same ReqNo, Analysis A B and C must be present with an answer, if they are, any instance of the answer to analysis X, Y or Z should be selected
Table contents:
ReqNo Ana Answer
1 A 7
1 B 14
1 C 18
1 X 250
2 A 8
2 X 35
2 Y 125
3 A 8
3 B 16
3 C 20
3 Z 100
4 X 115
4 Y 355
5 A 6
5 B 15
5 C 22
5 X 300
5 Y 108
5 C 88
Desired result:
ReqNo A B C X Y Z
1 7 14 18 250
3 8 16 20 100
5 6 15 22 300 108 88
leaving out ReqNo 2 and 4, since they don't meet the A/B/C criteria.
Is that even possible?
You may first filter the records that have all 3 (A,B and C) and then use PIVOT to convert them to columns for those which satisfy the criteria.
with req
AS
(
select reqno from t where ana IN ('A','B','C')
GROUP BY reqno HAVING
count(DISTINCT ana) = 3
)
select * FROM
(
select * from t where
exists ( select 1 from req r where t.reqno = r.reqno )
)
PIVOT(
min(answer) for ana in ('A' as A, 'B' as B, 'C' as C,
'X' as X, 'Y' as Y, 'Z' as Z)
) ORDER BY reqno;
Demo
I would just use conditional aggregation:
select reqno,
max(case when Ana = 'A' then Answer end) as a,
max(case when Ana = 'B' then Answer end) as b,
max(case when Ana = 'C' then Answer end) as c,
max(case when Ana = 'X' then Answer end) as x,
max(case when Ana = 'Y' then Answer end) as y,
max(case when Ana = 'Z' then Answer end) as z
from t
group by reqno
having sum(case when Ana = 'A' then 1 else 0 end) > 0 and
sum(case when Ana = 'B' then 1 else 0 end) > 0 and
sum(case when Ana = 'C' then 1 else 0 end) > 0 ;
Given that you don't seem to have duplicates, you can simplify the having to:
having sum(case when Ana in ('A', 'B', 'C') then 1 else 0 end) = 3
Allow me to preface this by saying that I am fairly new to sql, and I'm sure there is an easy way to do this that I'm not understanding.
Lets say we have a table:
X | Y
2 | 2
3 | 1
3 | 3
3 | 2
I am trying to find values of y such that x contains both 2 and 3.
Basically, y = 2 is the only value that satisfies this.
EDIT: I know that in relational algebra this is trivial with division
using a conditional SUM. If any group of Y contain 2 sum will be greater than 0, same with 3
SELECT Y
FROM YourTable
GROUP BY Y
HAVING SUM(CASE WHEN X = 2 THEN 1 ELSE 0 END) > 0
and SUM(CASE WHEN X = 3 THEN 1 ELSE 0 END) > 0
You could probably try this:
select y
from test
where x in (2,3)
group by y
having count(*) = 2;
EDIT: Notice a good recommendation by Juan. In case your data contains X=2 and Y=2, a better way of writing the query would be this:
select y
from test
where x in (2,3)
group by y
having count(distinct x) = 2;
I'd use INTERSECT:
SELECT Y
FROM YourTable
WHERE X = 2
INTERSECT
SELECT Y
FROM YourTable
WHERE X = 3
Using the analytic LAG() function.
SELECT y
FROM
( SELECT x,
y,
lag(x) OVER(PARTITION BY y ORDER BY x) x_lag FROM your_table WHERE x IN (2, 3)
)
WHERE x_lag = x - 1;
Working demo:
SQL> WITH DATA AS(
2 SELECT 2 X, 2 Y FROM dual UNION ALL
3 SELECT 3 X, 1 Y FROM dual UNION ALL
4 SELECT 3 X, 3 Y FROM dual UNION ALL
5 SELECT 3 X, 2 Y FROM dual
6 )
7 SELECT y
8 FROM
9 ( SELECT x,
10 y,
11 lag(x) OVER(PARTITION BY y ORDER BY x) x_lag FROM data WHERE x IN (2, 3)
12 )
13 WHERE x_lag = x - 1;
Y
----------
2
Following my [question]: T-SQL Query a matrix table for free position
I've now trying to handle my matrix table as a LIFO. Each couple of (X,Z) represent a channel in which I can store an element. When I generate a location I'm now using the query provided in the above question and here below.
SELECT x, z, MAX(CASE WHEN disabled = 0 AND occupiedId IS NULL THEN Y ELSE 0 END) firstFreeY
FROM matrix
GROUP BY x, z
ORDER BY x, z;
This is working but it doesn't handle "holes". In fact It's possible that a Disabled flag is removed from the table or an element is manually deleted.
In case my Matrix table will look like this:
X Z Y Disabled OccupiedId
--------------------------------------------------
1 1 1 0 591
1 1 2 0 NULL
1 1 3 1 NULL
1 1 4 0 524
1 1 5 0 523
1 1 6 0 522
1 1 7 0 484
1 2 1 0 NULL
1 2 2 0 NULL
1 2 3 0 NULL
1 2 4 0 NULL
1 2 5 0 NULL
1 2 6 0 589
1 2 7 0 592
the result of the above query is:
X Z firstFreeY
------------------------
1 1 2
1 2 5
instead of:
X Y firstFreeY
------------------------
1 1 0
1 2 5
Any suggestions on how to achieve this?
This query looks for the largest Y that is smaller than all other occupied Y's:
select m1.X
, m1.Z
, max(
case
when m2.MinOccupiedY is null or m1.Y < m2.MinOccupiedY then m1.Y
else 0
end
) as FirstFreeY
from matrix m1
join (
select X
, Z
, min(
case
when disabled <> 0 or occupiedId is not null then Y
end
) as MinOccupiedY
from matrix
group by
X
, Z
) m2
on m1.X = m2.X
and m1.Z = m2.Z
group by
m1.X
, m1.Z
Live example at SQL Fiddle.
just to know if i understood what you were asking, is this working too?
select distinct
m1.x,m1.z, o.y
from
matrix m1
cross apply
(
select top 1 (case when m2.Disabled = 0 then m2.y else 0 end)
from matrix m2
where
m1.x = m2.x
and m1.z = m2.z
and m2.OccupiedId is null
order by m2.y desc
) o (y);