SQL create flag based on earliest/latest date - sql

I have a data set with the following attributes:
- IDs are not unique and has multiple rows
- Each ID has a different date called 'Start Date'
I am trying to add a flag (Y/N) to determine which ID row to use, based on the earliest date.
This is what I have so far:
SELECT *,
min(Start_Date) OVER (PARTITION BY ID) AS FirstEntryFlag,
From `table`
Could someone please give me guidance on how I would achieve this? Thankyou

Is this what you want?
select (case when start_date = min(Start_Date) OVER (PARTITION BY ID)
then 1 else 0
end) as FirstEntryFlag
from t;
If the start date has duplicates for an id and you want only one row flagged, use row_number():
select (case when 1 = row_number() over (partition by id order by Start_Date)
then 1 else 0
end) as FirstEntryFlag
from t;
Finally, some databases support boolean types, so the case is not necessary. Just the conditional expression can return a valid value.

Related

How to filter to get one unique record using SQL

I have a table similar to this. If there is a confirmed record, I want to select the oldest record and if not, select the most recent one. In this case, I would want the 4_A record.
ID
Record
Type
Date
1_A
1
auto
4/7/2021
2_A
1
confirmed
4/1/2021
3_A
1
suggested
4/5/2021
4_A
1
confirmed
4/2/2021
5_A
1
suggested
4/5/2021
I've been able to use the a window function and QUALIFY to filter the most recent one but not sure how to include the TYPE field into the mix.
SELECT * from TABLE WHERE QUALIFY ROW_NUMBER() OVER (PARTITION BY RECORD ORDER BY RECORD,DATE DESC) = 1 ;
Let me assume that you mean the oldest confirmed date if there is a confrimed:
SELECT *
FROM TABLE
WHERE QUALIFY ROW_NUMBER() OVER (PARTITION BY RECORD
ORDER BY (CASE WHEN Type = 'Confirmed' THEN 1 ELSE 2 END),
(CASE WHEN Type = 'Confirmed' THEN DATE END) ASC,
DATE ASC
) = 1;
If you really mean the oldest date if there is a confirmed, then:
SELECT *
FROM TABLE
QUALIFY (CASE WHEN COUNT_IF( Type = 'Confirmed') OVER (PARTITION BY RECORD)
THEN ROW_NUMBER() OVER (PARTITION BY RECORD ORDER BY DATE)
THEN ROW_NUMBER() OVER (PARTITION BY RECORD ORDER BY DATE DESC)
END) = 1;

Update Flag Based On Change of Previous Value

I have below table .Need sql ,If there is change in INPUT value then update FLAG to 1 else 0.
INPUT START_DATE PERSON_ID FLAG
42707 2017-01-01 227317 0
40000 2018-01-01 227317 1
42400 2019-01-01 227317 1
42400 2019-01-02 227317 0
You can use lag() :
select t.*,
(case when lag(input, 1, input) over (partition by person_id order by start_date) = input
then 0 else 1
end) as FLAG
from table t;
If you want this in a query, then use row_number():
select t.*,
(case when row_number() over (partition by person_id order by start_date) = 1
then 0 else 1
end) as flag
from t;
If the input_value could be the same on different rows, then use first_value():
select t.*,
(case when value <> first_value(input) over (partition by person_id order by start_date) = 1
then 0 else 1
end) as flag
from t;
Either form could be incorporated into an update using an updatable CTE if you want to update the table.
EDIT:
If you want to know if the value changes from one row to the "next", then use lag(). In an update, this looks like:
with toupdate as (
select t.*,
lag(input) over (partition by customerid order by date) as prev_input
from t
)
update toupdate
set flag = (case when prev_input <> input then 1 else 0 end);
That said, I would not advise you to store the data in the table. Instead, just put the logic in a select when you need it. Otherwise, the data could get out of date if a historical value is updated.

SQL - Window function to get values from previous row where value is not null

I am using Exasol, in other DBMS it was possible to use analytical functions such LAST_VALUE() and specify some condition for the ORDER BY clause withing the OVER() function, like:
select ...
LAST_VALUE(customer)
OVER (PARTITION BY ID ORDER BY date_x DESC ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING ) as the_last
Unfortunately I get the following error:
ERROR: [0A000] Feature not supported: windowing clause (Session:
1606983630649130920)
the same do not happen if instead of AND 1 PRECEDING I use: CURRENT ROW.
Basically what I wanted is to get the last value according the Order by that is NOT the current row. In this example it would be the $customer of the previous row.
I know that I could use the LAG(customer,1) OVER ( ...) but the problem is that I want the previous customer that is NOT null, so the offset is not always 1...
How can I do that?
Many thanks!
Does this work?
select lag(customer) over (partition by id
order by (case when customer is not null then 1 else 0 end),
date
)
You can do this with two steps:
select t.*,
max(customer) over (partition by id, max_date) as max_customer
from (select t.*,
max(case when customer is not null then date end) over (partition by id order by date) as max_date
from t
) t;

Organizing SQL data based on date

I am trying to organize my SQL data based off of the dates from which the orders were made.
My data:
SELECT DISTINCT ORDER_NO, ITEM, VERSION_NO,
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY ORDER_NO ORDER BY NOT_BEFORE_DATE
ASC) = 1
THEN 'what-if'
ELSE 'wh'
END) AS VERSION_NEW
,
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY ORDER_NO ORDER BY
NOT_BEFORE_DATE ASC) = 2
THEN 'initial'
ELSE 'other'
END) AS VERSION
FROM FDT_MAPTOOL
WHERE ITEM IN (1032711)
;
My results:
I want my data to be ordered by PO# and the date it was created.
As you can see in my picture the First two line have the same ITEM and same PO (Order_No). I need the first two to say Initial on the side because they are the first two based on the dates. They were created first. Everything after should say other.
I am not sure if PL/SQL is needed for this?
Thank you!
Use a different analytic function so that more than one row can have the value of 1 e.g.
SELECT DISTINCT ORDER_NO, ITEM, VERSION_NO,
(CASE WHEN DENSE_RANK() OVER (PARTITION BY ORDER_NO ORDER BY NOT_BEFORE_DATE
ASC) = 1
THEN 'what-if'
ELSE 'wh'
END) AS VERSION_NEW
,
(CASE WHEN DENSE_RANK() OVER (PARTITION BY ORDER_NO ORDER BY
NOT_BEFORE_DATE ASC) = 1
THEN 'initial'
ELSE 'other'
END) AS VERSION
FROM FDT_MAPTOOL
WHERE ITEM IN (1032711)
;
Either rank() OR dense_rank() should work here instead of row_number()
nb: note sure if you really need "select distinct"

SQL Find the minimum date based on consecutive values

I'm having trouble constructing a query that can find consecutive values meeting a condition. Example data below, note that Date is sorted DESC and is grouped by ID.
To be selected, for each ID, the most recent RESULT must be 'Fail', and what I need back is the earliest date in that run of 'Fails'. For ID==1, only the 1st two values are of interest (the last doesn't count due to prior 'Complete'. ID==2 doesn't count at all, failing the first condition, and for ID==3, only the first value matters.
A result table might be:
The trick seems to be doing some type of run-length encoding, but even with several attempts manipulating ROW_NUM and an attempt at the tabibitosan method for grouping consecutive values, I've been unable to gain traction.
Any help would be appreciated.
If your database supports window functions, you can do
select id, case when result='Fail' then earliest_fail_date end earliest_fail_date
from (
select t.*
,row_number() over(partition by id order by dt desc) rn
,min(case when result = 'Fail' then dt end) over(partition by id) earliest_fail_date
from tablename t
) x
where rn=1
Use row_number to get the latest row in the table. min() over() to get the earliest fail date for each id. If the first row has status Fail, you select the earliest_fail_date or else it would be null.
It should be noted that the expected result for id=1 is wrong. It should be 2016-09-20 as it is the earliest fail date.
Edit: Having re-read the question, i think this is what you might be looking for. Getting the minimum Fail date from the latest consecutive groups of Fail rows.
with grps as (
select t.*,row_number() over(partition by id order by dt desc) rn
,row_number() over(partition by id order by dt)-row_number() over(partition by id,result order by dt) grp
from tablename t
)
,maxfailgrp as (
select g.*,
max(case when result = 'Fail' then grp end) over(partition by id) maxgrp
from grps g
)
select id,
case when result = 'Fail' then (select min(dt) from maxfailgrp where id = m.id and grp=m.maxgrp) end earliest_fail_date
from maxfailgrp m
where rn=1
Sample Demo