check if the person have pervious record with certain date :db2 sql - sql

i have the following table :
id name start end
1 Asla 2021-01-01 2021-12-31
1 Asla 2022-01-01 2022-04-15
2 Tina 2021-05-16 2021-09-23
3 Layla 2021-01-01 2021-09-27
3 Layla 2022-01-01 2022-07-18
2 Sim 2020-05-12 2020-08-13
3 Anderas 2021-07-01 2021-09-13
3 Anderas 2021-10-01 2021-11-18
3 Anderas 2022-01-01 2029-11-18
4 Klara 2022-01-01 null
what i want to do get persons that have work (date) under 2021 and create a new column that show status (if the person continue having work under 2022 -- ok else not ok and if the person is new like 'Klara' get new ) and show last record for every person . maybe too End = null ??????
i tried this .
select w.id ,w.name ,w.start ,w.end, max_date.end
from Work_date w
left join (select * from Work_date where start>='2022-01-01')max_date on max_date.id=id
where w.start>='2021-01-01'
``` but the problem i get the result as this
<pre>
id name start end
1 Asla 2021-01-01 null
1 Asla 2022-01-01 2022-04-15
2 Tina 2021-05-16 null
3 Layla 2021-01-01 null
3 Layla 2022-01-01 2022-07-18
3 Anderas 2021-07-01 null
3 Anderas 2021-10-01 2021-11-18
3 Anderas 2022-01-01 null
4 Klara 2022-01-01 null
</pre>
men i want to get result as <pre>
id name start end status
1 Asla 2022-01-01 2022-04-15 ok
2 Tina 2021-05-16 2021-09-23 not ok
3 Layla 2022-01-01 2022-07-18 ok
3 Anderas 2022-01-01 2029-11-18 ok
4 Klara 2022-01-01 null ok

Looks like you can simply aggregate.
Then use a CASE WHEN for the status.
select
w.id
, w.name
, max(w.start) as start
, max(w.end) as end
, case
when year(max(end)) < 2022 then 'not ok'
else 'ok'
end as status
from Work_date w
where w.start >= '2021-01-01'
group by w.id, w.name
order by w.id, max(w.start), max(w.end);
ID
NAME
START
END
STATUS
1
Asla
2022-01-01
2022-04-15
ok
2
Tina
2021-05-16
2021-09-23
not ok
3
Layla
2022-01-01
2022-07-18
ok
3
Anderas
2022-01-01
2029-11-18
ok
4
Klara
2022-01-01
null
ok
Demo on db<>fiddle here

Related

How to Create table with Dates in range defined by table with start date inputs

I am trying to create a dates table in SQL based on a set of inputs, but I haven't been able to figure it out.
I am receiving in SQL inputs as below:
This table:
Date
Value
2022-01-01
5
2022-07-12
10
2022-11-15
3
A Start Date = 2022-01-01
A stop Date = 2022-12-01
I need to get a table as below starting from Start Date until Stop Date, assiging each correspondent number based on the initial table to each date in that period:
Date
Value
2022-01-01
5
2022-01-02
5
2022-01-03
5
2022-01-04
5
.
5
.
5
.
5
2022-07-09
5
2022-07-10
5
2022-07-11
5
2022-07-12
10
2022-07-13
10
2022-07-14
10
.
10
.
10
2022-11-13
10
2022-11-14
10
2022-11-15
3
2022-11-16
3
2022-11-17
3
2022-11-18
3
How can I do that?
Thanks.
Using the window function lead() over() in concert with an ad-hoc tally table
Example
Select Date = dateadd(DAY,N,A.Date)
,A.Value
From (
Select *
,nDays = datediff(DAY,Date,lead(Date,1,dateadd(day,1,'2022-12-01')) over (order by date))
From YourTable
) A
Join ( Select Top 1000 N=-1+Row_Number() Over (Order By (Select NULL)) From master..spt_values n1, master..spt_values n2 ) B
on N<NDays
Order by Date
Results
Date Value
2022-01-01 5
2022-01-02 5
2022-01-03 5
2022-01-04 5
2022-01-05 5
...
2022-07-10 5
2022-07-11 5
2022-07-12 10
2022-07-13 10
2022-07-14 10
...
2022-11-12 10
2022-11-13 10
2022-11-14 10
2022-11-15 3
2022-11-16 3
2022-11-17 3
...
2022-11-30 3
2022-12-01 3

count of rows based on multiple conditions/partitions on the same table

DBFiddle Link: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=2d7e9a4ddfdc8fb619a8dfc76d767950
Hi. I have one table called 'Model Versions' which has the fields and records as below.
intent_id
intent_name
version
version_created_at
client
sentence
1
a_intent
1
2021-01-01
es_client1
sentence_1
1
a_intent
1
2021-01-01
es_client1
sentence_2
1
a_intent
1
2021-01-01
es_client1
sentence_3
2
b_intent
2
2021-02-01
es_client1
sentence_1
2
b_intent
2
2021-02-01
es_client1
sentence_2
2
b_intent
2
2021-02-01
es_client1
sentence_3
3
c_intent
3
2021-03-01
es_client1
sentence_1
3
c_intent
3
2021-03-01
es_client1
sentence_2
4
d_intent
4
2021-04-01
es_client1
sentence_1
4
d_intent
4
2021-04-01
es_client1
sentence_2
5
e_intent
5
2021-05-01
es_client1
sentence_1
6
g_intent
1
2021-01-01
es_client2
sentence_1
6
g_intent
1
2021-01-01
es_client2
sentence_2
7
h_intent
2
2021-03-01
es_client2
sentence_1
7
h_intent
2
2021-03-01
es_client2
sentence_2
7
h_intent
2
2021-03-01
es_client2
sentence_3
8
i_intent
3
2021-04-01
es_client2
sentence_1
8
i_intent
3
2021-04-01
es_client2
sentence_2
9
j_intent
4
2021-05-01
es_client2
sentence_1
9
j_intent
4
2021-05-01
es_client2
sentence_2
10
k_intent
1
2021-01-01
es_client3
sentence_1
10
k_intent
1
2021-01-01
es_client3
sentence_2
11
k_intent
2
2021-06-01
es_client3
sentence_1
11
k_intent
2
2021-06-01
es_client3
sentence_2
12
k_intent
3
2021-07-01
es_client3
sentence_1
12
k_intent
3
2021-07-01
es_client3
sentence_2
13
k_intent
4
2021-08-01
es_client3
sentence_1
13
k_intent
4
2021-08-01
es_client3
sentence_2
14
k_intent
5
2021-10-01
es_client3
sentence_1
14
k_intent
5
2021-10-01
es_client3
sentence_2
Expected Output:
I wanted to get the top 3 versions of each client along with their respective sentence count. My expected output looks like below:
client
version
total_count_of_sentences_per_version
version_created_at
es_client1
5
1
2021-05-01
es_client1
4
2
2021-04-01
es_client1
3
2
2021-03-01
es_client2
4
2
2021-05-01
es_client2
3
2
2021-04-01
es_client2
2
3
2021-03-01
es_client3
5
2
2021-10-01
es_client3
4
2
2021-08-01
es_client3
3
2
2021-06-01
I tried writing a query with multiple CTEs and Partition By's. But none worked out. Seeking your help to achieve this.
DBFiddle Link: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=2d7e9a4ddfdc8fb619a8dfc76d767950
You did not specify which of the top 3 version you wish to fetch. I'll assume you want to retrieve the 3 latest versions, based on creation date.
My suggestion is to use a ROW_NUMBER() for each client in a windowed function, and to filter the top 3 rows.
For instance :
with cte as(
select
client,
version,
version_created_at,
count(Sentence) total_count_of_sentences_per_version,
row_number() over(partition by client order by version_created_at desc) version_row_number
from model_versions
group by
client,version,
version_created_at
)
select
client,
version,
total_count_of_sentences_per_version,
version_created_at
from cte
where version_row_number <=3
Try it online
You can try this:
WITH main_tab
AS (SELECT client,
version,
Count(*)
OVER (
partition BY client, version),
Min(version_created_at)
OVER (
partition BY client, version),
Dense_rank()
OVER (
partition BY client
ORDER BY version DESC) rn
FROM model_versions)
SELECT DISTINCT m.*
FROM main_tab m;

CASE in WHERE Clause in Snowflake

I am trying to do a case statement within the where clause in snowflake but I’m not quite sure how should I go about doing it.
What I’m trying to do is, if my current month is Jan, then the where clause for date is between start of previous year and today. If not, the where clause for date would be between start of current year and today.
WHERE
CASE MONTH(CURRENT_DATE()) = 1 THEN DATE BETWEEN DATE_TRUNC(‘YEAR’, DATEADD(YEAR, -1, CURRENT_DATE())) AND CURRENT_DATE()
CASE MONTH(CURRENT_DATE()) != 1 THEN DATE BETWEEN DATE_TRUNC(‘YEAR’, CURRENT_DATE()) AND CURRENT_DATE()
END
Appreciate any help on this!
Use a CASE expression that returns -1 if the current month is January or 0 for any other month, so that you can get with DATEADD() a date of the previous or the current year to use in DATE_TRUNC():
WHERE DATE BETWEEN
DATE_TRUNC('YEAR', DATEADD(YEAR, CASE WHEN MONTH(CURRENT_DATE()) = 1 THEN -1 ELSE 0 END, CURRENT_DATE()))
AND
CURRENT_DATE()
I suspect that you don't even need to use CASE here:
WHERE
(MONTH(CURRENT_DATE()) = 1 AND
DATE BETWEEN DATE_TRUNC(‘YEAR’, DATEADD(YEAR, -1, CURRENT_DATE())) AND
CURRENT_DATE()) OR
(MONTH(CURRENT_DATE()) != 1 AND
DATE BETWEEN DATE_TRUNC(‘YEAR’, CURRENT_DATE()) AND CURRENT_DATE())
So the other answers are quite good, but... the answer can be even simpler
Making a little table to brake down what is happening.
select
row_number() over (order by null) - 1 as rn,
dateadd('day', rn * 5, date_trunc('year',current_date())) as pretend_current_date,
DATEADD(YEAR, -1, pretend_current_date) as pcd_sub1,
month(pretend_current_date) as pcd_month,
DATE_TRUNC(year, iff(pcd_month = 1, pcd_sub1, pretend_current_date)) as _from,
pretend_current_date as _to
from table(generator(ROWCOUNT => 30))
order by rn;
this shows:
RN
PRETEND_CURRENT_DATE
PCD_SUB1
PCD_MONTH
_FROM
_TO
0
2022-01-01
2021-01-01
1
2021-01-01
2022-01-01
1
2022-01-06
2021-01-06
1
2021-01-01
2022-01-06
2
2022-01-11
2021-01-11
1
2021-01-01
2022-01-11
3
2022-01-16
2021-01-16
1
2021-01-01
2022-01-16
4
2022-01-21
2021-01-21
1
2021-01-01
2022-01-21
5
2022-01-26
2021-01-26
1
2021-01-01
2022-01-26
6
2022-01-31
2021-01-31
1
2021-01-01
2022-01-31
7
2022-02-05
2021-02-05
2
2022-01-01
2022-02-05
8
2022-02-10
2021-02-10
2
2022-01-01
2022-02-10
9
2022-02-15
2021-02-15
2
2022-01-01
2022-02-15
10
2022-02-20
2021-02-20
2
2022-01-01
2022-02-20
11
2022-02-25
2021-02-25
2
2022-01-01
2022-02-25
12
2022-03-02
2021-03-02
3
2022-01-01
2022-03-02
13
2022-03-07
2021-03-07
3
2022-01-01
2022-03-07
14
2022-03-12
2021-03-12
3
2022-01-01
2022-03-12
15
2022-03-17
2021-03-17
3
2022-01-01
2022-03-17
16
2022-03-22
2021-03-22
3
2022-01-01
2022-03-22
17
2022-03-27
2021-03-27
3
2022-01-01
2022-03-27
18
2022-04-01
2021-04-01
4
2022-01-01
2022-04-01
19
2022-04-06
2021-04-06
4
2022-01-01
2022-04-06
20
2022-04-11
2021-04-11
4
2022-01-01
2022-04-11
21
2022-04-16
2021-04-16
4
2022-01-01
2022-04-16
22
2022-04-21
2021-04-21
4
2022-01-01
2022-04-21
23
2022-04-26
2021-04-26
4
2022-01-01
2022-04-26
24
2022-05-01
2021-05-01
5
2022-01-01
2022-05-01
25
2022-05-06
2021-05-06
5
2022-01-01
2022-05-06
26
2022-05-11
2021-05-11
5
2022-01-01
2022-05-11
27
2022-05-16
2021-05-16
5
2022-01-01
2022-05-16
28
2022-05-21
2021-05-21
5
2022-01-01
2022-05-21
29
2022-05-26
2021-05-26
5
2022-01-01
2022-05-26
Your logic is asking "is the current date in the month of January", at which point take the prior year, and then date truncate to the year, otherwise take the current date and truncate to the year. As the start of a BETWEEN test.
This is the same as getting the current date subtracting one month, and truncating this to year.
Thus there is no need for any IFF or CASE
WHERE date BETWEEN DATE_TRUNC(year, DATEADD(month,-1, CURRENT_DATE())) AND CURRENT_DATE()
and if you like to drop some paren's, CURRENT_DATE can be used if you leave it in upper case, thus it can even be smaller:
WHERE date BETWEEN DATE_TRUNC(year, DATEADD(month,-1, CURRENT_DATE)) AND CURRENT_DATE

How can I join two tables on an ID and a DATE RANGE in SQL

I have 2 query result tables containing records for different assessments. There are RAssessments and NAssessments which make up a complete review.
The aim is to eventually determine which reviews were completed. I would like to join the two tables on the ID, and on the date, HOWEVER the date each assessment is completed on may not be identical and may be several days apart, and some ID's may have more of an RAssessment than an NAssessment.
Therefore, I would like to join T1 on to T2 on ID & on T1Date(+ or - 7 days). There is no other way to match the two tables and to align the records other than using the date range, as this is a poorly designed database. I hope for some help with this as I am stumped.
Here is some sample data:
Table #1:
ID
RAssessmentDate
1
2020-01-03
1
2020-03-03
1
2020-05-03
2
2020-01-09
2
2020-04-09
3
2022-07-21
4
2020-06-30
4
2020-12-30
4
2021-06-30
4
2021-12-30
Table #2:
ID
NAssessmentDate
1
2020-01-07
1
2020-03-02
1
2020-05-03
2
2020-01-09
2
2020-07-06
2
2020-04-10
3
2022-07-21
4
2021-01-03
4
2021-06-28
4
2022-01-02
4
2022-06-26
I would like my end result table to look like this:
ID
RAssessmentDate
NAssessmentDate
1
2020-01-03
2020-01-07
1
2020-03-03
2020-03-02
1
2020-05-03
2020-05-03
2
2020-01-09
2020-01-09
2
2020-04-09
2020-04-10
2
NULL
2020-07-06
3
2022-07-21
2022-07-21
4
2020-06-30
NULL
4
2020-12-30
2021-01-03
4
2021-06-30
2021-06-28
4
2021-12-30
2022-01-02
4
NULL
2022-01-02
Try this:
SELECT
COALESCE(a.ID, b.ID) ID,
a.RAssessmentDate,
b.NAssessmentDate
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RowId, *
FROM table1
) a
FULL OUTER JOIN (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RowId, *
FROM table2
) b ON a.ID = b.ID AND a.RowId = b.RowId
WHERE (a.RAssessmentDate BETWEEN '2020-01-01' AND '2022-01-02')
OR (b.NAssessmentDate BETWEEN '2020-01-01' AND '2022-01-02')

filter data based on month start and month end

Given a dataframe with date column in this format.
Date Group
2020-05-18 1
2020-06-22 1
2019-07-11 1
2018-03-01 1
2021-01-21 2
2021-05-05 2
2021-09-11 2
And two strings;
Start = 2020-05 (indicating month start)
End = 2021-09 (indicating month end)
I want to filter out the data so that only the dates that fall within the start and end date are available in the dataframe.
Expected output:
Date Group
2020-05-18 1
2020-06-22 1
2021-01-21 2
2021-05-05 2
2021-09-11 2
# Creating dummy data
d = {'dt':['2020-05-18',
'2020-06-22',
'2019-07-11',
'2018-03-01',
'2021-01-21',
'2021-05-05',
'2021-09-11'],
'group':[1,1,1,1,2,2,2]}
dt_df = pd.DataFrame(data=d)
dt_df
dt_df['dt'] = pd.to_datetime(dt_df['dt'])
dt_df
Inital Input:
0 2020-05-18
1 2020-06-22
2 2019-07-11
3 2018-03-01
4 2021-01-21
5 2021-05-05
6 2021-09-11
Name: dt, dtype: datetime64[ns]
Start = '2020-05'
End = '2021-09'
Start = pd.to_datetime(Start)
End = pd.to_datetime(End)
End = End+np.timedelta64(1, 'M')
Use loc to select only dates between Start and End timestamp.
dt_df.loc[(dt_df['dt'] - Start >= np.timedelta64(0,'D')) & (dt_df['dt'] - End <= np.timedelta64(0, 'D'))]
Output:
dt group
0 2020-05-18 1
1 2020-06-22 1
4 2021-01-21 2
5 2021-05-05 2
6 2021-09-11 2