Join based on ID and closest date

Join based on ID and closest date - sql

I have two tables:
Table 1 which contains phone calls (for every CustomerID there is at most one PhoneCall per day):
ActicityID CustomerID PhoneDate
1 A 2019-11-01
2 A 2019-12-01
3 A 2019-12-20
4 B 2019-11-01
5 B 2019-11-20
6 C 2019-11-03
7 D 2019-11-03
8 D 2019-12-01
9 E 2019-11-05
10 F 2019-11-01
Table 2 which contains Orders (OrdDate is the date when the order was placed and BillingDate is the date when the order was charged)
CustomerID OrdDate BillingDate
A 2019-12-03 2019-12-04
A 2019-12-21 2019-12-21
B 2019-11-03 2019-11-10
D 2019-12-02 2019-12-02
F 2019-11-02 2019-11-02
I want to join the tables. The joined table should have the same number of rows as Table 1.
So basically I want to know if there was order after a phone call. The problem is that if just join on CustomerID I get an OrdDat and a BillingDate for every customer who has ever made an order. For example Customer A made an order after the call on 2019-12-01 and after the call on the 2019-12-20 but not after the first call.
So my desired output would be
ActicityID CustomerID PhoneDate OrdDate BillingDate
1 A 2019-11-01 NULL NULL
2 A 2019-12-01 2019-12-03 2019-12-04
3 A 2019-12-20 2019-12-21 2019-12-21
4 B 2019-11-01 2019-11-03 2019-11-10
5 B 2019-11-20 NULL NULL
6 C 2019-11-03 NULL NULL
7 D 2019-11-03 NULL NULL
8 D 2019-12-01 2019-12-02 2019-12-02
9 E 2019-11-05 NULL NULL
10 F 2019-11-01 2019-11-02 2019-11-02
I think I need to join on CustomerID and the closest date between PhoneDate and OrdDate but my SQL knowledge is quite limited and I couldn't figure out how to do it.

I think you can do what you want by using lead() to get the next phone date and then just joining:
select a.*, b.orddate, b.billdate
from (select a.*,
lead(phonedate) over (partition by customerid order by phonedate) as next_pd
from a
) a left join
b
on b.customerid = a.customerid and
b.orddate >= a.phonedate and
(b.orddate < a.next_pd or a.next_pd is null);

You need to use a sub-query to limit the other table, referencing the TOP 1 associated date record...
SELECT
ActivityID,
CustomerID,
PhoneDate,
(SELECT TOP (1)
OrderDate
FROM
dbo.CustomerBilling AS b
WHERE
a.PhoneDate < OrderDate AND
a.CustomerID = CustomerID
ORDER BY OrderDate) AS BillingDate
FROM
dbo.Activity AS a

Related

SQL query to add missing values per id/date

I have two tables, a table with id, date, value and a table with all the dates of interest. I'd like to do a SQL query such that I get a new table exactly the same as my first table but not with NULL values per ID when a date is not present for a given ID.
Table 1.
id
date
value
1
2021-01-01
10
1
2021-02-01
8
1
2021-04-01
20
2
2021-02-01
5
2
2021-04-01
6
Table 2.
date
2020-12-01
2021-01-01
2021-02-01
2021-03-01
2021-04-01
2021-05-01
After I "merge" the two tables the result would be:
id
date
value
1
2020-12-01
NULL
1
2021-01-01
10
1
2021-02-01
8
1
2020-03-01
NULL
1
2021-04-01
20
1
2021-05-01
NULL
2
2020-12-01
NULL
2
2021-01-01
NULL
2
2021-02-01
5
2
2021-03-01
NULL
2
2021-04-01
6
2
2021-05-01
NULL
Which SQL query do I need to run to get such result?

SELECT
u.id,
d.date,
t.value
FROM
(
SELECT DISTINCT id FROM table1
)
u
CROSS JOIN
table2 d
LEFT JOIN
table1 t
ON t.id = u.id
AND t.date = d.date
Though, I'd refrain from using date and other potential keywords as column names.

T-SQL get values for specific group

I have a table EmployeeContract similar like this:
ContractId
EmployeeId
ValidFrom
ValidTo
Salary
12
5
2018-02-01
2019-06-31
x
25
8
2015-01-01
2099-12-31
x
50
5
2019-07-01
2021-05-31
x
52
6
2011-08-01
2021-12-31
x
72
8
2010-08-01
2014-12-31
x
52
6
2011-08-01
2021-12-31
x
Table includes history contracts in company for each employee. I need to get date when employees started work and last date of contract. Sometime records has duplicates.
For example, based on data from above:
EmployeeId
ValidFrom
ValidTo
5
2018-02-01
2021-05-31
8
2010-08-01
2099-12-31
6
2011-08-01
2021-12-31
Base on this article: https://www.techcoil.com/blog/sql-statement-for-selecting-the-latest-record-in-each-group/
I prepared query like this:
select minv.*, maxv.maxvalidto from
(select distinct con.[EmployeeId], mvt.maxvalidto
from [EmployeeContract] con
join (select [EmployeeId], max(validto) as maxvalidto
FROM [EmployeeContract]
group by [EmployeeId]) mvt
on con.[EmployeeId] = mvt.[EmployeeId] and mvt.maxvalidto = con.validto) maxv
join
(select distinct con.[EmployeeId], mvf.minvalidfrom
from [EmployeeContract] con
join (select [EmployeeId], min(validfrom) as minvalidfrom
FROM [EmployeeContract]
group by [EmployeeId]) mvf
on con.[EmployeeId] = mvf.[EmployeeId] and mvf.minvalidfrom = con.validfrom) minv
on minv.[EmployeeId] = maxv.[EmployeeId]
order by 1
But I'm not satisfied, i think it's not easy to read, and probably optimize is poor. How can I do it better?

I think you want group by:
select employeeid, min(validfrom), max(validto)
from employeecontract
group by employeeid

Problems with complex query

There are two tables.
In the first I have columns:
id - a person
time - the time of receiving the bonus (timestamp)
money - size of bonus
And the second:
id
time - time of getting a rank (timestamp)
range - military rank (int)
The task is to withdraw the amount and number of bonuses received by people in the rank of captain (range = 7) with aggregation by day.
I have no ideas how to do a table with this data. I can summarize data by all days such as
SELECT DISTINCTROW Payment.user_id AS user_id, Sum(IIf(IsNull(Payment.money),0,Payment.money)) AS [Sum - money], Count(Payment.money) AS [Count - Payment], Format(Payment.time, "Short Date") as day
FROM Payment
GROUP BY Payment.user_id, Format (Payment.time, "Short Date")
Having ((Count(Payment.money) > 0));
Can you help me with second part and summarize them? thanks
For example: first table (Payment):
user_id time money
a 01.01.10 00:00:00 15,00
a 01.01.10 10:00:00 2,00
a 03.01.10 00:00:00 3,00
c 04.01.10 00:00:00 4,00
c 04.01.10 00:05:00 5,00
d 06.01.10 00:00:00 6,00
e 07.01.10 00:00:00 7,00
e 08.01.10 00:00:00 8,00
The second one:
user_id time range
a 01.01.10 00:00:00 6
a 01.01.10 09:00:00 7
a 04.01.10 00:00:00 8
b 04.01.10 00:00:00 4
c 04.01.10 00:05:00 7
d 06.01.10 00:00:00 5
e 07.01.10 00:00:00 6
f 08.01.10 00:00:00 6
g 08.01.10 00:00:00 7
I expected:
user_id time sum
a 01.01.10 2
a 03.01.10 3
c 04.01.10 5

Here is one possible method using joins:
select t1.user_id, datevalue(p.time) as [time], sum(p.money) as [sum]
from
(
(select t.user_id, t.time from rank t where t.range = 7) t1
inner join payment p on t1.user_id = p.user_id
)
left join
(select t.user_id, t.time from rank t where t.range > 7) t2 on p.user_id = t2.user_id
where
p.time >= t1.time and (t2.user_id is null or p.time < t2.time)
group by
t1.user_id, datevalue(p.time)
I have assumed that your second table is called rank (this was not stated in your question).
Here, the subquery t1 obtains the set of users with range = 7 (captain), and the subquery t2 obtains the set of users with range > 7. I then select all records with a payment date greater than or equal to the date of promotion to captain, but less than any subsequent promotion (if it exists).
This yields the following result:
+---------+------------+------+
| user_id | time | sum |
+---------+------------+------+
| a | 01/01/2010 | 2.00 |
| a | 03/01/2010 | 3.00 |
| c | 04/01/2010 | 5.00 |
+---------+------------+------+
Unless I have misunderstood, I would argue that your expected result is incorrect as the payment below occurs before user_id = c achieved the rank of captain:
c 04.01.10 00:00:00 4,00
c 04.01.10 00:05:00 7

SQL query to check if the next row value is same or different

I am joining two tables based on a common column date. However, the column I am trying to get from one the table (cmg) in this case, should get next row value only if it is different from its previous row's value
Table A
Date comp.no
-----------------------
2019-03-08 5
2019-02-26 5
2019-01-17 5
2019-01-10 5
2018-12-27 5
Table B
Date cmg
-----------------
2019-07-17 NULL
2019-04-20 NULL
2019-02-26 RHB
2019-01-19 NULL
2019-01-17 RHB
2019-01-10 RMB
2018-12-28 NULL
2018-12-27 RHB
2018-12-12 RUB
2018-11-28 RUB
2018-10-20 NULL
2018-07-21 NULL
2018-04-21 NULL
2018-01-20 NULL
2017-10-21 NULL
2017-07-29 NULL
2017-05-07 NULL
2017-02-13 NULL
2016-11-22 NULL
2016-08-29 NULL
2016-06-07 NULL
2016-04-06 RUB
2016-03-21 RUB
2016-03-07 RUB

You can use lag function to compare with previous value. And for the first row you'll need an isnull() check since the first row won't have a previous value.
;with cte as(
select case
when isnull(lag(t2.cmg)over (order by t2.cmg desc),'') <>t2.cmg then 1 else 0 end as isresult
,t2.date,t2.cmg
from TableA t1
inner join TableB t2
on t1.date=t2.date
)
select date,cmg from cte where isresult=1

Use lag():
select date, cmg
from (select b.date, b.cmg, lag(b.cmg) over (order by b.date) as prev_cmg
from a join
b
on a.date = b.date
) b
where prev_cmg is null or prev_cmg <> cmg
order by date;

CREATE TEMP TABLE BASED ON SELECT DISTINCT ON 3 COLUMNS BUT WITH 1 EXTRA COLUMN

I need to make a temporary file with in it:
Partcode, MutationDate, MovementType, Qty
Every partcode has multiple mutationdates per Movementtype (there are max 9 movementtypes possible)
I need to get the last mutationdate per movementtype per partcode and the quantity that goes with that.
An example with partcode 003307
003307 2018-05-31 1 -100
003307 2018-06-11 2 -33
003307 2018-04-25 3 +25
and so on for all 9 movementtypes.
What did I get so far:
create table #LMUT(
MutationDate T_Date
,PartCode T_Code_Part
,CumInvQty T_Quantum_Qty10_3
,MovementType T_Type_PMOverInvt
)
insert #LMUT(
MutationDate,
Partcode,
CumInvQty,
MovementType)
SELECT
cast (max(MOV.MutationDate) as date)
,MOV.PartCode
,INV.MutationQty
,INV.PMOverInvtType
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on
INV.PMMainCode=MOV.PMMainCode
WHERE
MOV.PartMovementType = 1
group by MOV.PartCode,INV.PMOverInvtType,INV.MutationQty,MOV.MutationDate
SELECT * FROM #LMUT where partcode='003007'
drop table #LMUT
results in:
2016-12-06 00:00:00.000 003007 -24.000 2
2016-09-29 00:00:00.000 003007 -24.000 2
2016-11-09 00:00:00.000 003007 -24.000 2
2016-11-22 00:00:00.000 003007 -24.000 2
2016-10-26 00:00:00.000 003007 -24.000 2
2016-09-12 00:00:00.000 003007 -42.000 2
2016-10-13 00:00:00.000 003007 -24.000 2
2016-12-03 00:00:00.000 003007 100.000 5
2017-01-12 00:00:00.000 003007 -48.000 2
2016-10-04 00:00:00.000 003007 306.000 7
Not what I need, still have 8 times type 2
What else have I tried:
SELECT distinct MOV.Partcode,INV.PMOverInvtType,mov.MutationDate
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on
INV.PMMainCode=MOV.PMMainCode
WHERE
mov.MutationDate = (SELECT MAX (c.MutationDate) FROM
dbo.T_PartMovementMain as c
inner join dbo.T_PartMovementOverInvt as d on D.PMMainCode=c.PMMainCode
WHERE
C.PartMovementType = 1 AND
C.PartCode=mov.PartCode AND
D.PMMainCode = C.PMMainCode AND
D.PMOverInvtType=inv.PMOverInvtType
)
and MOV.PartMovementType = 1 and mov.partcode='003007'
order by MOV.Partcode,INV.PMOverInvtType
Results in:
3007 2 2017-01-12 00:00:00.000
3007 5 2016-12-03 00:00:00.000
3007 7 2016-10-04 00:00:00.000
That is what I want but I need to get the Qty too.

use row_number() window function
with cte as
( SELECT MOV.*,INV.*,
row_number() over(partition by INV.PMOverInvtType order by MOV.MutationDate desc)rn
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on
INV.PMMainCode=MOV.PMMainCode
) select cte.* from cte where rn=1

Solved it like this:
create table #LMUT(
PartCode T_Code_Part
,MovementType T_Type_PMOverInvt
,MutationDate T_Date
,CumInvQty T_Quantum_Qty10_3
)
insert #LMUT(Partcode,MovementType,MutationDate,CumInvQty)
select Artikel,Type,Datum,Aant
from (
SELECT MOV.Partcode as Artikel,INV.PMOverInvtType as Type,mov.MutationDate as Datum,INV.MutationQty as Aant,
row_number() over(partition by MOV.Partcode,INV.PMOverInvtType order by MOV.Partcode,INV.PMOverInvtType,MOV.MutationDate desc) rn
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on INV.PMMainCode=MOV.PMMainCode) cse
where rn=1
select * from #LMUT order by Partcode
drop table #LMUT

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Join based on ID and closest date - sql

Related

SQL query to add missing values per id/date

T-SQL get values for specific group

Problems with complex query

SQL query to check if the next row value is same or different

CREATE TEMP TABLE BASED ON SELECT DISTINCT ON 3 COLUMNS BUT WITH 1 EXTRA COLUMN

Categories

Resources