Join based on ID and closest date - sql

I have two tables:
Table 1 which contains phone calls (for every CustomerID there is at most one PhoneCall per day):
ActicityID CustomerID PhoneDate
1 A 2019-11-01
2 A 2019-12-01
3 A 2019-12-20
4 B 2019-11-01
5 B 2019-11-20
6 C 2019-11-03
7 D 2019-11-03
8 D 2019-12-01
9 E 2019-11-05
10 F 2019-11-01
Table 2 which contains Orders (OrdDate is the date when the order was placed and BillingDate is the date when the order was charged)
CustomerID OrdDate BillingDate
A 2019-12-03 2019-12-04
A 2019-12-21 2019-12-21
B 2019-11-03 2019-11-10
D 2019-12-02 2019-12-02
F 2019-11-02 2019-11-02
I want to join the tables. The joined table should have the same number of rows as Table 1.
So basically I want to know if there was order after a phone call. The problem is that if just join on CustomerID I get an OrdDat and a BillingDate for every customer who has ever made an order. For example Customer A made an order after the call on 2019-12-01 and after the call on the 2019-12-20 but not after the first call.
So my desired output would be
ActicityID CustomerID PhoneDate OrdDate BillingDate
1 A 2019-11-01 NULL NULL
2 A 2019-12-01 2019-12-03 2019-12-04
3 A 2019-12-20 2019-12-21 2019-12-21
4 B 2019-11-01 2019-11-03 2019-11-10
5 B 2019-11-20 NULL NULL
6 C 2019-11-03 NULL NULL
7 D 2019-11-03 NULL NULL
8 D 2019-12-01 2019-12-02 2019-12-02
9 E 2019-11-05 NULL NULL
10 F 2019-11-01 2019-11-02 2019-11-02
I think I need to join on CustomerID and the closest date between PhoneDate and OrdDate but my SQL knowledge is quite limited and I couldn't figure out how to do it.

I think you can do what you want by using lead() to get the next phone date and then just joining:
select a.*, b.orddate, b.billdate
from (select a.*,
lead(phonedate) over (partition by customerid order by phonedate) as next_pd
from a
) a left join
b
on b.customerid = a.customerid and
b.orddate >= a.phonedate and
(b.orddate < a.next_pd or a.next_pd is null);

You need to use a sub-query to limit the other table, referencing the TOP 1 associated date record...
SELECT
ActivityID,
CustomerID,
PhoneDate,
(SELECT TOP (1)
OrderDate
FROM
dbo.CustomerBilling AS b
WHERE
a.PhoneDate < OrderDate AND
a.CustomerID = CustomerID
ORDER BY OrderDate) AS BillingDate
FROM
dbo.Activity AS a

Related

SQL query to add missing values per id/date

I have two tables, a table with id, date, value and a table with all the dates of interest. I'd like to do a SQL query such that I get a new table exactly the same as my first table but not with NULL values per ID when a date is not present for a given ID.
Table 1.
id
date
value
1
2021-01-01
10
1
2021-02-01
8
1
2021-04-01
20
2
2021-02-01
5
2
2021-04-01
6
Table 2.
date
2020-12-01
2021-01-01
2021-02-01
2021-03-01
2021-04-01
2021-05-01
After I "merge" the two tables the result would be:
id
date
value
1
2020-12-01
NULL
1
2021-01-01
10
1
2021-02-01
8
1
2020-03-01
NULL
1
2021-04-01
20
1
2021-05-01
NULL
2
2020-12-01
NULL
2
2021-01-01
NULL
2
2021-02-01
5
2
2021-03-01
NULL
2
2021-04-01
6
2
2021-05-01
NULL
Which SQL query do I need to run to get such result?
SELECT
u.id,
d.date,
t.value
FROM
(
SELECT DISTINCT id FROM table1
)
u
CROSS JOIN
table2 d
LEFT JOIN
table1 t
ON t.id = u.id
AND t.date = d.date
Though, I'd refrain from using date and other potential keywords as column names.

T-SQL get values for specific group

I have a table EmployeeContract similar like this:
ContractId
EmployeeId
ValidFrom
ValidTo
Salary
12
5
2018-02-01
2019-06-31
x
25
8
2015-01-01
2099-12-31
x
50
5
2019-07-01
2021-05-31
x
52
6
2011-08-01
2021-12-31
x
72
8
2010-08-01
2014-12-31
x
52
6
2011-08-01
2021-12-31
x
Table includes history contracts in company for each employee. I need to get date when employees started work and last date of contract. Sometime records has duplicates.
For example, based on data from above:
EmployeeId
ValidFrom
ValidTo
5
2018-02-01
2021-05-31
8
2010-08-01
2099-12-31
6
2011-08-01
2021-12-31
Base on this article: https://www.techcoil.com/blog/sql-statement-for-selecting-the-latest-record-in-each-group/
I prepared query like this:
select minv.*, maxv.maxvalidto from
(select distinct con.[EmployeeId], mvt.maxvalidto
from [EmployeeContract] con
join (select [EmployeeId], max(validto) as maxvalidto
FROM [EmployeeContract]
group by [EmployeeId]) mvt
on con.[EmployeeId] = mvt.[EmployeeId] and mvt.maxvalidto = con.validto) maxv
join
(select distinct con.[EmployeeId], mvf.minvalidfrom
from [EmployeeContract] con
join (select [EmployeeId], min(validfrom) as minvalidfrom
FROM [EmployeeContract]
group by [EmployeeId]) mvf
on con.[EmployeeId] = mvf.[EmployeeId] and mvf.minvalidfrom = con.validfrom) minv
on minv.[EmployeeId] = maxv.[EmployeeId]
order by 1
But I'm not satisfied, i think it's not easy to read, and probably optimize is poor. How can I do it better?
I think you want group by:
select employeeid, min(validfrom), max(validto)
from employeecontract
group by employeeid

Problems with complex query

There are two tables.
In the first I have columns:
id - a person
time - the time of receiving the bonus (timestamp)
money - size of bonus
And the second:
id
time - time of getting a rank (timestamp)
range - military rank (int)
The task is to withdraw the amount and number of bonuses received by people in the rank of captain (range = 7) with aggregation by day.
I have no ideas how to do a table with this data. I can summarize data by all days such as
SELECT DISTINCTROW Payment.user_id AS user_id, Sum(IIf(IsNull(Payment.money),0,Payment.money)) AS [Sum - money], Count(Payment.money) AS [Count - Payment], Format(Payment.time, "Short Date") as day
FROM Payment
GROUP BY Payment.user_id, Format (Payment.time, "Short Date")
Having ((Count(Payment.money) > 0));
Can you help me with second part and summarize them? thanks
For example: first table (Payment):
user_id time money
a 01.01.10 00:00:00 15,00
a 01.01.10 10:00:00 2,00
a 03.01.10 00:00:00 3,00
c 04.01.10 00:00:00 4,00
c 04.01.10 00:05:00 5,00
d 06.01.10 00:00:00 6,00
e 07.01.10 00:00:00 7,00
e 08.01.10 00:00:00 8,00
The second one:
user_id time range
a 01.01.10 00:00:00 6
a 01.01.10 09:00:00 7
a 04.01.10 00:00:00 8
b 04.01.10 00:00:00 4
c 04.01.10 00:05:00 7
d 06.01.10 00:00:00 5
e 07.01.10 00:00:00 6
f 08.01.10 00:00:00 6
g 08.01.10 00:00:00 7
I expected:
user_id time sum
a 01.01.10 2
a 03.01.10 3
c 04.01.10 5
Here is one possible method using joins:
select t1.user_id, datevalue(p.time) as [time], sum(p.money) as [sum]
from
(
(select t.user_id, t.time from rank t where t.range = 7) t1
inner join payment p on t1.user_id = p.user_id
)
left join
(select t.user_id, t.time from rank t where t.range > 7) t2 on p.user_id = t2.user_id
where
p.time >= t1.time and (t2.user_id is null or p.time < t2.time)
group by
t1.user_id, datevalue(p.time)
I have assumed that your second table is called rank (this was not stated in your question).
Here, the subquery t1 obtains the set of users with range = 7 (captain), and the subquery t2 obtains the set of users with range > 7. I then select all records with a payment date greater than or equal to the date of promotion to captain, but less than any subsequent promotion (if it exists).
This yields the following result:
+---------+------------+------+
| user_id | time | sum |
+---------+------------+------+
| a | 01/01/2010 | 2.00 |
| a | 03/01/2010 | 3.00 |
| c | 04/01/2010 | 5.00 |
+---------+------------+------+
Unless I have misunderstood, I would argue that your expected result is incorrect as the payment below occurs before user_id = c achieved the rank of captain:
c 04.01.10 00:00:00 4,00
c 04.01.10 00:05:00 7

SQL query to check if the next row value is same or different

I am joining two tables based on a common column date. However, the column I am trying to get from one the table (cmg) in this case, should get next row value only if it is different from its previous row's value
Table A
Date comp.no
-----------------------
2019-03-08 5
2019-02-26 5
2019-01-17 5
2019-01-10 5
2018-12-27 5
Table B
Date cmg
-----------------
2019-07-17 NULL
2019-04-20 NULL
2019-02-26 RHB
2019-01-19 NULL
2019-01-17 RHB
2019-01-10 RMB
2018-12-28 NULL
2018-12-27 RHB
2018-12-12 RUB
2018-11-28 RUB
2018-10-20 NULL
2018-07-21 NULL
2018-04-21 NULL
2018-01-20 NULL
2017-10-21 NULL
2017-07-29 NULL
2017-05-07 NULL
2017-02-13 NULL
2016-11-22 NULL
2016-08-29 NULL
2016-06-07 NULL
2016-04-06 RUB
2016-03-21 RUB
2016-03-07 RUB
You can use lag function to compare with previous value. And for the first row you'll need an isnull() check since the first row won't have a previous value.
;with cte as(
select case
when isnull(lag(t2.cmg)over (order by t2.cmg desc),'') <>t2.cmg then 1 else 0 end as isresult
,t2.date,t2.cmg
from TableA t1
inner join TableB t2
on t1.date=t2.date
)
select date,cmg from cte where isresult=1
Use lag():
select date, cmg
from (select b.date, b.cmg, lag(b.cmg) over (order by b.date) as prev_cmg
from a join
b
on a.date = b.date
) b
where prev_cmg is null or prev_cmg <> cmg
order by date;

CREATE TEMP TABLE BASED ON SELECT DISTINCT ON 3 COLUMNS BUT WITH 1 EXTRA COLUMN

I need to make a temporary file with in it:
Partcode, MutationDate, MovementType, Qty
Every partcode has multiple mutationdates per Movementtype (there are max 9 movementtypes possible)
I need to get the last mutationdate per movementtype per partcode and the quantity that goes with that.
An example with partcode 003307
003307 2018-05-31 1 -100
003307 2018-06-11 2 -33
003307 2018-04-25 3 +25
and so on for all 9 movementtypes.
What did I get so far:
create table #LMUT(
MutationDate T_Date
,PartCode T_Code_Part
,CumInvQty T_Quantum_Qty10_3
,MovementType T_Type_PMOverInvt
)
insert #LMUT(
MutationDate,
Partcode,
CumInvQty,
MovementType)
SELECT
cast (max(MOV.MutationDate) as date)
,MOV.PartCode
,INV.MutationQty
,INV.PMOverInvtType
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on
INV.PMMainCode=MOV.PMMainCode
WHERE
MOV.PartMovementType = 1
group by MOV.PartCode,INV.PMOverInvtType,INV.MutationQty,MOV.MutationDate
SELECT * FROM #LMUT where partcode='003007'
drop table #LMUT
results in:
2016-12-06 00:00:00.000 003007 -24.000 2
2016-09-29 00:00:00.000 003007 -24.000 2
2016-11-09 00:00:00.000 003007 -24.000 2
2016-11-22 00:00:00.000 003007 -24.000 2
2016-10-26 00:00:00.000 003007 -24.000 2
2016-09-12 00:00:00.000 003007 -42.000 2
2016-10-13 00:00:00.000 003007 -24.000 2
2016-12-03 00:00:00.000 003007 100.000 5
2017-01-12 00:00:00.000 003007 -48.000 2
2016-10-04 00:00:00.000 003007 306.000 7
Not what I need, still have 8 times type 2
What else have I tried:
SELECT distinct MOV.Partcode,INV.PMOverInvtType,mov.MutationDate
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on
INV.PMMainCode=MOV.PMMainCode
WHERE
mov.MutationDate = (SELECT MAX (c.MutationDate) FROM
dbo.T_PartMovementMain as c
inner join dbo.T_PartMovementOverInvt as d on D.PMMainCode=c.PMMainCode
WHERE
C.PartMovementType = 1 AND
C.PartCode=mov.PartCode AND
D.PMMainCode = C.PMMainCode AND
D.PMOverInvtType=inv.PMOverInvtType
)
and MOV.PartMovementType = 1 and mov.partcode='003007'
order by MOV.Partcode,INV.PMOverInvtType
Results in:
3007 2 2017-01-12 00:00:00.000
3007 5 2016-12-03 00:00:00.000
3007 7 2016-10-04 00:00:00.000
That is what I want but I need to get the Qty too.
use row_number() window function
with cte as
( SELECT MOV.*,INV.*,
row_number() over(partition by INV.PMOverInvtType order by MOV.MutationDate desc)rn
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on
INV.PMMainCode=MOV.PMMainCode
) select cte.* from cte where rn=1
Solved it like this:
create table #LMUT(
PartCode T_Code_Part
,MovementType T_Type_PMOverInvt
,MutationDate T_Date
,CumInvQty T_Quantum_Qty10_3
)
insert #LMUT(Partcode,MovementType,MutationDate,CumInvQty)
select Artikel,Type,Datum,Aant
from (
SELECT MOV.Partcode as Artikel,INV.PMOverInvtType as Type,mov.MutationDate as Datum,INV.MutationQty as Aant,
row_number() over(partition by MOV.Partcode,INV.PMOverInvtType order by MOV.Partcode,INV.PMOverInvtType,MOV.MutationDate desc) rn
FROM dbo.T_PartMovementMain as MOV
inner join dbo.T_PartMovementOverInvt as INV on INV.PMMainCode=MOV.PMMainCode) cse
where rn=1
select * from #LMUT order by Partcode
drop table #LMUT