How to join a table to another one depending two date columns? - sql

I have two tables which are
T1:
UserID Tier BeginDate EndDate
8278020 1 2019-03-02 18:33:04.893 2019-03-28 10:34:33.837
8278020 2 2019-03-28 10:34:33.837 2019-04-01 16:48:22.107
8278020 3 2019-04-01 16:48:22.107 2019-04-07 21:44:40.060
8278020 4 2019-04-07 21:44:40.060 2019-06-30 23:59:59.999
T2:
UserID GiftCardID UseDate OrderID IsUsed
8278020 165491838 2019-03-06 23057796 1
8278020 165491839 2019-03-10 23106429 1
8278020 165491840 2019-03-24 23277217 1
8278020 166418161 NULL NULL 0
8278020 166418162 NULL NULL 0
8278020 167026357 2019-04-22 23594414 1
8278020 167026358 2019-04-28 23668492 1
I want to match two tables such that I show the each tier of the customer when he/she used the giftcard.
For example, when the user used the Giftcard with '165491839' he was in tier 1.
Or at GiftCardID = '167026357' the tier is 4.
I couldn't find how to match the tables according to that.
I wait for your help...

Just use JOIN:
select t2.*, t1.tier
from table2 t2 left join
table1 t1
on t2.userid = t1.userid and
t2.usedate >= t1.begindate and
t2.userdate < t1.enddate;
This is a left join, so you won't lose rows if, for some reason, the dates don't match.

Related

SQL: Left join on calendar table (spark SQL)

I am trying to join data to a calendar table cross joined with user id, to get other columns corresponding to it. I have tried joining on date condition, without the date condition. Created a cross joined master table to left join the other data on. However, seems like I am missing something.
DATE_TBL looks like:
CAL_DT BUYER_ID
2019-03-31 1
2019-03-31 2
2019-03-31 3
2019-03-30 1
2019-03-30 2
2019-03-30 3
2019-03-29 1
2019-03-29 2
2019-03-29 3 ......
DATA2 looks like:
CREATED_DT BUYER_ID ITEM_PRICE
2019-03-31 1 10
2019-03-30 2. 12
2019-03-29 3. 45
2019-03-29 2. 13 ........
Here is my code:
WITH DATE_TBL AS
(
SELECT CAL.CAL_DT, CK.BUYER_ID
FROM DATA1 CAL
CROSS JOIN DATA2 CK
WHERE cal.CAL_DT BETWEEN '2018-01-01' AND '2019-03-31'
AND CK.BYR_CNTRY_ID IN (1,2,3) AND CK.CREATED_DT BETWEEN '2019-03-01' AND '2019-03-31'
GROUP BY 1,2
)
,
REVENUE_CALC AS
(
SELECT CAL.CAL_DT
,CK.BYR_CNTRY_ID
,CK.BUYER_ID
,CK.CREATED_DT AS CREATED_DT
,SUM(CK.ITEM_PRICE) AS ITEM_PRICE
,SUM(CK.QUANTITY) AS QUANTITY
,MAX(COALESCE(I.CURNCY_PLAN_RATE, 1)) AS CURNCY_PLAN_RATE
,SUM(CK.ITEM_PRICE *CK.QUANTITY *I.CURNCY_PLAN_RATE) AS REVENUE
FROM DATE_TBL CAL
LEFT JOIN DATA2 CK
ON CAL.BUYER_ID = CK.BUYER_ID AND CAL.CAL_DT = CK.CREATED_DT
LEFT JOIN DATA3 I
ON I.CURNCY_ID = CK.LSTG_CURNCY_ID
GROUP BY 1,2,3,4
ORDER BY CAL.CAL_DT DESC, CK.BUYER_ID
)
SELECT *
FROM REVENUE_CALC
Desired Result Must look like:
CAL_DT BUYER_ID ITEM ITEM_PRICE
2019-03-31 1. 10
2019-03-31 2. null
2019-03-31 3. null
2019-03-30 1. null
2019-03-30 2. 12
2019-03-30 3. null
2019-03-29 1. null
2019-03-29 2. 13
2019-03-29 3. 45......
What I get is only the data for common dates. Could someone help me understand what I am doing wrong?

SQL query to check if the next row value is same or different

I am joining two tables based on a common column date. However, the column I am trying to get from one the table (cmg) in this case, should get next row value only if it is different from its previous row's value
Table A
Date comp.no
-----------------------
2019-03-08 5
2019-02-26 5
2019-01-17 5
2019-01-10 5
2018-12-27 5
Table B
Date cmg
-----------------
2019-07-17 NULL
2019-04-20 NULL
2019-02-26 RHB
2019-01-19 NULL
2019-01-17 RHB
2019-01-10 RMB
2018-12-28 NULL
2018-12-27 RHB
2018-12-12 RUB
2018-11-28 RUB
2018-10-20 NULL
2018-07-21 NULL
2018-04-21 NULL
2018-01-20 NULL
2017-10-21 NULL
2017-07-29 NULL
2017-05-07 NULL
2017-02-13 NULL
2016-11-22 NULL
2016-08-29 NULL
2016-06-07 NULL
2016-04-06 RUB
2016-03-21 RUB
2016-03-07 RUB
You can use lag function to compare with previous value. And for the first row you'll need an isnull() check since the first row won't have a previous value.
;with cte as(
select case
when isnull(lag(t2.cmg)over (order by t2.cmg desc),'') <>t2.cmg then 1 else 0 end as isresult
,t2.date,t2.cmg
from TableA t1
inner join TableB t2
on t1.date=t2.date
)
select date,cmg from cte where isresult=1
Use lag():
select date, cmg
from (select b.date, b.cmg, lag(b.cmg) over (order by b.date) as prev_cmg
from a join
b
on a.date = b.date
) b
where prev_cmg is null or prev_cmg <> cmg
order by date;

Left outer Table Joins on multiple conditions.

I have two tables and want to left outer join.
First Table
Id RenewalTerm EffectiveDt RenewalDt
400001 -1 8/1/2012 8/1/2013
400001 0 8/1/2013 8/1/2014
400001 1 8/1/2014 8/1/2015
400001 2 8/1/2015 8/1/2016
400001 3 8/1/2016 8/1/2017
400001 4 8/1/2017 8/1/2018
SecondTable
Id RenewalTerm MaxSize AY DateTime EffectiveDt RenewalDt
400001 -1 2 2013 2/25/2013 8/1/2012 8/1/2013
400001 -1 1.75 2013 2/25/2013 8/1/2012 8/1/2013
400001 2 1.75 2016 5/1/2016 8/1/2015 8/1/2016
Expected Table
Result
Id RenewalTerm EffectiveDt RenewalDt DateTime AY MaxSize
400001 -1 8/1/2012 8/1/2013 *2/25/2013 2013 2*
*400001 -1 8/1/2012 8/1/2013 2/25/2013 2013 1.75*
400001 0 8/1/2013 8/1/2014 NULL NULL NULL
400001 1 8/1/2014 8/1/2015 NULL NULL NULL
*400001 2 8/1/2015 8/1/2016 5/1/2016 2016 1.75*
400001 3 8/1/2016 8/1/2017 NULL NULL NULL
400001 4 8/1/2017 8/1/2018 NULL NULL NULL
In second table, renewal term -1 is repeating and in first table theres just one -1. So, one of the -1 should get updated with Maxsize, AY and datetime and a new row of -1 from second table should be added to first table.
In second table, renewal term 2 is just once. So the extra columns Maxsize, AY and datetime from second table should get added to first.
I have been trying to solve this for a long time. Can somebody please help me with this. Thank you.
I have added italic/stars to show which data got updated/added
Basic Left Join theory
COALESCE function
SELECT a.ID, a.RenewalTerm, COALESCE( b.EffectiveDt, a.EffectiveDt ) AS EffectiveDt,
COALESCE( b.RenewalDt, a.RenewalDt ) AS RenewalDt,
MaxSize, AY, [DateTime]
FROM [FristTable] AS a
LEFT JOIN [SecondTable] AS b ON a.ID = b.ID AND a.RenewalTerm = b.RenewalTerm
It looks like a simple left join + coalesce resolve your problem.
Please check this fiddle:
Select
t1.Id,
t1.RenewalTerm,
coalesce(t2.EffectiveDt, t1.EffectiveDt) EffectiveDt,
coalesce(t2.RenewalDt, t1.RenewalDt) RenewalDt,
t2.DateTime,
t2.AY,
t2.MaxSize
From
table1 t1
left join table2 t2 on t1.id = t2.id and t1.RenewalTerm = t2.RenewalTerm
I see this as:
select t1.*, t2.DateTime, t2.AY, t2.MaxSize
from table1 t1 left join
table2 t2
on t1.id = t2.id and t1.renewalterm = t2.renewalterm;
Perhaps I'm missing something, but I see no need for coalesce().

what is the best method to Identify which pairs of rows have identical Products, Customers and Measures, and overlapping date ranges?

image of sample question where i have to identify duplicate rows then make date ranges not overlap.
The overlapping for row 1, 2 is represented as:
rows 1 and 2 are overlap , like this:
20130101 |--------------------| 20130401
20130301 |----------------------| 20131231
You can use T-Sql language in MS SQL Server:
select t1a.id , t1.id second_id,
t1.valid_from_day , t1.valid_to_day ,
t1a.valid_from_day second_valid_from_day ,
t1a.valid_to_day second_valid_to_day
from t1 t1a
cross apply
(
select * from t1
where t1.product = t1a.product
and t1.customer = t1a.customer
and t1.measure = t1a.measure
and t1.id <> t1a.id
and t1.valid_from_day >= t1a.valid_from_day -- overlap
and t1.valid_to_day >= t1a.valid_to_day
) t1
The results of the query is:
id second_id valid_from_day valid_to_day second_valid_from_day second_valid_to_day
1 2 2013-03-01 2013-12-31 2013-01-01 2013-04-01
4 5 2013-03-01 2014-04-01 2013-01-01 2013-04-01
9 10 2014-04-01 2015-01-01 2013-03-01 2013-12-31
so The pairs identical are:
pair 1,2
pair 4,5
pair 9,10

SQL remove colliding part of datetimes in a table according to value of a column

I have a table like
MemberID MembershipStartDate MembershipEndDate type
=============================================================================
123 2010-01-01 10:00:00.000 2012-12-31 23:00:00.000 1
123 2011-01-01 21:00:00.000 2012-12-31 12:00:00.000 2
123 2013-05-01 9:00:00.000 2013-12-31 5:00:00.000 2
123 2014-01-01 14:00:00.000 2014-12-31 2:00:00.000 1
123 2014-01-01 11:00:00.000 2015-03-31 1:00:00.000 2
In which for a given member and type, the times do not collide: for type 1, there will be no start and finish row which will concur with other rows, so if I have member 123 type 1 start 2010-01-01 10:00:00.000 and finish 2012-12-31 23:00:00.000, I cannot have member 123 type 1 start 2010-02-01 10:00:00.000 finish 2013-12-31 23:00:00.000 since the range are colliding (I could however have this for type 2). This is my current table.
What I want to do is remove the collisions of times between different types for the same MemberID, so for memberID 123, if a row for type 2 started at 2013-05-01 9:00:00.000 and finished at 2013-12-31 5:00:00.000, and type 1 started at 2013-10-01 9:00:00.000 and finished at 2014-12-31 5:00:00.000, since the row for type 2 started first (the one that started later is the one trimmed), the one for type 1 would be trimmed to: 2013-12-31 5:00:00.000 till 2014-12-31 5:00:00.000, where as you can see, the new start date for the row is the finish date for the row of type 2.
At the end, the first table will end with
MemberID MembershipStartDate MembershipEndDate type
=============================================================================
123 2010-01-01 10:00:00.000 2012-12-31 23:00:00.000 1
123 2012-12-31 23:00:00.000 2012-12-31 12:00:00.000 2
123 2013-05-01 9:00:00.000 2013-12-31 5:00:00.000 2
123 2014-01-01 14:00:00.000 2014-12-31 2:00:00.000 1
123 2014-12-31 2:00:00.000 2015-03-31 1:00:00.000 2
the times are not necessary in order.
First, I would recommend adding an auto_incrementing id field to the table, so that referencing each row is easier.
Second, use a self-referential query to find the offending records (and is often my desire, generate update sql).
SELECT CONCAT("UPDATE <table> SET enddate = ", QUOTE(t2.startdate), " WHERE id = ", t1.id, ";") AS stmt
#, t1.*, t2.* # uncomment this line to see the raw data.
FROM <table> AS t1
JOIN <table> AS t2 ON t1.member_id = t2.member_id
AND t1.type = t2.type
AND t1.id != t2.id # this makes sure that you dont connect a record to itself. If you didnt have an autoincrementing key, you would have a nasty OR chain to accomplish this
WHERE t1.enddate > t2.startdate
AND t1.startdate < t2.startdate;
If you chose not to use and auto-incrementing pk, then:
AND t1.id != t2.id
#becomes something like:
AND NOT (t1.enddate = t2.enddate AND t1.startdate = t2.startdate)
depending on what the natural key actually is (excluding the parts of it that you are actually joining on).
See the accepted answer and comments on it to see the main idea and what I changed from it
SELECT t2.id, t2.code, MAX(case when t1.enddate > t2.startdate and t1.startdate < t2.startdate then t1.enddate else t2.startdate end), MAX(t2.enddate)
FROM #temporaryTable2 AS t2
LEFT JOIN #temporaryTable2 AS t1 ON t1.member_id = t2.member_id
AND t1.Code != t2.Code
AND t1.id != t2.id
GROUP BY t2.id, t2.code