SQL left join same column and table - sql

I have a customer order data and would like to do analysis on customer retention after price changes.
The order table is as follows:
customer_id order_number order_delivered_date
14156 R980193622 2/6/2020 14:51
1926396 R130222714 22/5/2020 11:02
1085123 R313065343 22/5/2020 14:50
699858 R693959049 8/6/2020 17:03
1609769 R195969327 3/6/2020 16:14
14156 R997103187 27/6/2020 14:01
1926396 R403942827 11/6/2020 14:42
1926396 R895013611 8/7/2020 17:04
So, I would like to pull order in the period before new price. Assume the new price implementation is on 10/6/2020. I would like to do left join to order after the new price on the customer_id.
Before is a set of data dated 10/5/2020 00:00:00 to 9/6/2020 23:59:59 while After is a set of data dated 10/6/2020 00:00:00 to 9/7/2020 23:59:59.
The desired table:
Before After
14156 14156
1926396 1926396
1085123 Null
699858 Null
1609769 Null
If customer_id is found side by side it means they are retained. It should be simple...But I have been stucked.
EDIT:
This is few code that I have been trying
First try:
select ol2.customer_id as before, ol.customer_id as after
from master.order_level ol,
left join master.order_level ol2
on ol2.customer_id = ol.customer_id
where order_delivered_date between '2020-05-10 00:00:00' and '2020-07-09 23:59:59' and country_id = 2
Second try:
SELECT ol.customer_id as before, ol2.customer_id as after
FROM master.order_level ol,master.order_level ol2
left join master.order_level
ON ol.customer_id = ol2.customer_id
WHERE ol.order_delivered_date between '2020-05-10 00:00:00' and '2020-06-09 23:59:59' and ol.country_id =2 and ol2.order_delivered_date between '2020-06-10 00:00:00' and '2020-07-09 23:59:59' and ol2.country_id =2

No need to do a join, you can just use you can do a simple group by and use case and aggregate functions. I also made a fiddle showing it in action here
SELECT customer_id,
CASE
WHEN MIN(order_delivered_date) < '3-15-2019' THEN customer_id
ELSE NULL END customer_before,
CASE
WHEN MAX(order_delivered_date) >= '3-15-2019' THEN customer_id
ELSE NULL END customer_after
FROM my_table
GROUP BY customer_id
there qyery will giva you results like this
customer_id customer_before customer_after
4 4 (null)
1 1 1
3 3 (null)
2 2 2

with before (customer_id) as
( select distinct customer_id from orders where order_delivered_date <= '10/06/2020'
),
after (customer_id) as
(select distinct customer_id from orders where order_delivered_date between '10/06/2020' and '09/07/2020')
select
before.customer_id,
after.customer_id
from before left outer join after on before.customer_id = after.customer_id

you can use union
select customer_id as before, null as after
from #order
where order_delivered_date <'2020-06-10'
union
select null as before, customer_id as after
from #order
where order_delivered_date >='2020-06-10'
results

Related

add missing month in sales

I have a sales table with below values.
TransactionDate,CustomerID,Quantity
2020-01-01,1234,5
2020-07-01,1234,9
2020-03-01,3241,8
2020-07-01,3241,4
As you can see first purchase was for CustomerID = 1234 in Jan 2020 and for CustomerID = 3241 in MAR 2020.
I want on output where in all the date should be filled up with 0 purchase value.
means if there is no sale between Jan and July Then output should be as below.
TransactionDate,CustomerID,Quantity
2020-01-01,1234,5
2020-02-01,1234,0
2020-03-01,1234,0
2020-04-01,1234,0
2020-05-01,1234,0
2020-06-01,1234,0
2020-07-01,1234,9
2020-03-01,3241,8
2020-04-01,3241,0
2020-05-01,3241,0
2020-06-01,3241,0
2020-07-01,3241,4
You can use a recursive query to create the missing dates per customer.
with recursive dates (customerid, transactiondate, max_transactiondate) as
(
select customerid, min(transactiondate), max(transactiondate)
from sales
group by customerid
union all
select customerid, dateadd(month, 1, transactiondate), max_transactiondate
from dates
where transactiondate < max_transactiondate
)
select
d.customerid,
d.transactiondate,
coalesce(s.quantity, 0) as quantity
from dates d
left join sales s on s.customerid = d.customerid and s.transactiondate = d.transactiondate
order by d.customerid, d.transactiondate;
This is a convenient place to use a recursive CTE. Assuming all your dates are on the first of the month:
with cr as (
select customerid, min(transactiondate) as mindate, max(transactiondate) as maxdate
from t
group by customerid
union all
select customerid, dateadd(month, 1, mindate), maxdate
from cr
where mindate < maxdate
)
select cr.customerid, cr.mindate as transactiondate, coalesce(t.quantity, 0) as quantity
from cr left join
t
on cr.customerid = t.customerid and
cr.mindate = t.transactiondate;
Here is a db<>fiddle.
Note that if you have more than 100 months to fill in, then you will need option (maxrecursion 0).
Also, this can easily be adapted if the dates are not all on the first of the month. But you would need to explain what the result set should look like in that case.
[EDIT] Based on what other posted I updated the code.
;with
min_date_cte(MinTransactionDate, MaxTransactionDate) as (
select min(TransactionDate), max(TransactionDate) from tsales),
unq_yrs_cte(year_int) as (
select distinct year(TransactionDate) from tsales),
unq_cust_cte(CustomerID) as (
select distinct CustomerID from tsales)
select datefromparts(uyc.year_int, v.month_int, 1) TransactionDate,
ucc.CustomerID,
isnull(t.Quantity, 0) Quantity
from min_date_cte mdc
cross join unq_yrs_cte uyc
cross join unq_cust_cte ucc
cross join (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) v(month_int)
left join tsales t on datefromparts(uyc.year_int, v.month_int, 1)=t.TransactionDate
and ucc.CustomerID=t.CustomerId
where
datefromparts(uyc.year_int, v.month_int, 1)>=mdc.MinTransactionDate
and datefromparts(uyc.year_int, v.month_int, 1)<=mdc.MaxTransactionDate;
Results
TransactionDate CustomerID Quantity
2020-01-01 1234 5
2020-01-01 3241 0
2020-02-01 1234 0
2020-02-01 3241 0
2020-03-01 1234 0
2020-03-01 3241 8
2020-04-01 1234 0
2020-04-01 3241 0
2020-05-01 1234 0
2020-05-01 3241 0
2020-06-01 1234 0
2020-06-01 3241 0
2020-07-01 1234 9
2020-07-01 3241 4
You can make use of recursive query:
WITH cte1 as
(
select customerid, min([TransactionDate]) as Monthly_date, max([TransactionDate]) as end_date from calender_table
group by customerid
union all
select customerid, dateadd(month, 1, Monthly_date), end_date from cte1
where Monthly_date < end_date
)
select a.Monthly_date, a.customerid,coalesce(b.quantity, 0) from cte1 a left outer join calender_table b
on (a.Monthly_date = b.[TransactionDate] and a.customerid = b.customerid)
order by a.customerid, a.Monthly_date;

Selecting a single row in the same table/view if a query returns no results

I have the following view in my SQL database, which selects data from a Transaction table and a Customer table:
+-------+-----------+---------------------+--------+
| RowNo | Name | Date | Amount |
+-------+-----------+---------------------+--------+
| 1 | Customer1 | 2018-11-10 01:00:00 | 55.49 |
| 2 | Customer2 | 2018-11-10 02:00:00 | 58.15 |
| 3 | Customer3 | 2018-11-10 03:00:00 | 79.15 |
| 4 | Customer1 | 2018-11-11 04:00:00 | 41.89 |
| 5 | Customer2 | 2018-11-11 05:00:00 | 5.15 |
| 6 | Customer3 | 2018-11-11 06:00:00 | 35.17 |
| 7 | Customer1 | 2018-11-12 07:00:00 | 43.78 |
| 8 | Customer1 | 2018-11-12 08:00:00 | 93.78 |
| 9 | Customer2 | 2018-11-12 09:00:00 | 80.74 |
+-------+-----------+---------------------+--------+
I need an SQL query that will return all a customer's transactions for a given day (easy enough), but then if a customer had no transactions on the given day, the query must return the customer's most recent transaction.
Edit:
The view is as follows:
Create view vwReport as
Select c.Name, t.Date, t.Amount
from Transaction t
inner join Customer c on c.Id = t.CustomerId
And then to get the data I just do a select from the view:
Select * from
vwReport r
where r.Date between '2018-11-10 00:00:00' and '2018-11-11 00:00:00'
So, to clarify, I need one query that returns all the customer transactions for a day, and included in that results set is the last transaction of any customers who don't have a transaction on that day. So, in the table above, running the query for 2018-11-12, should return row 7, 8 and 9, as well as row 6 for Customer3 that did not have a transaction on the 12th.
Take your existing query and UNION ALL it with a "most recent transaction query" for everyone who doesn't have a transaction in that range.
with found as
(
select c.Id, c.Name, t.Date, t.Amount
from Transaction t
inner join Customer c on c.Id = t.CustomerId
where Date between '2018-11-10 00:00:00' and '2018-11-11 00:00:00'
)
with unfound as
(
select c.Id, c.Name, t.Date, t.Amount, RANK() OVER (PARTITION BY Name ORDER BY CAST(Date AS DATE) DESC) AS row
from Transaction t
inner join Customer c on c.Id = t.CustomerId
WHERE Date < '2018-11-10 00:00:00'
)
select Name, Date, Amount
from found
union all
select Name, Date, Amount
from unfound
where Id not in ( select Id from found ) and row = 1
You're interested in selecting multiple rows with ties, you could use the RANK() function to find all rows ranked by date descending:
SELECT * FROM (
SELECT *, RANK() OVER (PARTITION BY Name ORDER BY CAST(Date AS DATE) DESC) AS rn
FROM txntbl
WHERE CAST(Date AS DATE) <= '2018-11-12'
) AS x
WHERE rn = 1
Demo on DB Fiddle
You can use a correlated subquery:
select t.*
from transactions t
where t.date = (select max(t2.date)
from transactions t2
where t2.name = t.name and
t2.date <= #date
);
Note: This only returns customers who had a transaction on or before the date in question.
With the limited information available from the question, the following presents a solution using a join as opposed to a correlated subquery:
select t1.*
from
vwReport t1 inner join
(
select t2.name, max(t2.date) as mdate
from vwReport t2
group by t2.name
) t3
on t1.name = t3.name and t1.date = t3.mdate
where
t1.date <= #date
Use UNION for the last date transactions only if there are no transactions for the given dates (BETWEEN '2018-11-10 00:00:00' AND '2018-11-11 00:00:00'):
SELECT * FROM vwReport r
WHERE (r.Date BETWEEN '2018-11-10 00:00:00' AND '2018-11-11 00:00:00')
AND (r.Name = #name)
UNION
SELECT * FROM vwReport r
WHERE (r.Date = (SELECT MAX(r.Date) FROM vwReport r WHERE r.Name = #name))
AND (r.Name = #name)
AND ((SELECT COUNT(*) FROM vwReport r
WHERE (r.Date BETWEEN '2018-11-10 00:00:00' AND '2018-11-11 00:00:00')
AND (r.Name = #name)) = 0)

Need to count certain rows based on date criteria in SQL Server

I'm using SQL Server 2012 and have a table with these 2 columns. I need to count an ORG_ID once ONLY IF the EndDate for every row or that ORG_ID falls within/before the timeframe of '1-1-2018' and '1-31-2018' (or before but NOT after) for ALL rows for that org. An ORE with an EndDate of NULL would also NOT be in my results
ORG_ID EndDate
99968042 1/31/2018
99968042 2/14/2018
99968042 2/14/2018
99900699 1/10/2018
99900699 1/10/2018
99900699 1/10/2018
99900699 1/10/2018
99899776 1/20/2018
99843366 12/17/2017
99843366 1/4/2018
99841000 2/1/2016
99651255 NULL
99651255 1/15/2018
The rows that should output are:
99900699
99899776
99843366
I haven't tried anything, because I can't think how to approach it.
So now I've tried this:
select distinct ORG_ID
from ##PLCMT p1
where not exists (
select *
from ##PLCMT p2
where p1.ORG_ID = p2.ORG_ID and
(p1.EndDate <= '2018-01-01' or p1.enddate >= '2018-01-31' or p1.EndDate is NULL)
)
and it is still resulting back an org that has a NULL enddate, I can't figure out why. ORG_ID 3098376 is in my results but if I look at all the rows for that ORG_ID, it looks like this:
select *
from ##PLCMT
where org_id = '3098376'
results:
ORG_ID EndDate
3098376 2017-09-11
3098376 NULL
3098376 NULL
Use group by and having and NOT IN (to eliminate those who have any NULL value):
SELECT org_id
FROM t
GROUP BY org_id
WHERE org_id NOT IN (SELECT org_id FROM t WHERE enddate IS NULL)
HAVING MAX(enddate) <= '2018-01-31';
Your are probably safe with this logic:
having max(enddate) < '2018-02-01'
This works even if enddate has a time component.
Another way is with NOT EXISTS()
SELECT DISTINCT org_id
from t
WHERE NOT EXISTS(
SELECT * FROM t t1
WHERE t1.org_id=t.org_id
AND (t1.enddate > '20180131' OR t1.enddate IS NULL)
)

SQL - Find if column dates include at least partially a date range

I need to create a report and I am struggling with the SQL script.
The table I want to query is a company_status_history table which has entries like the following (the ones that I can't figure out)
Table company_status_history
Columns:
| id | company_id | status_id | effective_date |
Data:
| 1 | 10 | 1 | 2016-12-30 00:00:00.000 |
| 2 | 10 | 5 | 2017-02-04 00:00:00.000 |
| 3 | 11 | 5 | 2017-06-05 00:00:00.000 |
| 4 | 11 | 1 | 2018-04-30 00:00:00.000 |
I want to answer to the question "Get all companies that have been at least for some point in status 1 inside the time period 01/01/2017 - 31/12/2017"
Above are the cases that I don't know how to handle since I need to add some logic of type :
"If this row is status 1 and it's date is before the date range check the next row if it has a date inside the date range."
"If this row is status 1 and it's date is after the date range check the row before if it has a date inside the date range."
I think this can be handled as a gaps and islands problem. Consider the following input data: (same as sample data of OP plus two additional rows)
id company_id status_id effective_date
-------------------------------------------
1 10 1 2016-12-15
2 10 1 2016-12-30
3 10 5 2017-02-04
4 10 4 2017-02-08
5 11 5 2017-06-05
6 11 1 2018-04-30
You can use the following query:
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
ORDER BY company_id, effective_date
to get:
id company_id status_id effective_date grp
-----------------------------------------------
1 10 1 2016-12-15 0
2 10 1 2016-12-30 1
3 10 5 2017-02-04 2
4 10 4 2017-02-08 2
5 11 5 2017-06-05 0
6 11 1 2018-04-30 0
Now you can identify status = 1 islands using:
;WITH CTE AS
(
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
)
SELECT id, company_id, status_id, effective_date,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) -
cnt AS grp
FROM CTE
Output:
id company_id status_id effective_date grp
-----------------------------------------------
1 10 1 2016-12-15 1
2 10 1 2016-12-30 1
3 10 5 2017-02-04 1
4 10 4 2017-02-08 2
5 11 5 2017-06-05 1
6 11 1 2018-04-30 2
Calculated field grp will help us identify those islands:
;WITH CTE AS
(
SELECT t.id, t.company_id, t.status_id, t.effective_date, x.cnt
FROM company_status_history AS t
OUTER APPLY
(
SELECT COUNT(*) AS cnt
FROM company_status_history AS c
WHERE c.status_id = 1
AND c.company_id = t.company_id
AND c.effective_date < t.effective_date
) AS x
), CTE2 AS
(
SELECT id, company_id, status_id, effective_date,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) -
cnt AS grp
FROM CTE
)
SELECT company_id,
MIN(effective_date) AS start_date,
CASE
WHEN COUNT(*) > 1 THEN DATEADD(DAY, -1, MAX(effective_date))
ELSE MIN(effective_date)
END AS end_date
FROM CTE2
GROUP BY company_id, grp
HAVING COUNT(CASE WHEN status_id = 1 THEN 1 END) > 0
Output:
company_id start_date end_date
-----------------------------------
10 2016-12-15 2017-02-03
11 2018-04-30 2018-04-30
All you want know is those records from above that overlap with the specified interval.
Demo here with somewhat more complicated use case.
Maybe this is what you are looking for? For these kind of questions, you need to join two instance of your table, in this case I am just joining with next record by Id, which probably is not totally correct. To do it better, you can create a new Id using a windowed function like row_number, ordering the table by your requirement criteria
If this row is status 1 and it's date is before the date range check
the next row if it has a date inside the date range
declare #range_st date = '2017-01-01'
declare #range_en date = '2017-12-31'
select
case
when csh1.status_id=1 and csh1.effective_date<#range_st
then
case
when csh2.effective_date between #range_st and #range_en then true
else false
end
else NULL
end
from company_status_history csh1
left join company_status_history csh2
on csh1.id=csh2.id+1
Implementing second criteria:
"If this row is status 1 and it's date is after the date range check
the row before if it has a date inside the date range."
declare #range_st date = '2017-01-01'
declare #range_en date = '2017-12-31'
select
case
when csh1.status_id=1 and csh1.effective_date<#range_st
then
case
when csh2.effective_date between #range_st and #range_en then true
else false
end
when csh1.status_id=1 and csh1.effective_date>#range_en
then
case
when csh3.effective_date between #range_st and #range_en then true
else false
end
else null -- ¿?
end
from company_status_history csh1
left join company_status_history csh2
on csh1.id=csh2.id+1
left join company_status_history csh3
on csh1.id=csh3.id-1
I would suggest the use of a cte and the window functions ROW_NUMBER. With this you can find the desired records. An example:
DECLARE #t TABLE(
id INT
,company_id INT
,status_id INT
,effective_date DATETIME
)
INSERT INTO #t VALUES
(1, 10, 1, '2016-12-30 00:00:00.000')
,(2, 10, 5, '2017-02-04 00:00:00.000')
,(3, 11, 5, '2017-06-05 00:00:00.000')
,(4, 11, 1, '2018-04-30 00:00:00.000')
DECLARE #StartDate DATETIME = '2017-01-01';
DECLARE #EndDate DATETIME = '2017-12-31';
WITH cte AS(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY effective_date) AS rn
FROM #t
),
cteLeadLag AS(
SELECT c.*, ISNULL(c2.effective_date, c.effective_date) LagEffective, ISNULL(c3.effective_date, c.effective_date)LeadEffective
FROM cte c
LEFT JOIN cte c2 ON c2.company_id = c.company_id AND c2.rn = c.rn-1
LEFT JOIN cte c3 ON c3.company_id = c.company_id AND c3.rn = c.rn+1
)
SELECT 'Included' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date BETWEEN #StartDate AND #EndDate
UNION ALL
SELECT 'Following' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date > #EndDate
AND LagEffective BETWEEN #StartDate AND #EndDate
UNION ALL
SELECT 'Trailing' AS RangeStatus, *
FROM cteLeadLag
WHERE status_id = 1
AND effective_date < #EndDate
AND LeadEffective BETWEEN #StartDate AND #EndDate
I first select all records with their leading and lagging Dates and then I perform your checks on the inclusion in the desired timespan.
Try with this, self-explanatory. Responds to this part of your question:
I want to answer to the question "Get all companies that have been at
least for some point in status 1 inside the time period 01/01/2017 -
31/12/2017"
Case that you want to find those id's that have been in any moment in status 1 and have records in the period requested:
SELECT *
FROM company_status_history
WHERE id IN
( SELECT Id
FROM company_status_history
WHERE status_id=1 )
AND effective_date BETWEEN '2017-01-01' AND '2017-12-31'
Case that you want to find id's in status 1 and inside the period:
SELECT *
FROM company_status_history
WHERE status_id=1
AND effective_date BETWEEN '2017-01-01' AND '2017-12-31'

Combine Two Rows into One with Similar fields (DateTime) and NULL Vales in SQL

Could any one help me for the below request.
I have data of One row for the Login DateTime and another row for the Logout Datetime. The rest of the fields are same. I need to combine both rows in to one with Login (Datetime) and Logout (Datetime).
Sample Data
ID Code DateTime User Status
35 100 1/1/2014 14:50 a IN
35 100 1/1/2014 15:45 a OUT
35 100 1/1/2014 18:20 a IN
35 100 1/1/2014 19:10 a OUT
Result should look like below
ID Code Datetime1 Datetime2 User
35 100 2014-01-01 14:50 2014-01-01 15:45 a
35 100 2014-01-01 18:20 2014-01-01 19:10 a
Thank you.
Use the ROW_NUMBER() windowing function to determine the closest 'OUT' status for each 'IN' iteration:
SELECT * FROM (
SELECT t1.ID, t1.Code, t1.[Datetime] as Datetime1, tNext.[Datetime] as Datetime2, t1.[User],
ROW_NUMBER() OVER (PARTITION BY t1.ID, t1.Code, t1.[User], t1.[Datetime] ORDER BY tNext.[Datetime]) rowNum
FROM myTable t1
JOIN myTable tNext ON
t1.ID = tNext.ID AND
t1.Code = tNext.Code AND
t1.[User] = tNext.[User] AND
tNext.Status = 'OUT' AND
t1.[Datetime] < tNext.[Datetime]
WHERE t1.Status = 'IN' ) t
WHERE rowNum = 1
ORDER BY ID, Code, [User], Datetime1
SQLFiddle here
This finds the next date/time with an 'OUT' after each 'IN' :
(simplified to match small data sample, extra code required)
With YourData as (
SELECT 35 as ID, 100 as Code, '1/1/2014 14:50' as yDatetime,
'a' as yUser, 'IN' AS status UNION ALL
SELECT 35,100, '1/1/2014 15:45', 'a', 'OUT' UNION ALL
SELECT 35,100, '1/1/2014 18:20', 'a', 'IN' UNION ALL
SELECT 35,100, '1/1/2014 19:10', 'a', 'OUT'
)
SELECT
ID,
Code,
yDatetime AS When_IN,
(SELECT Min(yDatetime) FROM YourData yd2
WHERE (yd2.yDatetime>YourData.yDatetime)
AND Status='OUT'
-- extra matching needed here
-- for ID, CODE, User fields in use
) AS When_OUT,
yUser as _User
FROM YourData WHERE Status='IN'
Results :
35 100 1/1/2014 14:50 1/1/2014 15:45 a
35 100 1/1/2014 18:20 1/1/2014 19:10 a
Try
select
a.id,
a.code,
a.datetime as datetime1,
b.datetime as datetime2,
a.user
from
(select
id,
code,
datetime,
user
from
table
where
status='IN') a
inner join
(select
id,
code,
datetime,
user
from
table
where
status='OUT') b
on
(a.user=b.user and a.id=b.id and a.code=b.code)
try this
SELECT lin.ID, lin.CODE, lin.USER, lin.DateTime as LoginDate,
(select top 1 DateTime from TABLE lout
where lout.data > lin.data and lin.id=lout.id
and lin.user = lout.user and lin.code = lout.code and status = 'out'
order by lout.dateTime
) as LogOutDate
FROM TABLE lin
where lin.status='IN'