Find repeat customers after making their transcation in a month - sql

I've a sample data - data1
Date User Orderid
12-02-2020 A 50274
13-02-2020 B 34704
18-02-2020 A 12079
01-03-2020 C 69711
13-03-2020 B 36813
01-04-2020 D 57321
Customer A made the first transaction in Feb and another transaction in same month.
Customer B made the first transaction in Feb and made a transaction again in March.
How can I identify the customer acquisation in a month and their following months orders?
month | customers_acquired | made_transcation_in_month+1 | made_transaction_in_month+2
2 2 1 0
3 1 0 0
4 1 0 0
In the above result, In month 2, two customers made their first transcations and one of them made again in next month.
In month 3, one new customer made a transcation and never made any transactions again. Same goes with month 4.

select year(date) as "year"
,month(Date) as "month"
,count(new_customer_cnt) as customers_acquired
,count("repeat_customers+1") as "made_transcation_in_month+1"
,count("repeat_customers+2") as "made_transcation_in_month+2"
from (
select *
,case when "User" <> lag("User") over(order by "User", Date) or lag("User") over(order by "User", Date) is null then 1 end as new_customer_cnt
,case when "User" = lead("User") over(partition by "User" order by Date) and month(dateadd(month, 1, date)) = lead(month(date)) over(partition by "User" order by Date) then 1 end as "repeat_customers+1"
,case when "User" = lead("User") over(partition by "User" order by Date) and month(dateadd(month, 2, date)) = lead(month(date)) over(partition by "User" order by Date) then 1 end as "repeat_customers+2"
from t
) t
group by year(date), month(Date)
year
month
customers_acquired
made_transcation_in_month+1
made_transcation_in_month+2
2020
2
2
1
0
2020
3
1
0
0
2020
4
1
0
0
Fiddle

Related

FInd records 1 year older/newer than dates given in the same column, for each ID#

I need to find if a customer has a subscription the previous year and the following year, and how many subscriptions were new or were canceled the following year.
Sample data:
ID
Subscription year
1
2010
1
2011
1
2019
2
2011
2
2012
3
2010
Thinking of this approach: subtracting and adding 1 to the subscription year and seeing if the customer ID has another row that corresponds (ex. if no rows for year+1, customer had canceled the next year). Hoping for something like this:
ID
Subscription year
SubscribedPreviousYear
SubscribedNextYear
1
2010
F
T
1
2011
T
F
1
2019
F
F
2
2011
F
T
2
2012
T
F
3
2010
F
F
Then counting the F's in SubscribedPreviousYear as new subscriptions (they are counted as new if customer did not have one the immediate previous year, even for existing customers) and F's in SubscribedNextYear as canceled subscriptions, to get something like this:
Year
New (# F's in SubscribedPreviousYear)
Canceled (# F's in SubscribedPreviousYear)
2010
2
1
2011
1
1
2012
0
1
2019
1
1
I had tried this code, modified from a similar MySQL question, but got 'F' for all rows.
select
t1.Id, cast(t1.year as date),
IIF((select count(*) from table t2
where t1.Id=t2.Id and
datediff(y, t2.year, t1.year)=1) <1, 'T','F')
as SubscribedPreviousYear
from table t;
I would use LEAD() and LAG() here:
SELECT id, year,
CASE WHEN LAG(year) OVER (PARTITION BY id ORDER BY year) = year - 1
THEN 'T' ELSE 'F' END AS SubscribedPreviousYear,
CASE WHEN LEAD(year) OVER (PARTITION BY id ORDER BY year) = year + 1
THEN 'T' ELSE 'F' END AS SubscribedNextYear
FROM yourTable
ORDER BY id, year;
To get the final result, we can aggregate by year:
WITH cte AS (
SELECT *,
LAG(year) OVER (PARTITION BY id ORDER BY year) AS year_lag,
LEAD(year) OVER (PARTITION BY id ORDER BY year) AS year_lead
FROM yourTable
)
SELECT year,
COUNT(CASE WHEN year != year_lag + 1 THEN 1 END) AS [New],
COUNT(CASE WHEN year != year_lead - 1 THEN 1 END) AS Cancelled
FROM cte
GROUP BY year
ORDER BY year;

Rank customer Transactions per segments in SQL Server

I have below table which has customer's transaction details.
Tranactaction date
CustomerID
1/27/2022
1
1/29/2022
1
2/27/2022
1
3/27/2022
1
3/29/2022
1
3/31/2022
1
4/2/2022
1
4/4/2022
1
4/6/2022
1
In this table consecutive transactions occurred in every two days considered as a segment.
For example, Transactions between Jan 27th and Jan 29th considered as segment 1 & Transactions between Mar 29th and Apr 6th considered as Segment 2. I need to rank the transactions per segment with date order. If a transaction not fall under any segment by default the rank is 1. Expected output is below.
Segment Rank
Tranactaction date
CustomerID
1
1/27/2022
1
2
1/29/2022
1
1
2/27/2022
1
1
3/27/2022
1
2
3/29/2022
1
3
3/31/2022
1
4
4/2/2022
1
5
4/4/2022
1
6
4/6/2022
1
Can somebody guide how to achieve this in T-sql?
Using lag() to check for change in TransDate that is within 2 days and groups together (as a segment). After that use row_number() to generate the required sequence
with
cte as
(
select *,
g = case when datediff(day,
lag(t.TransDate) over (order by t.TransDate),
t.TransDate
) <= 2
then 0
else 1
end
from tbl t
),
cte2 as
(
select *, grp = sum(g) over (order by TransDate)
from cte
)
select *, row_number() over (partition by grp order by TransDate)
from cte2
db<>fiddle demo

SQL question - how to output using iterative date logic in SQL Server

I have the following sample table (provided with single ID for simplicity - need to perform the same logic across all IDs)
ID Visit_date
-----------------
ABC 8/7/2019
ABC 9/10/2019
ABC 9/12/2019
ABC 10/1/2019
ABC 10/1/2019
ABC 10/8/2019
ABC 10/15/2019
ABC 10/17/2019
ABC 10/24/2019
Here is what I need to get the sample output
Mark the first visit as 1 in the "new_visit" column
Compare the subsequent dates with the 1st date until it exceeds 21 days condition. Example Sep 10 is compared to Aug 7 and it doesn’t fall within 21 days of Aug 7, therefore this is considered as another new_visit, so mark new_visit as 1
Then we compare Sep 10 with the subsequent dates with 21 days criteria and mark all of them as follow_up of Sep 10 visit. Eg. Sep 12, Oct 1 are within 21 days of Sep 10; hence they are considered as follow up visits, so mark "follow_up" as 1
When the subsequent date exceeds 21 days criteria of the previous new visit (e.g. Oct 8 compared to Sep 10) then Oct 8 will be considered a new visit & mark "New_visit" as 1 and the subsequent dates will be compared against Oct 8
Sample Output :
Dates New_Visit Follow_up
-----------------------------
8/7/2019 1
9/10/2019 1
9/12/2019 1
10/1/2019 1
10/1/2019 1
10/8/2019 1
10/15/2019 1
10/17/2019 1
10/24/2019 1
You need a recursive query for this.
You would enumerate the rows, then walk through the dataset by ascending date, while keeping track of the first visit date of each group; when the interval since the last first visit exceeds 21 days, the date of the first visit resets, and a new group starts.
with
data as (
select t.*, row_number() over(partition by id order by date) rn
from mtytable t
),
cte as (
select id, visit_date, visit_date first_visit_date
from data
where rn = 1
union all
select c.id, d.visit_date, case when d.visit_date > datead(day, 21, c.first_visit_date) then d.visit_date else c.first_visit_date end
from cte c
inner join data d on d.id = c.id and d.rn = c.rn + 1
)
select
id,
date,
case when visit_date = first_visit_date then 1 else 0 end as is_new
case when visit_date = first_visit_date then 0 else 1 end as is_follow_up
from cte
If a patient may have more than 100 visits, then you need to add option (maxrecursion 0) at the very end of the query.
You need a recursive CTE to handle this. This is the idea, although the exact syntax might vary by database:
with recursive t as (
select id, date,
row_number() over (partition by id order by date) as seqnum
from yourtable
),
recursive cte as (
select id, date, visit_start as date, 1 as is_new_visit
from t
where id = 1
union all
select cte.id, t.date,
(case when t.date < visit_start + interval '21 day'
then cte.visit_start else t.date
end) as visit_start,
(case when t.date < cte.visit_start + interval '21 say'
then 0 else 1
end) as is_new_visit
from cte join
t
on t.id = cte.id and t.seqnum = cte.seqnum + 1
)
select *
from cte
where is_new_visit = 1;

db2 compare year and month side by side

I need to compare side by side the companies values by current year vs last year and current month with same month of the previous year.
I use this query to get the values
SELECT STORE, SUM(TOTAL) as VAL, DATE FROM MYTABLE
WHERE DATE=CURRENT_DATE GROUP BY STORE ORDER BY STORE
below the results
STORE | VAL | DATE
1 10 CURRENT_DATE (2018-27-03)
1 20 2018-26-03
1 30 2018-25-03
2 20 CURRENT_DATE (2018-27-03)
2 20 2018-26-02
and i need this
STORE | VALUE CURRENT YEAR | VALUE LAST YEAR
1 60 30 (CALCULATED)
2 40 50 (CALCULATED)
STORE | VALUE CURRENT MONTH | VALUE SAME MONTH OF LAST YEAR
1 60 30 (CALCULATED)
2 20 50 (CALCULATED)
Thank you
You could just join two sub-selects together.
E.g with this DDL and Data
CREATE TABLE MYTABLE (STORE int, VAL int, D DATE);
INSERT INTO MYTABLE VALUES
( 1, 10, '2018-03-27')
,( 1, 20, '2018-03-26')
,( 1, 10, '2018-02-25')
,( 1, 35, '2017-03-25')
,( 2, 20, '2018-03-27')
,( 2, 15, '2017-03-26');
This will get you current month and last month last year values
SELECT C.*, LY.VAL_CURR_MONTH_LY
FROM (
SELECT STORE, SUM(VAL) as VAL_CURR_MONTH
FROM MYTABLE WHERE INT(D)/100=INT(CURRENT_DATE)/100
GROUP BY STORE ) AS C
LEFT JOIN
(SELECT STORE
, SUM(VAL) AS VAL_CURR_MONTH_LY
FROM MYTABLE
WHERE INT(D)/100 = INT(CURRENT_DATE)/100 -100
GROUP BY STORE ) LY
ON
C.STORE = LY.STORE
Then this for years
SELECT C.*, LY.VAL_LY
FROM (
SELECT STORE, SUM(VAL) as VAL_CURR_YEAR
FROM MYTABLE WHERE INT(D)/10000=INT(CURRENT_DATE)/10000
GROUP BY STORE ) AS C
LEFT JOIN
(SELECT STORE
, SUM(VAL) AS VAL_LY
FROM MYTABLE
WHERE INT(D)/10000 = INT(CURRENT_DATE)/10000 -1
GROUP BY STORE ) LY
ON
C.STORE = LY.STORE
P.S. there are many other ways to manipulate dates, but casting to INT is maybe one of the easier ways
Also, here is a more flexible way to get the "Same Month of Last Year" value. A similar method can get "last Year" values.
SELECT T.*
, AVG(VAL) OVER(
PARTITION BY STORE
ORDER BY YEAR_MONTH
RANGE BETWEEN 101 PRECEDING AND 100 PRECEDING
) AS SAME_MONTH_PREV_YEAR
FROM
( SELECT STORE
, INTEGER(D)/100 AS YEAR_MONTH
, SUM(VAL) AS VAL
FROM
MYTABLE T
GROUP BY
STORE
, INTEGER(D)/100
) AS T
;
Gives
STORE YEAR_MONTH VAL SAME_MONTH_PREV_YEAR
----- ---------- --- --------------------
1 201703 35 NULL
1 201802 10 NULL
1 201803 30 35
2 201703 15 NULL
2 201803 20 15
It is better to avoid functions on table columns in where clauses. Check following SQLs which are based on P. Vernon sample table.
Note: These SQLs are for DB2 LUW 11.1
For month:
SELECT STORE,
SUM(CASE WHEN YEAR(D) = year(current date) THEN val
ELSE 0 END) as VAL_CURR_MONTH,
SUM(CASE WHEN YEAR(D) = year(current date) - 1 THEN vaL
ELSE 0 END) as VAL_CURR_MONTH_LY
FROM MYTABLE
WHERE D between first_day(current date) and last_day(current date)
or D between first_day(current date - 1 year) and last_day(current date - 1 year)
GROUP BY STORE
ORDER BY STORE
For year:
SELECT STORE, SUM(CASE WHEN YEAR(D) = year(current date) THEN val
ELSE 0 END) as VAL_CY,
SUM(CASE WHEN YEAR(D) = year(current date) - 1 THEN vaL
ELSE 0 END) as VAL_LY
FROM MYTABLE
WHERE D between first_day(current date - (month(current date) - 1) months)
and last_day(current date + (12 - month(current date)) months)
or D between first_day(current date - (month(current date) - 1) months - 1 year)
and last_day(current date + (12 - month(current date)) months - 1 year)
GROUP BY STORE
ORDER BY STORE

Selecting a row based on a field value on the 1st of the month in SQL

In the below table I want to select a row where "Days" = 1 but the account should have Days = 0 on the 1st of the month.
Account| Date | Days
-------|------|-----
A | 1/3/2015 | 0
A | 5/3/2015 | 1
A | 9/3/2015 | 10
B | 1/3/2015 | 30
B | 3/3/2015 | 1
B | 6/3/2015 | 12
The result should be 2nd row
A 5/3/2015 1
On the 1st A has 0 days but B has 30 days hence I want only account A
This code is for ORACLE, but take a look. Idea is:
SELECT *
FROM ACCOUNTS
WHERE DAYS = 1
AND ACC IN (SELECT ACCOUNT
FROM ACCOUNTS
WHERE ACC_DATE = TRUNC (ACC_DATE, 'MONTH') AND DAYS = 0)
Try to convert to MSSQL and run.
Try this using window function max with case to find out if there is a row with day = 1 and Days = 0 and if there is, return second row using row_number in the order of increasing date for that account in that month.
select *
from (
select
t.*,
max(case when day(date) = 1 and Days = 0 then 1 end)
over (partition by Account, month(Date), year(Date)) flag,
row_number() over (
partition by Account, month(Date), year(Date)
order by Date
) rn
from your_table t
) t where flag = 1 and rn = 2
You could try this:
SELECT account, date, days
FROM table_name t
WHERE days = 1
AND EXISTS (SELECT 1
FROM table_name
WHERE account = t.account
AND date = DATEADD(month, DATEDIFF(month, 0, t.date), 0)
--cast(date As Date) = DATEADD(month, DATEDIFF(month, 0, t.date), 0)
AND days = 0);