SQL - New customers Retention - MoM - sql

I am trying to identify the retention period of new customers that we acquire every month.
Have identified the New Customer logic from the transactions, I have no lead on how to start M1 to M10
I need to get something like below, to explain the table, in the month of Jan we have acquired 2500 customers, of those 2.5k new customers only 1600 transacted in M1(Feb), of those 1600 only 1200 transacted in M2(Mar) and so on
Similarly, in the month of Feb we have acquired 2k customers, of those only 1100 transacted in M1(here M1 refers to Mar), of those 1100 only 800 transacted in M2(here M2 refers to Apr)
M2 is subset of M1, M3 is subset of M2 and so on.
Using SQL Server 2012, I want to avoid preprocessing of data due to certain limitation on my role and access.
Any leads with sql logic will help.

Based on Gordon's answer, I propose the solution: http://sqlfiddle.com/#!18/f6785/3
select
year(first_yyyymm),
month(first_yyyymm),
count(distinct customer_id) as new_customers,
sum(case when seqnum = 1 then 1 else 0 end) as m1,
sum(case when seqnum = 2 then 1 else 0 end) as m2,
sum(case when seqnum = 3 then 1 else 0 end) as m3,
sum(case when seqnum = 4 then 1 else 0 end) as m4,
sum(case when seqnum = 5 then 1 else 0 end) as m5,
sum(case when seqnum = 6 then 1 else 0 end) as m6,
sum(case when seqnum = 7 then 1 else 0 end) as m7,
sum(case when seqnum = 8 then 1 else 0 end) as m8,
sum(case when seqnum = 9 then 1 else 0 end) as m9,
sum(case when seqnum = 10 then 1 else 0 end) as m10
from
(
select
customer_id,
first_yyyymm, yyyymm,
datediff(month, first_yyyymm, yyyymm) as seqnum
from
(
select
customer_id,
eomonth(created_at) as yyyymm,
min(eomonth(created_at))
over (partition by customer_id) as first_yyyymm
from transactions t
group by customer_id, eomonth(created_at)
) t
) t
group by year(first_yyyymm), month(first_yyyymm)
order by month(first_yyyymm);
For the data:
The result shall be:
Edit
Here's another solution, computing just those customers with transactions in every month.
http://sqlfiddle.com/#!18/ad3803/2

I would suggest the following:
Summarize by customer and month.
Get the earliest month a customer appears, using window functions.
Get the last month where there is no following month using row_number()
Aggregate.
In SQL, this looks like:
select year(first_yyyymm), month(first_yyyymm),
count(*) as new_customers,
sum(case when seqnum = 1 then 1 else 0 end) as m1,
sum(case when seqnum = 2 then 1 else 0 end) as m2,
sum(case when seqnum = 3 then 1 else 0 end) as m3,
sum(case when seqnum = 4 then 1 else 0 end) as m4,
sum(case when seqnum = 5 then 1 else 0 end) as m5,
sum(case when seqnum = 6 then 1 else 0 end) as m6,
sum(case when seqnum = 7 then 1 else 0 end) as m7,
sum(case when seqnum = 8 then 1 else 0 end) as m8,
sum(case when seqnum = 9 then 1 else 0 end) as m9,
sum(case when seqnum = 10 then 1 else 0 end) as m10
from (select customer, eomonth(date) as yyyymm,
min(eomonth(date)) over (partition by customer) as first_eomonth,
row_number() over (partition by customer order by eomonth(date)) as seqnum
from transactions t
group by customer, eomonth(date)
) t
where datediff(month, first_yyyymm, yyyymm) = seqnum - 1
group by year(first_yyyymm), month(first_yyyymm)
order by min(first_yyyymm);

Related

How could I adapt this query to work over multiple years?

This query pulls data from a VistaDB and produces info on the number of courses started in each month of the year from people in different countries.
Select c.CountryName As Country,
Count (case When Month( ch.CourseStarted ) = 1 Then 1 End) As Jan19,
Count (case when Month(ch.CourseStarted ) = 2 Then 1 End) as Feb19,
Count (case When Month(ch.CourseStarted ) = 3 Then 1 End) as Mar19,
Count (case When Month(ch.CourseStarted ) = 4 Then 1 End) as Apr19,
Count (case When Month(ch.CourseStarted ) = 5 Then 1 End) as May19,
Count (case When Month(ch.CourseStarted ) = 6 Then 1 End) as Jun19,
Count (case When Month(ch.CourseStarted ) = 7 Then 1 End) as Jul19,
Count (case When Month(ch.CourseStarted ) = 8 Then 1 End) as Aug19,
Count (case When Month(ch.CourseStarted ) = 9 Then 1 End) as Sep19,
Count (case When Month(ch.CourseStarted ) = 10 Then 1 End) as Oct19,
Count (case When Month(ch.CourseStarted ) = 11 Then 1 End)as Nov19,
Count (case When Month(ch.CourseStarted ) = 12 Then 1 End) as Dec19
From Country As c
Inner Join CourseHistory As ch On c.Oid = ch.Country
Where (ch.CourseStarted >= '2019-01-01' And
ch.CourseStarted <= '2019-12-31')
Group By c.CountryName
Order by c.CountryName;
My question is would it be possible to make this semi-dynamic so that if I were to make the final date in the where clause '2022-12-31' I could get a rafft of colums for each month of each year?

How to calculate a Cumulative total using SQL

I have a Tickets table in My database , each Ticket have a status_id (1,2,3)
1: Ticket IN PROGRESS
2: Ticket Out Of time
3: Ticket Closed
I want using SQL to calculate the number of tickets for each status .
Calculate the cumulative total for each Status in a specific Date, I have already a column affectation_Date that contains the date where the status of ticket has been changed .
Use conditional aggregation as
SELECT TicketID,
AffectationDate,
SUM(CASE WHEN StatusID = 1 THEN 1 ELSE 0 END) InProgress,
SUM(CASE WHEN StatusID = 2 THEN 1 ELSE 0 END) OuOfTime,
SUM(CASE WHEN StatusID = 3 THEN 1 ELSE 0 END) Closed,
COUNT(1) Total
FROM Tickets
GROUP BY TicketID,
AffectationDate
ORDER BY TicketID,
AffectationDate;
Or if you want to GROUP BY AffectationDate only
SELECT AffectationDate,
SUM(CASE WHEN StatusID = 1 THEN 1 ELSE 0 END) TotalInProgress,
SUM(CASE WHEN StatusID = 2 THEN 1 ELSE 0 END) TotalOutOfTime,
SUM(CASE WHEN StatusID = 3 THEN 1 ELSE 0 END) TotalClosed,
COUNT(1) TotalStatusThisDate
FROM Tickets
GROUP BY AffectationDate
ORDER BY AffectationDate;
Live Demo
Using conditional counts.
SELECT affectation_Date,
COUNT(CASE WHEN status_id = 1 THEN 1 END) AS TotalInProgress,
COUNT(CASE WHEN status_id = 2 THEN 1 END) AS TotalOutOfTime,
COUNT(CASE WHEN status_id = 3 THEN 1 END) AS TotalClosed
FROM Tickets t
GROUP BY affectation_Date
ORDER BY affectation_Date
you may use the desired filter condition for the date criteria
SELECT COUNT(1), STATUS
FROM tickets
WHERE affectation_Date >= 'someDate'
group by status
Regards
You just need to group by status and count the number of tickets in each group:
select status, count(*) as number
from Tickets
where dt >= '2019-01-01 00:00:00' and dt < '2019-01-02 00:00:00'
group by status
having status >= 1 and status <= 3
This adds the Cumulative Sum to the existing answers:
SELECT AffectationDate,
Sum(CASE WHEN StatusID = 1 THEN 1 ELSE 0 END) AS TotalInProgress,
Sum(CASE WHEN StatusID = 2 THEN 1 ELSE 0 END) AS TotalOutOfTime,
Sum(CASE WHEN StatusID = 3 THEN 1 ELSE 0 END) AS TotalClosed,
Count(*) as TotalStatusThisDate,
Sum(Sum(CASE WHEN StatusID = 1 THEN 1 ELSE 0 END)) Over (ORDER BY AffectationDate) AS cumTotalInProgress,
Sum(Sum(CASE WHEN StatusID = 2 THEN 1 ELSE 0 END)) Over (ORDER BY AffectationDate) AS cumTotalOutOfTime,
Sum(Sum(CASE WHEN StatusID = 3 THEN 1 ELSE 0 END)) Over (ORDER BY AffectationDate) AS cumTotalClosed,
Sum(Count(*)) Over (ORDER BY AffectationDate) AS cumTotalStatusThisDate
FROM Tickets
GROUP BY AffectationDate
ORDER BY AffectationDate;

SQL Server Query with join and merge two row into single row of record

I have had a sample table like these
I would like to have a final result for my query in this way
I have no clue how to create SQL Server Query to archive the result as said about. Would you mind to guide me how to make it works?
Regards,
Assuming you have at most two rows, you can use row_number() to enumerate the values and conditional aggregation (or pivot, if you prefer):
select m.movementid, m.arrflt, m.depflt,
sum(case when seqnum = 1 then des else 0 end) as des_1,
sum(case when seqnum = 1 then cargo else 0 end) as cargo_1,
sum(case when seqnum = 1 then mail else 0 end) as mail_1,
sum(case when seqnum = 1 then luggage else 0 end) as luggage_1,
sum(case when seqnum = 2 then des else 0 end) as des_2,
sum(case when seqnum = 2 then cargo else 0 end) as cargo_2,
sum(case when seqnum = 2 then mail else 0 end) as mail_2,
sum(case when seqnum = 2 then luggage else 0 end) as luggage_2
from movement m join
(select md.*,
row_number() over (partition by movementid order by movementid) as seqnum
from movementdetail md
) md
on md.movementid = m.movementid
group by m.movementid, m.arrflt, m.depflt;

How to make multiple rows into columns

I've tried MAX CASE WHEN and CTE but for some reason can't exactly figure this out.
My data looks like this:
SELECT RC, isMHy, eligible
FROM test
RC isMHY eligible
190B05 0 1
190K00 1 0
There can be up to 4 rows in the table, I want to the results to look like this (12 columns in case there are 4 rows)
RC1 isMHY1 eligible1 RC2 isMHY2 eligible2
190B05 0 1 190K00 1 0
Any suggestions would be appreciated
You can use conditional aggregation with ROW_NUMBER() :
SELECT MAX(CASE WHEN s.rnk = 1 THEN s.rc END) as rc1,
MAX(CASE WHEN s.rnk = 1 THEN s.ismhy END) as ismhy1,
MAX(CASE WHEN s.rnk = 1 THEN s.eligible END) as eligible1,
MAX(CASE WHEN s.rnk = 2 THEN s.rc END) as rc2,
MAX(CASE WHEN s.rnk = 2 THEN s.ismhy END) as ismhy2,
MAX(CASE WHEN s.rnk = 2 THEN s.eligible END) as eligible2,
..........
FROM(
SELECT t.*,
ROW_NUMBER() OVER(ORDER BY SELECT 1) as rnk
FROM test t) s

sql subquery that collects from 3 rows

I have a huge database with over 4 million rows that look like that:
Customer ID Shop
1 Asda
1 Sainsbury
1 Tesco
2 TEsco
2 Tesco
I need to count customers that within last 4 weeks had shopped in all 3 shops Tesco Sainsbury and Asda. Can you please advice if its possible to do it with subqueries?
This is an example of a "set-within-sets" subquery. You can solve it with aggregation:
select customer_id
from Yourtable t
where <shopping date within last four weeks>
group by customer_id
having sum(case when shop = 'Asda' then 1 else 0 end) > 0 and
sum(case when shop = 'Sainsbury' then 1 else 0 end) > 0 and
sum(case when shop = 'Tesco' then 1 else 0 end) > 0;
This structure is quite flexible. So if you wanted Asda and Tesco but not Sainsbury, then you would do:
select customer_id
from Yourtable t
where <shopping date within last four weeks>
group by customer_id
having sum(case when shop = 'Asda' then 1 else 0 end) > 0 and
sum(case when shop = 'Sainsbury' then 1 else 0 end) = 0 and
sum(case when shop = 'Tesco' then 1 else 0 end) > 0;
EDIT:
If you want a count, then use this as a subquery and count the results:
select count(*)
from (select customer_id
from Yourtable t
where <shopping date within last four weeks>
group by customer_id
having sum(case when shop = 'Asda' then 1 else 0 end) > 0 and
sum(case when shop = 'Sainsbury' then 1 else 0 end) > 0 and
sum(case when shop = 'Tesco' then 1 else 0 end) > 0
) t