Display # of customers per month that have previous Sale date > 3 months and # of these customers that have a "Sale Date" in the given month - sql

Basically, my requirement is - for a given month, how many customers had their "previous Sale date" 3 months before the given month and of these customers how many of them have a "Sale date" in the given month.
I tried using Lag function, but my column "Reactivated_Guests" is giving me null value always.
SELECT datepart(month,["sale date"]) `"Sale_Month",count(distinct
["user id"]) "Lost_Guests",
lag("Guests",4) OVER (ORDER BY "Sale_Month")+
lag("Guests",5) OVER (ORDER BY "Sale_Month")+
lag("Guests",6) OVER (ORDER BY "Sale_Month")+
lag("Guests",7) OVER (ORDER BY "Sale_Month")+
lag("Guests",8) OVER (ORDER BY "Sale_Month")+
lag("Guests",9) OVER (ORDER BY "Sale_Month")+
lag("Guests",10) OVER (ORDER BY "Sale_Month")+
lag("Guests",11) OVER (ORDER BY "Sale_Month")+
lag("Guests",12) OVER (ORDER BY "Sale_Month") "Reactivated_Guests"
group by "Sale_Month"
order by "Sale_Month"
My expected output is month-wise # of guests that have their previous "Sale date" greater than 3 months before the given month (Lost_Guests) and of these customers how many have a "Sale date" in the given month (Reactivated_Guests)
Expected Result :
Sale_Month Lost_Guests Reactivated_Guests
(prev Sale date > 3 months) (Prev Sale date > 3 months and
have a Sale date in given month)
June 1,200 110
July 1,800 130
Aug 1,900 140
Actual Result :
Sale_Month Lost_Guests Reactivated_Guests
June 1,200 null
July 1,800 null
Aug 1,900 null
Sample Data :
Customer Sale Date
AAAAA 11/15/2018
BBBBB 11/16/2018
CCCCC 9/23/2018
CCCCC 1/25/2019
AAAAA 3/16/2019 ----> so for given month of March, AAAAA to be
CCCCC 3/18/2019 considered in "Lost_Guests" because
AAAAA's previous sale date (11/15/2018) is
more than 3 months from the given month
(March - 2019) and AAAAA to be considered in
"Reactivated_guests" because AAAAA has a
Sale date in the given month (March-2019)
----> for given month of March, CCCCC shall not
be considered in "Lost guests" and
"Reactivated Guests" because
previous sale date (1/25/2019) is less
than 3 months from given month (March-2019)
and hence does not appear in
"Reactivated_Guests" as well

This addresses the original version of the question.
You seem to want something like this:
select sale_month, count(distinct user_id) as guests,
count(distinct case when min_sale_date < sale_date - interval '3 month' then user_id end) as old_guests
from (select t.*,
min(sale_date) over (partition by user_id) as min_sale_date
from t
) t
group by sale_month
order by sale_month;
Note that date functions are very database dependent, so the exact syntax might vary depending on your database.

Related

POSTGRESQL - SUM of all orders with the same customer for a given month

I'm trying to produce a report with the total invoice amount for each customer in the month of December :
date
customer
invoice amount
01/12/2021
AB1
40
02/11/2021
AB2
60
12/12/2021
CE6
1000
31/12/2021
RF9
0.5
Could I get any pointers? I'm still fairly new to postgresql.
You should use GROUP BY for your purposes.
SELECT customer, SUM(invoice_amount) as total_invoice_amount
FROM your_table
WHERE EXTRACT(MONTH FROM date) = 12
GROUP BY customer

Count distinct customers who bought in previous period and not in next period Bigquery

I have a dataset in bigquery which contains order_date: DATE and customer_id.
order_date | CustomerID
2019-01-01 | 111
2019-02-01 | 112
2020-01-01 | 111
2020-02-01 | 113
2021-01-01 | 115
2021-02-01 | 119
I try to count distinct customer_id between the months of the previous year and the same months of the current year. For example, from 2019-01-01 to 2020-01-01, then from 2019-02-01 to 2020-02-01, and then who not bought in the same period of next year 2020-01-01 to 2021-01-01, then 2020-02-01 to 2021-02-01.
The output I am expect
order_date| count distinct CustomerID|who not buy in the next period
2020-01-01| 5191 |250
2020-02-01| 4859 |500
2020-03-01| 3567 |349
..........| .... |......
and the next periods shouldn't include the previous.
I tried the code below but it works in another way
with customers as (
select distinct date_trunc(date(order_date),month) as dates,
CUSTOMER_WID
from t
where date(order_date) between '2018-01-01' and current_date()-1
)
select
dates,
customers_previous,
customers_next_period
from
(
select dates,
count(CUSTOMER_WID) as customers_previous,
count(case when customer_wid_next is null then 1 end) as customers_next_period,
from (
select prev.dates,
prev.CUSTOMER_WID,
next.dates as next_dates,
next.CUSTOMER_WID as customer_wid_next
from customers as prev
left join customers
as next on next.dates=date_add(prev.dates,interval 1 year)
and prev.CUSTOMER_WID=next.CUSTOMER_WID
) as t2
group by dates
)
order by 1,2
Thanks in advance.
If I understand correctly, you are trying to count values on a window of time, and for that I recommend using window functions - docs here and here a great article explaining how it works.
That said, my recommendation would be:
SELECT DISTINCT
periods,
COUNT(DISTINCT CustomerID) OVER 12mos AS count_customers_last_12_mos
FROM (
SELECT
order_date,
FORMAT_DATE('%Y%m', order_date) AS periods,
customer_id
FROM dataset
)
WINDOW 12mos AS ( # window of last 12 months without current month
PARTITION BY periods ORDER BY periods DESC
ROWS BETWEEN 12 PRECEEDING AND 1 PRECEEDING
)
I believe from this you can build some customizations to improve the aggregations you want.
You can generate the periods using unnest(generate_date_array()). Then use joins to bring in the customers from the previous 12 months and the next 12 months. Finally, aggregate and count the customers:
select period,
count(distinct c_prev.customer_wid),
count(distinct c_next.customer_wid)
from unnest(generate_date_array(date '2020-01-01', date '2021-01-01', interval '1 month')) period join
customers c_prev
on c_prev.order_date <= period and
c_prev.order_date > date_add(period, interval -12 month) left join
customers c_next
on c_next.customer_wid = c_prev.customer_wid and
c_next.order_date > period and
c_next.order_date <= date_add(period, interval 12 month)
group by period;

Extract list of Sellers with TTM > 100k for the last 3 months in SQL (PostgreSQL)

Quick question:
I want to extract a list of sellers with TTM sales > 100k for the last 3 months. I need a snapshot of the data for the last 3-4 months:
Sellers which have TTM sales > 100k in January
Sellers which have TTM sales > 100k in Febr
Sellers which have TTM sales > 100k in March
I want to create it in a dynamic way on which if I want to extract the data for 6 months, only need to change > Jan to > Oct and store the data in a table
Eg. Jan - 100 sellers, Febr. 200 sellers, March - 75 Sellers
Table used:
Seller list (id, marketplace)
Sales (id, marketplace, sale_day, sale_amount)
Output 1 - TTM:
Output 2 - TTM (only sellers > 100k):
This last output has to be dynamic. I want to get the last 3 months data based on a "RUN_DATE". per eligible seller
TTM = Trailing Twelve Months (sales per seller on the last 12 months)
eg. Sales in Jan (Jan 2020 - Dec 2020)
Sales in Febr (Febr 2020 - Jan 2021)
Thise has to be filterer by 100k.
My actual logic is to take a snapshot of them per month Jan , Febr, March and union.
I suspect you want aggregation:
select id, date_trunc('month', sale_date) as yyyymm,
sum(sale_amount) as total_sales
from sales
group by yyyymm, id;
Then for the last 3 months and filtering, you would do:
select id, date_trunc('month', sale_date) as yyyymm,
sum(sale_amount) as total_sales
from sales s
where s.sale_date >= date_trunc('month', now()) - interval '3' month
group by yyyymm, id
having sum(sale_amount) > 100000;

Calculating sales on daily basis comparing the previous sales

How can I calculate sales on the basis of date comparing the previous, current, and upcoming dates?
order date | total qty
------------------------------
02/01/2021 | 5
02/04/2021 | 10
02/06/2021 | 7
02/08/2021 | 10
02/10/2021 | 2
Your bucket column could be given by:
CONCAT(
DATE_PART('day', AGE('2021-02-01', orderdate))*7+1,
'-',
(DATE_PART('day', AGE('2021-02-01', orderdate))+1)*7,
' days'
)
Your cumu total by:
SUM(total) OVER(PARTITION BY DATE_PART('day', AGE('2021-02-01', orderdate)) ORDER BY orderdate)
A sum has an implied "rows unbounded preceding" if it has an order by
I presume you're starting your report somewhere (eg your front end does where orderdate > x so it can supply the min date for the functions too. If it doesn't then you might benefit from a cte that calls the min orderdate

record for last two month and their difference in oracle

i need variance for last two month and i am using below query
with Positions as
(
select
COUNT(DISTINCT A_SALE||B_SALE) As SALES,
TO_CHAR(DATE,'YYYY-MON') As Period
from ORDERS
where DATE between date '2020-02-01' and date '2020-02-29'
group by TO_CHAR(DATE,'YYYY-MON')
union all
select
COUNT(DISTINCT A_SALE||B_SALE) As SALES,
TO_CHAR(DATE,'YYYY-MON') As Period
from ORDERS
where DATE between date '2020-03-01' and date '2020-03-31'
group by TO_CHAR(DATE,'YYYY-MON')
)
select
SALES,
period,
case when to_char(round((SALES-lag(SALES,1, SALES) over (order by period desc))/ SALES*100,2), 'FM999999990D9999') <0
then to_char(round(abs( SALES-lag(SALES,1, SALES) over (order by period desc))/ SALES*100,2),'FM999999990D9999')||'%'||' (Increase) '
when to_char(round((SALES-lag(SALES,1,SALES) over (order by period desc))/SALES*100,2),'FM999999990D9999')>0
then to_char(round(abs(SALES-lag(SALES,1, SALES) over (order by period desc ))/SALES*100,2),'FM999999990D9999')||'%'||' (Decrease) '
END as variances
from Positions
order by Period;
i am getting output like this
SALES | Period | variances
---------|------------------|--------------------
100 | 2020-FEB | 100%(Increase)
200 | 2020-MAR | NULL
i want record something like that where variance in front of march instead of feb as we are looking variance for the latest month
SALES | Period | variances
---------|------------------|--------------------
200 | 2020-MAR | 100%(Increase)
100 | 2020-FEB | NULL
I did not analyze the query in too much detail but you have one obvious flaw.
You change your period from a date to char.
That means when you apply your window functions your ordering will not work as expected.
a date ordered desc will look like (based on chronological ordering)
MAR - 2020
FEB - 2020
JAN - 2020
Text ordered desc will look like (based on alphabetical ordering)
MAR - 2020
JAN - 2020
FEB - 2020
That being said, you are comparing a 'good' case (FEB + MAR) where both the text ordering and date ordering will work the same way.
The implied ordering is ASCENDING. So at the end when you do
order by Period;
it will display February first and then March. If you do
order by Period DESC;
you will get March displayed first.