Extract list of Sellers with TTM > 100k for the last 3 months in SQL (PostgreSQL) - sql

Quick question:
I want to extract a list of sellers with TTM sales > 100k for the last 3 months. I need a snapshot of the data for the last 3-4 months:
Sellers which have TTM sales > 100k in January
Sellers which have TTM sales > 100k in Febr
Sellers which have TTM sales > 100k in March
I want to create it in a dynamic way on which if I want to extract the data for 6 months, only need to change > Jan to > Oct and store the data in a table
Eg. Jan - 100 sellers, Febr. 200 sellers, March - 75 Sellers
Table used:
Seller list (id, marketplace)
Sales (id, marketplace, sale_day, sale_amount)
Output 1 - TTM:
Output 2 - TTM (only sellers > 100k):
This last output has to be dynamic. I want to get the last 3 months data based on a "RUN_DATE". per eligible seller
TTM = Trailing Twelve Months (sales per seller on the last 12 months)
eg. Sales in Jan (Jan 2020 - Dec 2020)
Sales in Febr (Febr 2020 - Jan 2021)
Thise has to be filterer by 100k.
My actual logic is to take a snapshot of them per month Jan , Febr, March and union.

I suspect you want aggregation:
select id, date_trunc('month', sale_date) as yyyymm,
sum(sale_amount) as total_sales
from sales
group by yyyymm, id;
Then for the last 3 months and filtering, you would do:
select id, date_trunc('month', sale_date) as yyyymm,
sum(sale_amount) as total_sales
from sales s
where s.sale_date >= date_trunc('month', now()) - interval '3' month
group by yyyymm, id
having sum(sale_amount) > 100000;

Related

Is there a way to count distinct from first record to last day of each month? BigQuery

I am trying to compute the total of customer base from 2018-01-01 till last day of the months this year to achieve a month on month look. For instance, for the month of Jan in 2022, it will be the total count of distinct customers from 2018-01-01 to 2022-01-30. For the month of feb in 2022, it will be total count of distinct customers from 2018-01-01 to 2022-02-29. Could someone enlighten me?
select count(distinct customername) from table
where billingdate between "2018-01-01" and "2022-01-30";
currently, I only get the result for first month.
result
I think you are expecting cumulative customer count month wise,
example: in jan 2018 the customer count is 10 and in feb 2018 count is 20
jan 2018 - 10
feb 2018 - 20
what you need is
jan 2018 - 10
feb 2018 - 30 <--
In this case, group the dates and use 'over' clause, to get the cumulative count
select year_month_date,sum(customer_count) over(ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as cumulative_customer_count_from_jan_2018 from (select year_month_date, count(distinct customername) as customer_count from (select date(extract(year from billingdate),extract(month from billingdate),1) as year_month_date, customername from table) as table group by year_month_date order by year_month_date) where year_month_date >= date(2018,1,1)

How to append more rows to a table depending on a condition in BigQuery

I have two tables one with volume data and other with price data. Table with price data at times might not match the period covered in the table with volume data. For example volumes are available for the period starting from Jan 1 2022 to Dec 31 2023, but the price data is available only up until Dec 31 2022. However, eventually it will get populated at some point. Thus, I want to copy Q4 2022 prices from the price table and append it 4 times to the main table labeled as Q1 2023, Q2 2023, Q3 2023 and Q4 2023. But I only want to do it when a condition is met, and that condition is if the period in two tables match or not.
So essentially this:
SELECT year, quarter, SKU, price FROM prices_table
UNION ALL
SELECT 2023 as year, 1 as quarter, SKU, price FROM prices_table WHERE year = 2022 and quarter = 4
UNION ALL
SELECT 2023 as year, 2 as quarter, SKU, price FROM prices_table WHERE year = 2022 and quarter = 4
and so on until I have full 4 quarters of 2023 prices.
I want the script to first check if prices_table already has that data or not. Basically if year * 10 + quarter from volume_table > year * 10 + quarter from prices_table then I want the script to handle it as I have described above. But if year year * 10 + quarter from volume_table = year * 10 + quarter from prices_table I don't want the script to do anything since all the prices I need are already available in the prices_table

Count total without duplicate records

I have a table that contains the following columns: TrackingStatus, Year, Month, Order, Notes
I need to calculate the total number of tracking status for each year and month.
For example, if the table contains the following orders:
TrackingStatus
Year
Month
Order
Notes
F
2020
1
33
F
2020
1
33
DFF
E
2020
2
36
xxx
A
2021
3
34
X1
A
2021
3
34
DD
A
2021
3
88
A
2021
2
45
The result should be:
• Tracking F , year 2020, month 1 the total will be one (because it's the same year, month, and order).
• Tracking A , year 2021, month 2 the total will be one. (because there is only one record with the same year, month, and order).
• Tracking A , year 2021, month 3 the total will be two. (because there are two orders within the same year and month).
So the expected SELECT output will be like that:
TrackingStatus
Year
Month
Total
F
2020
1
1
E
2020
2
1
A
2021
2
1
A
2021
3
2
I was trying to use group by but then it will count the number of records which in my scenario is wrong.
How can I get the total orders for each month without counting “duplicate” records?
Thank you
You can use a COUNT DISTINCT aggregation function, whereas the COUNT allows you to count the values, but the DISTINCT condition will allow each value only once.
SELECT TrackingStatus,
Year,
Month,
COUNT(DISTINCT Order) AS Total
FROM tab
GROUP BY TrackingStatus,
Year,
Month
ORDER BY Year,
Month
Here you can find a tested solution in a MySQL environment, though this should work with many DBMS.

Display # of customers per month that have previous Sale date > 3 months and # of these customers that have a "Sale Date" in the given month

Basically, my requirement is - for a given month, how many customers had their "previous Sale date" 3 months before the given month and of these customers how many of them have a "Sale date" in the given month.
I tried using Lag function, but my column "Reactivated_Guests" is giving me null value always.
SELECT datepart(month,["sale date"]) `"Sale_Month",count(distinct
["user id"]) "Lost_Guests",
lag("Guests",4) OVER (ORDER BY "Sale_Month")+
lag("Guests",5) OVER (ORDER BY "Sale_Month")+
lag("Guests",6) OVER (ORDER BY "Sale_Month")+
lag("Guests",7) OVER (ORDER BY "Sale_Month")+
lag("Guests",8) OVER (ORDER BY "Sale_Month")+
lag("Guests",9) OVER (ORDER BY "Sale_Month")+
lag("Guests",10) OVER (ORDER BY "Sale_Month")+
lag("Guests",11) OVER (ORDER BY "Sale_Month")+
lag("Guests",12) OVER (ORDER BY "Sale_Month") "Reactivated_Guests"
group by "Sale_Month"
order by "Sale_Month"
My expected output is month-wise # of guests that have their previous "Sale date" greater than 3 months before the given month (Lost_Guests) and of these customers how many have a "Sale date" in the given month (Reactivated_Guests)
Expected Result :
Sale_Month Lost_Guests Reactivated_Guests
(prev Sale date > 3 months) (Prev Sale date > 3 months and
have a Sale date in given month)
June 1,200 110
July 1,800 130
Aug 1,900 140
Actual Result :
Sale_Month Lost_Guests Reactivated_Guests
June 1,200 null
July 1,800 null
Aug 1,900 null
Sample Data :
Customer Sale Date
AAAAA 11/15/2018
BBBBB 11/16/2018
CCCCC 9/23/2018
CCCCC 1/25/2019
AAAAA 3/16/2019 ----> so for given month of March, AAAAA to be
CCCCC 3/18/2019 considered in "Lost_Guests" because
AAAAA's previous sale date (11/15/2018) is
more than 3 months from the given month
(March - 2019) and AAAAA to be considered in
"Reactivated_guests" because AAAAA has a
Sale date in the given month (March-2019)
----> for given month of March, CCCCC shall not
be considered in "Lost guests" and
"Reactivated Guests" because
previous sale date (1/25/2019) is less
than 3 months from given month (March-2019)
and hence does not appear in
"Reactivated_Guests" as well
This addresses the original version of the question.
You seem to want something like this:
select sale_month, count(distinct user_id) as guests,
count(distinct case when min_sale_date < sale_date - interval '3 month' then user_id end) as old_guests
from (select t.*,
min(sale_date) over (partition by user_id) as min_sale_date
from t
) t
group by sale_month
order by sale_month;
Note that date functions are very database dependent, so the exact syntax might vary depending on your database.

SQL Difference Between Sum of Two Months

I'm trying to find the difference between the previous years month and current years month. An example would be the SUM of sales for January 2013 and the difference of SUM of sales for January 2014 sales. This is being done to see how much we made from the previous year. I have a group by that shows the total sales by month and year. I'm having trouble on defining how to find the difference between the two months. Thank you for your help. Its greatly appreciated.
Table
Date Sales
1/1/2013 100
1/12/2013 150
1/21/2013 90
1/4/2014 200
1/17/2014 50
1/20/2014 100
Result of Group By
Jan 2013
340
Jan 2014
350
Difference
Jan 2014 - Jan 2013
340 - 350 = 10
The best way to do this depends on the database. The first thing you need to do is to aggregate the data. Then a simple join will get the data you need. Here is one method:
with ym as (
select year(date) as yr, month(date) as mon,
sum(sales) as sales
from table t
group by year(date), month(date)
)
select ym.yr, ym.mon, ym.sales, ymprev.sales as prev_sales,
(ym.sales - ymprev.sales) as diff
from ym join
ym ymprev
on ymprev.yr = ym.yr - 1 and ymprev.mon = ym.mon;