SQL Query to recursively track month of purchase - sql

I have a table with customer id and month of purchase. For each customer, I first need to segment them on their first month of purchase, i.e., if a customer did their first purchase on 10 June 2017, then they belong to bucket June 2017. See below sample data table.
Then for each subsequent purchase of that customer (say from June 2017 segment), we need to track the month. For instance, if the June 2017 customer did their second purchase on 25 June 2017 and 3rd purchase on 11 Aug 2017. Then second purchase will be counted in 1st Month (within 30 days of 1st transaction) and 3rd purchase will be counted in 3rd month, as difference between 11 Aug 2017 and 10 June 2017 is 62 days, which lies between 61 and 90 days, hence in the 3rd month.
See below sample output table, although I need it in percentage form (% of customer who did in first month, second month, etc.). In the table, we are showing all the customers who did their first transaction say in Jan 2017 and then how many of them did transactions in subsequent months.
This tracking needs to be done for each customer. While I believe I am comfortable with the first part, wherein I need to segment each customer, I can do that based on first or partition.
I am not sure about how to do this recursively for subsequent transactions.
Thanks in advance for help!

You simply use window functions to define the original month and then conditional aggregation.
You don't mention the database, but this is the idea:
select to_char(first_purchase_date, 'YYYY-MM') as yyyymm,
sum(case when months_between(first_purchase_date, purchase_date) = 1 then 1 else 0 end) as purchases_1,
sum(case when months_between(first_purchase_date, purchase_date) = 1 then 1 else 0 end) as purchases_2,
. . .
from (select t.*,
min(purchase_date) over (partition by customer_id) as first_purchase_date
from t
) t
group by first_purchase_date;
I invented the months_between() and to_char() functions, but you should get the idea.
The above tracks purchases. To get customers, you can use:
(count(distinct case when months_between(first_purchase_date, purchase_date) = 1 then customer_id) /
count(distinct customer_id)
) as month_1_ratio

You can use the lag function to create a column “previous purchase.
Lag(purchasemonth,1) over(partition. by customerid order by purchasemonth) as [PreviousPurchaseDate]
Then simply do a datediff and bucket as you wish.

Related

getting sum for each month for several months in a year in sql

I have the following table
image of database in use
i want to get the following kind of results
jan 12500
feb 16500
mar 4500
apr 6500
the query should return a total for each month for desired months.
i know how to do this..
$sql = "SELECT SUM(cost) as january FROM earnings WHERE month= 1 and year= '$2022" ;
to get the sum for a given month but I cant find anything on how to get multiple months at once.
am still new to this
SELECT
SUM(cost) as cost,
month
FROM earnings
WHERE year = :year
GROUP BY month
Sum all entries of cost, per month (GROUP BY) found in year (:year)
Each ROW will have a column cost and month.
If you want to "further" filter the months you can apply another AND clause
AND (month >= 1 OR month <= 6) for January to June
Useful Source:
https://www.mysqltutorial.org/mysql-group-by.aspx

Add results to a row regarding the last 12 months rows- SQL Server

At my last meeting someone asked me if it was possible to hide people who where ill since a year from a dashboard. So I'm searching for the best way to actually KNOW who has been ill for 12 months.
I am working with a table with the number of days you've been absent for every kind of absence you could have, the number of days you should have been working that month, with a row per person, department and profession each month.
So it looks something like this :
PersonID
YEAR
MONTH
DEPARTMENT
PROFESSION
Absence1
Absence2
Absence3
WORKING DAYS OF THE MONTH...
11111
2021
07
HR
ASSISTANT
0
2
0
22
11111
2021
08
HR
ASSISTANT
0
0
0
22
==> So if I'm on a row of July 2021 I need to check the lines from June 2020 to June 2021.
My guess is that I need to add a column to this table who will say (with some kind of loop maybe) "if for the last 12 months (rows) the total number of days of absence equal the number of working days of the last 12 months then "ILL FOR A YEAR OR MORE" for each person (knowing that a person can work in more than one department or more than one profession so she'll have more than one row per month).
But I really have no idea how to actually write it in a script as I usually do very basic things. I'm using SQL SERVER and have 429 207 rows in the table. I'm thinking about doing it in the whole table and not only treating this month's rows because in the dashboard we show an historic.
Your table is heavily denormalized. If you want to represent all this information in the database, I would have expected the following tables, instead of just one:
Person
Department
Illness (list of illnesses)
IllnessAbsence (join table between Person and Illness)
Either way, you can get the information you need with something like this:
I've assumed you want the whole table, so you need a window function
We need to flip the logic on its head: exclude all rows which have no non-absence in the last 12 months
SELECT
PersonID,
YEAR,
MONTH,
DEPARTMENT,
PROFESSION,
ILLNESS1,
ILLNESS2,
ILLNESS3,
[WORKING DAYS OF THE MONTH]
FROM (
SELECT *,
NotIllLast12Months = COUNT(CASE WHEN DATEFROMPARTS(YEAR + 1, MONTH, 1) >= GETDATE()
AND ILLNESS1 + ILLNESS2 + ILLNESS3 = 0 THEN 1 END)
OVER (PARTITION BY ID)
FROM HETP_ABS
) abs
WHERE NotIllLast12Months > 0;

I want to get a specific answer by comparing two columns in postgresql

I have a query like this :
with base_data as
( Select
receipt_date,
receipt_value,
receipt_customer_id
From table1 )
Select
count(distinct (receipt_customer_id) , sum(receipt_value)
From
base_data
where
(receipt_date:timestamp <= current_date - interval '1 month' and
receipt_date: timestamp >= current_date - interval '2 month)
This basically gives me the number of distinct clients and their sum of receipt values for July and August considering the current month as September
I want to reduce this further and just want data for
distinct clients and sum of their receipt values
for whom there was no receipt in July i.e. they never transacted with us in July but came back in August basically they skipped a month and then transacted again.
I am unable to write this clause which I am putting in English below as a problem statement :
Give me the data for a distinct count of clients and their total sum of receipts who transacted with us in August but had no receipt value in July
I hope I am able to explain it. I have been racking my brain on this for a while but am unable to figure out a solution. Please help.
The current result looks like this
Count: 120
Sum: 207689
I want it reduced to (assumption)
Count: 12
Sum: 7000
The first issue I can see is with "sum of receipt values for July and August"; the return from your current query will depend upon when it is run (and will not be for calendar months). Lets put that aside and simplify/fix (the query as stated does not run) your query to one that will list all transactions in August (I think its simpler to understand using hard coded dates for now):
Select
receipt_customer_id, sum(receipt_value)
From
table1
where
-- Transacted in August
receipt_date >= '2020-08-01'::timestamp and
receipt_date < '2020-09-01'::timestamp
group by receipt_customer_id;
We can now add another clause to the where to filter out customers with transactions totalling $0/NULL (so total of $0 or no transactions at all) in July:
Select
receipt_customer_id, sum(receipt_value)
From
table1 t
where
-- Transacted in August
t.receipt_date >= '2020-08-01'::timestamp and
t.receipt_date < '2020-09-01'::timestamp
and (
select coalesce(sum(receipt_value), 0)
from table1
where
receipt_customer_id = t.receipt_customer_id and
-- Transacted in July
receipt_date >= '2020-07-01'::timestamp and
receipt_date < '2020-08-01'::timestamp
) = 0
group by receipt_customer_id;
or if you just want the count of customers and sum of receipt_value:
Select
count(distinct receipt_customer_id), sum(receipt_value)
From
table1 t
where
-- Transacted in August
t.receipt_date >= '2020-08-01'::timestamp and
t.receipt_date < '2020-09-01'::timestamp
and (
select coalesce(sum(receipt_value), 0)
from table1
where
receipt_customer_id = t.receipt_customer_id and
-- Transacted in July
receipt_date >= '2020-07-01'::timestamp and
receipt_date < '2020-08-01'::timestamp
) = 0
See this db fiddle for a test of this (feel free to use this if you want to ask follow-up questions). Note that if you want to reintroduce current_date you can do so (but you probably want to calculate the start of the month date_trunc can help with this).

How to count the number of occurrences per month?

I have a program and want to generate reports from it. The program is for a grocery store that does deliveries. A customer places an order and the program captures the various items that the customer wishes to purchase, e.g. Order 21 and the program lists the various items relating to that specific order.
I would like to generate a SQL query that counts the number of orders that customers place each month and want it to look like this
No of orders Month
10 Jan
20 Feb
30 March
The SQL that I had which is
SELECT COUNT(OrderID) AS "Number Of Orders", datepart(month, order_date) AS "Month"
FROM "ORDER"
Group by datepart(month, order_date);
Displays
Number of Orders Month
16 9
However this is the count of all the orders for the various months and is only displayed in month 9 (September.)
Hope this will help:
select COUNT(OrderID) as "Number Of Orders",DATENAME(mm,order_date) as "Month" from "ORDER" group by DATENAME(mm,DueDate) order by 2

Determine monthly values of timestamped records

I have a SQL table with the following schema:
fruit_id INT
price FLOAT
date DATETIME
This table contains many records where the price of a given fruit is recorded at a given time. There may be multiple records in a single day, there may be
I would like to be able to fetch a list of prices for a single fruit over the last 12 months inclusive of the current month. So given a fruit_id of 2 and datetime of now(), what would the price values be for December, January, February, ... October, November?
Given the above requirements, what strategy would you use to get this data? Pure sql, fetch all prices and process in code?
Thanks for you time.
Are you talking about min price, max price, average price, or something else?
Here's a quick query to get you started, which includes min, max, and average price for each month for fruit_id 2:
select left(date,7) as the_month, min(price),max(price),avg(price)
from fruit_price
where fruit_id = 2
and date >= concat(left(date_sub(curdate(), interval 11 month),7),'-01')
group by the_month;
If I understand it correctly from -
I would like to be able to fetch a list of prices for a single fruit over the last 12 months inclusive of the current month. So given a fruit_id of 2 and datetime of now(), what would the price values be for December, January, February, ... October, November?
You want the total price for every month for a single year based on the date and fruit_if you pass in.
So,this won't give all months of an year but all months which had a price for year..in case you want all months..you would need to create a dimdate table which will have all the dates...and then join with it..
declare #passeddate=Now() --date to be calculated
declare #fruit_id=2 --fruit id to be calculated
Select
fruit_id as FruitId,
Sum(price) as MonthPrice,
Month(date) as FruitMonth
from SQL_Table
group by FruitMonth,FruitId
where fruit_id=#fruit_id and
Year(date)=Year(#passeddate)
select month(date) as "Month", distinct(price) as "Unique Price" where fruit_id = 2 group by month(date);
I'd try to state as much as possible in SQL that does not require unindexed access to data because it's usually fast(er) than processing it with the application.