SQL - Calculate the difference in number of orders by month - sql

I am working on the orders table provided by this site, it has its own editor where you can test your SQL statements.
The order table looks like this
order_id
customer_id
order_date
1
7000
2016/04/18
2
5000
2016/04/18
3
8000
2016/04/19
4
4000
2016/04/20
5
NULL
2016/05/01
I want to get the difference in the number of orders for subsequent months.
To elaborate, the number of orders each month would be like this
SQL Statement
SELECT
MONTH(order_date) AS Month,
COUNT(MONTH(order_date)) AS Total_Orders
FROM
orders
GROUP BY
MONTH(order_date)
Result:
Month
Total_Orders
4
4
5
1
Now my goal is to get the difference in subsequent months which would be
Month
Total_Orders
Total_Orders_Diff
4
4
4 - Null = Null
5
1
1 - 4 = -3
My strategy was to self-join following this answer
This was my attempt
SELECT
MONTH(a.order_date),
COUNT(MONTH(a.order_date)),
COUNT(MONTH(b.order_date)) - COUNT(MONTH(a.order_date)) AS prev,
MONTH(b.order_date)
FROM
orders a
LEFT JOIN
orders b ON MONTH(a.order_date) = MONTH(b.order_date) - 1
GROUP BY
MONTH(a.order_date)
However, the result was just zeros (as shown below) which suggests that I am just subtracting from the same value rather than from the previous month (or subtracting from a null value)
MONTH(a.order_date)
COUNT(MONTH(a.order_date))
prev
MONTH(b.order_date)
4
4
0
NULL
5
1
0
NULL
Do you have any suggestions as to what I am doing wrong?

You have to use LAG window function in your SELECT statement.
LAG provides access to a row at a given physical offset that comes
before the current row.
So, this is what you need:
SELECT
MONTH(order_date) as Month,
COUNT(MONTH(order_date)) as Total_Orders,
COUNT(MONTH(order_date)) - (LAG (COUNT(MONTH(order_date))) OVER (ORDER BY (SELECT NULL))) AS Total_Orders_Diff
FROM orders
GROUP BY MONTH(order_date);
Here in an example on the SQL Fiddle: http://sqlfiddle.com/#!18/5ed75/1
Solution without using LAG window function:
WITH InitCTE AS
(
SELECT MONTH(order_date) AS Month,
COUNT(MONTH(order_date)) AS Total_Orders
FROM orders
GROUP BY MONTH(order_date)
)
SELECT InitCTE.Month, InitCTE.Total_Orders, R.Total_Orders_Diff
FROM InitCTE
OUTER APPLY (SELECT TOP 1 InitCTE.Total_Orders - CompareCTE.Total_Orders AS Total_Orders_Diff
FROM InitCTE AS CompareCTE
WHERE CompareCTE.Month < InitCTE.Month) R;

Something like the following should give you what you want - disclaimer, untested!
select *, Total_Orders - lag(Total_orders,1) over(order by Month) as Total_Orders_Diff
from (
select Month(order_date) as Month, Count(*) as Total_Orders
From orders
Group by Month(order_date)
)o

Related

SQL Divide previous row balance by current row balance and insert that value into current rows column "Growth"

I have a table where like this.
Year
ProcessDate
Month
Balance
RowNum
Calculation
2022
20220430
4
22855547
1
2022
20220330
3
22644455
2
2022
20220230
2
22588666
3
2022
20220130
1
33545444
4
2022
20221230
12
22466666
5
I need to take the previous row of each column and divide that amount by the current row.
Ex: Row 1 calculation should = Row 2 Balance / Row 1 Balance (22644455/22855547 = .99% )
Row 2 calculation should = Row 3 Balance / Row 2 Balance etc....
Table is just a Temporary table I created titled #MonthlyLoanBalance2.
Now I just need to take it a step further.
Let me know what and how you would go about doing this.
Thank you in advance!
Insert into #MonthlytLoanBalance2 (
Year
,ProcessDate
,Month
,Balance
,RowNum
)
select
--CloseYearMonth,
left(ProcessDate,4) as 'Year',
ProcessDate,
--x.LOANTypeKey,
SUBSTRING(CAST(x.ProcessDate as varchar(38)),5,2) as 'Month',
sum(x.currentBalance) as Balance
,ROW_NUMBER()over (order by ProcessDate desc) as RowNum
from
(
select
distinct LoanServiceKey,
LoanTypeKey,
AccountNumber,
CurrentBalance,
OpenDateKey,
CloseDateKey,
ProcessDate
from
cu.LAFactLoanSnapShot
where LoanStatus = 'Open'
and LoanTypeKey = 0
and ProcessDate in (select DateKey from dimDate
where IsLastDayOfMonth = 'Y'
and DateKey > convert(varchar, getdate()-4000, 112)
)
) x
group by ProcessDate
order by ProcessDate desc;``
I am assuming your data is already prepared as shown in the table. Now you can try Lead() function to resolve your issue. Remember format() function is used for taking only two precision.
SELECT *,
FORMAT((ISNULL(LEAD(Balance,1) OVER (ORDER BY RowNum), 1)/Balance),'N2') Calculation
FROM #MonthlytLoanBalance2

I have 3 rows per user, need to have one row (with 3 columns) per user instead

I'm creating a table with the earliest 3 purchases by customer along with the total count of purchases by said customer, using a CTE. I did this successfully with the query below, but it shows 3 rows for each user with a row for the first purchase date, 2nd purchase date, and 3rd purchase date as separate rows. I'm trying to show the 3 purchase dates as columns, with one row for each user, instead.
This table has hundreds of rows so I can't write the needed user IDs in the code. Any ideas? Is there a way to merge 3 CTEs or write code to spit out the earliest payment date, 2nd earliest, 3rd earliest, and total amount for the user as columns. Current code is below:
WITH cte_2
AS (SELECT customer_id,
payment_date,
Row_number()
OVER (
partition BY customer_id
ORDER BY payment_date ASC) AS purchase_number
FROM payment)
SELECT cte_2.customer_id,
cte_2.payment_date,
cte_2.purchase_number,
Count(payment_id) AS total_payments
FROM payment
INNER JOIN cte_2
ON payment.customer_id = cte_2.customer_id
WHERE purchase_number <= 3
GROUP BY cte_2.customer_id,
cte_2.payment_date,
purchase_number
ORDER BY customer_id ASC
Current Output with above code:
Preferred Output:
Using pandas you can use pivot:
df = df.set_index('customer_id')
pivot_df = df.pivot(columns='purchase_number', values='payment_dates')
# To improve readability of your columns you can add a prefix:
pivot_df = pivot_df.add_prefix('payment_')
pivot_df.merge(df['total_payments'], left_index=True, right_index=True).drop_duplicates()
When using:
df = pd.DataFrame({
'customer_id':[1,1,1,2,2,2,3],
'payment_dates':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05', '2021-01-06', '2021-01-01'],
'purchase_number':[1,2,3,1,2,3,1],
'total_payments':[4,4,4,26,26,26,1]})
Our result is:
payment_1 payment_2 payment_3 total_payments
customer_id
1 2021-01-01 2021-01-02 2021-01-03 4
2 2021-01-04 2021-01-05 2021-01-06 26
3 2021-01-01 NaN NaN 1
if your sql product supports 'case when' then you can do it with:
WITH
cte_2
AS (SELECT payment_id,
Row_number()
OVER (
partition BY customer_id
ORDER BY payment_date ASC) AS purchase_number
FROM payment)
SELECT pmt.customer_id,
Count(case when cte_2.purchase_number=1 then 1 else null end) as [First Payment],
Count(case when cte_2.purchase_number=2 then 1 else null end) as [2nd Payment],
Count(case when cte_2.purchase_number=3 then 1 else null end) as [3rd Payment],
Count(pmt.payment_id) AS total_payments
FROM payment pmt
LEFT JOIN cte_2
ON pmt.payment_id=cte_2.payment_id
and cte_2.purchase_number <= 3
GROUP BY pmt.customer_id
ORDER BY pmt.customer_id ASC
CTE simply assigns payment numbers for each payment, we then join the payments table to that CTE by payment id using left join, because an inner join would remove payments with payment number > 3 (but we want to count them)

What is wrong with my SQL query SUM group-by

Hello i have a sql query and it does not count my one row. which is called Spend, you can see it in the fiddle. what is wrong with my code?
I just need basic table
Month ID GOT SPEND
1 1 100 50
2 1 500 200
1 2 200 50
I have created the fiddle http://sqlfiddle.com/#!9/3623b1/2
Could you please help me?
Here is the query:
select
keliones_lapas.Vairuot_Id,
MONTH(keliones_lapas.Data_darbo),
sum(keliones_lapas.uzdarbis) as Got,
coalesce(Suma, 0) AS Spend,
(sum(keliones_lapas.uzdarbis) - coalesce(Suma, 0)) Total
from keliones_lapas
left join (
select Vairuotas,
MONTH(Data_islaidu) as Data_islaidu,
sum(Suma) as Suma
from islaidos
group by Vairuotas, MONTH(Data_islaidu)) islaidos
on keliones_lapas.Vairuot_Id = islaidos.Vairuotas
and MONTH(keliones_lapas.Data_darbo) = MONTH(islaidos.Data_islaidu)
group by keliones_lapas.Vairuot_Id, MONTH(keliones_lapas.Data_darbo), Suma
order by keliones_lapas.Vairuot_Id, MONTH(keliones_lapas.Data_darbo);
TRY THIS: You are taking already month in your subquery then again using MONTH to retrieve from month in the join so it's returning NULL and not matching with any month of keliones_lapas
SELECT
keliones_lapas.Vairuot_Id,
MONTH(keliones_lapas.Data_darbo),
SUM(keliones_lapas.uzdarbis) AS Got,
COALESCE(Suma, 0) AS Spend,
(SUM(keliones_lapas.uzdarbis) - COALESCE(Suma, 0)) Total
FROM keliones_lapas
LEFT JOIN (
SELECT Vairuotas,
MONTH(Data_islaidu) AS Data_islaidu, --It's already in MONTH
SUM(Suma) AS Suma
FROM islaidos
GROUP BY Vairuotas, MONTH(Data_islaidu)) islaidos
ON keliones_lapas.Vairuot_Id = islaidos.Vairuotas
AND MONTH(keliones_lapas.Data_darbo) = Data_islaidu --No need to use MONTH or `vice versa`
GROUP BY keliones_lapas.Vairuot_Id, MONTH(keliones_lapas.Data_darbo), Suma
ORDER BY keliones_lapas.Vairuot_Id, MONTH(keliones_lapas.Data_darbo)

Query to get top product gainers by sales over previous week

I have a database table with three columns.
WeekNumber, ProductName, SalesCount
Sample data is shown in below table. I want top 10 gainers(by %) for week 26 over previous week i.e. week 25. The only condition is that the product should have sales count greater than 0 in both the weeks.
In the sample data B,C,D are the common products and C has the highest % gain.
Similarly, I will need top 10 losers also.
What I have tried till now is to make a inner join and get common products between two weeks. However, I am not able to get the top gainers logic.
The output should be like
Product PercentGain
C 400%
D 12.5%
B 10%
This will give you a generic answer, not just for any particular week:
select top 10 product , gain [gain%]
from
(
SELECT product, ((curr.salescount-prev.salescount)/prev.salescount)*100 gain
from
(select weeknumber, product, salescount from tbl) prev
JOIN
(select weeknumber, product, salescount from tbl) curr
on prev.weeknumber = curr.weeknumber - 1
AND prev.product = curr.product
where prev.salescount > 0 and curr.salescount > 0
)A
order by gain desc
If you are interested in weeks 25 and 26, then just add the condition below in the WHERE clause:
and prev.weeknumber = 25
If you are using SQL-Server 2012 (or newer), you could use the lag function to match "this" weeks sales with the previous week's. From there on, it's just some math:
SELECT TOP 10 product, sales/prev_sales - 1 AS gain
FROM (SELECT product,
sales,
LAG(sales) OVER (PARTITION BY product
ORDER BY weeknumber) AS prev_sales
FROM mytable) t
WHERE weeknumber = 26 AND
sales > 0 AND
prev_sales > 0 AND
sales > prev_sales
ORDER BY sales/prev_sales
this is the Query .
select top 10 product , gain [gain%]
from
(
SELECT curr.Product, ( (curr.Sales - prev.Sales ) *100)/prev.Sales gain
from
(select weeknumber, product, sales from ProductInfo where weeknumber = 25 ) prev
JOIN
(select weeknumber, product, sales from ProductInfo where weeknumber = 26 ) curr
on prev.product = curr.product
where prev.Sales > 0 and curr.Sales > 0
)A
order by gain desc

SQL query to identify seasonal sales items

I need a SQL query that will identify seasonal sales items.
My table has the following structure -
ProdId WeekEnd Sales
234 23/04/09 543.23
234 30/04/09 12.43
432 23/04/09 0.00
etc
I need a SQL query that will return all ProdId's that have 26 weeks consecutive 0 sales. I am running SQL server 2005. Many thanks!
Update: A colleague has suggested a solution using rank() - I'm looking at it now...
Here's my version:
DECLARE #NumWeeks int
SET #NumWeeks = 26
SELECT s1.ProdID, s1.WeekEnd, COUNT(*) AS ZeroCount
FROM Sales s1
INNER JOIN Sales s2
ON s2.ProdID = s1.ProdID
AND s2.WeekEnd >= s1.WeekEnd
AND s2.WeekEnd <= DATEADD(WEEK, #NumWeeks + 1, s1.WeekEnd)
WHERE s1.Sales > 0
GROUP BY s1.ProdID, s1.WeekEnd
HAVING COUNT(*) >= #NumWeeks
Now, this is making a critical assumption, namely that there are no duplicate entries (only 1 per product per week) and that new data is actually entered every week. With these assumptions taken into account, if we look at the 27 weeks after a non-zero sales week and find that there were 26 total weeks with zero sales, then we can deduce logically that they had to be 26 consecutive weeks.
Note that this will ignore products that had zero sales from the start; there has to be a non-zero week to anchor it. If you want to include products that had no sales since the beginning, then add the following line after `WHERE s1.Sales > 0':
OR s1.WeekEnd = (SELECT MIN(WeekEnd) FROM Sales WHERE ProdID = s1.ProdID)
This will slow the query down a lot but guarantees that the first week of "recorded" sales will always be taken into account.
SELECT DISTINCT
s1.ProdId
FROM (
SELECT
ProdId,
ROW_NUMBER() OVER (PARTITION BY ProdId ORDER BY WeekEnd) AS rownum,
WeekEnd
FROM Sales
WHERE Sales <> 0
) s1
INNER JOIN (
SELECT
ProdId,
ROW_NUMBER() OVER (PARTITION BY ProdId ORDER BY WeekEnd) AS rownum,
WeekEnd
FROM Sales
WHERE Sales <> 0
) s2
ON s1.ProdId = s2.ProdId
AND s1.rownum + 1 = s2.rownum
AND DateAdd(WEEK, 26, s1.WeekEnd) = s2.WeekEnd;