SQL Server running balance with Partition by month - sql

I have the following scenario where a user has some allowance taken every month up to a yearly capping.
I have successfully implemented this as shown here
I have stumbled into a problem for if the user gets promoted during the year the yearly capping needs to be ratified accordingly.
The following query gave these results. (Using sql server 2012)
SELECT *,
RemainingBalance = AnnualCapping - Sum(amount)
OVER (
partition BY userid, year, annualcapping
ORDER BY userid, year, month)
FROM exampleTx
WHERE userid = 1
AND year = 2015
data
userId year month monthname name surname annualCapping amount RemainingBalance
1 2015 1 January Joe Black 500,00 40,00 460,00
1 2015 2 February Joe Black 500,00 40,00 420,00
1 2015 3 March Joe Black 500,00 40,00 380,00
1 2015 4 April Joe Black 500,00 40,00 340,00
1 2015 5 May Joe Black 500,00 40,00 300,00
1 2015 6 June Joe Black 500,00 40,00 260,00
1 2015 7 July Joe Black 500,00 40,00 220,00
1 2015 8 August Joe Black 500,00 40,00 180,00
1 2015 9 September Joe Black 1000,00 40,00 **960,00**
1 2015 10 October Joe Black 1000,00 40,00 **920,00**
1 2015 11 November Joe Black 1000,00 40,00 **880,00**
1 2015 12 December Joe Black 1000,00 40,00 **840,00**
In September the monthly allowance should have been proportional to remaining of year.
4 months = 1000 * 4/12 = 333.33
and remaining balance of 293.33, 253.33, 213.33,173.33.
Could I achieve this without modifying the annual capping field. ie.e would have been simpler if annual capping was reduced to 333.33 but this is the data I have.
A change in capping of previous month would indicate a promotion has taken place. It can occur during any month. Hence the new capping should be proportional.

You could use following query
Select *,
RemainingBalance = AnnualCapping - SUM(amount) OVER (
partition by userid ,year ORDER BY userid, year,month)
from
exampleTx
where userid = 1 and year = 2015
Remove the Annual Capping from the Partition by clause.

Related

Produce weekly and quarterly stats from a monthly figure

I have a sample of a table as below:
Customer Ref
Bear Rate
Distance
Month
Revenue
ABA-IFNL-001
1000
01/01/2022
-135
ABA-IFNL-001
1000
01/02/2022
-135
ABA-IFNL-001
1000
01/03/2022
-135
ABA-IFNL-001
1000
01/04/2022
-135
ABA-IFNL-001
1000
01/05/2022
-135
ABA-IFNL-001
1000
01/06/2022
-135
I also have a sample of a calendar table as below:
Date
Year
Week
Quarter
WeekDay
Qtr Start
Qtr End
Week Day
04/11/2022
2022
45
4
Fri
30/09/2022
29/12/2022
1
05/11/2022
2022
45
4
Sat
30/09/2022
29/12/2022
2
06/11/2022
2022
45
4
Sun
30/09/2022
29/12/2022
3
07/11/2022
2022
45
4
Mon
30/09/2022
29/12/2022
4
08/11/2022
2022
45
4
Tue
30/09/2022
29/12/2022
5
09/11/2022
2022
45
4
Wed
30/09/2022
29/12/2022
6
10/11/2022
2022
45
4
Thu
30/09/2022
29/12/2022
7
11/11/2022
2022
46
4
Fri
30/09/2022
29/12/2022
1
12/11/2022
2022
46
4
Sat
30/09/2022
29/12/2022
2
13/11/2022
2022
46
4
Sun
30/09/2022
29/12/2022
3
14/11/2022
2022
46
4
Mon
30/09/2022
29/12/2022
4
15/11/2022
2022
46
4
Tue
30/09/2022
29/12/2022
5
16/11/2022
2022
46
4
Wed
30/09/2022
29/12/2022
6
17/11/2022
2022
46
4
Thu
30/09/2022
29/12/2022
7
How can I join/link the tables to report on revenue over weekly and quarterly periods using the calendar table? I can put into two tables if needed as an output eg:
Quarter Starting
31/12/2021
01/04/2022
01/07/2022
30/09/2022
Quarter
1
2
3
4
Revenue
500
400
540
540
Week Date Start
31/12/2021
07/01/2022
14/01/2022
21/01/2022
Week
41
42
43
44
Revenue
33.75
33.75
33.75
33.75
I am using alteryx for this but wouldnt mind explaination of possible logic in sql to apply it into the system
Thanks
Before I get into the answer, you're going to have an issue regarding data integrity. All the revenue data is aggregated at a monthly level, where your quarters start and end on someday within the month.
For example - Q4 starts September 30th (Friday) and ends Dec. 29th (Thursday). You may have a day or two that bleeds from another month into the quarters which might throw off the data a bit (esp. if there's a large amount of revenue during the days that bleed into a quarter.
Additionally, your revenue is aggregated at a monthly level - unless you have more granular data (weekly, daily would be best), it doesn't make sense to do a weekly calculation since you'll probably just be dividing revenue by 4.
That being said - You'll want to use a cross tab feature in alteryx to get the data how you want it. But before you do that, we want to aggregate your data at a quarterly level first.
You can do this with an if statement or some other data cleansing tool (sorry, been a while since I used alteryx). Something like:
# Pseudo code - this won't actually work!
# For determining quarter
if (month) between (30/09/2022,29/12/2022) then 4
where you can derive the logic from your calendar table. Then once you have the quarter, you can join in the Quarter Start date based on your quarter calculation.
Now you have a nice clean table that might look something like this:
Month
Revenue
Quarter
Quarter Start Date
01/01/2022
-135
4
30/09/2022
01/01/2022
-135
4
30/09/2022
Aggregate on your quarter to get a cleaner table
Quarter Start Date
Quarter
revenue
30/09/2022
4
300
Then use cross tab, where you pivot on the Quarter start date.
For SQL, you'd be pivoting the data. Essentially, taking the value from a row of data, and converting it into a column. It will look a bit janky because the data is so customized, but here's a good question that goes over pivioting - Simple way to transpose columns and rows in SQL?

sql - How To Remove All Rows After 4th Occurence of Column Combination in postgresql

I have a sql query that results in a table similar to the following after grouping by name, quarter, year and ordering by year DESC, quarter DESC:
name
count
quarter
year
orange
22
4
2022
apple
1
4
2022
banana
123
3
2022
pie
93
2
2022
apple
12
2
2022
orange
0
1
2022
apple
900
4
2021
...
...
...
...
I want to remove any rows that come after the 4th unique combination of quarter and year is reached (for the table above this would be any rows after the last combination of quarter 1, year 2022), like so:
name
count
quarter
year
orange
22
4
2022
apple
1
4
2022
banana
123
3
2022
pie
93
2
2022
apple
12
2
2022
orange
0
1
2022
I am using Postgres 6.10.
If the next year were reached, it would still need to work with the quarter at the top being 1 and the year 2023.
select name
,count
,quarter
,year
from
(
select *
,dense_rank() over(order by year desc, quarter desc) as dns_rnk
from t
) t
where dns_rnk <= 4
name
count
quarter
year
orange
22
4
2022
apple
1
4
2022
banana
123
3
2022
pie
93
2
2022
apple
12
2
2022
orange
0
1
2022
Fiddle

Filter Data by Date

I have question to ask. Currently, I'm developing a payslip application. However, I stuck at 1 part of the process. I'm managed to display salary but I need to filter certain date in order for the salary to be display.
For example, for this September, the company already key in the salary on 26th Sep but user can only see it start from 28th Sep and above. So, basically, the program can show previous month payslip except for September unless user start see it on 28th Sep.
Current Output :
EMPLOYEEID MONTH YEAR SALARY
E001 7 2017 2000
E001 8 2017 2000
E001 9 2017 2000
E002 7 2017 2100
E002 8 2017 2100
E002 9 2017 2100
Expectation output:
EMPLOYEEID MONTH YEAR SALARY
E001 7 2017 2000
E001 8 2017 2000
E002 7 2017 2100
E002 8 2017 2100
Current Query Progress :
SELECT EMPLOYEEID, MONTH, YEAR, SALARY
FROM DBO.Salary
WHERE day(getdate())>=28
Today is 2017-09-29, do Sep should appear. Add the OR for when it's less than 28th
select *
from dbo.Salary s1
where day(getdate())>=28
or month(getdate()) > s1.Month

How can I calculate daily snapshots of my total sales on SQL?

I have a table (let's call it DiodeSales) that tells me the total number of diode sales I made, grouped by date, diode color, and country. This is a sample of this schema:
Date Color Country Sales
June, 20 2016 00:00:00 Green US 1
June, 20 2016 00:00:00 Red Japan 1
June, 20 2016 00:00:00 Red US 1
June, 21 2016 00:00:00 Red US 1
June, 22 2016 00:00:00 Green US 1
June, 22 2016 00:00:00 Red US 1
June, 23 2016 00:00:00 Green US 1
June, 23 2016 00:00:00 Red Japan 1
June, 23 2016 00:00:00 Red US 1
June, 24 2016 00:00:00 Green US 1
June, 24 2016 00:00:00 Red US 1
I want to be able to have have an additional column that tells me how many diodes we've sold up until that point. So, for example, using the above data, the {June 23, Red, 1, US} row would have a total sales value of 4, because we've sold 4 red diodes in the US at that point.
I initially thought a cumulative sum would do the trick. So I wrote this: (sqlfiddle here)
SELECT
t1.Date,
t1.Color,
t1.Country,
t1.Sales,
SUM(t2.Sales) AS CumulativeSales
FROM DiodeSales AS t1
INNER JOIN DiodeSales AS t2
ON t1.Date >= t2.Date
AND t1.Color = t2.Color
AND t1.Country = t2.Country
GROUP BY
t1.Date,
t1.Color,
t1.Country
This gives me the cumulative sum, as expected, but it does not give me the total sales for a given color in a given country on a given day. In particular, because some specific days may have 0 sales in some country, they will not have a cumulative value associated to it. For example, consider the results of the previous table:
Date Color Country Sales CumulativeSales
June, 20 2016 00:00:00 Green US 1 1
June, 20 2016 00:00:00 Red Japan 1 1
June, 20 2016 00:00:00 Red US 1 1
June, 21 2016 00:00:00 Red US 1 2
June, 22 2016 00:00:00 Green US 1 2
June, 22 2016 00:00:00 Red US 1 3
June, 23 2016 00:00:00 Green US 1 3
June, 23 2016 00:00:00 Red Japan 1 2
June, 23 2016 00:00:00 Red US 1 4
June, 24 2016 00:00:00 Green US 1 4
June, 24 2016 00:00:00 Red US 1 5
If I were to look for the column corresponding to Japan on June 24, I'd find nothing (because there was no Japan sale that day, so there is no Japan row for that day). I don't think there's a way to do this in SQL, but is it possible to populate this resulting table with values on days in which some countries had no sales? The starting table will always have at least one column for each day for some country.
I am aware I could just write a simple
SELECT SUM(Sales) FROM DiodeSales
WHERE Date &lt= #someDate AND Color = #someColor AND Country = #someCountry
to get this information, but this is for a table that has to be formatted in that way for it to be used by another piece of already-made software.
EDIT: Someone mentioned this as a potential duplicate of Calculate a Running Total in SQL Server, but that post only addresses efficiency while calculating a running sum. I already have various ways of calculating this sum, but I'm looking for a way to fix the issue of missing day/country combinations for days when there were no sales in that country. For the above example, the fixed query would return this:
Date Color Country Sales CumulativeSales
June, 20 2016 00:00:00 Green US 1 1
June, 20 2016 00:00:00 Red Japan 1 1
June, 20 2016 00:00:00 Red US 1 1
June, 21 2016 00:00:00 Green US 0 1
June, 21 2016 00:00:00 Red Japan 0 1
June, 21 2016 00:00:00 Red US 1 2
June, 22 2016 00:00:00 Green US 1 2
June, 22 2016 00:00:00 Red Japan 0 1
June, 22 2016 00:00:00 Red US 1 3
June, 23 2016 00:00:00 Green US 1 3
June, 23 2016 00:00:00 Red Japan 1 2
June, 23 2016 00:00:00 Red US 1 4
June, 24 2016 00:00:00 Green US 1 4
June, 24 2016 00:00:00 Red Japan 0 2
June, 24 2016 00:00:00 Red US 1 5
Try this:
SELECT [Date], Color, Country, Sales,
SUM(Sales) OVER(PARTITION BY Color, Country ORDER BY [Date] rows unbounded preceding) as RunningTotal
FROM YourTable
ORDER BY [Date], Color
It produces the output as expected.
[EDIT]
If you're looking for solution for missing dates, countries and colors, try this (replace #tmp with the name of your table):
SELECT A.[Date], A.Color, A.Country, COALESCE(B.Sales, 0) AS Sales
, SUM(COALESCE(B.Sales, 0)) OVER(PARTITION BY A.Color, A.Country ORDER BY A.[Date] rows unbounded preceding) as RunningTotal
FROM (
SELECT [Date], Color, Country
FROM (SELECT DISTINCT [Date] FROM #tmp) AS q1 CROSS JOIN
(SELECT DISTINCT Color FROM #tmp) AS q2 CROSS JOIN
(SELECT DISTINCT Country FROM #tmp) AS q3
) AS A
LEFT JOIN #tmp AS B ON A.[Date] = B.[Date] AND A.Color= B.Color AND A.Country = B.Country
ORDER BY A.[Date], A.Color
Above query produces:
Date Color Country Sales RunningTotal
2016-06-20 Green Japan 0 0
2016-06-20 Green US 1 1
2016-06-20 Red Japan 1 1
2016-06-20 Red US 1 1
2016-06-21 Green US 0 1
2016-06-21 Green Japan 0 0
2016-06-21 Red US 1 2
2016-06-21 Red Japan 0 1
2016-06-22 Green Japan 0 0
2016-06-22 Green US 1 2
2016-06-22 Red Japan 0 1
2016-06-22 Red US 1 3
2016-06-23 Green US 1 3
2016-06-23 Green Japan 0 0
2016-06-23 Red US 1 4
2016-06-23 Red Japan 1 2
2016-06-24 Green Japan 0 0
2016-06-24 Green US 1 4
2016-06-24 Red Japan 0 2
2016-06-24 Red US 1 5
I think you should use left join instead of inner join
SELECT
t.Date,
t.Color,
t.Country,
t.CumulativeSales
from DiodeSales t
left join
(SELECT
t1.Date,
t1.Color,
t1.Country,
t1.Sales,
SUM(t2.Sales) AS CumulativeSales
FROM DiodeSales AS t1
GROUP BY
t1.Date,
t1.Color,
t1.Country) t2
on
t.Date=t2.date
and t.Color=t2.color
and t.Country=t2.country
Try this
Select distinct Date into SalesDate From DiodeSales
SELECT S.Date,t.Color,t.Country,t.CumulativeSales
from DiodeSales t left join
(SELECt t1.Date,t1.Color,t1.Country,t1.Sales,
SUM(t2.Sales) AS CumulativeSales FROM DiodeSales AS t1
GROUP BY
t1.Date,
t1.Color,
t1.Country) t2 on
S.Date=t2.date
and t.Color=t2.color
and t.Country=t2.country
join
SalesDate S
on t.date=S.date

MS Access selecting by year intervals

I have a table, where every row has its own date (year of purchase), I should select the purchases grouped into year intervals.
Example:
Zetor 1993
Zetor 1993
JOHN DEERE 2001
JOHN DEERE 2001
JOHN DEERE 2001
Means I have 2 zetor purchase in 1993 and 3 john deere purchase in 2001. I should select the count of the pruchases grouped into these year intervals:
<=1959
1960-1969
1970-1979
1980-1989
1990-1994
1995-1999
2000-2004
2004-2009
2010-2013
I have no idea how should I do this.
The result should look like this on the example above:
<=1959
1960-1969 0
1970-1979 0
1980-1989 0
1990-1994 2
1995-1999 0
2000-2004 3
2004-2009 0
2010-2013 0
Create table with intervals:
tblRanges([RangeName],[Begins],[Ends])
Populate it with your intervals
Use GROUP BY with your table tblPurchases([Item],YearOfDeal):
SELECT tblRanges.RangeName, Count(tblPurchases.YearOfDeal)
FROM tblRanges INNER JOIN tblPurchases ON (tblRanges.Begins <= tblPurchases.Year) AND (tblRanges.Ends >= tblPurchases.YearOfDeal)
GROUP BY tblRanges.RangeName;
You may wish to consider Partition for future use:
SELECT Partition([Year],1960,2014,10) AS [Group], Count(Stock.Year) AS CountOfYear
FROM Stock
GROUP BY Partition([Year],1960,2014,10)
Input:
Tractor Year
Zetor 1993
Zetor 1993
JOHN DEERE 2001
JOHN DEERE 2001
JOHN DEERE 2001
Pre 59 1945
1960 1960
Result:
Group CountOfYear
:1959 1
1960:1969 1
1990:1999 2
2000:2009 3
Reference: http://office.microsoft.com/en-ie/access-help/partition-function-HA001228892.aspx