Running Total for Current & Previous Year on weekly basis - sql

This query provides Year to date numbers for both Price and Square Feet of the current year and the previous year to date. This is more like the Running Total of the current Year and the Previous year with respect to the weeks in this case from 1 through 7 and so on..... (week 7th of 2017 ended on 02/19/2017) of the current year and the previous year(week 7th of 2016 ended on 02/22/2016). The reason why I am using subqueries is because this is the only way I know to get around this situation. And of course if you think there is a shorter, viable alternative of executing this query, please advice.
Actual_Sale_Date holds data on all of the seven days of the week but we cut off on Sunday that is why 2/22/2016 (Sunday ending 7th week of 2016) and 2/19/2017 (Sunday ending 7th week of 2017).
I tried "Actual_Sale_Date" = date_trunc('week', now())::date - 1 this function only returns the previous week data ending on the passed Sunday. I took a look at interval since dateadd does not exist in postgresql but could not get my ways around with it.
My query:
select (money(Sum("Price") / COUNT("Price"))) as "Avg_Value YTD",
Round(Avg("Price"/"Sq_Ft"),+2) as "Avg_PPSF YTD",
(select
(money(Sum("Price") / COUNT("Price"))) from allsalesdata
where "Actual_Sale_Date" >= '01/01/2016' AND "Actual_Sale_Date" < '02/22/2016'
and "Work_ID" = 'SO') AS "Last Year at this time Avg_Value",
(select Round(Avg("Price"/"Sq_Ft"),+2)
from allsalesdata
where "Actual_Sale_Date" >= '01/01/2016' AND "Actual_Sale_Date" < '02/22/2016'
and "Work_ID" = 'SO') AS "Last Year at this time Avg_PPSF"
from allsalesdata
where "Actual_Sale_Date" >= '01/01/2017' AND "Actual_Sale_Date" <'02/20/2017'
and "Work_ID" = 'SO'
Sample Data:
Price Sq_Ft Actual_Sale_Date Work_ID
45871 3583 01/15/2016 SO
55874 4457 02/05/2016 SO
88745 4788 02/20/2016 SO
58745 1459 01/10/2016 SO
88749 2145 01/25/2017 SO
74856 1478 01/25/2017 SO
74586 4587 01/31/2017 ABC
74745 1142 02/10/2017 SO
74589 2214 02/19/2017 SO

This should be what you need (assuming you have a recent version of PG):
SELECT DISTINCT wk AS "Week",
sum("Price")::money FILTER (WHERE yr = 2017) OVER w /
count("Price") FILTER (WHERE yr = 2017) OVER w AS "Avg_Value YTD",
sum("Price")::money FILTER (WHERE yr = 2017) OVER w /
sum("Sq_Ft") FILTER (WHERE yr = 2017) OVER w AS "Avg_PPSF YTD",
sum("Price")::money FILTER (WHERE yr = 2016) OVER w /
count("Price") FILTER (WHERE yr = 2016) OVER w AS "Last Year this time Avg_Value",
sum("Price")::money FILTER (WHERE yr = 2016) OVER w /
sum("Sq_Ft") FILTER (WHERE yr = 2016) OVER w AS "Last Year this time Avg_PPSF",
FROM (
SELECT extract(isoyear from "Actual_Sale_Date")::integer AS yr,
extract(week from "Actual_Sale_Date")::integer AS wk,
"Price", "Sq_Ft"
FROM allsalesdata
WHERE "Work_ID" = 'SO') sub
-- optional, show only completed weeks in this year:
WHERE wk <= extract(week from CURRENT_DATE)::integer - 1
WINDOW w AS (ORDER BY wk)
ORDER BY wk;
In the inner query the year and week of the sale date are extracted for every sale. The week starts on Monday, as per your requirement.
In the main query these rows are processed as a single partition frame, i.e. from the start of the partition (= first row) to the last peer of the current row. Since the window definition orders the rows by wk, all rows from the start (week = 1) to the current week are included in the summarization. This will give you the running total. The sum() and count() functions filter by the year in question and the DISTINCT clause ensures that you get only a single row per week.

Related

Get consecutive months and days difference from date range?

So let's say I have a table like this:
subscriber_id
package_id
package_start_date
package_end_date
package_price_per_day
1081
231
2014-01-13
2014-12-31
$3.
1084
231
2014-03-21
2014-06-05
$3
1086
235
2014-06-21
2014-09-09
$4
Now I want the result for top 3 packages based on total revenue for each month for year 2014.
Note: For example for package 231 Revenue should be calculated such as 18 days of Jan * $3 +
28 days of feb * $3 + .... and so on.
For the second row the calculation would be same as first row (9 days of March* $3 + 30 days of April *$3 ....)
On the result the package should group by according to month and show rank depending on total revenue.
Sample result:
Month
Package_id
Revenue
Rank
Jan
231.
69499
1.
Jan.
235.
34345.
2.
Jan.
238.
23455.
3.
Feb.
231.
89274
1.
I wrote a query to filter the dates so that I get the active subscriber throughout the year 2014 (since initially there were values from different years),which shows the first table in the question, but I am not sure how do I break the months and days afterwards.
select subscriber_id, package_id, package_start_date, package_end_date
from (
select subscriber_id, package_id
, case when year(package_start_date) < '2014' then package_start_date = '01-Jan-2014' else package_start_date end as package_start_date
, case when year(package_start_date) > '2014' then package_end_date = '31-Dec-2014' else package_start_date end as package_end_date
, price_per_day
from subscription
) a
where year(package_start_date) = '2014' and year(package_end_date) = '2014'
Please do not emphasize on syntax - I am just trying to understand the logical approach in SQL.
Suppose you have a table that is a list of unique dates in a column called d, and the table is called d
It is then relatively trivial to do
SELECT *
FROM t
INNER JOIN d on d.d >= t.package_start_date AND d.d < t.package_end_date
Assuming you class a start date of jan 1 and an end date of jan 2 as 1 day. If you class as two, use <=
This will cause your package rows to multiply into the number of days, so start and end days of jan 1 and jan 11 would mean that row repeats 10 times. The d.d date is different on every row and you can extract the month from d.d and then group on it to give you totals for each month per package
Suppose you've CTEd that query above as x, it's like
SELECT DATEPART(month, x.dd), --the d.d date
package_id,
SUM(revenue)
FROM x
GROUP BY DATEPART(month, x.dd), package_id
Because the rows from T are repeated by Cartesian explosion when joined to d, you can safely group them or aggregate them to get them back to single values per month per package. If you have packages that stay with you more than a year you should also group on datepart year, to avoid mixing up the months from packages that stay from eg jan 2020 to feb 2021(they stay for two jans and two febs)
Then all you need to do is add the ranking of the revenue in, which looks like it would go in at the first step with something like
RANK(DATEDIFF(DAY, start, end)*revenue) OVER(PARTITION BY package_id)
I think I understand it correctly that you rank packages on total revenue over the entire period rather than per month.. look up the difference between rank and dense rank too as you may want dense instead

Using Date to find the inequality for sales than 500

I'm curious as to find the daily average sales for the month of December 1998 not greater than 100 as a where clause. So what I imagine is that since the table consists of the date of sales (sth like 1 december 1998, consisting of different date, months and year), amount due....First I'm going to define a particular month.
DEFINE a = TO_DATE('1-Dec-1998', 'DD-Month-YYYY')
SELECT SUBSTR(Sales_Date, 4,6), (SUM(Amount_Due)/EXTRACT(DAY FROM LAST_DAY(Sales_Date))
FROM ......
WHERE SUM(AMOUNT_DUE)/EXTRACT(DAY FROM LAST_DAY(&a)) < 100
I'm stuck as to extract the sum of amount due in the month of december 1998 for the where clause....
How can I achieve the objective?
To me, it looks like this:
select to_char(sales_date, 'mm.yyyy') month,
avg(amount_due) avg_value
from your_table
where sales_date >= trunc(date '1998-12-01', 'mm')
and sales_date < add_months(trunc(date '1998-12-01', 'mm'), 1)
group by to_char(sales_date, 'mm.yyyy')
having avg(amount_due) < 100;
WHERE clause can be simplified; it shows how to fetch certain period:
trunc to mm returns first day in that month
add_months to the above value (first day in that month) will return first day of the next month
the bottom line: give me all rows whose sales_date is >= first day of this month and < first day of the next month; basically, the whole this month
Finally, the where clause you used should actually be the having clause.
As long as the amount_due column only contains numbers, you can use the sum function.
Below SQL query should be able to satisfy your requirement.
Select SUM(Amount_Due) from table Sales where Sales_Date between '1-12-1998' and '31-12-1998'
OR
Select SUM(Amount_Due) from table Sales where Sales_Date like '%-12-1998'

SQL Server / SSRS: Calculating monthly average based on grouping and historical values

I need to calculate an average based on historical data for a graph in SSRS:
Current Month
Previous Month
2 Months ago
6 Months ago
This query returns the average for each month:
SELECT
avg_val1, month, year
FROM
(SELECT
(sum_val1 / count) as avg_val1, month, year
FROM
(SELECT
SUM(val1) AS sum_val1, SUM(count) AS count, month, year
FROM
(SELECT
COUNT(val1) AS count, SUM(val1) AS val1,
MONTH([SnapshotDate]) AS month,
YEAR([SnapshotDate]) AS year
FROM
[DC].[dbo].[KPI_Values]
WHERE
[SnapshotKey] = 'Some text here'
AND No = '001'
AND Channel = '999'
GROUP BY
[SnapshotDate]) AS sub3
GROUP BY
month, year, count) AS sub2
GROUP BY sum_val1, count, month, year) AS sub1
ORDER BY
year, month ASC
When I add the following WHERE clause I get the average for March (2 months ago):
WHERE month = MONTH(GETDATE())-2
AND year = YEAR(GETDATE())
Now the problem is when I want to retrieve data from 6 months ago; MONTH(GETDATE()) - 6 will output -1 instead of 12. I also have an issue with the fact that the year changes to 2016 and I am a bit unsure of how to implement the logic in my query.
I think I might be going about this wrong... Any suggestions?
Subtract the months from the date using the DATEADD function before you do your comparison. Ex:
WHERE SnapshotDate BETWEEN DATEADD(month, -6, GETDATE()) AND GETDATE()
MONTH(GETDATE()) returns an int so you can go to 0 or negative values. you need a user scalar function managing this, adding 12 when <= 0

Modification todate dimension in SQL Server

I need a suggestion around one of the columns that I'm creating in the Date dimension in SQL Server, basically rolling weeks..
I have a table dimDate in my datawarehouse.
I want to create a column in the dimdate table which will have week number in any year and each week should have 7 days.
For eg: In year 2015 there are 53 weeks but the 53rd week has only 5 days (because the week starts on Sunday in SQL Server I guess).
I want to include 2 more days from 2016 (1st and 2nd Jan in 2016) to complete the 53rd week with 7 days and also the the 1st week in 2016 should start on 3rd of Jan 2016, so on and so forth.
If there are any suggestions that will be great to start with.
Assuming that you already have weeks populated (but not extended into the next year), and making some assumptions about columns names
This query finds the last week in a year (which would almost always always be 53 but don't count on it:) and the date that it ends on
SELECT YearNo, MAX(Week) As Week, MAX(DateKey) As DateKey
FROM dimDate
GROUP BY YearNo
This query finds all weeks that are shorter than 7 days, and how many extra days are required to make them 7 days.
SELECT
YearNo,
Week,
7-COUNT(DISTINCT DateKey) As ExtraDaysRequired
FROM dimDate
GROUP BY YearNo, Week
HAVING COUNT(DISTINCT DateKey) < 7
This might always be the last week of the year but lets not make assumptions.
Lets combine these to find all final weeks that have less than 7 days, as well as add the number of days required:
SELECT
Under7Days.YearNo, Under7Days.Week, Under7Days.ExtraDaysRequired,
FinalWeeks.DateKey StartDate,
DATEADD(d,Under7Days.ExtraDaysRequired,FinalWeeks.DateKey) EndDate
FROM
(
SELECT YearNo, MAX(Week) As Week, MAX(DateKey) As DateKey
FROM dimDate
GROUP BY YearNo
) As FinalWeeks
INNER JOIN
(
SELECT YearNo, Week, 7-COUNT(DISTINCT DateKey) As ExtraDaysRequired
FROM dimDate
GROUP BY YearNo, Week
HAVING COUNT(DISTINCT DateKey) < 7
) As Under7Days
ON FinalWeeks.Week = Under7Days.Week
AND FinalWeeks.YearNo = Under7Days.YearNo
So we have a query that identifies the start date and end date and week number that it needs to be updated to. So now we run an update:
UPDATE TGT
SET Week = SRC.Week
FROM dimDate TGT
INNER JOIN
(
SELECT
Under7Days.YearNo, Under7Days.Week, Under7Days.ExtraDaysRequired,
FinalWeeks.DateKey StartDate,
DATEADD(d,Under7Days.ExtraDaysRequired,FinalWeeks.DateKey) EndDate
FROM
(
SELECT YearNo, MAX(Week) As Week, MAX(DateKey) As DateKey
FROM dimDate
GROUP BY YearNo
) As FinalWeeks
INNER JOIN
(
SELECT YearNo, Week, 7-COUNT(DISTINCT DateKey) As ExtraDaysRequired
FROM dimDate
GROUP BY YearNo, Week
HAVING COUNT(DISTINCT DateKey) < 7
) As Under7Days
ON FinalWeeks.Week = Under7Days.Week
AND FinalWeeks.YearNo = Under7Days.YearNo
) SRC
ON TGT.DateID BETWEEN SRC.StartDate AND SRC.EndDate
Looks complicated? There's half a dozen ways to write the same thing but this approach is step-by-step. You could probably write a windowing function to do the same thing but I leave that as an exercise for someone else.

Last three months average for each month in PostgreSQL query

I'm trying to build a query in Postgresql that will be used for a budget.
I currently have a list of data that is grouped by month.
For each month of the year I need to retrieve the average monthly sales from the previous three months. For example, in January I would need the average monthly sales from October through December of the previous year. So the result will be something like:
1 12345.67
2 54321.56
3 242412.45
This is grouped by month number.
Here is a snippet of code from my query that will get me the current month's sales:
LEFT JOIN (SELECT SUM((sti.cost + sti.freight) * sti.case_qty * sti.release_qty)
AS trsf_cost,
DATE_PART('month', st.invoice_dt) as month
FROM stransitem sti,
stocktrans st
WHERE sti.invoice_no = st.invoice_no
AND st.invoice_dt >= date_trunc('year', current_date)
AND st.location_cd = 'SLC'
AND st.order_st != 'DEL'
GROUP BY month) as trsf_cogs ON trsf_cogs.month = totals.month
I need another join that will get me the same thing, only averaged from the previous 3 months, but I'm not sure how.
This will ALWAYS be a January-December (1-12) list, starting with January and ending with December.
This is a classic problem for a window function. Here is how to solve this:
SELECT month_nr
,(COALESCE(m1, 0)
+ COALESCE(m2, 0)
+ COALESCE(m3, 0))
/
NULLIF ( CASE WHEN m1 IS NULL THEN 0 ELSE 1 END
+ CASE WHEN m2 IS NULL THEN 0 ELSE 1 END
+ CASE WHEN m3 IS NULL THEN 0 ELSE 1 END, 0) AS avg_prev_3_months
-- or divide by 3 if 3 previous months are guaranteed or you don't care
FROM (
SELECT date_part('month', month) as month_nr
,lag(trsf_cost, 1) OVER w AS m1
,lag(trsf_cost, 2) OVER w AS m2
,lag(trsf_cost, 3) OVER w AS m3
FROM (
SELECT date_part( 'month', month) as trsf_cost -- some dummy nr. for demo
,month
FROM generate_series('2010-01-01 0:0'::timestamp
,'2012-01-01 0:0'::timestamp, '1 month') month
) x
WINDOW w AS (ORDER BY month)
) y;
This is requires that no month is ever missing! Else, have a look at this related answer:
How to compare the current row with next and previous row in PostgreSQL?
Calculates correct average for every month. If only two previous moths then devide by 2, etc. If no prev. months, result is NULL.
In your subquery, use
date_trunc('month', st.invoice_dt)::date AS month
instead of
DATE_PART('month', st.invoice_dt) as month
so you can sort months over the years easily!
More info
Window function lag()
date_trunc()