Query help : Running Avg for last 15 days - sql

please help me with a query to find running Avg for every 15 days. I have used below query but not sure how to display only 15 days Avg.
Select Date,
Avg(Qty) OVER (ORDER BY Date ROWS BETWEEN 15 PRECEDING AND CURRENT ROW) AS RunningAvg
FROM Sample
Sample Table : (Contains Qty for each Day)
Date Qty
2014-10-01 4
2014-10-02 5
..
..
2014-12-31 4
Expected Result.
Date RunningAvg
2014-10-01 4
2014-10-15 XX
2014-11-01 XX
2014-11-15 XX
2014-12-01 XX
.
.
.

I'm a bit baffled by the question. Your results seem to suggest that you want the values on the 1st and 15th of the month -- and that has nothing to do with 15-day moving averages. For such filtering you can use:
select t.*
from t
where day(date) in (1, 15);
As you know, some months have 28, 29, or 31 days so "15 days" has nothing to do with the day of the months. And the number of days between the 1st and 15th is 14 days, not 15.

Related

SQL - POSTGRES - DATE_PART why is sql resulting in week 53 when it should be week 2

TABLE
INSERT INTO runners
("runner_id", "registration_date")
VALUES
(1, '2021-01-01'),
(2, '2021-01-03'),
(3, '2021-01-08'),
(4, '2021-01-15');
SQL Query
SELECT
DATE_PART('WEEK', R.registration_date) AS week_num,
COUNT(runner_id)
FROM
pizza_runner.runners R
GROUP BY
week_num
ORDER BY
week_num ASC;
I was expecting the query to return weeks 1 and 2 only but for some reason I am getting 53
]1
I was expecting the query to return weeks 1 and 2 only but for some reason I am getting 53
The documentation does a good job explaining the ISO rules for weeks - which Postgres follows:
The number of the ISO 8601 week-numbering week of the year. By definition, ISO weeks start on Mondays and the first week of a year contains January 4 of that year. In other words, the first Thursday of a year is in week 1 of that year.
Using your dataset:
SELECT r.*,
extract(week from registration_date) AS week_num,
extract(isodow from registration_date) as day_of_week
FROM runners r
ORDER BY registration_date;
runner_id
registration_date
week_num
day_of_week
1
2021-01-01
53
5
2
2021-01-03
53
7
3
2021-01-08
1
5
4
2021-01-15
2
5
It turns out that January 3rd, 2021 was a Sunday (day of week 7). January 4st, 2021 was a Monday, and according to the ISO rules this is when the first week of that year began. Previous dates (January 3rd, 2nd, 1st, and so on) belong to the last week of 2020 (week 53), although the dates belong to year 2021.

Timeseries : date time averaging and abnormally detection

I"m dealing with a dataset with 4 week sales data (data will be refreshed every hour) and need to observer for abnormality
I think I'll go with a basic approach, to compare with average numbers and I'm trying to figure out how to best break this down so I can answer some questions below
On average, how many orders received at 9:00 , 15:00 or 16:00 past 30 days
On average, how many orders received at 9:00 every Wednesday (past 4 Wednesdays), at 15:00 every Thursday (past 4 Thursdays),
Not sure how do we go about this (after breaking date/time down to Hour and Weekday columns)
date
order ID
order hour
order weekday
10/07/2022 10:26:12 PM
1111
22
6
10/07/2022 10:27:12 PM
2222
22
6
....
....
....
....
19/07/2022 11:34:19 AM
9998
11
1
19/07/2022 11:34:35 AM
9999
11
1
I would love to get your advice please
Thanks
I've ended up going with a tedious approach.
#get current hour & weekday
now=datetime.datetime.now()
today=datetime.date.today()
current_hour=now.hour
current_weekday=today.weekday()
#create a DF with orders from the same hour & weekday window
same_hour_weekday_df=order_df[(order_df.order_hour==current_hour ) & (order_df.order_weekday==current_weekday) ]
#calculate avg orders generated from the past weeks within the same hour and weekyday timeframe
orders_same_hour_weekday=same_hour_weekday_df['order_created_at'].count()
same_hour_weekday_periods=same_hour_weekday_df['order_week'].nunique()
avg_orders_same_hour_weekday=orders_same_hour_weekday/same_hour_weekday_periods

How to get column value comparison in sql?

I have a table as below. The table holds the price of a product for each day in a year. I would like to get price change for each day by year.
Product Year 1Jan 2Jan .................... 31Dec
A 2018 10 20 .................... 120
A 2019 130 150 .................... 200
B 2018 15 23 .................... 90
B 2019 113 130 .................... 220
I would like to compare columns sequentially with year overlaps and get output as below.
• For the year 2018, by negating the value 2 Jan from 1 Jan (2 Jan-1 Jan), we get the new value of 2 Jan.
• For the year 2018, by negating the value 3Jan from 2 Jan (3 Jan-2 Jan), we get the new value of 3 Jan.
• For the year 2018, by negating the value 31Dec from 30 Dec (31 Dec-30 Dec), we get the new value of 31 Dec
• Now, For the year 2019, by negating the value 31 Dec(2018 year) from 1 Jan (2019 year), we get the new value of 1 Jan, 2019
So, in a nutshell, the value of a column is the difference of its value with previous day value.
Product Year 1Jan 2Jan .................... 31Dec
A 2018 10 10 .................... 15 (just assume value of 30Dec column is 105)
A 2019 10 20 .................... 10 (just assume value of 30Dec column is 190)
B 2018 15 8 .................... 8 (just assume value of 30Dec column is 82)
B 2019 23 17 .................... 10 (just assume value of 30Dec column is 210)
Let me know, if things are not clear.
Though logically there is nothing in this query, but still you have to work hard to write it -
SELECT Product
,Year
,1Jan
,2Jan - 1Jan 2Jan
,3Jan - 2Jan 3Jan
.
.
.
,31Dec - 30Dec 31Dec
FROM YOUR_TAB
ORDER BY Product
,Year;
first of all I think the design of the table could be better but thats a topic for some other time. Right now below code should work -
SELECT Product, Year,
1Jan AS '1st Jan',
2Jan-1Jan AS '2nd Jan',
3Jan-2Jan AS '3rd Jan',
4Jan-3Jan AS '4th Jan',
.
.
.
.
.
31Dec-30Dec AS '31st Dec',
FROM [table name];

How can I calculate a fiscal week

I would like a column displaying the fiscal week. Our fiscal year begins in April.
So far I have the below, using datename(ww,DateAndTime) as Week
DateAndTime Week
2015-04-01 22:45 14
2015-06-14 13:22 25
2015-12-02 09:15 49
2016-01-01 07:35 1
I would like the output to show:
DateAndTime Week Fiscal Week
2015-04-01 22:45 14 1
2015-06-14 13:22 25 12
2015-12-02 09:15 49 36
2016-01-01 07:35 1 41
While I don't understand the logic behind the fiscal week (the difference between 1 and 41 is 40, but between 14 and 1 it's 39), maybe I'm missing something or you made a typo.
However, in general you'd do something like this (assuming the difference is 40 weeks):
SELECT week, (week+40)%52 AS fw FROM ...
If the fiscal year starts at a different week every each (say, 13th or 14th week depending on year), you can use the date and time functions, but they may vary between SQL versions. In MySQL you have YEAR(), MONTH(), WEEK(), etc.
For example:
SELECT week, (week+(52-WEEK(CONCAT_WS('-', YEAR(NOW()), '04-01'))))%52 FROM ...
But it might be overkill.
Note: It is possible to count the other way: if you subtract the diff from the week instead of adding, you will need to add 52 if the number is negative. You can do that by adding 52 and then doing modulo (%) 52.

GROUP BY several hours

I have a table where our product records its activity log. The product starts working at 23:00 every day and usually works one or two hours. This means that once a batch started at 23:00, it finishes about 1:00am next day.
Now, I need to take statistics on how many posts are registered per batch but cannot figure out a script that would allow me achiving this. So far I have following SQL code:
SELECT COUNT(*), DATEPART(DAY,registrationtime),DATEPART(HOUR,registrationtime)
FROM RegistrationMessageLogEntry
WHERE registrationtime > '2014-09-01 20:00'
GROUP BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
ORDER BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
which results in following
count day hour
....
1189 9 23
8611 10 0
2754 10 23
6462 11 0
1885 11 23
I.e. I want the number for 9th 23:00 grouped with the number for 10th 00:00, 10th 23:00 with 11th 00:00 and so on. How could I do it?
You can do it very easily. Use DATEADD to add an hour to the original registrationtime. If you do so, all the registrationtimes will be moved to the same day, and you can simply group by the day part.
You could also do it in a more complicated way using CASE WHEN, but it's overkill on the view of this easy solution.
I had to do something similar a few days ago. I had fixed timespans for work shifts to group by where one of them could start on one day at 10pm and end the next morning at 6am.
What I did was:
Define a "shift date", which was simply the day with zero timestamp when the shift started for every entry in the table. I was able to do so by checking whether the timestamp of the entry was between 0am and 6am. In that case I took only the date of this DATEADD(dd, -1, entryDate), which returned the previous day for all entries between 0am and 6am.
I also added an ID for the shift. 0 for the first one (6am to 2pm), 1 for the second one (2pm to 10pm) and 3 for the last one (10pm to 6am).
I was then able to group over the shift date and shift IDs.
Example:
Consider the following source entries:
Timestamp SomeData
=============================
2014-09-01 06:01:00 5
2014-09-01 14:01:00 6
2014-09-02 02:00:00 7
Step one extended the table as follows:
Timestamp SomeData ShiftDay
====================================================
2014-09-01 06:01:00 5 2014-09-01 00:00:00
2014-09-01 14:01:00 6 2014-09-01 00:00:00
2014-09-02 02:00:00 7 2014-09-01 00:00:00
Step two extended the table as follows:
Timestamp SomeData ShiftDay ShiftID
==============================================================
2014-09-01 06:01:00 5 2014-09-01 00:00:00 0
2014-09-01 14:01:00 6 2014-09-01 00:00:00 1
2014-09-02 02:00:00 7 2014-09-01 00:00:00 2
If you add one hour to registrationtime, you will be able to group by the date part:
GROUP BY
CAST(DATEADD(HOUR, 1, registrationtime) AS date)
If the starting hour must be reflected accurately in the output (as 9, 23, 10, 23 rather than as 10, 0, 11, 0), you could obtain it as MIN(registrationtime) in the SELECT clause:
SELECT
count = COUNT(*),
day = DATEPART(DAY, MIN(registrationtime)),
hour = DATEPART(HOUR, MIN(registrationtime))
Finally, in case you are not aware, you can reference columns by their aliases in ORDER BY:
ORDER BY
day,
hour
just so that you do not have to repeat the expressions.
The below query will give you what you are expecting..
;WITH CTE AS
(
SELECT COUNT(*) Count, DATEPART(DAY,registrationtime) Day,DATEPART(HOUR,registrationtime) Hour,
RANK() over (partition by DATEPART(HOUR,registrationtime) order by DATEPART(DAY,registrationtime),DATEPART(HOUR,registrationtime)) Batch_ID
FROM RegistrationMessageLogEntry
WHERE registrationtime > '2014-09-01 20:00'
GROUP BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
)
SELECT SUM(COUNT) Count,Batch_ID
FROM CTE
GROUP BY Batch_ID
ORDER BY Batch_ID
You can write a CASE statement as below
CASE WHEN DATEPART(HOUR,registrationtime) = 23
THEN DATEPART(DAY,registrationtime)+1
END,
CASE WHEN DATEPART(HOUR,registrationtime) = 23
THEN 0
END