Query to find a weekly average - sql

I have an SQLite database with the following fields for example:
date (yyyymmdd fomrat)
total (0.00 format)
There is typically 2 months of records in the database. Does anyone know a SQL query to find a weekly average?
I could easily just execute:
SELECT COUNT(1) as total_records, SUM(total) as total FROM stats_adsense
Then just divide total by 7 but unless there is exactly x days that are divisible by 7 in the db I don't think it will be very accurate, especially if there is less than 7 days of records.
To get a daily summary it's obviously just total / total_records.
Can anyone help me out with this?

You could try something like this:
SELECT strftime('%W', thedate) theweek, avg(total) theaverage
FROM table GROUP BY strftime('%W', thedate)

I'm not sure how the syntax would work in SQLite, but one way would be to parse out the date parts of each [date] field, and then specifying which WEEK and DAY boundaries in your WHERE clause and then GROUP by the week. This will give you a true average regardless of whether there are rows or not.
Something like this (using T-SQL):
SELECT DATEPART(w, theDate), Avg(theAmount) as Average
FROM Table
GROUP BY DATEPART(w, theDate)
This will return a row for every week. You could filter it in your WHERE clause to restrict it to a given date range.
Hope this helps.

Your weekly average is
daily * 7
Obviously this doesn't take in to account specific weeks, but you can get that by narrowing the result set in a date range.

You'll have to omit those records in the addition which don't belong to a full week. So, prior to summing up, you'll have to find the min and max of the dates, manipulate them such that they form "whole" weeks, and then run your original query with a WHERE that limits the date values according to the new range. Maybe you can even put all this into one query. I'll leave that up to you. ;-)
Those values which are "truncated" are not used then, obviously. If there's not enough values for a week at all, there's no result at all. But there's no solution to that, apparently.

Related

SQL - Returning max count, after breaking down a day into hourly rows

I need to write a SQL query that helps return the highest count in a given hourly range. The problem is that in my table, it just logs orders as they come and doesn’t have a unique identifier that separates hours from hours.
So basically, I need to find the highest number of orders (on any given hour), from 7/08/2022, - 7/15/2022, have a table that does not distinguish distinct hour sets, and logs orders as they come.
I have tried to use a query that combines MAX(), COUNT(), and DATETIME(), but to no avail.
Can I please receive some help?
I've had to tackle this kind of measurement in the past..
Here's what I did for 15 minute intervals:
My datetime column is named datreg in my database log area.
cast(round(floor(cast(datreg as float(53))*24*4)/(24*4),5) as smalldatetime
I times by 4 in this formula, to get 4 intervals inside my 24 hour period.. For you it would look like this to get just hourly intervals:
cast(round(floor(cast(datreg as float(53))*24)/(24),5) as smalldatetime
This is a little piece of magic when it comes to dashboards and reports.

Calculating the AVG value per GROUP in the GROUP BY Clause

I'm working on a query in SQL Server 2005 that looks at a table of recorded phone calls, groups them by the hour of the day, and computes the average wait time for each hour in the day.
I have a query that I think works, but I'm having trouble convincing myself it's right.
SELECT
DATEPART(HOUR, CallTime) AS Hour,
(AVG(calls.WaitDuration) / 60) AS WaitingTimesInMinutes
FROM (
SELECT
CallTime,
WaitDuration
FROM Calls
WHERE DATEADD(day, DATEDIFF(Day, 0, CallTime), 0) = DATEADD(day, DATEDIFF(Day, 0, GETDATE()), 0)
AND DATEPART(HOUR, CallTime) BETWEEN 6 AND 18
) AS calls
GROUP BY DATEPART(HOUR, CallTime)
ORDER BY DATEPART(HOUR, CallTime);
To clarify what I think is happening, this query looks at all calls made on the same day as today, and where the hour of the call is between 6 and 18 -- the times are recorded and SELECTed in 24-hour time, so this between hours is to get calls between 6am and 6pm.
Then, the outer query computes the average of the WaitDuration column (and converts seconds to minutes) and then groups each average by the hour.
What I'm uncertain of is this: Are the reported BY HOUR averages only for the calls made in that hour's timeframe? Or does it compute each reported average using all the calls made on the day and between the hours? I know the AVG function has a optional OVER/PARTITION clause, and it's been a while since I used the AVG group function. What I would like is that each result grouped by an hour shows ONLY the average wait time for that specific hour of the day.
Thanks for your time in this.
The grouping happens on the values that get spit out of datepart(hour, ...). You're already filtering on that value so you know they're going to range between 6 and 18. That's all that the grouping is going to see.
Now of course the datepart() function does what you're looking for in that it looks at the clock and gives the hour component of the time. If you want your group to coincide with HH:00:00 to HH:59:59.997 then you're in luck.
I've already noted in comments that you probably meant to filter your range from 6 to 17 and that your query will probably perform better if you change that and compare your raw CallTime value against a static range instead. Your reasoning looks correct to me. And because your reasoning is correct, you don't need the inner query (derived table) at all.
Also if WaitDuration is an integer then you're going to be doing decimal division in your output. You'd need to cast to decimal in that case or change the divisor a decimal value like 60.00.
Yes if you use the AVG function with a GROUP BY only the items in that group are averaged. Just like if you use the COUNT function with a GROUP BY only the items in that group are counted.
You can use windowing functions (OVER/PARTITION) to conceptually perform GROUP BYs on different criteria for a single function.
eg
AVG(zed) OVER (PARTITION BY DATEPART(YEAR, CallTime)) as YEAR_AVG
Are the reported BY HOUR averages only for the calls made in that hour's timeframe?
Yes. The WHERE clause is applied before the grouping and aggregation, so the aggregation will apply to all records that fit the WHERE clause and within each group.

Aggregating 15-minute data into weekly values

I'm currently working on a project in which I want to aggregate data (resolution = 15 minutes) to weekly values.
I have 4 weeks and the view should include a value for each week AND every station.
My dataset includes more than 50 station.
What I have is this:
select name, avg(parameter1), avg(parameter2)
from data
where week in ('29','30','31','32')
group by name
order by name
But it only displays the avg value of all weeks. What I need is avg values for each week and each station.
Thanks for your help!
The problem is that when you do a 'GROUP BY' on just name you then flatten the weeks and you can only perform aggregate functions on them.
Your best option is to do a GROUP BY on both name and week so something like:
select name, week, avg(parameter1), avg(parameter2)
from data
where week in ('29','30','31','32')
group by name, week
order by name
PS - It' not entirely clear whether you're suggesting that you need one set of results for stations and one for weeks, or whether you need a set of results for every week at every station (which this answer provides the solution for). If you require the former then separate queries are the way to go.

Querying SQLITE DB for Data from One Column Based On Another Column

I hope the title of this post makes sense.
The db in question has two columns that are related to my issue, a date column that follows the format xx/xx/xxxx and price a column. What I want to do is get a sum of the prices in the price column based on the month and year in which they occurred, but that data is in the other aforementioned column. Doing so will allow me to determine the total for a given month of a given year. The problem is I have no idea how to construct a query that would do what I need. I have done some reading on the web, but I'm not really sure how to go about this. Can anyone provide some advice/tips?
Thanks for your time!
Mike
I was able to find a solution using a LIKE clause:
SELECT sum(price) FROM purchases WHERE date LIKE '11%1234%'
The "11" could be any 2-digit month and the "1234" is any 4 digit year. The % sign acts as a wildcard. This query, for example, returns the sum of any prices that were from month 11 of year 1234 in the db.
Thanks for your input!
You cannot use the built-in date functions on these date values because you have stored them formatted for displaing instead of in one of the supported date formats.
If the month and day fields always have two digits, you can use substr:
SELECT substr(MyDate, 7, 4) AS Year,
substr(MyDate, 1, 2) AS Month,
sum(Price)
FROM Purchases
GROUP BY Year,
Month
So, the goal is to get an aggregate grouping by the month?
select strftime('%m', mydate), sum(price)
from mytable
group by strftime('%m', mydate)
Look into group by

PostgreSQL - GROUP BY timestamp values?

I've got a table with purchase orders stored in it. Each row has a timestamp indicating when the order was placed. I'd like to be able to create a report indicating the number of purchases each day, month, or year. I figured I would do a simple SELECT COUNT(xxx) FROM tbl_orders GROUP BY tbl_orders.purchase_time and get the value, but it turns out I can't GROUP BY a timestamp column.
Is there another way to accomplish this? I'd ideally like a flexible solution so I could use whatever timeframe I needed (hourly, monthly, weekly, etc.) Thanks for any suggestions you can give!
This does the trick without the date_trunc function (easier to read).
// 2014
select created_on::DATE from users group by created_on::DATE
// updated September 2018 (thanks to #wegry)
select created_on::DATE as co from users group by co
What we're doing here is casting the original value into a DATE rendering the time data in this value inconsequential.
Grouping by a timestamp column works fine for me here, keeping in mind that even a 1-microsecond difference will prevent two rows from being grouped together.
To group by larger time periods, group by an expression on the timestamp column that returns an appropriately truncated value. date_trunc can be useful here, as can to_char.