SQLite: Get monthly, weekly or daily average of all entries

SQLite: Get monthly, weekly or daily average of all entries - sql

I have a SQLite database of records in this format:
date location temperature
1568463916 room 1 20.0
1568463916 room 2 25.0
1568463916 room 3 30.0
...
1568460316 room 1 15.5
1568460316 room 2 20.5
1568460316 room 3 21.3
Every hour three new records get inserted, one for every room.
For a monthly average this output is desired:
month avg_temperature location
01 21.333 room 1
01 24.5 room 2
01 19.0 room 3
...
12 20.4 room 1
12 31.31 room 2
12 13.37 room 3
The same query might be reused to get weekly averages (day 00-07) and daily averages (hour 00-23).
To get a monthly average, I'm assuming I will select:
All records with date between now and now - 1 year
Month of every record with strftime(date, "unixepoch") as month
For every location, then for every month get avg(temperature)
The result is rooms*12 rows of average temperature of each room for each month
When I'm using the GROUP BY statement however, I'm only getting the last row of every month. What's the correct way to construct this kind of query?
This is the query I've tried:
SELECT strftime("%m", date, "unixepoch") month,
avg(temperature) avg_temperature,
location
FROM table
WHERE date > date("now", "unixepoch", "-1 year")
AND date < date("now", "unixepoch")
GROUP BY location, month
ORDER BY month

This should do what you want:
select date(datetime(date, 'unixepoch'), 'start of month') as month,
location,
avg(temperature)
from t
group by date(datetime(date, 'unixepoch'), 'start of month') as month,
location
order by month, location;

Related

Calculate the monthly average including the date where data is missing

I want to calculate the monthly average of some data using SQL query where the data resides in redshift DB.
The data is present in the following format in the table.
s_date | sales
------------+-------
2020-08-04 | 10
2020-08-05 | 20
---- | --
---- | --
The data may not be present for all the date in a month. If the data is not present for a day, it should be considered as 0.
Following query using AVG() function "group by" month as gives the average of based on the data on available date.
select trunc(date_trunc('MONTH', s_date)::timestamp) as month, avg(sales) from sales group by month;
However it does not consider the data for missing dates as 0. What should be the right query to calculate the monthly average as expected?
One more expectation is that, for the current month, the average should be calculated based on the data till today. So it should not consider entire month (like 30 or 31 days).
Regards,
Paul

Using a calendar table might be the easiest way to go here:
WITH dates AS (
SELECT date_trunc('day', t)::date AS dt
FROM generate_series('2020-01-01'::timestamp, '2020-12-31'::timestamp, '1 day'::interval) t
),
cte AS (
SELECT t.dt, COALESCE(SUM(s.sales), 0) AS sales
FROM dates t
LEFT JOIN sales s ON t.dt = s.s_date
GROUP BY t.dt
)
SELECT
LEFT(dt::text, 7) AS ym,
AVG(sales) AS avg_sales
FROM cte
GROUP BY
LEFT(dt::text, 7);
The logic here is to first generate an intermediate table in the second CTE which has one record for each data in your data set, along with the total sales for that date. Then, we aggregate by year/month, and report the average sales.

Counting readmissions in postgresql

I have a table containing data for a prison facility, of the following format:
Prisoner_id admission date discharge date
---------------------------------------------------
1325 06/13/2014 09/13/2014
1266 05/01/2014 07/02/2014
1325 02/21/2015 07/23/2015
1471 02/26/2014 04/20/2014
1266 10/19/2014 12/22/2014
1325 10/09/2015 11/10/2015
I need to count the number of readmissions of each prisoner; that is, how many times each prisoner has been admitted again to the facility, such that the difference between his admission date (date he entered) the last time he entered the facility and his discharge date (date he was let go) the time before the last is less than 60 days.
This means that if the same prisoner has been admitted 2 times, we count this as 1 readmission if the difference between his admission date of the second time and his discharge date of the first time is less than 60 days.
Moreover, if a prisoner has been admitted 3 times, we count this as 2 readmissions if the difference between his discharge date the third time and his admission date the second time AND the difference between his discharge date the second time and his admission date the first time are both less than 60 days. If one of them is less than 60 days but the other is not, count as 1 readmission. If none of them is less than 60 days, count as zero readmissions.
How can I do this in SQL or PostgreSQL? Your help is really appreciated.

I think you just want lag() and some query logic:
The following gets the groups:
select t.prisoner_id,
sum( (prev_dd > admission_date - interval '60 day')::int ) as num_readmissions
from (select t.*,
lag(discharge_date) over (partition by prisoner_id) as prev_dd
from t
) t
group by prisoner_id;

YTD(Year to date) Calender Flag in qlikview

ordertype actual actual YTD
A 900 1500
B 500 2000
C 200 2200
D 300 2500
Actual column should calculate value from (present date of month minus start date).
Actual YTD should calculate based on present date minus start date of year
Here start date month/year should be april 1 (Financial Year)

In your script use InYearToDate and InMonthToDate
use-case example:
Load distinct
Date,
if(InYearToDate(Date,today(),0,4),1,0) as YTD, // the "4" mean that The year start at the 4th month,
if(InMonthToDate(Date,today(),0),1,0) as MTD
resident DataTable
Now in your Calendar table you the YTD and MTD fields and you can use them like this:
sum({<YTD={1}>} Actual)
for more information about these functions visit here

SQLite - Determine average sales made for each day of week

I am trying to produce a query in SQLite where I can determine the average sales made each weekday in the year.
As an example, I'd say like to say
"The average sales for Monday are $400.50 in 2017"
I have a sales table - each row represents a sale you made. You can have multiple sales for the same day. Columns that would be of interest here:
Id, SalesTotal, DayCreated, MonthCreated, YearCreated, CreationDate, PeriodOfTheDay
Day/Month/Year are integers that represent the day/month/year of the week. DateCreated is a unix timestamp that represents the date/time it was created too (and is obviously equal to day/month/year).
PeriodOfTheDay is 0, or 1 (day, or night). You can have multiple records for a given day (typically you can have at most 2 but some people like to add all of their sales in individually, so you could have 5 or more for a day).
Where I am stuck
Because you can have two records on the same day (i.e. a day sales, and a night sales, or multiple of each) I can't just group by day of the week (i.e. group all records by Saturday).
This is because the number of sales you made does not equal the number of days you worked (i.e. I could have worked 10 saturdays, but had 30 sales, so grouping by 'saturday' would produce 30 sales since 30 records exist for saturday (some just happen to share the same day)
Furthermore, if I group by daycreated,monthcreated,yearcreated it works in the sense it produces x rows (where x is the number of days you worked) however that now means I need to return this resultset to the back end and do a row count. I'd rather do this in the query so I can take the sales and divide it by the number of days you worked.
Would anyone be able to assist?
Thanks!
UPDATE
I think I got it - I would love someone to tell me if I'm right:
SELECT COUNT(DISTINCT CAST(( julianday((datetime(CreationDate / 1000, 'unixepoch', 'localtime'))) ) / 7 AS INT))
FROM Sales
WHERE strftime('%w', datetime(CreationDate / 1000, 'unixepoch'), 'localtime') = '6'
AND YearCreated = 2017
This would produce the number for saturday, and then I'd just put this in as an inner query, dividing the sale total by this number of days.

Buddy,
You can group your query by getting the day of week and week number of day created or creation date.
In MSSQL
DATEPART(WEEK,'2017-08-14') // Will give you week 33
DATEPART(WEEKDAY,'2017-08-14') // Will give you day 2
In MYSQL
WEEK('2017-08-14') // Will give you week 33
DAYOFWEEK('2017-08-14') // Will give you day 2
See this figures..
Day of Week
1-Sunday, 2- Monday, 3-Tuesday, 4-Wednesday, 5-Thursday, 6-Saturday
Week Number
1 - 53 Weeks in a year
This will be the key so that you will have a separate Saturday's in every month.
Hope this can help in building your query.

Number of specific one-hour periods between two date/times

I have a table of table records, call it "game"
It has an id and timestamp.
What I need to know is unrelated to the table specifically. In order to know the average number of games played per hour, I need to know :
Total games played for each hour over the date range
Number of hourly
periods between the date range.
Finding the first is a matter of extracting the hour from the timestamp and grouping by it.
For the second, if the date range was rounded to the nearest day, finding this value would be easy (totalgames/numdays).
Unfortunately I can't assume this. What I need help with is finding the number of specific hour periods existing within a time range.
Example:
If the range is 5 PM today to 8 PM tomorrow, there is one "00" hour (midnight to 1 AM), but two 17, 18, 19 hours (5-6, 6-7, 7-8)
Thanks for the help
Edit: for clarity, consider the following query:
I have table game:
id, daytime
select EXTRACT(hour from daytime) as hour_period, count (*)
from game
where daytime > dateFrom and daytime < dayTo
group by hour_period
This will give me the number of games played broken down into hourly chunks for the time period.
In order to find the average games played per hour, I need to know exactly how many specific hour durations are between two timestamps. Simply dividing by the number of days is not accurate.
Edit: The ideal output will look something like this:
00 275
01 300
02 255
...
Consider the following: How many times does midnight occur between date 1 and date 2 ? If you have 1.5 days, that doesn't guarantee that midnight will occur twice. 6 AM today to 6 PM tomorrow night, for example, has 1 midnight, but 9PM tonight to 9 AM two days from now has 2 midnights.
What I'm trying to find is how many of the EXACT HOUR occurs between two timestamps, so I can use it to average the number of games played at THAT HOUR over a time period.

EDIT:
The following query gets the days, hours, and # of games, giving an output as below:
29 23 100
29 00 130
30 22 140
30 23 150
Then, the outer query adds up the number of games for each distinct hour and divides by the number of hours, as follows
22 140
23 125
00 130
The modified query is below:
SELECT
hour_period,
sum(hourly_no_of_games) / count(hour_period)
FROM
(
SELECT
EXTRACT(DAY from daytime) as day_period,
EXTRACT(HOUR from daytime) as hour_period,
count (*) hourly_no_of_games
from game
where daytime > dateFrom and daytime < dayTo
group by EXTRACT(DAY from daytime), EXTRACT(HOUR from daytime)
) hourly_data
GROUP BY hour_period
ORDER BY hour_period;
SQL Fiddle demo

If you need something to GROUP BY, you can truncate the timestamp to the level of hour, as in the following:
DECLARE #Date DATETIME
SET #Date = GETDATE()
SELECT #Date, DATEADD(Hour, DATEDIFF(Hour, 0, #Date), 0) AS RoundedDate
If you just need to find the total hours, you can just select the DATEDIFF in hours, such as with
SELECT DATEDIFF(Hour, '5/29/2014 20:01:32.999', GETDATE())

Extract not only the hour of the day but the day of the year (1-366). Then group on those. If there is the possibility the interval could span a year, then add the year itself and group by all three.
year dy hr games
2013 365 23 115
2014 1 00 103

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas