Access query to partition data and sum each partition?

Access query to partition data and sum each partition? - sql

I have a query with the fields date hour and value.
It looks something like this
date hour value
xx/xx/xx 15 100
xx/xx/xx 30 122
xx/xx/xx 45 50
... 100 100
... 115 23
... ... ...
... ... ...
... 2400 400
... 15 23
Basically, date is the date, hour is the hour, and value is the value for that particular 15 minute interval. What I have been trying to figure out is a way to take each hour (so 15, 30, 45, and 100) or (1015, 1030, 1045, 1100) [As you can see hours are military-esque 1:00pm is 1300 and midnight 2400], and sum their values together. So i am looking to return something like this:
xx/xx/xx 100 372
xx/xx/xx 200 23 + (130 data) + (145 data) + (200 data)
And so on...
The table has on average around 100 days and they all start from 15 to 2400 incrementing by 15 with varying numbers for the value column.
I have thought about using a partition, group by, etc. with no real ideas how to tackle it. Essentially I have to take 4 rows (an hour), sum their values, spit out the date, hour, and summed value then repeat for every day. I am not asking for code, just some help with what i should be using since this seems like a simple problem minus the key to solving it.
Any help is greatly appreciated, Thank you!

Grouping by Hour/100 will almost get you there - subtract 1 from the hour will make 1 AM fall to 99, and get included in the grouping. This will give a query that looks like this:
SELECT Table1.Dte, Int(([tme]-1)/100) AS Hr, Sum(Table1.Val) AS TotVal
FROM Table1
GROUP BY Table1.Dte, Int(([tme]-1)/100);

I may have misremembered how you cast to int in Access, but this might work:
Select
[date],
100 * (1 + Cint(([Hour] - 1) / 100)),
Sum(Value)
From
Query
Group By
[date],
100 * (1 + Cint(([Hour] - 1) / 100))
Order By
1, 2

SELECT
DateCol,
Int(HourCol \ -100) * -100 AS Hr,
Sum(Value) AS Value
FROM
YourTable
GROUP BY
DateCol,
Int(HourCol \ -100) * -100
Or you can use ((Hr + 99) \ 100) * 100.

Related

Query using group by with steps/range over large data

I have a table that stores a sensor temperature readings every few seconds
Sample data looks like this
nId nOperationId strDeviceIp nIfIndex nValue nTimestamp
97 2 192.168.99.252 1 26502328 1593828551
158 2 192.168.99.252 1 26501704 1593828667
256 2 192.168.99.252 1 26501860 1593828788
354 2 192.168.99.250 1 26501704 1593828908
452 2 192.168.99.250 1 26501692 1593829029
I want to have the average temperature per device so I ran the following query
select strDeviceIp, AVG(CAST(nValue as bigint)) as val1
from myTable
where nOperationId = 2 and nTimestamp >= 1593828600 and nTimestamp <= 1593838600
group by strSwitchIp;
Where I can pass the time range I want.
My issue is that this gives me total average but I want steps or range
I want to achieve that instead of one line I'll get all the values in a range/step of say 5 minutes as a row.
P.S. I'm trying to show a graph.
Running the following query I get
strSwitchIp average
192.168.99.252 26501731
But I would like to get
strSwitchIp average timestamp
192.168.99.252 26201731 1593828600
192.168.99.252 26532731 1593828900
192.168.99.252 24501721 1593829200
192.168.99.252 26506531 1593829500
In this example I would like to get a row every 300 seconds, per device.

Since your nTimestamp is number of seconds, you can simply add it to the GROUP BY. Division by 300 gives you 300 second (5 minute) intervals. In SQL Server / is integer division, which discards the fractional part.
select
strSwitchIp
,AVG(CAST(nValue as bigint)) as val1
,(nTimestamp / 300) * 300 AS Timestamp
from myTable
where
nOperationId = 2 and nTimestamp >= 1593828600 and nTimestamp <= 1593838600
group by
strSwitchIp
,nTimestamp / 300
;
nTimestamp / 300 gives an integer, a number of 5-minute intervals since 1970. / discards here the fractional part.
When this number is multiplied back by 300, it becomes again the number of seconds since 1970, but "rounded" to the nearest 5-minute interval. Just as you showed in the question in the expected result.
For example:
1593828667 / 300 = 5312762.2233333333333333333333333
discard fractional part
1593828667 / 300 = 5312762
5312762 * 300 = 1593828600
So, all timestamps between 1593828600 and 1593828899 become 1593828600 and all values for these timestamps are grouped into one row and averaged.

you ca use partition like this:
select strDeviceIp, AVG(CAST(nValue as bigint)) as val1,
ROW_NUMBER() over(partition by nTimestamp order by nTimestamp desc) as ROW_NO from AmyTable) Q where q.ROW_NO%5=0
....

Postgresql Trying to calculate percentage of total using over(). Never used over() before but I've read this is the proper approach

Here is my simple table structure with a couple results:
The ages range from 18 - 100
I'm try to calculate the percentage of an age range that has a good job, such as 18 - 24, 24-30 etc. I need to sum the 'good_jobs' because this is survey data, and many did not respond, so many null values.
I'm trying combine what I can do in multiple queries into a single one:
query1:
select sum(good_job) as "18_24_GoodJobs"
from jf_q2
where
age >= 18 and age <= 24
and
working=1;
result 61
query2:
select sum(good_job) as "18_24_GoodJobs"
from jf_q2
where
age >= 18 and age <= 100
and
working=1;
result 2571
with a single query doing something like this:
select sum(good_job) as "18to24",
(sum(good_job)/ (select sum(good_job over(partition by good_job))) as Percentage)
from jf_q2
where
age >= 18 and age <= 24
and
working=1;
result = some_fractional number
I'm hoping for something like
18_24_GoodJobs| all_with_good_jobs
61 | 16%
Ultimately this is a flask app and I'll have to deal with this later, but I'm trying to get this query down to draw a graphic.
Thank you and Happy Sunday

You can do this with conditional aggregation:
select
sum(good_job) filter(where age between 18 and 24) 18_24_GoodJobs,
sum(good_job) filter(where age between 18 and 24)
/ sum(good_job) 18_24_GoodJobs_Part
from jf_q2
where working = 1
This gives you the count of good jobs for age 18-24 and the proportion of age 18-24 amongst good jobs (as a decimal value).

Percentiles in Oracle

I have the following query, in which I'm attempting to work out percentiles for the days between a letter being sent and today's date:
SELECT PERCENTILE_DISC(0.1) WITHIN GROUP (ORDER BY SUM(TRUNC(SYSDATE) -
( TO_DATE( SUBSTR(M.LETTER_SENT, 1, 11), 'YYYY-MON-DD') )) ASC) AS PERCENTILE_10,
PERCENTILE_DISC(0.9) WITHIN GROUP (ORDER BY SUM(TRUNC(SYSDATE) -
( TO_DATE( SUBSTR(M.LETTER_SENT, 1, 11), 'YYYY-MON-DD') )) ASC) AS PERCENTILE_90
FROM MV_TABLE M8
WHERE M8.LETTER_SENT != 'N'
GROUP BY M8.LETTER_SENT;
I am perhaps wrong in thinking that, it should return the 10th and 90th percentile for the result set?
M.LETTER_SENT is in the format YYYY-MON-DD: USER_ID. So my query uses SYSDATE - TO_DATE(SUBSTR(M.LETTER_SENT,1, 11), 'YYYY-MON-DD') to work out the number of days between.
So the M.LETTER_SENT actual value for the result set I list below of 4 days 2015-Feb-27: rstone
That query returns the following result set:
242
4
4
4
39
11
18
361
My understanding of percentiles is that if you want the 90th percentile the following occurs.
number of records * percentile = percentile => (round up) = index value
So in my situation it's:
8 * 0.1 = 0.8 => (round up) = 1
8 * 0.9 = 7.2 => (round up) = 8
The 1st value on the ordered result set is: 4
The 8th value on the ordered result set is: 361
Oracle for me returns: 11 as the 10th percentile though?
When I do 0.2 + 0.8 percentiles I get 12, 242 respectively. I always understood there to be a few different ways to calculate percentiles. So how does Oracle calculate these results am I wrong in my thoughts of what the percentiles should be?

Group DateTime into 5,15,30 and 60 minute intervals

I am trying to group some records into 5-, 15-, 30- and 60-minute intervals:
SELECT AVG(value) as "AvgValue",
sample_date/(5*60) as "TimeFive"
FROM DATA
WHERE id = 123 AND sample_date >= 3/21/2012
i want to run several queries, each would group my average values into the desired time increments. So the 5-min query would return results like this:
AvgValue TimeFive
6.90 1995-01-01 00:05:00
7.15 1995-01-01 00:10:00
8.25 1995-01-01 00:15:00
The 30-min query would result in this:
AvgValue TimeThirty
6.95 1995-01-01 00:30:00
7.40 1995-01-01 01:00:00
The datetime column is in yyyy-mm-dd hh:mm:ss format
I am getting implicit conversion errors of my datetime column. Any help is much appreciated!

Using
datediff(minute, '1990-01-01T00:00:00', yourDatetime)
will give you the number of minutes since 1990-1-1 (you can use the desired base date).
Then you can divide by 5, 15, 30 or 60, and group by the result of this division.
I've cheked it will be evaluated as an integer division, so you'll get an integer number you can use to group by.
i.e.
group by datediff(minute, '1990-01-01T00:00:00', yourDatetime) /5
UPDATE As the original question was edited to require the data to be shown in date-time format after the grouping, I've added this simple query that will do what the OP wants:
-- This convert the period to date-time format
SELECT
-- note the 5, the "minute", and the starting point to convert the
-- period back to original time
DATEADD(minute, AP.FiveMinutesPeriod * 5, '2010-01-01T00:00:00') AS Period,
AP.AvgValue
FROM
-- this groups by the period and gets the average
(SELECT
P.FiveMinutesPeriod,
AVG(P.Value) AS AvgValue
FROM
-- This calculates the period (five minutes in this instance)
(SELECT
-- note the division by 5 and the "minute" to build the 5 minute periods
-- the '2010-01-01T00:00:00' is the starting point for the periods
datediff(minute, '2010-01-01T00:00:00', T.Time)/5 AS FiveMinutesPeriod,
T.Value
FROM Test T) AS P
GROUP BY P.FiveMinutesPeriod) AP
NOTE: I've divided this in 3 subqueries for clarity. You should read it from inside out. It could, of course, be written as a single, compact query
NOTE: if you change the period and the starting date-time you can get any interval you need, like weeks starting from a given day, or whatever you can need
If you want to generate test data for this query use this:
CREATE TABLE Test
( Id INT IDENTITY PRIMARY KEY,
Time DATETIME,
Value FLOAT)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:00:22', 10)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:03:22', 10)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:04:45', 10)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:07:21', 20)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:10:25', 30)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:11:22', 30)
INSERT INTO Test(Time, Value) VALUES('2012-03-22T00:14:47', 30)
The result of executing the query is this:
Period AvgValue
2012-03-22 00:00:00.000 10
2012-03-22 00:05:00.000 20
2012-03-22 00:10:00.000 30

Building on #JotaBe's answer (to which I cannot comment on--otherwise I would), you could also try something like this which would not require a subquery.
SELECT
AVG(value) AS 'AvgValue',
-- Add the rounded seconds back onto epoch to get rounded time
DATEADD(
MINUTE,
(DATEDIFF(MINUTE, '1990-01-01T00:00:00', your_date) / 30) * 30,
'1990-01-01T00:00:00'
) AS 'TimeThirty'
FROM YourTable
-- WHERE your_date > some max lookback period
GROUP BY
(DATEDIFF(MINUTE, '1990-01-01T00:00:00', your_date) / 30)
This change removes temp tables and subqueries. It uses the same core logic for grouping by 30 minute intervals but, when presenting the data back as part of the result I'm just reversing the interval calculation to get the rounded date & time.

So, in case you googled this, but you need to do it in mysql, which was my case:
In MySQL you can do
GROUP BY
CONCAT(
DATE_FORMAT(`timestamp`,'%m-%d-%Y %H:'),
FLOOR(DATE_FORMAT(`timestamp`,'%i')/5)*5
)

In the new SQL Server 2022, you can use DATE_BUCKET, this rounds it down to the nearest interval specified.
SELECT
DATE_BUCKET(minute, 5, d.sample_date) AS TimeFive,
AVG(d.value) AS AvgValue
FROM DATA d
WHERE d.id = 123
AND d.sample_date >= '20121203'
GROUP BY
DATE_BUCKET(minute, 5, d.sample_date);

You can use the following statement, this removed the second component and calculates the number of minutes away from the five minute mark and uses this to round down to the time block. This is ideal if you want to change your window, you can simply change the mod value.
select dateadd(minute, - datepart(minute, [YOURDATE]) % 5, dateadd(minute, datediff(minute, 0, [YOURDATE]), 0)) as [TimeBlock]

This will help exactly what you want
replace dt - your datetime c - call field astro_transit1 - your table 300 refer 5 min so add 300 each time for time gap increase
SELECT FROM_UNIXTIME( 300 * ROUND( UNIX_TIMESTAMP( r.dt ) /300 ) ) AS 5datetime, ( SELECT r.c FROM astro_transit1 ra WHERE ra.dt = r.dt ORDER BY ra.dt DESC LIMIT 1 ) AS first_val FROM astro_transit1 r GROUP BY UNIX_TIMESTAMP( r.dt ) DIV 300 LIMIT 0 , 30

group-by/aggregation over date ranges

Got table ABC with one column. A date column "created". So sample values are like;
created
2009-06-18 13:56:00
2009-06-18 12:56:00
2009-06-17 14:02:00
2009-06-17 13:12:23
2009-06-16 10:02:10
I want to write a query so that the results are:
count created
2 2009-06-18
2 2009-06-17
1 2009-06-16
Basically count how many entries belong to each date, but ignoring time.
This is in PL-SQL with Oracle.
Any ideas?

The TRUNC function returns the DATE of the DATETIME.
select trunc(created),count(*) from ABC group by trunc(created)

select count(*), to_char('YYYY-MM-DD', created) from ABC group by to_char('YYYY-MM-DD', created)

Just for completeness, I'll add a generalized variant that works for arbitrarily defined bucket sizes:
SELECT trunc( (created - to_date('2000.01.01', 'YYYY.MM.DD'))
* 24
) / 24
+ to_date('2000.01.01', 'YYYY.MM.DD') theDate
, count(*)
FROM Layer
GROUP BY trunc( (created - to_date('2000.01.01', 'YYYY.MM.DD'))
* 24
) / 24
+ to_date('2000.01.01', 'YYYY.MM.DD')
ORDER BY 1;
The query above will give you the counts by hour; use something smaller than 24 to get larger intervals, or bigger to create smaller intervals. By reversing the position of the * and /, you can make your buckets be in increments of days (e.g. doing " / 7 " instead of " * 24 " would give you buckets of one week each.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Access query to partition data and sum each partition? - sql

I may have misremembered how you cast to int in Access, but this might work: Select [date], 100 * (1 + Cint(([Hour] - 1) / 100)), Sum(Value) From Query Group By [date], 100 * (1 + Cint(([Hour] - 1) / 100)) Order By 1, 2

SELECT DateCol, Int(HourCol \ -100) * -100 AS Hr, Sum(Value) AS Value FROM YourTable GROUP BY DateCol, Int(HourCol \ -100) * -100 Or you can use ((Hr + 99) \ 100) * 100.

Related

Query using group by with steps/range over large data

Postgresql Trying to calculate percentage of total using over(). Never used over() before but I've read this is the proper approach

Percentiles in Oracle

Group DateTime into 5,15,30 and 60 minute intervals

group-by/aggregation over date ranges

Categories

Resources