SQL select one row for every n minutes - sql

I'm using Microsoft SQL Server 2008, and have a data set that has entries for every few minutes, over a long period of time. I am using a program to graph the data, so i need to return about 20 values per hour. Some days the data is every minute, sometimes every five minutes, and sometimes every 8 or 9 minutes, so selecting every nth row won't give an even spread over time
eg for a sample in 2012, it looks like this :
DateTime
2012-01-01 08:00:10.000
2012-01-01 08:08:35.000
2012-01-01 08:17:01.000
2012-01-01 08:25:26.000
and for a sample the next year it looks like this:
DateTime
2013-07-20 08:00:00.000
2013-07-20 08:01:00.000
2013-07-20 08:02:00.000
2013-07-20 08:03:00.000
2013-07-20 08:04:00.000
at the moment I am using a statement like this:
SELECT * FROM [Master]
WHERE (((CAST(DATEPART(hour, DateTime)as varchar(2)))*60)
+CAST(DATEPART(minute, DateTime)as varchar(2))) % '5' = '0'
ORDER BY DateTime
This works fine for july 2013, but I miss most points in 2012, as it returns this
DateTime
2012-01-01 08:00:10.000
2012-01-01 08:25:26.000
2012-01-01 08:50:43.000
2012-01-01 09:15:59.000
2012-01-01 10:40:14.000
2012-01-01 11:05:30.000
What better way is there to do this?
EDIT: The table has a DateTime column, and a pressure column, and I need to output both and graph pressure against date and time.

Since they can be random for the hours, this should work for what you need:
Declare #NumberPerHour Int = 20
;With Cte As
(
Select DateTime, Row_Number() Over (Partition By DateDiff(Hour, 0, DateTime) Order By NewId()) RN
From Master
)
Select DateTime
From Cte
Where RN <= #NumberPerHour
Order By DateTime Asc
This will group the rows by the hour, and assign a random Row_Number ID to them, and only pull those with a Row_Number less than the number you're looking for per hour.

Related

How can I handle a Julian Date rollover in SQL?

I have a list of julian dates that I need to keep in order ex. 362, 363, 364, 365, 001, 002, 003. My query starts with getting the last julian date processed and each date after that. Right now it will max my lowest date out at 365 and I can't get the records that follow it. The same set of data also has a date field with the year attached but it doesn't seem to be helpful since those records won't be gathered until the rollover is corrected. Here is my simplified query:
select JulianDate, RecordDate
from table
where JulianField > #LowestJulianDate
and RecordDate between GetDate() and DateAdd(day, 6, GetDate())
Sample date:
JulianDate
RecordDate
362
2020-12-28
363
2020-12-29
364
2020-12-30
365
2020-12-31
001
2021-01-01
002
2021-01-02
003
2021-01-03
Desired output:
JulianDate
362
363
364
365
001
002
003
So if you'll imagine we start on day 362, our #LowestJulianDate is 362, and our record date range is today and the next 6 days, completing that list of julian dates.
How can I get the dates to go in order and resolve in a rollover?
You cannot by just using the "JulianDate" which is actually the DayOfYear value. You would need to also store the year that it refers to either separately or as part of the "JulianDate" value. For example, instead of "362" you need "2021362".
well why not sorting by year column and Julian date column ?
select JulianDate, RecordDate
from table
order by yearcolumn,JulianDate
What we are doing in the case of not having a year and wanting to sort a list on the year rollover for a 7 day rolling window is looking at the left 1 of the Julian day. If it's less than 3 roll it's rolled over. We sort into 2 baskets (old year and new year), order them, then recombine them with the new year's data being the "greatest" in the list.
We look at the left 1 because in our application, the last day of data we get may be 357 and the rollover may be 003 for example.

How to retrieve data within a 6 day time period

I want to retrieve the data between a 6 day time period.
The output I want is:
Date
--------
2019-05-01
2019-05-04
2019-06-01
2019-06-06
2019-07-01
This is my query so far:
select date from data d
where CAST(d.createdate as Date) between CAST('2019-05-01' as Date)
AND DATEADD(CAST(dd,6,'2016-07-01') as Date)
Why is this not retrieving the results I want?
You have several problems with your query.
The first is with your DATEADD statement which is all mixed up. You are not nesting the casted date into the statement properly. This is the corrected version:
DATEADD(dd, 6, CAST('2016-07-01' as Date))
The second is that your select projection refers to the column date which does not exist. Instead, you probably want your createdate column.
The third is that your between clause is back to front. You are saying between 2019-05-01 and 2016-07-01 but the smaller date must come first.
In fact, your given example is incorrect. In your question, you say "want to retrieve the data between two dates only for 6 days." So, why would you start with a date in 2016 and then jump to a date in 2019 and add 6 days to the date in 2019? If you want to use the DATEADD approach, you need to use the same date in both positions.
So here is your corrected query:
select d.createdate from data d
where CAST(d.createdate as Date) between CAST('2019-05-01' as Date)
AND DATEADD(dd, 6, CAST('2019-05-01' as Date))

How to split time into hourly slot using SQL (can use view,or stored proc or function)

Example I have data in table which has start date, end date and duration. I want to show hourly time slot.
logic:
Condition 1. If start date =9:00 and end date = 11:00 then show the date as
09:00-10:00
10:00-11:00
It should repeat 2 times and all related column data will also repeat 2 times.
this will continue if time slot is suppose 11:00- 14:00 then
11:00-12:00
12:00-13:00
13:00-14:00
It should repeat 3 times.
Condition 2: If start date is 9:30 and end date is 10:30 then
time should round up. i.e. start date should be 9:00 and end date should be 11:00
How can I achieve this in Sql Server?
I assume that your issue is getting multiple rows from one, rather than formatting the date/time values as a string.
For this, you can use a recursive CTE:
with cte as (
select startdate as thetime, t.*
from t
union all
select dateadd(hour, 1, cte.startdate), . . . -- rest of columns here
from cte
where cte.thetime < cte.enddate
)
select cte.*
from cte;
You can then format thetime however you like, including the hyphenated version in your question.
SQL Server has a default limit of 100 for recursion -- the number of rows produced. Your example only uses times, so this can't exceed 24 and is not an issue. However, it could be an issue in other circumstances in which case option (maxrecursion 0) can be added to the query.

Find the minute difference between 2 date time

I need to get the difference between 2 date time in minutes(Time difference in minutes). And the last difference will be calculated based on 6 PM of every date.
Sample data: need result of last column
User_Name Date Time difference in minutes
User 1 1/1/06 12:00 PM 30
user 2 1/1/06 12:30 PM 315
user 3 1/1/06 5:45 PM 15
Here the date will be always in same date and the last user date difference calculated based on default value 6PM. Assuming the dates of any user will not cross 6PM time.
Please suggest how to write the query for the same.
You could use the lead window function.
I assume your table is called mytable and the date column is mydate (it is a bad idea to call a column Date as it is a reserved word).
select user_name,
round((lead(mydate, 1, trunc(mydate)+18/24)
over (partition by trunc(mydate) order by mydate)
- mydate) *24*60) as difference
from mytable
I found the solution.. if its not correct let me know
SELECT User_name,created_date,
trunc(to_number((cast(nvl(lead (created_date,1) OVER (ORDER BY created_date),TRUNC(SYSDATE) + (19/24)) as date) - cast(created_date as date)))*24*60) as difference
FROM users;

Reduce/Summarize and Replace Timestamped Records

I have a SQL table that has timestamped records for server performance data. This data is polled and stored every 1 minute for multiple servers. I want to keep data for a large period of time but reduce the number records for data older than six months.
For example, I have some old records like so:
Timestamp Server CPU App1 App2
1 ... 00:01 Host1 5 1 10
2 ... 00:01 Host2 10 5 20
3 ... 00:02 Host1 6 0 11
4 ... 00:02 Host2 11 5 20
5 ... 00:03 Host1 4 1 9
6 ... 00:04 Host2 9 6 19
I want to be able to reduce this data from every minute to every 10 minutes or possibly every hour for older data.
My initial assumption is that I'd average the values for times within a 10 minute time period and create a new timestamped record after deleting the old records. Could I create a sql query that generates the insert statements for the new summarized records? What would that query look like?
Or is there a better way to accomplish this summarization job?
You might also want to consider moving the summarized information into a different table so you don't end up in a situation where you're wondering if you're looking at "raw" or summarized data. Other benefits would be that you could include MAX, MIN, STDDEV and other values along with the AVG.
The tricky part is chunking out the times. The best way I could think of was to start with the output from the CONVERT(blah, Timestamp, 120) function:
-- Result: 2015-07-08 20:50:55
SELECT CONVERT(VARCHAR(19), CURRENT_TIMESTAMP, 120)
By cutting it off after the hour or after the 10-minute point you can truncate the times:
-- Hour; result is 2015-07-08 20
SELECT CONVERT(VARCHAR(13), CURRENT_TIMESTAMP, 120)
-- 10-minute point; result is 2015-07-08 20:50:5
SELECT CONVERT(VARCHAR(15), CURRENT_TIMESTAMP, 120)
With a little more massaging you can fill out the minutes for either one and CAST it back to a DATETIME or DATETIME2:
-- Hour increment
CAST(CONVERT(VARCHAR(13), CURRENT_TIMESTAMP, 120) + ':00' AS DATETIME)
-- 10-minute increment
CAST(CONVERT(VARCHAR(15), CURRENT_TIMESTAMP, 120) + 0' AS DATETIME)
Using the logic above, all times are truncated. In other words, the hour formula will convert Timestamp where 11:00 <= Timestamp < 12:00 to 11:00. The minute formula will convert Timestamp where 11:20 <= Timestamp < 11:30 to 11:20.
So the better part query looks like this (I've left out getting rid of the rows you've just summarized):
-- The hour-increment version
INSERT INTO myTableOrOtherTable
SELECT
CAST(CONVERT(VARCHAR(13), [Timestamp], 120) + ':00' AS DATETIME),
AVG(CPU),
AVG(App1),
AVG(App2)
FROM myTable
GROUP BY
CAST(CONVERT(VARCHAR(13), [Timestamp], 120) + ':00' AS DATETIME)
Assuming you have record for every minute, this is how you can group your records by 10 minutes:
SELECT
[Timestamp] = MIN([Timestamp]),
[Server],
CPU = AVG(CPU),
App1 = AVG(App1),
App2 = AVG(App2)
FROM (
SELECT *,
RN = (ROW_NUMBER() OVER(PARTITION BY [Server] ORDER BY [Timestamp]) - 1) / 10
FROM temp
)t
GROUP BY [Server], RN