Sql query to select the records for past 60 seconds and compare the temperature of the selected records and if any record has higher value then ignore - sql

I am trying to eliminate the data anomalies in the data I am receiving from eventhub and send only selected data to azure functions through Azure stream analytics for that I am writing a sql query where I need some help
Requirement: I need to collect the past 60 seconds data and need to group by Id and compare the records that I received in the 60 seconds and If any record value is way higher than the selected values than ignore that record (for example, I will collect the 4 records in past 60 seconds and if the data is 40 40 40 40 5. We should drop the 5. Example 2 - 20 20 20 500 drop the 500. ).
My sql table will be something like this:
id Temp date datetime
123 30 2023-01-01 2023-01-01 12:00:00
124 35 2023-01-01 2023-01-01 12:00:00
123 31 2023-01-01 2023-01-01 12:00:00
123 33 2023-01-01 2023-01-01 12:00:00
123 60 2023-01-01 2023-01-01 12:00:00
124 36 2023-01-01 2023-01-01 12:00:00
124 36 2023-01-01 2023-01-01 12:00:00
124 8 2023-01-01 2023-01-01 12:00:00
124 36 2023-01-01 2023-01-01 12:00:00
I need to eliminate the records that are not in the range with the other records

I'll leave the details of the comparison up to you, but you can use a CROSS APPLY to gather the data for comparison.
Something like:
SELECT *
FROM TemperatureData T
CROSS APPLY (
SELECT AVG(T2.Temp * 1.0) AS PriorAvgTemp, COUNT(*) As PriorCount
FROM TemperatureData T2
WHERE T2.id = T.id
AND T2.datetime >= DATEADD(second, -60, T.datetime)
AND T2.datetime < T.datetime
) P
WHERE T.Temp BETWEEN P.PriorAvgTemp - 10 AND P.PriorAvgTemp + 10
--OR P.PriorCount < 3 -- Should we allow if there is insufficient prior data
--AND P.PriorCount >= 3 -- Should we omit if there is insufficient prior data
Be sure you have an index on TemperatureData(id, datetime).
If you are willing to accept the last N values instead of a time range, windowed aggregate calculation may be more efficient.
SELECT *
FROM (
SELECT *,
AVG(T.Temp * 1.0)
OVER(PARTITION BY id ORDER BY datetime
ROWS BETWEEN 60 PRECEDING AND 1 PRECEDING)
AS PriorAvgTemp,
COUNT(*)
OVER(PARTITION BY id ORDER BY datetime
ROWS BETWEEN 60 PRECEDING AND 1 PRECEDING)
AS PriorCount
FROM TemperatureData T
) TT
WHERE TT.Temp BETWEEN TT.PriorAvgTemp - 10 AND TT.PriorAvgTemp + 10
--OR TT.PriorCount < 3 -- Should we allow if there is insufficient prior data
--AND TT.PriorCount >= 3 -- Should we omit if there is insufficient prior data
Please note: The above is untested code, which may need some syntax fixes and debugging. If you discover errors, please comment and I will correct the post.

Related

How to build in product expiration in SQL?

I have a table that looks like the following and from it I want to get days remaining of total doses:
USER|PURCHASE_DATE|DOSES
1111|2017-07-27|15
2222|2020-07-17|3
3333|2021-02-01|5
If the doses do not have an expiration and each can be used for 90 days then the SQL I use is:
SUM(DOSES)*90-DATEDIFF(DAY,MIN(DATE),GETDATE())
USER|DAYS_REMAINING
1111|0
2222|6
3333|385
But what if I want to impose an expiration of each dose at a year? What can I do to modify my SQL to get the following desired answer:
USER|DAYS_REMAINING
1111|-985
2222|6
3333|300
It probably involves taking the MIN between when doses expire and how long they would last but I don't know how to aggregate in the expiry logic.
MIN is a aggregate function you want LEAST to pick between the two values:
WITH data(user,purchase_date, doses) AS (
SELECT * FROM VALUES
(1111,'2017-07-27',15),
(2222,'2020-07-17',3),
(3333,'2021-02-01',5)
)
SELECT
d.*,
d.doses * 90 AS doses_duration,
365::number AS year_duration,
least(doses_duration, year_duration) as max_duration,
DATEADD('day', max_duration, d.purchase_date)::date as last_dose_day,
DATEDIFF('day', current_date, last_dose_day) as day_remaining
FROM data AS d
ORDER BY 1;
gives:
USER PURCHASE_DATE DOSES DOSES_DURATION YEAR_DURATION MAX_DURATION LAST_DOSE_DAY DAY_REMAINING
1111 2017-07-27 15 1350 365 365 2018-07-27 -986
2222 2020-07-17 3 270 365 270 2021-04-13 5
3333 2021-02-01 5 450 365 365 2022-02-01 299
which can all be rolled together with a tiny fix on the date_diff, as:
WITH data(user,purchase_date, doses) AS (
SELECT * FROM VALUES
(1111,'2017-07-27',15),
(2222,'2020-07-17',3),
(3333,'2021-02-01',5)
)
SELECT
d.user,
DATEDIFF('day', current_date, DATEADD('day', least(d.doses * 90, 365::number), d.purchase_date)::date)+1 as day_remaining
FROM data AS d
ORDER BY 1;
giving:
USER DAY_REMAINING
1111 -985
2222 6
3333 300

How to calculate a running total that is a distinct sum of values

Consider this dataset:
id site_id type_id value date
------- ------- ------- ------- -------------------
1 1 1 50 2017-08-09 06:49:47
2 1 2 48 2017-08-10 08:19:49
3 1 1 52 2017-08-11 06:15:00
4 1 1 45 2017-08-12 10:39:47
5 1 2 40 2017-08-14 10:33:00
6 2 1 30 2017-08-09 07:25:32
7 2 2 32 2017-08-12 04:11:05
8 3 1 80 2017-08-09 19:55:12
9 3 2 75 2017-08-13 02:54:47
10 2 1 25 2017-08-15 10:00:05
I would like to construct a query that returns a running total for each date by type. I can get close with a window function, but I only want the latest value for each site to be summed for the running total (a simple window function will not work because it sums all values up to a date--not just the last values for each site). So I guess it could be better described as a running distinct total?
The result I'm looking for would be like this:
type_id date sum
------- ------------------- -------
1 2017-08-09 06:49:47 50
1 2017-08-09 07:25:32 80
1 2017-08-09 19:55:12 160
1 2017-08-11 06:15:00 162
1 2017-08-12 10:39:47 155
1 2017-08-15 10:00:05 150
2 2017-08-10 08:19:49 48
2 2017-08-12 04:11:05 80
2 2017-08-13 02:54:47 155
2 2017-08-14 10:33:00 147
The key here is that the sum is not a running sum. It should only be the sum of the most recent values for each site, by type, at each date. I think I can help explain it by walking through the result set I've provided above. For my explanation, I'll walk through the original data chronologically and try to explain the expected result.
The first row of the result starts us off, at 2017-08-09 06:49:47, where chronologically, there is only one record of type 1 and it is 50, so that is our sum for 2017-08-09 06:49:47.
The second row of the result is at 2017-08-09 07:25:32, at this point in time we have 2 unique sites with values for type_id = 1. They have values of 50 and 30, so the sum is 80.
The third row of the result occurs at 2017-08-09 19:55:12, where now we have 3 sites with values for type_id = 1. 50 + 30 + 80 = 160.
The fourth row is where it gets interesting. At 2017-08-11 06:15:00 there are 4 records with a type_id = 1, but 2 of them are for the same site. I'm only interested in the most recent value for each site so the values I'd like to sum are: 30 + 80 + 52 resulting in 162.
The 5th row is similar to the 4th since the value for site_id:1, type_id:1 has changed again and is now 45. This results in the latest values for type_id:1 at 2017-08-12 10:39:47 are now: 30 + 80 + 45 = 155.
Reviewing the 6th row is also interesting when we consider that at 2017-08-15 10:00:05, site 2 has a new value for type_id 1, which gives us: 80 + 45 + 25 = 150 for 2017-08-15 10:00:05.
You can get a cumulative total (running total) by including an ORDER BY clause in your window frame.
select
type_id,
date,
sum(value) over (partition by type_id order by date) as sum
from your_table;
The ORDER BY works because
The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
SELECT type_id,
date,
SUM(value) OVER (PARTITION BY type_id ORDER BY type_id, date) - (SUM(value) OVER (PARTITION BY type_id, site_id ORDER BY type_id, date) - value) AS sum
FROM your_table
ORDER BY type_id,
date

GROUP BY several hours

I have a table where our product records its activity log. The product starts working at 23:00 every day and usually works one or two hours. This means that once a batch started at 23:00, it finishes about 1:00am next day.
Now, I need to take statistics on how many posts are registered per batch but cannot figure out a script that would allow me achiving this. So far I have following SQL code:
SELECT COUNT(*), DATEPART(DAY,registrationtime),DATEPART(HOUR,registrationtime)
FROM RegistrationMessageLogEntry
WHERE registrationtime > '2014-09-01 20:00'
GROUP BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
ORDER BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
which results in following
count day hour
....
1189 9 23
8611 10 0
2754 10 23
6462 11 0
1885 11 23
I.e. I want the number for 9th 23:00 grouped with the number for 10th 00:00, 10th 23:00 with 11th 00:00 and so on. How could I do it?
You can do it very easily. Use DATEADD to add an hour to the original registrationtime. If you do so, all the registrationtimes will be moved to the same day, and you can simply group by the day part.
You could also do it in a more complicated way using CASE WHEN, but it's overkill on the view of this easy solution.
I had to do something similar a few days ago. I had fixed timespans for work shifts to group by where one of them could start on one day at 10pm and end the next morning at 6am.
What I did was:
Define a "shift date", which was simply the day with zero timestamp when the shift started for every entry in the table. I was able to do so by checking whether the timestamp of the entry was between 0am and 6am. In that case I took only the date of this DATEADD(dd, -1, entryDate), which returned the previous day for all entries between 0am and 6am.
I also added an ID for the shift. 0 for the first one (6am to 2pm), 1 for the second one (2pm to 10pm) and 3 for the last one (10pm to 6am).
I was then able to group over the shift date and shift IDs.
Example:
Consider the following source entries:
Timestamp SomeData
=============================
2014-09-01 06:01:00 5
2014-09-01 14:01:00 6
2014-09-02 02:00:00 7
Step one extended the table as follows:
Timestamp SomeData ShiftDay
====================================================
2014-09-01 06:01:00 5 2014-09-01 00:00:00
2014-09-01 14:01:00 6 2014-09-01 00:00:00
2014-09-02 02:00:00 7 2014-09-01 00:00:00
Step two extended the table as follows:
Timestamp SomeData ShiftDay ShiftID
==============================================================
2014-09-01 06:01:00 5 2014-09-01 00:00:00 0
2014-09-01 14:01:00 6 2014-09-01 00:00:00 1
2014-09-02 02:00:00 7 2014-09-01 00:00:00 2
If you add one hour to registrationtime, you will be able to group by the date part:
GROUP BY
CAST(DATEADD(HOUR, 1, registrationtime) AS date)
If the starting hour must be reflected accurately in the output (as 9, 23, 10, 23 rather than as 10, 0, 11, 0), you could obtain it as MIN(registrationtime) in the SELECT clause:
SELECT
count = COUNT(*),
day = DATEPART(DAY, MIN(registrationtime)),
hour = DATEPART(HOUR, MIN(registrationtime))
Finally, in case you are not aware, you can reference columns by their aliases in ORDER BY:
ORDER BY
day,
hour
just so that you do not have to repeat the expressions.
The below query will give you what you are expecting..
;WITH CTE AS
(
SELECT COUNT(*) Count, DATEPART(DAY,registrationtime) Day,DATEPART(HOUR,registrationtime) Hour,
RANK() over (partition by DATEPART(HOUR,registrationtime) order by DATEPART(DAY,registrationtime),DATEPART(HOUR,registrationtime)) Batch_ID
FROM RegistrationMessageLogEntry
WHERE registrationtime > '2014-09-01 20:00'
GROUP BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
)
SELECT SUM(COUNT) Count,Batch_ID
FROM CTE
GROUP BY Batch_ID
ORDER BY Batch_ID
You can write a CASE statement as below
CASE WHEN DATEPART(HOUR,registrationtime) = 23
THEN DATEPART(DAY,registrationtime)+1
END,
CASE WHEN DATEPART(HOUR,registrationtime) = 23
THEN 0
END

Calculate wrong amount in query

I have a table with some records now want to repeat this table content with some logic. I have two date start date and termination date, means record start from start_date and end on termination date, it will working fine but problem is calculate amount on it,
Logic is amount calculation formula
basesalary / 12 * ( SUTARate / 100 ) * ( x.num+1)
if this amount is less than SUTAMaximumAmount this amount is used, else 0. And one more thing if amount will be remain and year is complete then restart calculation from next year.. x.num is temporary table which hold 90 number from 1 to 90
Table
BaseSalary| S_Date | T_Date | SUTARate| SUTAMaximumAmount |A_S_Percent
48000 | 7-1-2013 | 3-15-2015 | 1.1 | 300 | 5
My result is
DAte amount
2013-07-01 00:00:00.000 44
2013-08-01 00:00:00.000 44
2013-09-01 00:00:00.000 44
2013-10-01 00:00:00.000 44
2013-11-01 00:00:00.000 44
2013-12-01 00:00:00.000 44
2014-01-01 00:00:00.000 36
2014-02-01 00:00:00.000 -8
2014-03-01 00:00:00.000 -52
2014-04-01 00:00:00.000 -96
2014-05-01 00:00:00.000 -140
2014-06-01 00:00:00.000 -184
2014-07-01 00:00:00.000 -228
2014-08-01 00:00:00.000 -272
2014-09-01 00:00:00.000 -316
2014-10-01 00:00:00.000 -360
2014-11-01 00:00:00.000 -404
2014-12-01 00:00:00.000 -448
2015-01-01 00:00:00.000 -492
2015-02-01 00:00:00.000 -536
2015-03-01 00:00:00.000 -580
and I want result like this
Date | Amount
7-1-2013 44
8-1-2013 44
9-1-2013 44
10-1-2013 44
11-1-2013 44
12-1-2013 44
1-1-2014 44
2-1-2014 44
3-1-2014 44
4-1-2014 44
5-1-2014 44
6-1-2014 44
7-1-2014 36
1-1-2015 44
2-1-2015 44
3-1-2015 44
Query
SELECT dateadd(M, (x.num),d.StartDate) AS TheDate,
Round( case when ((convert(float,d.SUTARate)/100* convert(integer,d.BaseSalary) / 12)*(x.num+1)) <=CONVERT(money,d.SUTAMaximumAmount)
then (convert(float,d.SUTARate)/100* convert(integer,d.BaseSalary)* / 12)
else (CONVERT(money,d.SUTAMaximumAmount)-((convert(float,d.SUTARate)/100* (convert(integer,d.BaseSalary) / 12)*x.num)))*Power((1+convert(float,d.AnnualSalaryIncreasePercent)/100),Convert(int,x.num/12)) end, 2) AS Amount,
FROM #Table AS x, myTbl AS d
WHERE (x.num >= 0) AND (x.num <= (DateDiff(M, d.StartDate, d.TerminationDate)) )
temporary table
create TABLE #Table (
num int NOT NULL,
);
;WITH Nbrs ( n ) AS (
SELECT 0 UNION ALL
SELECT 1 + n FROM Nbrs WHERE n < 99 )
INSERT #Table(num)
SELECT n FROM Nbrs
OPTION ( MAXRECURSION 99 )
this table used as x in above query
I created this SQLFiddle.
-- Numbers table is probably a good idea
WITH Nbrs ( num ) AS
(
SELECT 0 UNION ALL
SELECT 1 + num FROM Nbrs WHERE num < 99
)
-- All columns, except for 'num' come from myTbl
SELECT dateadd(M, (num),S_Date) AS TheDate,
Round(
CASE
WHEN (SUTARate / 100) * (BaseSalary / 12) <= SUTAMaximumAmount
THEN (SUTARate / 100) * (BaseSalary / 12)
ELSE 0
END
, 2) As Amount
-- This may be the number you were trying to multiply
,DatePart(Month, dateadd(M, (num),S_Date)) As PotentialMultiiplier
FROM Nbrs AS x, myTbl AS d
WHERE (num >= 0)
AND (num <= (DateDiff(M, S_Date, T_Date)) )
I am not entirely sure what your goal is, but you are probably on the right track with a numbers table. Because the result you are going for does not change much over time (i.e., nearly every month has an amount of $44), it is difficult to determine the correct code for the query. So, I recommend you provide a different set of data for better result-checking.
If you fiddle with the SQL in the provided link, you can re-post with better code, and then we can better solve your issue.

How to convert second into datetime's 108 format hh:mm:ss in sql server without writing function

select v1.*, datediff(ss,v1.dateofchange,v2.dateofchange) as acutaltime
from vActualTime v1 left join vActualTime v2
on v1.rowno=v2.rowno-1
FK_PatientId FK_Status_PatientId DateofChange rowno acutaltime
------------ ------------------- ----------------------- -------------------- -----------
3 16 2010-08-02 15:43:46.000 1 757
3 24 2010-08-02 15:56:23.000 2 96
3 26 2010-08-02 15:57:59.000 3 NULL
I am using Sql server 2005
When I writes this
select v1.*, datediff(mi,v1.dateofchange,v2.dateofchange) as acutaltime,
convert(datetime,datediff(mi,v1.dateofchange,v2.dateofchange),108) as [date]
from vActualTime v1 left join vActualTime v2
on v1.rowno=v2.rowno-1
I gets this
FK_PatientId FK_Status_PatientId DateofChange rowno acutaltime date
------------ ------------------- ----------------------- -------------------- ----------- -----------------------
3 16 2010-08-02 15:43:46.000 1 13 1900-01-14 00:00:00.000
3 24 2010-08-02 15:56:23.000 2 1 1900-01-02 00:00:00.000
3 26 2010-08-02 15:57:59.000 3 NULL NULL
This should have been given as 00-00-000 00:13:00:0000
From what I understand you need to take your calculated minutes (the datediff you're doing) and display that in a time format 108.
This should convert the minutes to a datetime format of 108, i.e. hh:mm:ss
select convert(varchar
,dateadd(minute
, datediff(mi,v1.dateofchange,v2.dateofchange), '00:00:00')
, 108
)
Short answer: You can't.
Long answer:
First, the calculated value in minutes is interpreted as days in the CONVERT function. You'd need to divide by (24 * 60) to actually add minutes.
DATETIME only covers the years 1753 to 9999 (MSDN). Note the statement
Microsoft® SQL Server™ rejects all values it cannot recognize as dates between 1753 and 9999.
Even if you use DATETIME2 (requiring SQL Server 2008?), your date intervall only starts with the year 1. (MSDN)
Sql2008 adds a new data type called TIME, which might solve your problem.