Return rows depends on timestamp between them - sql

The problem is that I want to get every n'th record from table but based on datetime.
I have table where I add record every 30 minutes with current state for each of my Sub objects, something like below:
Id
SubId
Color
Timestamp
1
7EB43D1D-7274-41C4-35DA-08D727A424E6
orange
2022-06-27 08:00:17.9843893
2
A8FDBB08-3747-4B93-BC66-08D7382060CE
purple
2022-06-27 08:00:17.9843893
3
7EB43D1D-7274-41C4-35DA-08D727A424E6
red
2022-06-27 08:30:15.7043893
4
A8FDBB08-3747-4B93-BC66-08D7382060CE
blue
2022-06-27 08:30:15.7043893
5
7EB43D1D-7274-41C4-35DA-08D727A424E6
yellow
2022-06-27 09:00:18.2841893
6
A8FDBB08-3747-4B93-BC66-08D7382060CE
orange
2022-06-27 09:00:18.2841893
And now I need to get points for one Sub object in certain period. But I dont want to get all entires cause I can end with too many points, I just want to get sometimes 1 per hour or 1 per day (it may change)
I already tried with ROW_NUMBER as I know that I'm adding point every 30 minutes but cause I need add where clausure for SubId then I might end with incorrect result (cause I'm adding or removing those Subobject in meanwhile)
SELECT * FROM (
SELECT [Id]
,[SubId]
,[Color]
,[Timestamp]
, ROW_NUMBER() OVER (ORDER BY OccupancyHistoryId) as rownum
FROM [dbo].[Table]) AS t
WHERE t.SubId = '7EB43D1D-7274-41C4-35DA-08D727A424E6' AND t.rownum % 2 = 0
Am I miss something obviouse? Or maybe my approach is wrong?
Expected result: For e.g records from 2022-06-27 to 2022-06-28 but only 1 per each 2 hours.
Id
SubId
Color
Timestamp
1
7EB43D1D-7274-41C4-35DA-08D727A424E6
orange
2022-06-27 08:00:17.9843893
5
7EB43D1D-7274-41C4-35DA-08D727A424E6
yellow
2022-06-27 10:00:18.2841893
10
7EB43D1D-7274-41C4-35DA-08D727A424E6
orange
2022-06-27 12:00:11.2821893

Thanks to #ourmandave's comments, I was able to resolve the problem. I didn't notice that I can use DATEDIFF with %.
So, to get entries only one per two hours, I simply write the query like that. So, obviously:
SELECT *
FROM [dbo].[Table] WHERE DateDiff(Minute, 0, TimestampUtc) % 120 = 0

Related

Generating columns for daily stats in SQL

I have a table that currently looks like this (simplified to illustate my issue):
Thing| Date
1 2022-12-12
2 2022-11-05
3 2022-11-18
4 2022-12-01
1 2022-11-02
2 2022-11-21
5 2022-12-03
5 2022-12-08
2 2022-11-18
1 2022-11-20
I would like to generate the following:
Thing| 2022-11 | 2022-12
1 2 1
2 3 0
3 1 0
4 0 1
5 0 2
I'm new to SQL and can't quite figure this out - would I use some sort of FOR loop equivalent in my SELECT clause? I'm happy to figure out the exact syntax myself, I just need someone to point me in the right direction.
Thank you!
You may use conditional aggregation as the following:
Select Thing,
Count(Case When Date Between '2022-11-01' And '2022-11-30' Then 1 End) As '2022-11',
Count(Case When Date Between '2022-12-01' And '2022-12-31' Then 1 End) As '2022-12'
From table_name
Group By Thing
Order By Thing
See a demo.
The count function counts only the not null values, so for each row not matching the condition inside the count function a null value is returned, hence not counted.

increase rank based on particular value in column

I would appreciate some help for below issue. I have below table
id
items
1
Product
2
Tea
3
Coffee
4
Sugar
5
Product
6
Rice
7
Wheat
8
Product
9
Beans
10
Oil
I want output like below. Basically I want to increase the rank when item is 'Product'. May I know how can I do that? For data privacy and compliance purposes I have modified the data and column names
id
items
ranks
1
Product
1
2
Tea
1
3
Coffee
1
4
Sugar
1
5
Product
2
6
Rice
2
7
Wheat
2
8
Product
3
9
Beans
3
10
Oil
3
I have tried Lag and lead functions but unable to get expected output
Here is solution using a derived value of 1 or 0 to denote data boundaries SUM'ed up with the ROWS UNBOUNDED PRECEDING option, which is key here.
SELECT
id,
items,
SUM(CASE WHEN items='Product' THEN 1 ELSE 0 END) OVER (ORDER BY id ROWS UNBOUNDED PRECEDING) as ranks
FROM

Select maximum value where another column is used for for the Grouping

I'm trying to join several tables, where one of the tables is acting as a
key-value store, and then after the joins find the maximum value in a
column less than another column. As a simplified example, I have the following three tables:
Documents:
DocumentID
Filename
LatestRevision
1
D1001.SLDDRW
18
2
P5002.SLDPRT
10
Variables:
VariableID
VariableName
1
DateReleased
2
Change
3
Description
VariableValues:
DocumentID
VariableID
Revision
Value
1
2
1
Created
1
3
1
Drawing
1
2
3
Changed Dimension
1
1
4
2021-02-01
1
2
11
Corrected typos
1
1
16
2021-02-25
2
3
1
Generic part
2
3
5
Screw
2
2
4
2021-02-24
I can use the LEFT JOIN/IS NULL thing to get the latest version of
variables relatively easily (see http://sqlfiddle.com/#!7/5982d/3/0).
What I want is the latest version of variables that are less than or equal
to a revision which has a DateReleased, for example:
DocumentID
Filename
Variable
Value
VariableRev
DateReleased
ReleasedRev
1
D1001.SLDDRW
Change
Changed Dimension
3
2021-02-01
4
1
D1001.SLDDRW
Description
Drawing
1
2021-02-01
4
1
D1001.SLDDRW
Description
Drawing
1
2021-02-25
16
1
D1001.SLDDRW
Change
Corrected Typos
11
2021-02-25
16
2
P5002.SLDPRT
Description
Generic Part
1
2021-02-24
4
How do I do this?
I figured this out. Add another JOIN at the start to add in another version of the VariableValues table selecting only the DateReleased variables, then make sure that all the VariableValues Revisions selected are less than this date released. I think the LEFT JOIN has to be added after this table.
The example at http://sqlfiddle.com/#!9/bd6068/3/0 shows this better.

Creating min and max values and comparing them to timestamp values sql

I have a PostgreSQL database and I have a table that I am looking to query to determine which presses have been updated between the first cycle created_timestamp and the most recent cycle created_timestamp. Here is an example of the table, which is called event_log_summary.
press_id cycle_number created_timestamp
1 1 2020-02-07 16:07:52
1 2 2020-02-07 16:07:53
1 3 2020-02-07 16:07:54
1 4 2020-04-01 13:23:10
2 1 2020-01-13 8:33:23
2 2 2020-01-13 8:33:24
2 3 2020-01-13 8:33:25
3 1 2020-02-21 18:45:44
3 2 2020-02-21 18:45:45
3 3 2020-02-26 14:22:12
This is the query that I used to get me a three column output of press_id, mincycle, max_cycle, but then I want to compare the maxcycle created_timestamp to the mincycle created_timestamp and see if there is at least x amount of time between the two, say at least 1 day, I am unsure about how to implement that.
SELECT
press_id,
MIN(cycle_number) AS minCycle,
MAX(cycle_number) AS maxCycle
FROM
event_log_detail
GROUP BY
press_id
I have tried different things like using WHERE (MAX(cycle_number) - MIN(cycle_number > 1), but I am pretty new to SQL and don't quite fully know how to implement this. The output I am looking for, would have a difference of at least one day would be the following:
press_id
1
3
Presses 1 and 3 have their maximum cycle created_timestamp at least 1-day difference than their minimum cycle created_timestamp. I am just looking for the press_ids whose first cycle and the last cycle have a difference of at least 1 day, I don't need any other information on the output, just one column with the press_ids. Any help would be appreciated. Thanks.
You can use a HAVING clause:
select press_id,
max(created_timestamp) - min(created_timestamp) as diff
from event_log_detail
group by press_id
having max(created_timestamp) > min(created_timestamp) + interval '1 day';

how to get a moving average in sql

If I have data from week 1 to week 52 data and I want 4 week Moving Average with 1 week how can I make a SQL query for this? For example, for week 5 I want week1-week4 average, week6 I want week5-week8 average and so on.
I have the columns week and target_value in table A.
Sample data is like this:
Week target_value
1 20
2 10
3 10
4 20
5 60
6 20
So the output I want will start from week 5 as only week 1-week4 is available not before that.
Output data will look like:
Week Output
5 15 (20+10+10+20)/4=15 Moving Average week1-week4
6 25 (10+10+20+60)/4=25 Moving Average week2-week5
The data is in hive but I can move it to oracle if it is simpler to do this there.
SELECT
Week,
(SELECT ISNULL(AVG(B.target_value), A.target_value)
FROM tblA B
WHERE (B.Week < A.Week)
AND B.Week >= (A.Week - 4)
) AS Moving_Average
FROM tblA A
The ISNULL keeps you from getting a null for your first week since there is no week 0. If you want it to be null, then just leave the ISNULL function out.
If you want it to start at week 5 only, then add the following line to the end of the SQL that I wrote:
WHERE A.Week > 4
Results:
Week Moving_Average
1 20
2 20
3 15
4 13
5 15
6 25