I have a table with duplicate values as shown below.
I would like to find the latest start time among the events. Expected output is
I used the below query but it seems to get the latest start time in the entire table.
SELECT ID,
EVENT,
START_TIME,
LAST_VALUE(START_TIME) OVER (ORDER BY ID,EVENT RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS latest_start_time
FROM
(Select * from EVENTS)
order by ID,EVENT;
I know I am missing something, probably a group by. Can you please help me out. I am using ORACLE.
You just need the MAX analytic function and partition on id and event.
SELECT ID,
EVENT,
START_TIME,
MAX(START_TIME) OVER (PARTITION BY id, event) AS latest_start_time
FROM EVENTS
order by ID,EVENT;
Try
SELECT DISTINCT
ID,
EVENT,
START_TIME,
MAX(START_TIME) OVER (PARTITION BY ID, EVENT) AS latest_start_time
FROM EVENTS
order by ID,EVENT;
Related
I have data like below with two columns, I need an output with new column shown below
Input -
Name,Date,Value
Test1,20200901,55
Test1,20200901,100
Test1,20200901,150
Test1,20200805,25
Test1,20200805,30
Row number is based on data from column - Name and Date
Output,
Name,Date,Value, row_number
Test1,20200901,55,1
Test1,20200901,100,1
Test1,20200901,150,1
Test1,20200805,25,2
Test1,20200805,30,2
The query using Partition didn't help
select *, row_number() over (partition by Date) as Rank from Table
Can someone please help here
Thank you very much
You want dense_rank():
select *,
dense_rank() over (order by Date) as Rank
from Table;
There is something suspicious when you are using partition by without order by (even if the underlying database supports that).
Use dense_rank() - and an order by clause:
select t.*, dense_rank() over (order by Date) as rn from mytable t
This gives you a sequential number that starts at 1 on the earliest date value increments without gaps everytime date changes.
I need help with a query that will return a single record per partition in the below dataset. I used the DENSE_RANK to get the order and first/last position within each partition, but the problem is that I need to get a single record for each EMPLOYEE ITEM_ID combination which contains:
MIN(START) which is date type with time
SUM(DURATION) which is a number type signifying seconds of activity
MIN ranked value from INIT_STATUS
MAX ranked value from FIN_STATUS
Here is the initial data table, the same data table ordered with rank, and the desired result at the end (see image below):
Also, here is the code used to get the ordered table with rank values:
SELECT T.*,
DENSE_RANK() OVER (PARTITION BY T.EMPLOYEE, T.ITEM_ID ORDER BY T.START) AS D_RANK
FROM TEST_DATA T
ORDER BY T.EMPLOYEE, T.ITEM_ID, T.START;
Use first/last option to find statuses. The rest is classic aggregation:
select employee, min(start_), sum(duration),
max(init_status) keep (dense_rank first order by start_),
max(fin_status) keep (dense_rank last order by start_)
from test_data t
group by employee, item_id
order by employee, item_id;
start is a reserved word, so I used start_ for my test.
I have a survey form of certain questions for a certain facility.
the facility can be monitored(data entry) more than once in a month.
now i need the latest data(values) against the questions
but if there is no latest data against any question i will traverse through prior records(previous dates) of the same month.
i can get the latest record but i don't know how to get previous record of the same month id there is no latest data.
i am using PostgreSQL 10.
Table Structure is
Desired output is
You can try to use ROW_NUMBER window function to make it.
SELECT to_char(date, 'MON') month,
facility,
idquestion,
value
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY facility,idquestion ORDER BY DATE DESC) rn
FROM T
) t1
where rn = 1
demo:db<>fiddle
SELECT DISTINCT
to_char(qdate, 'MON'),
facility,
idquestion,
first_value(value) OVER (PARTITION BY facility, idquestion ORDER BY qdate DESC) as value
FROM questions
ORDER BY facility, idquestion
Using window functions:
first_value(value) OVER ... gives you the first value of a window frame. The frame is a group of facility and idquestion. Within this group the rows are ordered by date DESC. So the very last value is first no matter which date it is
DISTINCT filtered the tied values (e.g. there are two values for facility == 1 and idquestion == 7)
Please notice:
"date" is a reserved word in Postgres. I strongly recommend to rename your column to avoid certain trouble. Furthermore in Postgres lower case is used and is recommended.
I have been looking around for 2 days and have not been able to figure out this one. Using dataset below and SQL server 2016 I would like to get the row number of each row by 'id' and 'cat' ordered by 'date' in asc order but would like to see a reset of the sequence if a different value in the 'cat' column for the same 'id' is found(see rows in green). Any help would be appreciated.
This is a gaps and islands problem. The simplest solution in this case is probably a difference of row numbers:
select t.*,
row_number() over (partition by id, cat, seqnum - seqnum_c order by date) as row_num
from (select t.*,
row_number() over (partition by id order by date) as seqnum,
row_number() over (partition by id, cat order by date) as seqnum_c
from t
) t;
Why this works is a bit tricky to explain. But, if you look at the sequence numbers in the subquery, you'll see that the difference defines the groups you want to define.
Note: This assumes that the date column provides a stable sort. You seem to have duplicates in the column. If there really are duplicates and you have no secondary column for sorting, then try rank() or dense_rank() instead of row_number().
I have the following table structure, with daily-hourly data:
time_of_ocurrence(timestamp); particles(numeric)
"2012-11-01 00:30:00";191.3
"2012-11-01 01:30:00";46
...
"2013-01-01 02:30:00";319.6
How do i select the DAILY max and THE HOUR in which this max occur?
I've tried
SELECT date_trunc('hour', time_of_ocurrence) as hora,
MAX(particles)
from my_table WHERE time_of_ocurrence > '2013-09-01'
GROUP BY hora ORDER BY hora
But it doesn't work:
"2013-09-01 00:00:00";34.35
"2013-09-01 01:00:00";33.13
"2013-09-01 02:00:00";33.09
"2013-09-01 03:00:00";28.08
My result would be in this format instead (one max per day, showing the hour)
"2013-09-01 05:00:00";100.35
"2013-09-02 03:30:00";80.13
How can i do that? Thanks!
This type of question has come up on StackOverflow frequently, and these questions are categorized with the greatest-n-per-group tag, if you want to see other solutions.
edit: I changed the following code to group by day instead of by hour.
Here's one solution:
SELECT t.*
FROM (
SELECT date_trunc('day', time_of_ocurrence) as hora, MAX(particles) AS particles
FROM my_table
GROUP BY hora
) AS _max
INNER JOIN my_table AS t
ON _max.hora = date_trunc('day', t.time_of_ocurrence)
AND _max.particles = t.particles
WHERE time_of_ocurrence > '2013-09-01'
ORDER BY time_of_ocurrence;
This might also show more than one result per day, if more than one row has the max value.
Another solution using window functions that does not show such duplicates:
SELECT * FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY date_trunc('day', time_of_ocurrence)
ORDER BY particles DESC) AS _rn
FROM my_table
) AS _max
WHERE _rn = 1
ORDER BY time_of_ocurrence;
If multiple rows have the same max, one row with nevertheless be numbered row 1. If you need specific control over which row is numbered 1, you need to use ORDER BY in the partitioning clause using a unique column to break such ties.
Use window functions:
select distinct
date_trunc('day',time_of_ocurrence) as day,
max(particles) over (partition by date_trunc('day',time_of_ocurrence)) as particles_max_of_day,
first_value(date_trunc('hour',time_of_ocurrence)) over (partition by date_trunc('day',time_of_ocurrence) order by particles desc)
from my_table
order by 1
One edge case here is if the same MAX number of particles show up in the same day, but in different hours. This version would randomly pick one of them. If you prefer one over the other (always the earlier one for example) you can add that to the order by clause:
first_value(date_trunc('hour',time_of_ocurrence)) over (partition by date_trunc('day',time_of_ocurrence) order by particles desc, time_of_ocurrence)