I have data like below with two columns, I need an output with new column shown below
Input -
Name,Date,Value
Test1,20200901,55
Test1,20200901,100
Test1,20200901,150
Test1,20200805,25
Test1,20200805,30
Row number is based on data from column - Name and Date
Output,
Name,Date,Value, row_number
Test1,20200901,55,1
Test1,20200901,100,1
Test1,20200901,150,1
Test1,20200805,25,2
Test1,20200805,30,2
The query using Partition didn't help
select *, row_number() over (partition by Date) as Rank from Table
Can someone please help here
Thank you very much
You want dense_rank():
select *,
dense_rank() over (order by Date) as Rank
from Table;
There is something suspicious when you are using partition by without order by (even if the underlying database supports that).
Use dense_rank() - and an order by clause:
select t.*, dense_rank() over (order by Date) as rn from mytable t
This gives you a sequential number that starts at 1 on the earliest date value increments without gaps everytime date changes.
Related
I need to count each occurrence of an ID in a column as new column.
It should look like this:
I tried it with ROW_NUMBER() OVER (ORDER BY [Id]) as rownum
but it did not work.
Can you guy please help me out?
In order to reset the numbering for each change of ID, you need a PARTITION BY clause.
There appears to be no specific ordering within the partitions, so you can use ORDER BY (SELECT 1). If you have another column you want to order the numbering by, use that instead.
ROW_NUMBER() OVER (PARTITION BY [Id] ORDER BY (SELECT 1)) as rownum
In SQL Server, I am attempting to pull the second latest NOTE_ENTRY_DT_TIME (items highlighted in screenshot). With the query written below it still pulls the latest date (I believe it's because of the grouping but the grouping is required to join later). What is the best method to achieve this?
SELECT
hop.ACCOUNT_ID,
MAX(hop.NOTE_ENTRY_DT_TIME) AS latest_noteid
FROM
NOTES hop
WHERE
hop.GEN_YN IS NULL
AND hop.NOTE_ENTRY_DT_TIME < (SELECT MAX(hope.NOTE_ENTRY_DT_TIME)
FROM NOTES hope
WHERE hop.GEN_YN IS NULL)
GROUP BY
hop.ACCOUNT_ID
Data sample in the table:
One of the "easier" ways to get the Nth row in a group is to use a CTE and ROW_NUMBER:
WITH CTE AS(
SELECT Account_ID,
Note_Entry_Dt_Time,
ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY Note_Entry_Dt_Time DESC) AS RN
FROM dbo.YourTable)
SELECT Account_ID,
Note_Entry_Dt_Time
FROM CTE
WHERE RN = 2;
Of course, if an ACCOUNT_ID only has 1 row, then it will not be returned in the result set.
The OP's statement "The row will not always be 2." from the comments conflicts with their statement "I am attempting to pull the second latest NOTE_ENTRY_DT_TIME" in the question. At a best guess, this means that the OP has rows with the same date, that could be the "latest" date. If so, then would simply need to replace ROW_NUMBER with DENSE_RANK. Their sampple data, however, doesn't suggest this is the case.
You can use window functions:
select *
from (
select
n.*,
row_number() over(partition by account_id order by note_entry_dt_time desc) rn
from notes n
) t
where rn = 2
I have a table with a datetime field ("time") and an int field ("index")
Please see the query and the picture below. I want ROW_NUMBER to count from 1 when the index changes, also if the index value exists in previous rows. The red text indicates the output that I want to get from the query. How can I modify the query to give me the expected results?
The query:
select rv.[time], rv.[index], ROW_NUMBER() OVER(PARTITION BY rv.[index] ORDER BY rv.[time], rv.[index] ASC) AS Row#
from
tbl
This is a gaps-and-islands problem. You need to identify groups of adjacent rows. In this case, I think the simplest method is the difference of row numbers:
select rv.*,
row_number() over (partition by index, (seqnum - seqnum_2) order by time) as row_num
from (select t.*,
row_number() over (order by time) as seqnum,
row_number() over (partition by index order by time) as seqnum_2
from tbl t
) rv;
Why this works is a little tricky to explain. If you look at the results of the subquery, you will see how the difference between the two row number values identifies adjacent values that are the same.
Also, you should not use names like time and index for columns, because these a keywords in SQL. I have not escaped the names in the above query. I encourage you to give your columns and tables names that do not need to be escaped.
I would like help with one of my queries.
Here's the requirement:-
I have to report the records whose difference between current review date and last review date is between than 365 days and more than 455 days. However, the catch here is that my customer table has just one column for the annual review date. So I have to check the historical table to find the current annual review date which in the below example is 30/04/2019 and the last review date is 30/04/2018.
How do I get just 1 line item for each record?
Below is how my table looks like, RNK column is a calculated column to determine the rank for each record, rest columns are from the table. Please help! I use Oracle 12c.
You may use row_number() analytical function for your rnk column as in the following select statement :
select row_number() over (partition by annual_review_date order by update_date) as rnk,
t.*
from tab t;
If I understand correctly, you can use dense_rank():
select t.id, max(annual_review_dt) as latest_ard,
min(annual_review_dt) as prev_ard
from (select t.*,
dense_rank() over (partition by id order by annual_review_dt) as seqnum
from t
) t
where seqnum in (1, 2);
I have been looking around for 2 days and have not been able to figure out this one. Using dataset below and SQL server 2016 I would like to get the row number of each row by 'id' and 'cat' ordered by 'date' in asc order but would like to see a reset of the sequence if a different value in the 'cat' column for the same 'id' is found(see rows in green). Any help would be appreciated.
This is a gaps and islands problem. The simplest solution in this case is probably a difference of row numbers:
select t.*,
row_number() over (partition by id, cat, seqnum - seqnum_c order by date) as row_num
from (select t.*,
row_number() over (partition by id order by date) as seqnum,
row_number() over (partition by id, cat order by date) as seqnum_c
from t
) t;
Why this works is a bit tricky to explain. But, if you look at the sequence numbers in the subquery, you'll see that the difference defines the groups you want to define.
Note: This assumes that the date column provides a stable sort. You seem to have duplicates in the column. If there really are duplicates and you have no secondary column for sorting, then try rank() or dense_rank() instead of row_number().