Alternative to using ROW_NUMBER for better performance

Alternative to using ROW_NUMBER for better performance - sql

I have a small query below where it outputs a row number under the RowNumber column based on partitioning the 'LegKey' column and ordering by UpdateID desc. This is so the latest updated row (UpdateID) per legkey is always number 1
SELECT *
, ROW_NUMBER() OVER(PARTITION BY LegKey ORDER BY UpdateID DESC) AS RowNumber
FROM Data.Crew
Data outputted:
UpdateID LegKey OriginalSourceTableID UpdateReceived RowNumber
7359 6641 11 2016-08-22 16:35:27.487 1
7121 6641 11 2016-08-15 00:00:47.220 2
8175 6642 11 2016-08-22 16:35:27.487 1
7122 6642 11 2016-08-15 00:00:47.220 2
8613 6643 11 2016-08-22 16:35:27.487 1
7123 6643 11 2016-08-15 00:00:47.220 2
The problem I have with this method is that I am getting slow performance because I assume I am using the ORDER BY.
My question is that is there an alternative way to produce a similar result but have my query run faster? I am thinking a MAX() may work but I didn't get the same output as before. Maybe I did the MAX() statement incorrectly so was wondering if this is a good alternative if somebody can provide an example on how they would write the MAX() statement for this example?
Thank you

Presumably this is the query you want to optimize:
SELECT c.*
FROM (SELECT c.*,
ROW_NUMBER() OVER (PARTITION BY LegKey ORDER BY UpdateID DESC) AS RowNumber
FROM Data.Crew c
) c
WHERE RowNumber = 1;
Try an index on Crew(LegKey, UpdateId).
This index will also be used if you do:
SELECT c.*
FROM Data.Crew c
WHERE c.UpdateId = (SELECT MAX(c2.UpdateId)
FROM Data.Crew c2
WHERE c2.LegKey = c.LegKey
);

You can try one of the following:
declare #Table table(UpdateID int, LegKey int, OriginalSourceTableID int, UpdateReceived datetime)
Here using the MAX Date in subquery.
select * from #Table as a where a.UpdateReceived = (Select MAX(UpdateReceived) from #Table as b Where b.LegKey = a.LegKey)
Here you can use it in cte with group by.
with MaxDate as( Select LegKey, Max(UpdateReceived) as MaxDate from #Table group by LegKey )
select * from MaxDate as a
inner join #Table as b
on b.LegKey=a.LegKey
and b.UpdateReceived=a.MaxDate

Related

How to get second row value by player in SQL Server and insert it in other table

how can i get second row value by player in SQL Server and insert it in other table.
For example i will have table like this:
PlayerId VIpLevel StartDate
1 1 2000-01-01 00:10
1 4 2001-01-01 00:10
1 5 2001-01-11 00:10
2 1 2000-01-01 00:10
2 3 2000-01-02 00:10
2 7 2000-05-02 00:10
So i want to get for player 1 and player 2 their second VipLevel order by StarDate DESC.
So far i find this, but it's not good for me
UPDATE #Results
SET [PreviousVIPLevel] = (
SELECT
ROW_NUMBER() OVER(ORDER BY StartDate DESC) AS RowNum,
PlayerId,
VIPLevelId
FROM #table
) foo
WHERE RowNum =2

I assume you have a playerid in your #Results table so you can update each player's record with their 2nd highest level. Then you need to use partition by in your row_number function and join accordingly:
UPDATE A
SET PreviousVIPLevel= B.VIPLevelId
FROM #Results A
JOIN (SELECT
ROW_NUMBER() OVER(PARTITION BY PlayerId ORDER BY StartDate DESC) AS RowNum,
PlayerId,
VIPLevelId
FROM #table
) B ON A.PLayerId = B.PlayerId AND B.RowNum = 2
Your current query cannot work for multiple reasons. First, you cannot update a field selecting multiple columns. That's why I used a join instead. Second, if you were able to get yours to work, it would update all records to the same value since you are missing the partition by clause.

There is actually no update involved. It's an insert statement. Assuming the destination table is created:
WITH CTE AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY PlayerId ORDER BY StartDate DESC) AS RowNum,
PlayerId,
VIPLevelId
FROM #table
)
INSERT INTO Destination
SELECT * FROM CTE WHERE RowNum = 2

Oracle partition using 2 columns

I have a table as like below
Id RC_CLASS RC_DATE RC_TYPE
14 FI-321619 22-Jan-16 S
14 FI-399481 29-Jan-16 D
14 FI-321619 20-Jan-17 S
Here is what i tried
SELECT *
FROM (SELECT rc.*,
RANK() OVER (PARTITION BY ID,RC_CLASS order by rc__date) AS LATEST_VERSION
FROM table
)
WHERE LATEST_VERSION = 1
ORDER BY rc_vendorid;
Expected output
Id RC_CLASS RC_DATE RC_TYPE
14 FI-399481 29-Jan-16 D
14 FI-321619 20-Jan-17 S
I wanna group by ID and Class and bring top one sort by the RC_DATE. What i am getting is always the top one based on date, partition is not working here. What is missing?

I think you are very close. Basically, you just need a descending sort to get the latest version:
SELECT rc.*
FROM (SELECT rc.*,
RANK() OVER (PARTITION BY ID, RC_CLASS ORDER BY rc_date DESC) AS LATEST_VERSION
FROM table rc
) rc
WHERE LATEST_VERSION = 1
ORDER BY rc_vendorid;
I note that you use RANK() for this. This can return duplicates, if you have two rows on the same date. If that is not desirable, you can use ROW_NUMBER() which would arbitrarily choose one (if all the other keys are the same).

SQL Server query distinct

I'm trying to do a query in SQL Server 2008. This is the structure of my table
Recno ClientNo ServiceNo
---------------------------------
1234 17 27
2345 19 34
3456 20 33
4567 17 34
I'm trying to select RecNo however, filtering by distinct ClientNo, so for some clients such as client no 17 - they have more than 1 entry, I'm trying to count that client only once. So basically, looking at this table, I'm only supposed to see 3 RecNo's, since there are only 3 distinct clients. Please help
Select RecNo, Count(ClientNo)
from TblA
where Count(clientNo)<2
Something like this?
EDIT:
The value of RecNo is not relevant, I only need to have an accurate number of records. In this case, I'd like to have 3 records.

oaky you are getting some crazy answers probably becuase your desired result is not clear so I suggest if some of these are not what you need that you clarify your desired result.
If you want the answer 3, I can only assume you want a count of DISTINCT ClientNo's if so it is simply aggregation.
SELECT COUNT(DISTINCT ClientNo) as ClientNoDistinctCount
FROM
TblA
GROUP BY
ClientNo

Ok, this will give you the count that you want:
WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY ClientNo ORDER BY Recno)
FROM TblA
)
SELECT COUNT(DISTINCT Recno) N
FROM CTE
WHERE RN = 1;

Try this..
;with cte1
As(SELECT Recno,clientno
,row_number() over(partition by clientno order by Recno )RNO FROM TblA)
Select Recno,clientno
From cre1 where RNO=1

Choose only ClientNo having the max Recno (or replace < with > to choose the min one).
Select *
from TblA t1
where not exists(select 1
from TblA t2
where t1.ClientNo = t2.ClientNo and t1.Recno < t2.Recno )
BTW, the other solution already mentioned, utilizing row_number() needs no CTE in this case
SELECT TOP(1) WITH TIES *
FROM TblA
ORDER BY ROW_NUMBER() OVER(PARTITION BY ClientNo ORDER BY Recno)

Getting all fields from table filtered by MAX(Column1)

I have table with some data, for example
ID Specified TIN Value
----------------------
1 0 tin1 45
2 1 tin1 34
3 0 tin2 23
4 3 tin2 47
5 3 tin2 12
I need to get rows with all fields by MAX(Specified) column. And if I have few row with MAX column (in example ID 4 and 5) i must take last one (with ID 5)
finally the result must be
ID Specified TIN Value
-----------------------
2 1 tin1 34
5 3 tin2 12

This will give the desired result with using window function:
;with cte as(select *, row_number(partition by tin order by specified desc, id desc) as rn
from tablename)
select * from cte where rn = 1

Edit: Updated query after question edit.
Here is the fiddle
http://sqlfiddle.com/#!9/20e1b/1/0
SELECT * FROM TBL WHERE ID IN (
SELECT max(id) FROM
TBL WHERE SPECIFIED IN
(SELECT MAX(SPECIFIED) FROM TBL
GROUP BY TIN)
group by specified)
I am sure we can simplify it further, but this will work.
select * from tbl where id =(
SELECT MAX(ID) FROM
tbl where specified =(SELECT MAX(SPECIFIED) FROM tbl))

One method is to use window functions, row_number():
select t.*
from (select t.*, row_number() over (partition by tim
order by specified desc, id desc
) as seqnum
from t
) t
where seqnum = 1;
However, if you have an index on tin, specified id and on id, the most efficient method is:
select t.*
from t
where t.id = (select top 1 t2.id
from t t2
where t2.tin = t.tin
order by t2.specified desc, id desc
);
The reason this is better is that the index will be used for the subquery. Then the index will be used for the outer query as well. This is highly efficient. Although the index will be used for the window functions; the resulting execution plan probably requires scanning the entire table.

sql row difference without cursor

I'm trying to get date gap(in days) between rows.
For example my data is ordered by saleDate and looks like the bellow:
ID | saleDate ID | gapInDays
10 | 1/1/2014 10 | 4 -- (5/1/2014 - 1/1/2014).Days
20 | 5/1/2014 20 | 2
30 | 7/1/2014 ====>>> 30 | 3
40 | 10/1/2014 40 | 7
50 | 17/1/2014 50 | 1 -- last row will always be 1
doing it in code is not a big deal but because the amount of row is huge (few millions) I'm trying to do so in SP level. I assume I can use cursor but i understood it is very slow.
Any solution will be highly appreciated.
Pini.

If you are using SQL SERVER 2012/Oracle/Postgres/DB2, then you have LEAD(), LAG() Functions.
select ID,saleDate,LEAD(saleDate) over (order by saleDate) DateOfNextRow
,Isnull(Datediff(dd,saleDate,LEAD(saleDate) over (order by saleDate)),1) as gapInDays
from Order
For SQL SERVER 2005/2008, you can use Window Functions like ROW_NUMBER().

If you are using MS SQL Server 2012 (or another database that supports the same, or similar functions) you can use the LAG() function to access previous rows (or LEAD() to access subsequent rows)
Apparently you want this to work on SQL Azure that lacks theLAGandLEADwindowing functions.
One solution that should work is to use theROW_NUMBERranking function applied over the date column. Azure supports theROW_NUMBERso this code should work:
select t1.id, isnull(datediff(day, t1.saledate, t2.saledate), 1) as gapInDays
from
(select id, saledate, rn = row_number() over (order by saledate, id) from gaps) t1
left join
(select id, saledate, rn = row_number() over (order by saledate, id) from gaps) t2
on t1.rn = t2.rn-1
If you want it slightly more compact (and if Azure supports ctes which I believe it does) you can do it as a common table expression:
;with c as (
select id, saledate, r = row_number() over (order by saledate, id) from gaps
)
select c.id, isnull(datediff(day, c.saledate, c2.saledate), 1) as gapInDays
from c left join c c2 on c.r = c2.rn-1
In these queries I ordered the rows by saledate, if that is incorrect you might have to change it to order by id, saledate instead if it is the id that determines order.

If your ids are strictly sequential you could do something like this
select
a.id, b.saleDate - a.saleDate
from
yourTable as a, yourTable as b
where
a.id = b.id-1

If the database is SQL Server then following query should work.
WITH Sales AS
(
SELECT
*, ROW_NUMBER() OVER (ORDER BY SaleDate) AS RowNumber
FROM
TableName
)
SELECT
DATEDIFF(DAY, T1.SaleDate, T2.SaleDate)
FROM
Sales AS T1 INNER JOIN Sales AS T2
ON T1.RowNumber = T2.RowNumber - 1;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Alternative to using ROW_NUMBER for better performance - sql

Related

How to get second row value by player in SQL Server and insert it in other table

Oracle partition using 2 columns

SQL Server query distinct

Getting all fields from table filtered by MAX(Column1)

sql row difference without cursor

Categories

Resources