use Row_number after applying distinct - sql

I am creating an SP which gives some result by applying distinct on it, now I want to implement sever side paging, so I tried using Row_number on distinct result like:
WITH CTE AS
(
SELECT ROW_NUMBER() OVER(ORDER BY tblA.TeamName DESC)
as Row,tblA.TeamId,tblA.TeamName,tblA.CompId,tblA.CompName,tblA.Title,tblA.Thumbnail,tblA.Rank,tblA.CountryId,tblA.CountryName
FROM
(
--The table query starts with SELECT
)tblA
)
SELECT CTE.* FROM CTE
WHERE CTE.Row BETWEEN #StartRowIndex AND #StartRowIndex+#NumRows-1
ORDER BY CTE.CountryName
but rows are first assigned RowNumber then distinct get applied that is why I am getting duplicate values, how to get distinct rows first then get row numbers for the same.
Any solution on this? Am I missing something?
need answer ASAP.
thanks in advance!

Don't you need to add "partition by" to your ROW_NUMBER statement?
ROW_NUMBER() OVER(Partition by ___, ___, ORDER BY tblA.TeamName DESC)
In the blank spaces, place the column names you would like to create a new row number for. Duplicates will receive a number that is NOT 1 so you might not need the distinct.
To gather the unique values you could write a subquery where the stored procedure only grabs the rows with a 1 in them.
select * from
(
your code
) where row = 1
Hope that helps.
I'm not sure why you're doing this:
WHERE CTE.Row BETWEEN #StartRowIndex AND #StartRowIndex+#NumRows-1

Related

Getting the second row in a Table in SQL

I have a view. In this view, I got the row_num based on the productcontracted, InvoiceDate.
Can you please let me know how I can get the second row in each group?
As you've already got the row_number() setup how you want it, then all you need to do is filter on that row_number in the where statement. This often means turning your query into a subquery, something like the below.
select * from
(
<<your main query here>>
)
where row_number = 2
Looking at your data it looks like you have already applied row_number() over in your query.
To get the second row is therefore: row_number = 2 .
According to the highlighted rows you also want the row before last for each partition. To do this you can reverse the order by and then get the second row in each direction.
Your query will be something like the following
with cte as
( InvoiceDate,
ProductContractId,
row_number() over (
partition by ProductContractId
order by InvoiceDate asc) rn_forwards
row_number() over (
partition by ProductContractId
order by InvoiceDate desc) rn_backwards )
select
InvocieDate,
ProductContractId,
rn_forwards,
rn_backwards
from cte
where
rn_forwards = 2
or rn_backwards = 2;

Find the second largest value with Groupings

In SQL Server, I am attempting to pull the second latest NOTE_ENTRY_DT_TIME (items highlighted in screenshot). With the query written below it still pulls the latest date (I believe it's because of the grouping but the grouping is required to join later). What is the best method to achieve this?
SELECT
hop.ACCOUNT_ID,
MAX(hop.NOTE_ENTRY_DT_TIME) AS latest_noteid
FROM
NOTES hop
WHERE
hop.GEN_YN IS NULL
AND hop.NOTE_ENTRY_DT_TIME < (SELECT MAX(hope.NOTE_ENTRY_DT_TIME)
FROM NOTES hope
WHERE hop.GEN_YN IS NULL)
GROUP BY
hop.ACCOUNT_ID
Data sample in the table:
One of the "easier" ways to get the Nth row in a group is to use a CTE and ROW_NUMBER:
WITH CTE AS(
SELECT Account_ID,
Note_Entry_Dt_Time,
ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY Note_Entry_Dt_Time DESC) AS RN
FROM dbo.YourTable)
SELECT Account_ID,
Note_Entry_Dt_Time
FROM CTE
WHERE RN = 2;
Of course, if an ACCOUNT_ID only has 1 row, then it will not be returned in the result set.
The OP's statement "The row will not always be 2." from the comments conflicts with their statement "I am attempting to pull the second latest NOTE_ENTRY_DT_TIME" in the question. At a best guess, this means that the OP has rows with the same date, that could be the "latest" date. If so, then would simply need to replace ROW_NUMBER with DENSE_RANK. Their sampple data, however, doesn't suggest this is the case.
You can use window functions:
select *
from (
select
n.*,
row_number() over(partition by account_id order by note_entry_dt_time desc) rn
from notes n
) t
where rn = 2

Ambiguous column name using row_number() without alias

I'm trying to implement pagination in a query that is built using information from a view, and I need to use the row_number() function over a column when I don't know which table it is from.
SELECT * FROM (
SELECT class.ID as ID, user.ID as USERID, row_number() over (ORDER BY
ID desc) as row_number FROM class, user
) out_q WHERE row_number > #startrow ORDER BY row_number
The problem is that I only have the result column name (ID or USERID) that came from a previous query. If I execute this query, it will raise the error 'Ambiguous column name "ID"'. Is there a way to specify that I'm referencing the column ID that is being selected and not from a different table?
Is it possible to specify an alias to the query result itself?
I have already tried the following,
SELECT TOP 30 * FROM (
SELECT *, row_number() over (ORDER BY ID desc) as row_number FROM(
SELECT class.ID as ID, user.ID as USERID FROM class, user
) in_q
) out_q WHERE row_number > #startrow ORDER BY row_number
It works, but the SGBD gets confused on which query plan it has to use, because of the small row goal present in the outer query and the big set of results returned by the inner query, when #startrow is a small number, the query executes in less than one second, when it is a big number the query takes minutes to execute.
Your problem is the id in the row_number itself. If you want a stable sort, then include both ids:
SELECT *
FROM (SELECT class.ID as ID, user.ID as USERID,
row_number() over (ORDER BY class.ID desc, user.id) as row_number
FROM class CROSS JOIN user
) out_q
WHERE row_number > #startrow
ORDER BY row_number;
I assume the cartesian product is intentional. Sometimes, this indicates an error in the query. In general, I would advise you to avoid using commas in the from clause. If you do want a cartesian product, then be explicit by using CROSS JOIN.
You could try using the option you already tried, then use the OPTIMIZE FOR hint.
OPTION ( OPTIMIZE FOR (#startrow = 100000) );
See a description of the hint in MSDN docs here: https://msdn.microsoft.com/en-us/library/ms181714.aspx.

How can I resolve the distinct issue in SQL Server 2005?

I am trying to get distinct values for my query. I tried like below, but I am not getting proper result, will any one suggest me how to do resolve the issue.
Here the I want to distinct part_id.
http://tinypic.com/view.php?pic=9scx21&s=8#.UupFqT2SzyQ
Thanks in advance.
Why do you think the result is not correct, the rows returned are distinct.
DISTINCT is applied to all the columns, there's nothing like give me a DISTINCT(p.part_id) and don't care about other columns.
What you probably want is a single row for each part.id
If you don't have any rules which row you want to be returned you can go with a ROW_NUMBER:
select *
from
(
select all your columns
, row_number() over (partition by p.partid order by p.part_id) as rn
from ....
where ...
) as dt
where rn = 1
If there are some rules to determine which row should be returned (oldest/newest/whatever) you simply ORDER BY this column DESC instead of ORDER BY p.part
order by part_id;
Change SELECT DISTINCT P.PART_ID FROM.. at begining and add GROUP BY p.part_id at end.
Distinct must be applied for all columns which values are the same so you can add columns but remenber to add thet to GROUP BY also

SQL ROW_NUMBER and sorting issue

In SQL 2005/2008 database we have table BatchMaster. Columns:
RecordId bigint - autoincremental id, BatchNumber bigint - unique non-clustered index, BatchDate). We have sproc that returns paginated data from this table. That sproc works fine for most of the clients, but at one SQL server instance we have problem with records order.
In general, at sproc we do
select * from
(
select row_number() over (order by bm.BatchDate desc, bm.BatchNumber desc) as Row,
*
from dbo.BatchMaster bm with (nolock)
)
where Row between #StartingRow and #EndgingRow
So, as you can notice from the script above we want return records sorted by BatchDate and BatchNumber. That's not gonna happen for one of our client:
Records are in wrong order. Also, notice first column (Row), it is not in ascending order.
Can someone explain why so?
Assuming you want the lowest BatchNumber for a given BatchDate with the smallest Row number and that you want orderer by the Row, try this:
select * from
(
select row_number() over (order by bm.BatchDate desc, bm.BatchNumber asc) as Row,
*
from dbo.BatchMaster bm with (nolock)
)
where Row between #StartingRow and #EndgingRow
order by Row
Your code doesn't actually sort the results, it only sets 'Row' based on the order of BatchDate and Batchnumber and appears to be doing that correctly. You need to add ORDER BY Row to your statement.
Change your query to include a sort in the outermost query
select * from
(
select row_number() over (order by bm.BatchDate desc, bm.BatchNumber desc) as Row,
*
from dbo.BatchMaster bm with (nolock)
)
where Row between #StartingRow and #EndgingRow
order by Row
The ORDER BY clause in your ROW_NUMBER ranking function only applies to calculating the value of that ranking function, it does not actually order the results.
If you would like the records returned in a certain order you will need specify that in your query: ORDER BY [Row]