SQL ROW_NUMBER and sorting issue - sql

In SQL 2005/2008 database we have table BatchMaster. Columns:
RecordId bigint - autoincremental id, BatchNumber bigint - unique non-clustered index, BatchDate). We have sproc that returns paginated data from this table. That sproc works fine for most of the clients, but at one SQL server instance we have problem with records order.
In general, at sproc we do
select * from
(
select row_number() over (order by bm.BatchDate desc, bm.BatchNumber desc) as Row,
*
from dbo.BatchMaster bm with (nolock)
)
where Row between #StartingRow and #EndgingRow
So, as you can notice from the script above we want return records sorted by BatchDate and BatchNumber. That's not gonna happen for one of our client:
Records are in wrong order. Also, notice first column (Row), it is not in ascending order.
Can someone explain why so?

Assuming you want the lowest BatchNumber for a given BatchDate with the smallest Row number and that you want orderer by the Row, try this:
select * from
(
select row_number() over (order by bm.BatchDate desc, bm.BatchNumber asc) as Row,
*
from dbo.BatchMaster bm with (nolock)
)
where Row between #StartingRow and #EndgingRow
order by Row

Your code doesn't actually sort the results, it only sets 'Row' based on the order of BatchDate and Batchnumber and appears to be doing that correctly. You need to add ORDER BY Row to your statement.

Change your query to include a sort in the outermost query
select * from
(
select row_number() over (order by bm.BatchDate desc, bm.BatchNumber desc) as Row,
*
from dbo.BatchMaster bm with (nolock)
)
where Row between #StartingRow and #EndgingRow
order by Row

The ORDER BY clause in your ROW_NUMBER ranking function only applies to calculating the value of that ranking function, it does not actually order the results.
If you would like the records returned in a certain order you will need specify that in your query: ORDER BY [Row]

Related

Getting the second row in a Table in SQL

I have a view. In this view, I got the row_num based on the productcontracted, InvoiceDate.
Can you please let me know how I can get the second row in each group?
As you've already got the row_number() setup how you want it, then all you need to do is filter on that row_number in the where statement. This often means turning your query into a subquery, something like the below.
select * from
(
<<your main query here>>
)
where row_number = 2
Looking at your data it looks like you have already applied row_number() over in your query.
To get the second row is therefore: row_number = 2 .
According to the highlighted rows you also want the row before last for each partition. To do this you can reverse the order by and then get the second row in each direction.
Your query will be something like the following
with cte as
( InvoiceDate,
ProductContractId,
row_number() over (
partition by ProductContractId
order by InvoiceDate asc) rn_forwards
row_number() over (
partition by ProductContractId
order by InvoiceDate desc) rn_backwards )
select
InvocieDate,
ProductContractId,
rn_forwards,
rn_backwards
from cte
where
rn_forwards = 2
or rn_backwards = 2;

How to pick first record from the duplicates, With only duplicate column values

Here is the situation where I have a table in bigquery like following.
As in the table we have record 1 and 3 with the same id but different first_name (Say the person with the id one changed his first_name) all other fields are same in both of the records (1 and 3) Now I need to select one records out of those 2 how can I do that. I tried self join but that is discarding both of the records, group_by will not work because the records is not duplicate only the Id is duplicate same with the distinct.
Thanks!!!!
The query I am using right now is
select * from table t group by 1,2,3,4,5;
You Can use ROW_NUMBER function to assign row numbers to each of your records in the table.
select *
from(
select *, ROW_NUMBER() OVER(PARTITION BY t.id) rn
from t)
Where rn = 1
ROW_NUMBER does not require the ORDER BY clause. Returns the sequential row ordinal (1-based) of each row for each ordered partition. If the ORDER BY clause is unspecified then the result is non-deterministic.
If you have record created date or modified dates you can use those in the ORDER BY clause to alway pick up the latest records.
SQL tables represent unordered sets. There is no first row unless you have a column that specifies the ordering. Let me assume you have such a column.
If you want a particular row, you can use aggregation with an order by:
select array_agg(t order by ? asc limit 1)[ordinal(1)].*
from t
group by id;
? is the column that specifies the ordering.
You can also leave out the order by:
select array_agg(t limit 1)[ordinal(1)].*
from t
group by id;

MSSQL: Why won't ROW_NUMBER give me expected results?

I have a table with a datetime field ("time") and an int field ("index")
Please see the query and the picture below. I want ROW_NUMBER to count from 1 when the index changes, also if the index value exists in previous rows. The red text indicates the output that I want to get from the query. How can I modify the query to give me the expected results?
The query:
select rv.[time], rv.[index], ROW_NUMBER() OVER(PARTITION BY rv.[index] ORDER BY rv.[time], rv.[index] ASC) AS Row#
from
tbl
This is a gaps-and-islands problem. You need to identify groups of adjacent rows. In this case, I think the simplest method is the difference of row numbers:
select rv.*,
row_number() over (partition by index, (seqnum - seqnum_2) order by time) as row_num
from (select t.*,
row_number() over (order by time) as seqnum,
row_number() over (partition by index order by time) as seqnum_2
from tbl t
) rv;
Why this works is a little tricky to explain. If you look at the results of the subquery, you will see how the difference between the two row number values identifies adjacent values that are the same.
Also, you should not use names like time and index for columns, because these a keywords in SQL. I have not escaped the names in the above query. I encourage you to give your columns and tables names that do not need to be escaped.

use Row_number after applying distinct

I am creating an SP which gives some result by applying distinct on it, now I want to implement sever side paging, so I tried using Row_number on distinct result like:
WITH CTE AS
(
SELECT ROW_NUMBER() OVER(ORDER BY tblA.TeamName DESC)
as Row,tblA.TeamId,tblA.TeamName,tblA.CompId,tblA.CompName,tblA.Title,tblA.Thumbnail,tblA.Rank,tblA.CountryId,tblA.CountryName
FROM
(
--The table query starts with SELECT
)tblA
)
SELECT CTE.* FROM CTE
WHERE CTE.Row BETWEEN #StartRowIndex AND #StartRowIndex+#NumRows-1
ORDER BY CTE.CountryName
but rows are first assigned RowNumber then distinct get applied that is why I am getting duplicate values, how to get distinct rows first then get row numbers for the same.
Any solution on this? Am I missing something?
need answer ASAP.
thanks in advance!
Don't you need to add "partition by" to your ROW_NUMBER statement?
ROW_NUMBER() OVER(Partition by ___, ___, ORDER BY tblA.TeamName DESC)
In the blank spaces, place the column names you would like to create a new row number for. Duplicates will receive a number that is NOT 1 so you might not need the distinct.
To gather the unique values you could write a subquery where the stored procedure only grabs the rows with a 1 in them.
select * from
(
your code
) where row = 1
Hope that helps.
I'm not sure why you're doing this:
WHERE CTE.Row BETWEEN #StartRowIndex AND #StartRowIndex+#NumRows-1

How do I use ROW_NUMBER()?

I want to use the ROW_NUMBER() to get...
To get the max(ROW_NUMBER()) --> Or i guess this would also be the count of all rows
I tried doing:
SELECT max(ROW_NUMBER() OVER(ORDER BY UserId)) FROM Users
but it didn't seem to work...
To get ROW_NUMBER() using a given piece of information, ie. if I have a name and I want to know what row the name came from.
I assume it would be something similar to what I tried for #1
SELECT ROW_NUMBER() OVER(ORDER BY UserId) From Users WHERE UserName='Joe'
but this didn't work either...
Any Ideas?
For the first question, why not just use?
SELECT COUNT(*) FROM myTable
to get the count.
And for the second question, the primary key of the row is what should be used to identify a particular row. Don't try and use the row number for that.
If you returned Row_Number() in your main query,
SELECT ROW_NUMBER() OVER (Order by Id) AS RowNumber, Field1, Field2, Field3
FROM User
Then when you want to go 5 rows back then you can take the current row number and use the following query to determine the row with currentrow -5
SELECT us.Id
FROM (SELECT ROW_NUMBER() OVER (ORDER BY id) AS Row, Id
FROM User ) us
WHERE Row = CurrentRow - 5
Though I agree with others that you could use count() to get the total number of rows, here is how you can use the row_count():
To get the total no of rows:
with temp as (
select row_number() over (order by id) as rownum
from table_name
)
select max(rownum) from temp
To get the row numbers where name is Matt:
with temp as (
select name, row_number() over (order by id) as rownum
from table_name
)
select rownum from temp where name like 'Matt'
You can further use min(rownum) or max(rownum) to get the first or last row for Matt respectively.
These were very simple implementations of row_number(). You can use it for more complex grouping. Check out my response on Advanced grouping without using a sub query
If you need to return the table's total row count, you can use an alternative way to the SELECT COUNT(*) statement.
Because SELECT COUNT(*) makes a full table scan to return the row count, it can take very long time for a large table. You can use the sysindexes system table instead in this case. There is a ROWS column that contains the total row count for each table in your database. You can use the following select statement:
SELECT rows FROM sysindexes WHERE id = OBJECT_ID('table_name') AND indid < 2
This will drastically reduce the time your query takes.
You can use this for get first record where has clause
SELECT TOP(1) * , ROW_NUMBER() OVER(ORDER BY UserId) AS rownum
FROM Users
WHERE UserName = 'Joe'
ORDER BY rownum ASC
ROW_NUMBER() returns a unique number for each row starting with 1. You can easily use this by simply writing:
ROW_NUMBER() OVER (ORDER BY 'Column_Name' DESC) as ROW_NUMBER
May not be related to the question here. But I found it could be useful when using ROW_NUMBER -
SELECT *,
ROW_NUMBER() OVER (ORDER BY (SELECT 100)) AS Any_ID
FROM #Any_Table
select
Ml.Hid,
ml.blockid,
row_number() over (partition by ml.blockid order by Ml.Hid desc) as rownumber,
H.HNAME
from MIT_LeadBechmarkHamletwise ML
join [MT.HAMLE] h on ML.Hid=h.HID
SELECT num, UserName FROM
(SELECT UserName, ROW_NUMBER() OVER(ORDER BY UserId) AS num
From Users) AS numbered
WHERE UserName='Joe'
You can use Row_Number for limit query result.
Example:
SELECT * FROM (
select row_number() OVER (order by createtime desc) AS ROWINDEX,*
from TABLENAME ) TB
WHERE TB.ROWINDEX between 0 and 10
--
With above query, I will get PAGE 1 of results from TABLENAME.
If you absolutely want to use ROW_NUMBER for this (instead of count(*)) you can always use:
SELECT TOP 1 ROW_NUMBER() OVER (ORDER BY Id)
FROM USERS
ORDER BY ROW_NUMBER() OVER (ORDER BY Id) DESC
Need to create virtual table by using WITH table AS, which is mention in given Query.
By using this virtual table, you can perform CRUD operation w.r.t row_number.
QUERY:
WITH table AS
-
(SELECT row_number() OVER(ORDER BY UserId) rn, * FROM Users)
-
SELECT * FROM table WHERE UserName='Joe'
-
You can use INSERT, UPDATE or DELETE in last sentence by in spite of SELECT.
SQL Row_Number() function is to sort and assign an order number to data rows in related record set. So it is used to number rows, for example to identify the top 10 rows which have the highest order amount or identify the order of each customer which is the highest amount, etc.
If you want to sort the dataset and number each row by seperating them into categories we use Row_Number() with Partition By clause. For example, sorting orders of each customer within itself where the dataset contains all orders, etc.
SELECT
SalesOrderNumber,
CustomerId,
SubTotal,
ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY SubTotal DESC) rn
FROM Sales.SalesOrderHeader
But as I understand you want to calculate the number of rows of grouped by a column. To visualize the requirement, if you want to see the count of all orders of the related customer as a seperate column besides order info, you can use COUNT() aggregation function with Partition By clause
For example,
SELECT
SalesOrderNumber,
CustomerId,
COUNT(*) OVER (PARTITION BY CustomerId) CustomerOrderCount
FROM Sales.SalesOrderHeader
This query:
SELECT ROW_NUMBER() OVER(ORDER BY UserId) From Users WHERE UserName='Joe'
will return all rows where the UserName is 'Joe' UNLESS you have no UserName='Joe'
They will be listed in order of UserID and the row_number field will start with 1 and increment however many rows contain UserName='Joe'
If it does not work for you then your WHERE command has an issue OR there is no UserID in the table. Check spelling for both fields UserID and UserName.