SQL Server 2008 paging methods? - sql

I have to work with a potentially large list of records and I've been Googling for ways to avoid selecting the whole list, instead I want to let users select a page (like from 1 to 10) and display the records accordingly.
Say, for 1000 records I will have 100 pages of 10 records each and the most recent 10 records will be displayed first then if the user click on page 5, it will show records from 41 to 50.
Is it a good idea to add a row number to each record then query based on row number? Is there a better way of achieving the paging result without too much overhead?
So far those methods as described here look the most promising:
http://developer.berlios.de/docman/display_doc.php?docid=739&group_id=2899
http://www.codeproject.com/KB/aspnet/PagingLarge.aspx

The following T-SQL stored procedure is a very efficient implementation of paging. THE SQL optimiser can find the first ID very fast. Combine this with the use of ROWCOUNT, and you have an approach that is both CPU-efficient and read-efficient. For a table with a large number of rows, it certainly beats any approach that I've seen using a temporary table or table variable.
NB: I'm using a sequential identity column in this example, but the code works on any column suitable for page sorting. Also, sequence breaks in the column being used don't affect the result as the code selects a number of rows rather than a column value.
EDIT: If you're sorting on a column with potentially non-unique values (eg LastName), then add a second column to the Order By clause to make the sort values unique again.
CREATE PROCEDURE dbo.PagingTest
(
#PageNumber int,
#PageSize int
)
AS
DECLARE #FirstId int, #FirstRow int
SET #FirstRow = ( (#PageNumber - 1) * #PageSize ) + 1
SET ROWCOUNT #FirstRow
-- Add check here to ensure that #FirstRow is not
-- greater than the number of rows in the table.
SELECT #FirstId = [Id]
FROM dbo.TestTable
ORDER BY [Id]
SET ROWCOUNT #PageSize
SELECT *
FROM dbo.TestTable
WHERE [Id] >= #FirstId
ORDER BY [Id]
SET ROWCOUNT 0
GO

If you use a CTE with two row_number() columns - one sorted asc, one desc, you get row numbers for paging as well as the total records by adding the two row_number columns.
create procedure get_pages(#page_number int, #page_length int)
as
set nocount on;
with cte as
(
select
Row_Number() over (order by sort_column desc) as row_num
,Row_Number() over (order by sort_column) as inverse_row_num
,id as cte_id
From my_table
)
Select
row_num+inverse_row_num as total_rows
,*
from CTE inner join my_table
on cte_id=df_messages.id
where row_num between
(#page_number)*#page_length
and (#page_number+1)*#page_length
order by rownumber

Using OFFSET
Others have explained how the ROW_NUMBER() OVER() ranking function can be used to perform pages. It's worth mentioning that SQL Server 2012 finally included support for the SQL standard OFFSET .. FETCH clause:
SELECT first_name, last_name, score
FROM players
ORDER BY score DESC
OFFSET 40 ROWS FETCH NEXT 10 ROWS ONLY
If you're using SQL Server 2012 and backwards-compatibility is not an issue, you should probably prefer this clause as it will be executed more optimally by SQL Server in corner cases.
Using the SEEK Method
There is an entirely different, much faster way to perform paging in SQL. This is often called the "seek method" as described in this blog post here.
SELECT TOP 10 first_name, last_name, score
FROM players
WHERE (score < #previousScore)
OR (score = #previousScore AND player_id < #previousPlayerId)
ORDER BY score DESC, player_id DESC
The #previousScore and #previousPlayerId values are the respective values of the last record from the previous page. This allows you to fetch the "next" page. If the ORDER BY direction is ASC, simply use > instead.
With the above method, you cannot immediately jump to page 4 without having first fetched the previous 40 records. But often, you do not want to jump that far anyway. Instead, you get a much faster query that might be able to fetch data in constant time, depending on your indexing. Plus, your pages remain "stable", no matter if the underlying data changes (e.g. on page 1, while you're on page 4).
This is the best way to implement paging when lazy loading more data in web applications, for instance.
Note, the "seek method" is also called keyset paging.

Try something like this:
declare #page int = 2
declare #size int = 10
declare #lower int = (#page - 1) * #size
declare #upper int = (#page ) * #size
select * from (
select
ROW_NUMBER() over (order by some_column) lfd,
* from your_table
) as t
where lfd between #lower and #upper
order by some_column

Here's an updated version of #RoadWarrior's code, using TOP. Performance is identical, and extremely fast. Make sure you have an index on TestTable.ID
CREATE PROC dbo.PagingTest
#SkipRows int,
#GetRows int
AS
DECLARE #FirstId int
SELECT TOP (#SkipRows)
#FirstId = [Id]
FROM dbo.TestTable
ORDER BY [Id]
SELECT TOP (#GetRows) *
FROM dbo.TestTable
WHERE [Id] >= #FirstId
ORDER BY [Id]
GO

Try this
Declare #RowStart int, #RowEnd int;
SET #RowStart = 4;
SET #RowEnd = 7;
With MessageEntities As
(
Select ROW_NUMBER() Over (Order By [MESSAGE_ID]) As Row, [MESSAGE_ID]
From [TBL_NAFETHAH_MESSAGES]
)
Select m0.MESSAGE_ID, m0.MESSAGE_SENDER_NAME,
m0.MESSAGE_SUBJECT, m0.MESSAGE_TEXT
From MessageEntities M
Inner Join [TBL_NAFETHAH_MESSAGES] m0 on M.MESSAGE_ID = m0.MESSAGE_ID
Where M.Row Between #RowStart AND #RowEnd
Order By M.Row Asc
GO

Why not to use recommended solution:
SELECT VALUE product FROM
AdventureWorksEntities.Products AS product
order by product.ListPrice SKIP #skip LIMIT #limit

Related

SQL Server recent rows

I'm sure this is easy but I have googled a lot and searched.
Ok, I have a table WITHOUT dates etc with 100000000000000000 records.
I want to see the latest entries, i.e.
Select top 200 *
from table
BUT I want to see the latest entries. Is there a rowidentifier that I could use in a table?
ie
select top 200 *
from table
order by rowidentifer Desc
Thanks
Is there a row.identifier that i could use in a table ie set top 200 * from table order by row.identifer Desc
As already stated in the comment's, there is not. The best way is having an identity, timestamp or some other form of identifying the record. Here is an alternative way using EXCEPT to get what you need, but the execution plan isn't the best... Play around with it and change as needed.
--testing purposes...
DECLARE #tbl TABLE(FirstName VARCHAR(50))
DECLARE #count INT = 0
WHILE (#count <= 12000)
BEGIN
INSERT INTO #tbl(FirstName)
SELECT
'RuPaul ' + CAST(#count AS VARCHAR(5))
SET #count += 1
END
--adjust how many records you would like, example is 200
SELECT *
FROM #tbl
EXCEPT(SELECT TOP (SELECT COUNT(*) - 200 FROM #tbl) * FROM #tbl)
--faster than above
DECLARE #tblCount AS INT = (SELECT COUNT(*) FROM #tbl) - 200
SELECT *
FROM #tbl
EXCEPT(SELECT TOP (#tblCount) * FROM #tbl)
On another note, you could create another Table Variable that has an ID and other columns, then you could insert the records you would need. Then you can perform other operations against the table, for example OrderBy etc...
What you could do
ALTER TABLE TABLENAME
ADD ID INT IDENTITY
This will add another column to the table "ID" and automatically give it an ID. Then you have an identifier you can use...
Nope, in short, there is none, if you don`t have a column dedicated as one (ie. an IDENTITY, or a SEQUENCE, or something similar). If you did, then you could get an ordered result back.

SQL stored procedure SET output param using COUNT(*) ON a CTE

I'm using a stored procedure with a CTE and doing some paging. I also want to return an output parameter with the total count of the returned query before my paging.
My problem is that I get an error that "OrderedSet" is not a valid object name.
#ft INT,
#page INT,
#pagesize INT,
#count INT OUTPUT
AS
BEGIN
DECLARE #offset INT
SET #offset = #page * #pagesize
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
WITH OrderedSet AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY Id DESC) AS 'Index'
FROM tbl_BulkUploadFiles buf
WHERE
buf.FileType = #ft )
SELECT * FROM OrderedSet WHERE [Index] BETWEEN #offset AND (#offset + #pagesize)
SET #count = (SELECT COUNT(*) FROM OrderedSet)
END
So my issue is on the last line, error is that last OrderedSet is not a valid object name.
Thanks in advance for any help!
Here are 2 approaches that avoid copying and pasting all the CTEs multiple times.
Return total rows as column of result set
Benefit here is that you can calculate total rows without multiple queries and temp tables, but you have to add logic to your front end to get the total row count from the first row of the result set before iterating over it to display the paged set. Another consideration is that you must account for no rows being returned, so set your total row count to 0 if no rows returned.
;WITH OrderedSet AS (
SELECT
*,
ROW_NUMBER() OVER (ORDER BY Id DESC) AS Seq,
ROW_NUMBER() OVER (ORDER BY Id) AS SeqRev
FROM tbl_BulkUploadFiles buf
WHERE buf.FileType = #ft
)
SELECT *, Seq + SeqRev - 1 AS [TotalCount]
FROM OrderedSet
WHERE Seq BETWEEN #offset AND (#offset + #pagesize)
Utilize a temp table
While there is a cost of a temp table, if your database instance follows best practices for tempdb (multiple files for multi-cores, reasonable initial size, etc), 200k rows may not be a big deal since the context is lost after the stored proc completes, so the 200k rows don't exist for too long. However, it does present challenges if these stored procs are called quite often concurrently - doesn't scale too well. However, you are not keeping the entire table - just the paged rows, so hopefully your page sizes are much smaller than 200k rows.
The approach below tries to minimize the tempdb cost being able to calculate the row count by getting only the first row due to the method of ASC and DESC ROW_NUMBERs.
;WITH OrderedSet AS (
SELECT
*,
ROW_NUMBER() OVER (ORDER BY Id DESC) AS Seq,
ROW_NUMBER() OVER (ORDER BY Id) AS SeqRev
FROM #buf buf --tbl_BulkUploadFiles buf
WHERE buf.FileType = #ft
)
SELECT * INTO #T
FROM OrderedSet
WHERE Seq BETWEEN #offset AND (#offset + #pagesize)
SET #count = COALESCE((SELECT TOP 1 SeqRev + Seq - 1 FROM #T), 0)
SELECT * FROM #T
Note: The method used above for calculating row counts was adapted from How to reference one CTE twice? and http://www.sqlservercentral.com/articles/T-SQL/66030/.
You can't use the CTE in more than one select statement. From the MSDN docs (talking about the CTE itself).
This is derived from a simple query and defined within the execution
scope of a single SELECT, INSERT, UPDATE, or DELETE statement.
You either need to run the CTE twice (probably a bad idea) or select the results of the CTE into a temp table and then select the paged data from that along with the total count.
Only alternative I see is repeating the query as inline view
select #count = numrows FROM
(
SELECT count(*) as numrows,
ROW_NUMBER() OVER (ORDER BY Id DESC) AS 'Index'
FROM tbl_BulkUploadFiles buf
WHERE
buf.FileType = #ft
) XXX WHERE [Index] BETWEEN #offset AND (#offset + #pagesize)

SQL stored procedure passing parameter into "order by"

Using Microsoft SQL server manager 2008.
Making a stored procedure that will "eventually" select the top 10 on the Pareto list. But I also would like to run this again to find the bottom 10.
Now, instead of replicating the query all over again, I'm trying to see if there's a way to pass a parameter into the query that will change the order by from asc to desc.
Is there any way to do this that will save me from replicating code?
CREATE PROCEDURE [dbo].[TopVRM]
#orderby varchar(255)
AS
SELECT Peroid1.Pareto FROM dbo.Peroid1
GROUP by Pareto ORDER by Pareto #orderby
Only by being slightly silly:
CREATE PROCEDURE [dbo].[TopVRM]
#orderby varchar(255)
AS
SELECT Peroid1.Pareto FROM dbo.Peroid1
GROUP by Pareto
ORDER by CASE WHEN #orderby='ASC' THEN Pareto END,
CASE WHEN #orderby='DESC' THEN Pareto END DESC
You don't strictly need to put the second sort condition in a CASE expression at all(*), and if Pareto is numeric, you may decide to just do CASE WHEN #orderby='ASC' THEN 1 ELSE -1 END * Pareto
(*) The second sort condition only has an effect when the first sort condition considers two rows to be equal. This is either when both rows have the same Pareto value (so the reverse sort would also consider them equal), of because the first CASE expression is returning NULLs (so #orderby isn't 'ASC', so we want to perform the DESC sort.
You might also want to consider retrieving both result sets in one go, rather than doing two calls:
CREATE PROCEDURE [dbo].[TopVRM]
#orderby varchar(255)
AS
SELECT * FROM (
SELECT
*,
ROW_NUMBER() OVER (ORDER BY Pareto) as rn1,
ROW_NUMBER() OVER (ORDER BY Pareto DESC) as rn2
FROM (
SELECT Peroid1.Pareto
FROM dbo.Peroid1
GROUP by Pareto
) t
) t2
WHERE rn1 between 1 and 10 or rn2 between 1 and 10
ORDER BY rn1
This will give you the top 10 and the bottom 10, in order from top to bottom. But if there are less than 20 results in total, you won't get duplicates, unlike your current plan.
try:
CREATE PROCEDURE [dbo].[TopVRM]
(#orderby varchar(255)
AS
IF #orderby='asc'
SELECT Peroid1.Pareto FROM dbo.Peroid1
GROUP by Pareto ORDER by Pareto asc
ELSE
SELECT Peroid1.Pareto FROM dbo.Peroid1
GROUP by Pareto ORDER by Pareto desc
I know it's pretty old, but just wanted to share our solution here, hoping to help someone :)
After some performance tests for several candidate solutions (some of them posted in this thread), we realized you must be really careful with your implementation: your SP performance could be hugely impacted, specially when you combine it with pagination problem.
The best solution we found was to save raw results, ie. just applying filters, in a temporal table (#RawResult in the example), adding afterwards the ORDER BY and OFFSET clauses for pagination. Maybe it's not the prettiest solution (as you are force to copy & paste a clause twice for each column you want to sort), but we were unable to find other better in terms of performance.
Here it goes:
CREATE PROCEDURE [dbo].[MySP]
-- Here goes your procedure arguments to filter results
#Page INT = 1, -- Resulting page for pagination, starting in 1
#Limit INT = 100, -- Result page size
#OrderBy NVARCHAR(MAX) = NULL, -- OrderBy column
#OrderByAsc BIT = 1 -- OrderBy direction (ASC/DESC)
AS
-- Here goes your SP logic (if any)
SELECT
* -- Here goes your resulting columns
INTO
#RawResult
FROM
...
-- Here goes your query data source([FROM], [WHERE], [GROUP BY], etc)
-- NO [ORDER BY] / [TOP] / [FETCH HERE]!!!!
--From here, ORDER BY columns must be copy&pasted twice: ASC and DESC orders for each colum
IF (#OrderByAsc = 1 AND #OrderBy = 'Column1')
SELECT * FROM #RawResult ORDER BY Column1 ASC OFFSET #Limit * (#Page - 1) ROWS FETCH NEXT #Limit ROWS ONLY
ELSE
IF (#OrderByAsc = 0 AND #OrderBy = 'Column1')
SELECT * FROM #RawResult ORDER BY Column1 DESC OFFSET #Limit * (#Page - 1) ROWS FETCH NEXT #Limit ROWS ONLY
ELSE
IF (#OrderByAsc = 1 AND #OrderBy = 'Column2')
SELECT * FROM #RawResult ORDER BY Column2 ASC OFFSET #Limit * (#Page - 1) ROWS FETCH NEXT #Limit ROWS ONLY
ELSE
IF (#OrderByAsc = 0 AND #OrderBy = 'Column2')
SELECT * FROM #RawResult ORDER BY Column2 DESC OFFSET #Limit * (#Page - 1) ROWS FETCH NEXT #Limit ROWS ONLY
ELSE
...
ELSE --Default order, first column ASC
SELECT * FROM #RawResult ORDER BY 1 ASC OFFSET #Limit * (#Page - 1) ROWS FETCH NEXT #Limit ROWS ONLY
This gives you more options
CREATE PROCEDURE [dbo].[TopVRM] #orderby varchar(255) = 'Pareto asc'
DECLARE #SendIt NVARCHAR(MAX)
AS
BEGIN
SET #SendIt = 'SELECT Peroid1.Pareto FROM dbo.Peroid1
GROUP by Pareto ORDER by '+ #orderby
EXEC sp_executesql #SendIt
END
GO
EXEC dbo.TopVRM 'Pareto DESC'
GO

How to select the next top rows including the new rows that User added between the selection.

Assume I have a table ordered by Name column.
At the first time I'd like to select the top 500 rows.
User can add new rows to the table.
Based on user requirements.
I'd like to retrieve the next 500 rows without retrieving the first 500 rows again.
Assume that table is order by name and he added new rows that might be at the top 500.
The question is How can I select the next 500 rows including the new rows that I couldn't get at the first time because it's new rows?
What you're describing is called Paging
Here's a nice article that describes it. Server Side Paging using SQL Server 2005
Which includes this sample
DECLARE #PageSize INT,
#PageNumber INT,
#FirstRow INT,
#LastRow INT
SELECT #PageSize = 20,
#PageNumber = 3
SELECT #FirstRow = ( #PageNumber - 1) * #PageSize + 1,
#LastRow = (#PageNumber - 1) * #PageSize + #PageSize ;
WITH Members AS
(
SELECT M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, M_DATE, M_COUNTRY,
ROW_NUMBER() OVER (ORDER BY M_POSTS DESC) AS RowNumber,
ROW_NUMBER() OVER (ORDER BY M_NAME DESC) AS RowNumber2
FROM dbo.FORUM_MEMBERS
)
SELECT RowNumber, M_NAME, M_POSTS, M_LASTPOSTDATE, M_LASTHEREDATE, M_DATE, M_COUNTRY
FROM Members
WHERE RowNumber BETWEEN #FirstRow AND #LastRow
ORDER BY RowNumber ASC;
Note the pagesize and pagenumber variables. These could be parameters to a stored procedure instead.
Im assming that u have a column in users table called isnewuser which is set to true for everynew user added and is not shown in the list
while you are viewing next 500 records, while some other users have been added, i would suggest you to show them in seperate list below the original one saying " new users" etcc etc..
Its makes no sense to show newly added users, which could have been on first page , on page 2 of main list .

ROW_NUMBER() OVER Not Fast Enough With Large Result Set, any good solution?

I use ROW_NUMBER() to do paging with my website content and when you hit the last page it timeout because the SQL Server takes too long to complete the search.
There's already an article concerning this problem but seems no perfect solution yet.
http://weblogs.asp.net/eporter/archive/2006/10/17/ROW5F00NUMBER28002900-OVER-Not-Fast-Enough-With-Large-Result-Set.aspx
When I click the last page of the StackOverflow it takes less a second to return a page, which is really fast. I'm wondering if they have a real fast database servers or just they have a solution for ROW_NUMBER() problem?
Any idea?
Years back, while working with Sql Server 2000, which did not have this function, we had the same issue.
We found this method, which at first look seems like the performance can be bad, but blew us out the water.
Try this out
DECLARE #Table TABLE(
ID INT PRIMARY KEY
)
--insert some values, as many as required.
DECLARE #I INT
SET #I = 0
WHILE #I < 100000
BEGIN
INSERT INTO #Table SELECT #I
SET #I = #I + 1
END
DECLARE #Start INT,
#Count INT
SELECT #Start = 10001,
#Count = 50
SELECT *
FROM (
SELECT TOP (#Count)
*
FROM (
SELECT TOP (#Start + #Count)
*
FROM #Table
ORDER BY ID ASC
) TopAsc
ORDER BY ID DESC
) TopDesc
ORDER BY ID
The base logic of this method relies on the SET ROWCOUNT expression to both skip the unwanted rows and fetch the desired ones:
DECLARE #Sort /* the type of the sorting column */
SET ROWCOUNT #StartRow
SELECT #Sort = SortColumn FROM Table ORDER BY SortColumn
SET ROWCOUNT #PageSize
SELECT ... FROM Table WHERE SortColumn >= #Sort ORDER BY SortColumn
The issue is well covered in this CodeProject article, including scalability graphs.
TOP is supported on SQL Server 2000, but only static values. Eg no "TOP (#Var)", only "TOP 200"