Using Microsoft SQL server manager 2008.
Making a stored procedure that will "eventually" select the top 10 on the Pareto list. But I also would like to run this again to find the bottom 10.
Now, instead of replicating the query all over again, I'm trying to see if there's a way to pass a parameter into the query that will change the order by from asc to desc.
Is there any way to do this that will save me from replicating code?
CREATE PROCEDURE [dbo].[TopVRM]
#orderby varchar(255)
AS
SELECT Peroid1.Pareto FROM dbo.Peroid1
GROUP by Pareto ORDER by Pareto #orderby
Only by being slightly silly:
CREATE PROCEDURE [dbo].[TopVRM]
#orderby varchar(255)
AS
SELECT Peroid1.Pareto FROM dbo.Peroid1
GROUP by Pareto
ORDER by CASE WHEN #orderby='ASC' THEN Pareto END,
CASE WHEN #orderby='DESC' THEN Pareto END DESC
You don't strictly need to put the second sort condition in a CASE expression at all(*), and if Pareto is numeric, you may decide to just do CASE WHEN #orderby='ASC' THEN 1 ELSE -1 END * Pareto
(*) The second sort condition only has an effect when the first sort condition considers two rows to be equal. This is either when both rows have the same Pareto value (so the reverse sort would also consider them equal), of because the first CASE expression is returning NULLs (so #orderby isn't 'ASC', so we want to perform the DESC sort.
You might also want to consider retrieving both result sets in one go, rather than doing two calls:
CREATE PROCEDURE [dbo].[TopVRM]
#orderby varchar(255)
AS
SELECT * FROM (
SELECT
*,
ROW_NUMBER() OVER (ORDER BY Pareto) as rn1,
ROW_NUMBER() OVER (ORDER BY Pareto DESC) as rn2
FROM (
SELECT Peroid1.Pareto
FROM dbo.Peroid1
GROUP by Pareto
) t
) t2
WHERE rn1 between 1 and 10 or rn2 between 1 and 10
ORDER BY rn1
This will give you the top 10 and the bottom 10, in order from top to bottom. But if there are less than 20 results in total, you won't get duplicates, unlike your current plan.
try:
CREATE PROCEDURE [dbo].[TopVRM]
(#orderby varchar(255)
AS
IF #orderby='asc'
SELECT Peroid1.Pareto FROM dbo.Peroid1
GROUP by Pareto ORDER by Pareto asc
ELSE
SELECT Peroid1.Pareto FROM dbo.Peroid1
GROUP by Pareto ORDER by Pareto desc
I know it's pretty old, but just wanted to share our solution here, hoping to help someone :)
After some performance tests for several candidate solutions (some of them posted in this thread), we realized you must be really careful with your implementation: your SP performance could be hugely impacted, specially when you combine it with pagination problem.
The best solution we found was to save raw results, ie. just applying filters, in a temporal table (#RawResult in the example), adding afterwards the ORDER BY and OFFSET clauses for pagination. Maybe it's not the prettiest solution (as you are force to copy & paste a clause twice for each column you want to sort), but we were unable to find other better in terms of performance.
Here it goes:
CREATE PROCEDURE [dbo].[MySP]
-- Here goes your procedure arguments to filter results
#Page INT = 1, -- Resulting page for pagination, starting in 1
#Limit INT = 100, -- Result page size
#OrderBy NVARCHAR(MAX) = NULL, -- OrderBy column
#OrderByAsc BIT = 1 -- OrderBy direction (ASC/DESC)
AS
-- Here goes your SP logic (if any)
SELECT
* -- Here goes your resulting columns
INTO
#RawResult
FROM
...
-- Here goes your query data source([FROM], [WHERE], [GROUP BY], etc)
-- NO [ORDER BY] / [TOP] / [FETCH HERE]!!!!
--From here, ORDER BY columns must be copy&pasted twice: ASC and DESC orders for each colum
IF (#OrderByAsc = 1 AND #OrderBy = 'Column1')
SELECT * FROM #RawResult ORDER BY Column1 ASC OFFSET #Limit * (#Page - 1) ROWS FETCH NEXT #Limit ROWS ONLY
ELSE
IF (#OrderByAsc = 0 AND #OrderBy = 'Column1')
SELECT * FROM #RawResult ORDER BY Column1 DESC OFFSET #Limit * (#Page - 1) ROWS FETCH NEXT #Limit ROWS ONLY
ELSE
IF (#OrderByAsc = 1 AND #OrderBy = 'Column2')
SELECT * FROM #RawResult ORDER BY Column2 ASC OFFSET #Limit * (#Page - 1) ROWS FETCH NEXT #Limit ROWS ONLY
ELSE
IF (#OrderByAsc = 0 AND #OrderBy = 'Column2')
SELECT * FROM #RawResult ORDER BY Column2 DESC OFFSET #Limit * (#Page - 1) ROWS FETCH NEXT #Limit ROWS ONLY
ELSE
...
ELSE --Default order, first column ASC
SELECT * FROM #RawResult ORDER BY 1 ASC OFFSET #Limit * (#Page - 1) ROWS FETCH NEXT #Limit ROWS ONLY
This gives you more options
CREATE PROCEDURE [dbo].[TopVRM] #orderby varchar(255) = 'Pareto asc'
DECLARE #SendIt NVARCHAR(MAX)
AS
BEGIN
SET #SendIt = 'SELECT Peroid1.Pareto FROM dbo.Peroid1
GROUP by Pareto ORDER by '+ #orderby
EXEC sp_executesql #SendIt
END
GO
EXEC dbo.TopVRM 'Pareto DESC'
GO
Related
I've created a stored procedure that filters and paginates for a DataTable.
Problem: I need to set an OUTPUT variable for #TotalRecords found before an OFFSET occurs, otherwise it sets #TotalRecord to #RecordPerPage.
I've messed around with CTE's and also simply trying this:
SELECT *, #TotalRecord = COUNT(1)
FROM dbo
But that doesn't work either.
Here is my stored procedure, with most of the stuff pulled out:
ALTER PROCEDURE [dbo].[SearchErrorReports]
#FundNumber varchar(50) = null,
#ProfitSelected bit = 0,
#SortColumnName varchar(30) = null,
#SortDirection varchar(10) = null,
#StartIndex int = 0,
#RecordPerPage int = null,
#TotalRecord INT = 0 OUTPUT --NEED TO SET THIS BEFORE OFFSET!
AS
BEGIN
SET NOCOUNT ON;
SELECT *
FROM
(SELECT *
FROM dbo.View
WHERE (#ProfitSelected = 1 AND Profit = 1)) AS ERP
WHERE
((#FundNumber IS NULL OR #FundNumber = '')
OR (ERP.FundNumber LIKE '%' + #FundNumber + '%'))
ORDER BY
CASE
WHEN #SortColumnName = 'FundNumber' AND #SortDirection = 'asc'
THEN ERP.FundNumber
END ASC,
CASE
WHEN #SortColumnName = 'FundNumber' AND #SortDirection = 'desc'
THEN ERP.FundNumber
END DESC
OFFSET #StartIndex ROWS
FETCH NEXT #RecordPerPage ROWS ONLY
Thank you in advance!
You could try something like this:
create a CTE that gets the data you want to return
include a COUNT(*) OVER() in there to get the total count of rows
return just a subset (based on your OFFSET .. FETCH NEXT) from the CTE
So your code would look something along those lines:
-- CTE definition - call it whatever you like
WITH BaseData AS
(
SELECT
-- select all the relevant columns you need
p.ProductID,
p.ProductName,
-- using COUNT(*) OVER() returns the total count over all rows
TotalCount = COUNT(*) OVER()
FROM
dbo.Products p
)
-- now select from the CTE - using OFFSET/FETCH NEXT, get only those rows you
-- want - but the "TotalCount" column still contains the total count - before
-- the OFFSET/FETCH
SELECT *
FROM BaseData
ORDER BY ProductID
OFFSET 20 ROWS FETCH NEXT 15 ROWS ONLY
As a habit, I prefer non-null entries before possible null. I did not reference those in my response below, and limited a working example to just the two inputs you are most concerned with.
I believe there could be some more clean ways to apply your local variables to filter the query results without having to perform an offset. You could return to a temp table or a permanent usage table that cleans itself up and use IDs that aren't returned as a way to set pages. Smoother, with less fuss.
However, I understand that isn't always feasible, and I become frustrated myself with those attempting to solve your use case for you without attempting to answer the question. Quite often there are multiple ways to tackle any issue. Your job is to decide which one is best in your scenario. Our job is to help you figure out the script.
With that said, here's a potential solution using dynamic SQL.
I'm a huge believer in dynamic SQL, and use it extensively for user based table control and ease of ETL mapping control.
use TestCatalog;
set nocount on;
--Builds a temp table, just for test purposes
drop table if exists ##TestOffset;
create table ##TestOffset
(
Id int identity(1,1)
, RandomNumber decimal (10,7)
);
--Inserts 1000 random numbers between 0 and 100
while (select count(*) from ##TestOffset) < 1000
begin
insert into ##TestOffset
(RandomNumber)
values
(RAND()*100)
end;
set nocount off;
go
create procedure dbo.TestOffsetProc
#StartIndex int = null --I'll reference this like a page number below
, #RecordsPerPage int = null
as
begin
declare #MaxRows int = 30; --your front end will probably manage this, but don't trust it. I personally would store this on a table against each display so it can also be returned dynamically with less manual intrusion to this procedure.
declare #FirstRow int;
--Quick entry to ensure your record count returned doesn't excede max allowed.
if #RecordsPerPage is null or #RecordsPerPage > #MaxRows
begin
set #RecordsPerPage = #MaxRows
end;
--Same here, making sure not to return NULL to your dynamic statement. If null is returned from any variable, the entire statement will become null.
if #StartIndex is null
begin
set #StartIndex = 0
end;
set #FirstRow = #StartIndex * #RecordsPerPage
declare #Sql nvarchar(2000) = 'select
tos.*
from ##TestOffset as tos
order by tos.RandomNumber desc
offset ' + convert(nvarchar,#FirstRow) + ' rows
fetch next ' + convert(nvarchar,#RecordsPerPage) + ' rows only'
exec (#Sql);
end
go
exec dbo.TestOffsetProc;
drop table ##TestOffset;
drop procedure dbo.TestOffsetProc;
We have a web application which helps organizing biological experiments (users describe experiment and upload experiment data). In the main page, we show first 10 experiments and then below Previous Next 1 2 3 .. 30.
I bugs me how to make efficient total count and pagination. Currently:
select count(id) from experiments; // not very efficient in large datasets
but how does this scale when dealing with large datarecords > 200.000. I tried to import random experiments to table, but it still performs quite ok (0.6 s for 300.000 experiments).
The other alternative I thought about is to add addtional table statistics (column tableName, column recordsCount). So after each insert to table experiments I would increase recordsCount in statistics (this means inserting to one table and updating other, using sql transaction of course). Vice versa goes for delete statement (recordsCount--).
For pagination the most efficient way is to do where id > last_id as sql uses index of course. Is there any other better way?
In case results are to be filtered e.g. select * from experiment where name like 'name%', option with table statistics fails. We need to get total count as: select count(id) from experiment where name like 'name%'.
Application was developed using Laravel 3 in case it makes any difference.
I would like to develop pagination that always performs the same. Records count must not affect pagination nor total count of records.
Please have the query like below:
CREATE PROCEDURE [GetUsers]
(
#Inactive Bit = NULL,
#Name Nvarchar(500),
#Culture VarChar(5) = NULL,
#SortExpression VarChar(50),
#StartRowIndex Int,
#MaxRowIndex Int,
#Count INT OUTPUT
)
AS
BEGIN
SELECT ROW_NUMBER()
OVER
(
ORDER BY
CASE WHEN #SortExpression = 'Name' THEN [User].[Name] END,
CASE WHEN #SortExpression = 'Name DESC' THEN [User].[Name] END DESC
) AS RowIndex, [User].*
INTO #tmpTable
FROM [User] WITH (NOLOCK)
WHERE (#Inactive IS NULL OR [User].[Inactive] = #Inactive)
AND (#Culture IS NULL OR [User].[DefaultCulture] = #Culture)
AND [User].Name LIKE '%' + #Name + '%'
SELECT *
FROM #tmpTable WITH (NOLOCK)
WHERE #tmpTable.RowIndex > #StartRowIndex
AND #tmpTable.RowIndex < (#StartRowIndex + #MaxRowIndex + 1)
SELECT #Count = COUNT(*) FROM #tmpTable
IF OBJECT_ID('tempdb..#tmpTable') IS NOT NULL DROP TABLE #tmpTable;
END
I'm using a stored procedure with a CTE and doing some paging. I also want to return an output parameter with the total count of the returned query before my paging.
My problem is that I get an error that "OrderedSet" is not a valid object name.
#ft INT,
#page INT,
#pagesize INT,
#count INT OUTPUT
AS
BEGIN
DECLARE #offset INT
SET #offset = #page * #pagesize
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
WITH OrderedSet AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY Id DESC) AS 'Index'
FROM tbl_BulkUploadFiles buf
WHERE
buf.FileType = #ft )
SELECT * FROM OrderedSet WHERE [Index] BETWEEN #offset AND (#offset + #pagesize)
SET #count = (SELECT COUNT(*) FROM OrderedSet)
END
So my issue is on the last line, error is that last OrderedSet is not a valid object name.
Thanks in advance for any help!
Here are 2 approaches that avoid copying and pasting all the CTEs multiple times.
Return total rows as column of result set
Benefit here is that you can calculate total rows without multiple queries and temp tables, but you have to add logic to your front end to get the total row count from the first row of the result set before iterating over it to display the paged set. Another consideration is that you must account for no rows being returned, so set your total row count to 0 if no rows returned.
;WITH OrderedSet AS (
SELECT
*,
ROW_NUMBER() OVER (ORDER BY Id DESC) AS Seq,
ROW_NUMBER() OVER (ORDER BY Id) AS SeqRev
FROM tbl_BulkUploadFiles buf
WHERE buf.FileType = #ft
)
SELECT *, Seq + SeqRev - 1 AS [TotalCount]
FROM OrderedSet
WHERE Seq BETWEEN #offset AND (#offset + #pagesize)
Utilize a temp table
While there is a cost of a temp table, if your database instance follows best practices for tempdb (multiple files for multi-cores, reasonable initial size, etc), 200k rows may not be a big deal since the context is lost after the stored proc completes, so the 200k rows don't exist for too long. However, it does present challenges if these stored procs are called quite often concurrently - doesn't scale too well. However, you are not keeping the entire table - just the paged rows, so hopefully your page sizes are much smaller than 200k rows.
The approach below tries to minimize the tempdb cost being able to calculate the row count by getting only the first row due to the method of ASC and DESC ROW_NUMBERs.
;WITH OrderedSet AS (
SELECT
*,
ROW_NUMBER() OVER (ORDER BY Id DESC) AS Seq,
ROW_NUMBER() OVER (ORDER BY Id) AS SeqRev
FROM #buf buf --tbl_BulkUploadFiles buf
WHERE buf.FileType = #ft
)
SELECT * INTO #T
FROM OrderedSet
WHERE Seq BETWEEN #offset AND (#offset + #pagesize)
SET #count = COALESCE((SELECT TOP 1 SeqRev + Seq - 1 FROM #T), 0)
SELECT * FROM #T
Note: The method used above for calculating row counts was adapted from How to reference one CTE twice? and http://www.sqlservercentral.com/articles/T-SQL/66030/.
You can't use the CTE in more than one select statement. From the MSDN docs (talking about the CTE itself).
This is derived from a simple query and defined within the execution
scope of a single SELECT, INSERT, UPDATE, or DELETE statement.
You either need to run the CTE twice (probably a bad idea) or select the results of the CTE into a temp table and then select the paged data from that along with the total count.
Only alternative I see is repeating the query as inline view
select #count = numrows FROM
(
SELECT count(*) as numrows,
ROW_NUMBER() OVER (ORDER BY Id DESC) AS 'Index'
FROM tbl_BulkUploadFiles buf
WHERE
buf.FileType = #ft
) XXX WHERE [Index] BETWEEN #offset AND (#offset + #pagesize)
I found this nice example of doing paging in SQL Server, however, I need to do some dynamic ordering. That is, the user passes in an integer, which then gets used to do the ordering, like this:
ORDER BY
CASE WHEN #orderBy = 1 THEN DateDiff(ss, getdate(), received_date) --oldest
WHEN #orderBy = 2 THEN DateDiff(ss, received_date, getdate()) --newest
WHEN #orderBy = 3 THEN message_id --messageid
WHEN #orderBy = 4 THEN LEFT(person_reference, LEN(person_reference)-1) --personid
END
Is it possible to do paging, with this form of dynamic ordering?
What you do instead is move the ORDER BY code into the ROW_NUMBER window function.
Like this example
SELECT * -- your columns
FROM
(
SELECT *, ROWNUM = ROW_NUMBER() OVER (
ORDER BY
CASE WHEN #orderBy = 1 THEN DateDiff(ss, getdate(), received_date) --oldest
WHEN #orderBy = 2 THEN DateDiff(ss, received_date, getdate()) --newest
WHEN #orderBy = 3 THEN message_id --messageid
WHEN #orderBy = 4 THEN LEFT(person_reference, LEN(person_reference)-1) --personid
END
)
FROM TBL
) R
where ROWNUM between ((#pageNumber-1)*#PageSize +1) and (#pageNumber*#PageSize)
The main problem with the complex ORDER BY and the windowing function is that you end up fully materializing the rownum against all rows before returning just one page.
I have to work with a potentially large list of records and I've been Googling for ways to avoid selecting the whole list, instead I want to let users select a page (like from 1 to 10) and display the records accordingly.
Say, for 1000 records I will have 100 pages of 10 records each and the most recent 10 records will be displayed first then if the user click on page 5, it will show records from 41 to 50.
Is it a good idea to add a row number to each record then query based on row number? Is there a better way of achieving the paging result without too much overhead?
So far those methods as described here look the most promising:
http://developer.berlios.de/docman/display_doc.php?docid=739&group_id=2899
http://www.codeproject.com/KB/aspnet/PagingLarge.aspx
The following T-SQL stored procedure is a very efficient implementation of paging. THE SQL optimiser can find the first ID very fast. Combine this with the use of ROWCOUNT, and you have an approach that is both CPU-efficient and read-efficient. For a table with a large number of rows, it certainly beats any approach that I've seen using a temporary table or table variable.
NB: I'm using a sequential identity column in this example, but the code works on any column suitable for page sorting. Also, sequence breaks in the column being used don't affect the result as the code selects a number of rows rather than a column value.
EDIT: If you're sorting on a column with potentially non-unique values (eg LastName), then add a second column to the Order By clause to make the sort values unique again.
CREATE PROCEDURE dbo.PagingTest
(
#PageNumber int,
#PageSize int
)
AS
DECLARE #FirstId int, #FirstRow int
SET #FirstRow = ( (#PageNumber - 1) * #PageSize ) + 1
SET ROWCOUNT #FirstRow
-- Add check here to ensure that #FirstRow is not
-- greater than the number of rows in the table.
SELECT #FirstId = [Id]
FROM dbo.TestTable
ORDER BY [Id]
SET ROWCOUNT #PageSize
SELECT *
FROM dbo.TestTable
WHERE [Id] >= #FirstId
ORDER BY [Id]
SET ROWCOUNT 0
GO
If you use a CTE with two row_number() columns - one sorted asc, one desc, you get row numbers for paging as well as the total records by adding the two row_number columns.
create procedure get_pages(#page_number int, #page_length int)
as
set nocount on;
with cte as
(
select
Row_Number() over (order by sort_column desc) as row_num
,Row_Number() over (order by sort_column) as inverse_row_num
,id as cte_id
From my_table
)
Select
row_num+inverse_row_num as total_rows
,*
from CTE inner join my_table
on cte_id=df_messages.id
where row_num between
(#page_number)*#page_length
and (#page_number+1)*#page_length
order by rownumber
Using OFFSET
Others have explained how the ROW_NUMBER() OVER() ranking function can be used to perform pages. It's worth mentioning that SQL Server 2012 finally included support for the SQL standard OFFSET .. FETCH clause:
SELECT first_name, last_name, score
FROM players
ORDER BY score DESC
OFFSET 40 ROWS FETCH NEXT 10 ROWS ONLY
If you're using SQL Server 2012 and backwards-compatibility is not an issue, you should probably prefer this clause as it will be executed more optimally by SQL Server in corner cases.
Using the SEEK Method
There is an entirely different, much faster way to perform paging in SQL. This is often called the "seek method" as described in this blog post here.
SELECT TOP 10 first_name, last_name, score
FROM players
WHERE (score < #previousScore)
OR (score = #previousScore AND player_id < #previousPlayerId)
ORDER BY score DESC, player_id DESC
The #previousScore and #previousPlayerId values are the respective values of the last record from the previous page. This allows you to fetch the "next" page. If the ORDER BY direction is ASC, simply use > instead.
With the above method, you cannot immediately jump to page 4 without having first fetched the previous 40 records. But often, you do not want to jump that far anyway. Instead, you get a much faster query that might be able to fetch data in constant time, depending on your indexing. Plus, your pages remain "stable", no matter if the underlying data changes (e.g. on page 1, while you're on page 4).
This is the best way to implement paging when lazy loading more data in web applications, for instance.
Note, the "seek method" is also called keyset paging.
Try something like this:
declare #page int = 2
declare #size int = 10
declare #lower int = (#page - 1) * #size
declare #upper int = (#page ) * #size
select * from (
select
ROW_NUMBER() over (order by some_column) lfd,
* from your_table
) as t
where lfd between #lower and #upper
order by some_column
Here's an updated version of #RoadWarrior's code, using TOP. Performance is identical, and extremely fast. Make sure you have an index on TestTable.ID
CREATE PROC dbo.PagingTest
#SkipRows int,
#GetRows int
AS
DECLARE #FirstId int
SELECT TOP (#SkipRows)
#FirstId = [Id]
FROM dbo.TestTable
ORDER BY [Id]
SELECT TOP (#GetRows) *
FROM dbo.TestTable
WHERE [Id] >= #FirstId
ORDER BY [Id]
GO
Try this
Declare #RowStart int, #RowEnd int;
SET #RowStart = 4;
SET #RowEnd = 7;
With MessageEntities As
(
Select ROW_NUMBER() Over (Order By [MESSAGE_ID]) As Row, [MESSAGE_ID]
From [TBL_NAFETHAH_MESSAGES]
)
Select m0.MESSAGE_ID, m0.MESSAGE_SENDER_NAME,
m0.MESSAGE_SUBJECT, m0.MESSAGE_TEXT
From MessageEntities M
Inner Join [TBL_NAFETHAH_MESSAGES] m0 on M.MESSAGE_ID = m0.MESSAGE_ID
Where M.Row Between #RowStart AND #RowEnd
Order By M.Row Asc
GO
Why not to use recommended solution:
SELECT VALUE product FROM
AdventureWorksEntities.Products AS product
order by product.ListPrice SKIP #skip LIMIT #limit