Dealing with value based pagination given a number of a page - sql

When using value based pagination
select * from articles
where id > #start_value
limit #page_size
how can I calculate #start_value given only page number?
Namely: say, I had a website and html page with a list of articles that I needed to paginate. But even to render the very 1st page, I'd need to calculate #start_value somehow. The input from a user would be a number of a page which he clicked; for the very first page it'd be 1 - by default.
given that 1, how would I calcualate #start_value?
or given any random page, still how would I calcualate #start_value?
Note that the values of the column id of a table aren't necessarily sequential, even if id is autoincremented.

First off, pagination without any sorting is not ideal. You can't guarantee how SQL will sort your results without including and order by clause.
You will also need to know the page size to calculate your start value, but given #page_num, and #page_size: #start_value is calculated by #start_value = #page_num * #page_size;.
Here it is without the where clause and with limit/offset instead
select *
from articles
order by id
limit #page_size
offset (#page_size * #page_num)

You don't need the "where id > ..." part. The right way of achieving this is using limit #page_size offset #offset construct. This way you don't have to worry about the gaps. To calculate the offset based on page number, you just have to multiply page_size * page_number. Another important thing is that you should order your registers if you want to have the same result always. If you don't trust the IDs, you can order by date or another field. So:
select * from articles
order by date
limit #page_size
offset (#page_size * (#page_num-1))
Note: I used (#page_num-1) to start with a 0 offset for page 1.

Related

Acquiring offset of some row in SQL

TL;DR: Is there a possibility to get OFFSET position of a particular, known row in SQL, considering some ORDER BY is applied?
So consider a schema like this (simplified):
CREATE TABLE "public"."painting" (
"uuid" uuid NOT NULL DEFAULT uuid_generate_v4(),
"name" varchar NOT NULL,
"score" int4 NOT NULL,
"approvedAt" timestamp,
PRIMARY KEY ("uuid")
);
Like
abc1,test1,10,10:00
abc2,test2,9,11:00
abc3,test3,8,8:00
abc4,test4,8,12:00
abc5,test5,6,7:00
I want to make a request sorted by score and limited with 3 items, and I should emphasize that multiple entities might have the same score.
Because of a dynamic nature of that table, while traversing through those items, sorted by score, some new item might appear somewhere in the list.
If I use SQL OFFSET statement, that means this new entity will shift all entities below to one row, so that the new selection will have an item, that was last on previous 3 items selection.
abc1,test1,10,10:00
abc2,test2,9,11:00
abc6,test6,8,15:00 (new item)
CURRENT OFFSET = 3
abc3,test3,8,8:00 (was in previous select)
abc4,test4,8,12:00
abc5,test5,6,7:00
To avoid that, instead of using OFFSET, I can remember the UUID of the item I fetched last, so it'll be abc3. On next request, I can use it's score to add an extra WHERE SCORE < 8 statement, but this will skip abc4, because it's too having score of 8.
If I use WHERE SCORE <= 8 this will again return abc3 which is already traversed. I can't use another field in WHERE clause, because this will affect the results. Additional ORDER BY won't help either.
It seems to me that it is a very common problem in database selection, yet I can't find one comprehensive answer.
So, my question then, if it's possible to do some kind of request like following:
SELECT * FROM "painting" WHERE "score" <= :score ORDER BY "score" DESC OFFSET %position of `abc3`% LIMIT 3
Or alternatively
SELECT OFFSET OF (`abc3`) FROM "painting" WHERE SCORE <= :score ORDER BY "score" DESC LIMIT 3
That will return 2 (because it's the second row with such score), then do
SELECT * FROM "painting" WHERE "score" <= :score ORDER BY "score" DESC OFFSET :offset LIMIT 3
where :score is the score of last received item and :offset is the result of SELECT OFFSET - 1
My own assumption is that we have to SELECT WHERE "score" = :score, and get offset position outside the SQL (or make a very complex SQL query). Though, if we have a lot of items with similar ORDER BY attribute, this helper request might end up being heavier than the data fetch itself.
Yet, I feel like that there's a much more clever SQL way of doing what I'm trying to do.
Good question. Accurate Backend Pagination requires the underlying data to use an ordering criteria with a set of columns that represent a UNIQUE key.
In your case your ordering criteria can be made unique by adding the column uuid to it. With that in mind you can increase the page size by 1 behind the scenes to 4. That 4th row won't be displayed but only used to retrieve the next page.
For example, you can get:
select *
from painting
order by -score, approvedAt, uuid
limit 4
Now you would display the first three rows:
abc1,test1,10,10:00
abc2,test2,9,11:00
abc3,test3,8,8:00
The client app (most likely the UI) will remember -- not display -- the 4th row (the "key") to retrieve the next page:
abc4,test4,8,12:00
Then, to get the next page the query will add a WHERE clause with the "key" and take the form:
select *
from painting
where (-score, approvedAt, uuid) >= (-8, '12:00', 'abc4')
order by -score, approvedAt, uuid
limit 4
This query won't display the new row being inserted, but the original 4th row.
To get blazing fast data retrieval you could create the index:
create index ix1 on painting ((-score), approvedAt, uuid);
See example at DB Fiddle.

Wrapping a range of data

How would I select a rolling/wrapping* set of rows from a table?
I am trying to select a number of records (per type, 2 or 3) for each day, wrapping when I 'run out'.
Eg.
2018-03-15: YyBiz, ZzCo, AaPlace
2018-03-16: BbLocation, CcStreet, DdInc
These are rendered within a SSRS report for Dynamics CRM, so I can do light post-query operations.
Currently I get to:
2018-03-15: YyBiz, ZzCo
2018-03-16: AaPlace, BbLocation, CcStreet
First, getting a number for each record with:
SELECT name, ROW_NUMBER() OVER (PARTITION BY type ORDER BY name) as RN
FROM table
Within SSRS, I then adjust RN to reflect the number of each type I need:
OnPageNum = FLOOR((RN+num_of_type-1)/num_of_type)-1
--Shift RN to be 0-indexed.
Resulting in AaPlace, BbLocation and CcStreet having a PageNum of 0, DdInc of 1, ... YyBiz and ZzCo of 8.
Then using an SSRS Table/Matrix linked to the dataset, I set the row filter to something like:
RowFilter = MOD(DateNum, NumPages(type)) == OnPageNum
Where DateNum is essentially days since epoch, and each page has a separate table and day passed in.
At this point, it is showing only N records of type per page, but if the total number of records of a type isn't a multiple of the number of records per page of that type, there will pages with less records than required.
Is there an easier way to approach this/what's the next step?
*Wrapping such as Wraparound found in videogames, seamless resetting to 0.
To achieve this effect, I found that offsetting the RowNumber by -DateNum*num_of_type (negative for positive ordering), then modulo COUNT(type) would provide the correct "wrap around" effect.
In order to achieve the desired pagination, it then just had to be divided by num_of_type and floor'd, as below:
RowFilter: FLOOR(((RN-DateNum*num_of_type) % count(type))/num_of_type) == 0

SQLite: How to achive RANDOM ORDER and pageintation at the same time?

I have a table of movies, I want to be able to query the database and get a randomized list of movies, but also I don't want it to return all movies available so I'm using LIMIT and OFFSET. The problem is when I'm doing something like this:
SELECT * FROM Movie ORDER BY RANDOM() LIMIT 50 OFFSET 0
and then when querying for the next page with LIMIT 50 OFFSET 50 the RANDOM seed changes and so it's possible for rows from the first page to be included in the second page, which is not the desired behavior.
How can I achieve a random order and preserve it through the pages? As far as I know SQLite doesn't support custom seed for it's RANDOM function.
Thank you!
You cant preserve the random values. You have to add another field name to your table to keep the random order
UPDATE movie
SET randomOrder = Random();
Then you can retrive the pages
SELECT *
FROM Movie
ORDER BY randomOrder
LIMIT 50 OFFSET 0

Using the Seek Method in SQL and jumping to non-adjacent pages

I'm implementing pagination using the SQL strategy called Seek method in a PostgreSql RDBMS.
All the examples that I see over the Internet are explaining how to get the next page (e.g. see this article. But I'm wondering how to implement the method to move from a page to another that is not adjacent (e.g. from page 1 to page 5) without using any offset.
Any example?
A standard seek method on a table with a serial identifier can be written as:
SELECT id FROM table_with_serial_id WHERE id > prev_page_last_id ORDER BY id ASC LIMIT page_size;
Starting with prev_page_last_id set to 0, we can progressively advance through the table by always using the last id from the previous page.
Therefore if you want to skip to another page you could simply add page_size to prev_page_last_id to skip to the next page.
Note that this only works if you do not have gaps in the id column as this would cause a simple offset from the previous page.
Unfortunately in the latter there is no way to predict what the next id limit will be without going through each page, unless you want to accept the compromise of the possibility of having pages with less than page_size
Hope this helps!
T-SQL:
DECLARE #row_per_page INT = 100
DECLARE #page_number INT = 2
SELECT * FROM
(SELECT ROW_NUMBER() OVER (ORDER BY [ID]) AS [RowNumber],*
FROM table_name) AS T
WHERE T.[RowNumber] > (#page_number-1)*#row_per_page AND T.[RowNumber] < #page_number*#row_per_page+1

Select Average of Top 25% of Values in SQL

I'm currently writing a stored procedure for my client to populate some tables that will be used to generate SSRS reports later on. Some of the data is based on specific stock formulas that are run on each of their clients' quarterly data (sent to them by their clients). The other part of the data is generated by comparing those results against those from other, similar sized clients. One of the things that they want tracked in their reports is the average of the top 25% of formula results for that particular comparison group.
To give a better picture of it, imagine the following fields that I have in a temp table:
FormulaID int
Value decimal (18,6)
I want to do the following: Given a specific FormulaID return the average of the top 25% of Value.
I know how to take an average in SQL, but I don't know how to do it against only the top 25% of a specific group.
How would I write this query?
I guess you can do something like this...
SELECT AVG(Q.ColA) Avg25Prec
FROM (
SELECT TOP 25 Percent ColA
FROM Table_Name
ORDER BY SomeCOlumn
) Q
Here's what I did, given the table shown above:
select AVG(t.Value)
from (select top 25 percent Value
from #TempGroupTable
where FormulaID = #PassedInFormulaID
order by Value desc) as t
The desc must be there, because the percent command will not actually do comparisons. It will just simply grab the first x number of records, with x being equal to 25% of the count of records it's querying. Therefore, the order by Value desc line then will grab the top 25% records which have the highest Value, and then sends that info to be averaged.
As a side note to all of this, this also means that if you wanted to grab the bottom 25% instead, or if your formula results are like a golf score (i.e. lowest is the best), all you would need to do is remove the desc part and you would be good to go.