SQL Server Pagination with different line number per page - sql

I have a table in SQL Server database containing :
int value (column's name : Value)
datetime value (column's name : Date)
bit value (column's name : LastLineOfPage)
I would like to make a pagination query over this table. The logic of the pagination is the following :
The query must return lines corresponding to a given page (parameter #PageNumber), after sorting lines by the Date column
Also, the query must give the SUM of all the previous pages lines
The line number per page is not fixed : by default it's 14 lines per page, but if the bit LastLineOfPage is true, then the page contain only lines until the one with the true value
Here is a synthetic view of the process :
Here is the data in text :
ID DATE VALUE LASTLINEOFPAGE
1 07/10/2006 10 0
2 14/10/2006 12 0
3 21/10/2006 4 1
4 28/10/2006 6 0
5 04/11/2006 8 1
6 25/11/2006 125 0
7 02/12/2006 1 0
8 09/12/2006 5 0
9 16/12/2006 45 0
10 30/12/2006 1 1
So, the query receiving #PageNumber, and also #DefaultLineNumberPerPage (which will be equal to 14 but maybe one day that will change).
Could you help me in the design of this query or SQL function ?
Thanks !

Sample data
I added few rows to illustrate how it works when there are more rows per page than #DefaultLineNumberPerPage. In this example I'll use #DefaultLineNumberPerPage=5 and you'll see how extra pages were generated.
DECLARE #T TABLE (ID int, dt date, VALUE int, LASTLINEOFPAGE bit);
INSERT INTO #T(ID, dt, VALUE, LASTLINEOFPAGE) VALUES
(1 , '2006-10-07', 10 , 0),
(2 , '2006-10-14', 12 , 0),
(3 , '2006-10-21', 4 , 1),
(4 , '2006-10-28', 6 , 0),
(5 , '2006-11-04', 8 , 1),
(6 , '2006-11-25', 125, 0),
(7 , '2006-12-02', 1 , 0),
(8 , '2006-12-09', 5 , 0),
(9 , '2006-12-16', 45 , 0),
(10, '2006-12-30', 1 , 1),
(16, '2007-01-25', 125, 0),
(17, '2007-02-02', 1 , 0),
(18, '2007-02-09', 5 , 0),
(19, '2007-02-16', 45 , 0),
(20, '2007-02-20', 1 , 0),
(26, '2007-02-25', 125, 0),
(27, '2007-03-02', 1 , 0),
(28, '2007-03-09', 5 , 0),
(29, '2007-03-10', 5 , 0),
(30, '2007-03-11', 5 , 0),
(31, '2007-03-12', 5 , 0),
(32, '2007-03-13', 5 , 1),
(41, '2007-10-07', 10 , 0),
(42, '2007-10-14', 12 , 0),
(43, '2007-10-21', 4 , 1);
Query
Run it step-by-step, CTE-by-CTE and examine intermediate results to understand what it does.
CTE_FirstLines sets the FirstLineOfPage flag to 1 for the first line of the page instead of the last.
CTE_SimplePages uses a cumulative SUM to calculate the simple page numbers based on FirstLineOfPage page breaks.
CTE_ExtraPages uses ROW_NUMBER divided by #DefaultLineNumberPerPage to calculate extra page numbers if there is a page that has more than #DefaultLineNumberPerPage rows.
CTE_CompositePages combines simple page numbers with extra page numbers to make a single composite page "Number". It assumes that there will be less than 1000 rows between original LASTLINEOFPAGE flags. If it is possible to have such long sequence of rows, increase the 1000 constant and consider using bigint type for CompositePageNumber column.
CTE_FinalPages uses DENSE_RANK to assign sequential numbers without gaps for each final page.
DECLARE #DefaultLineNumberPerPage int = 5;
DECLARE #PageNumber int = 3;
WITH
CTE_FirstLines
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE
,CAST(ISNULL(LAG(LASTLINEOFPAGE)
OVER (ORDER BY dt), 1) AS int) AS FirstLineOfPage
FROM #T
)
,CTE_SimplePages
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage
,SUM(FirstLineOfPage) OVER(ORDER BY dt
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS SimplePageNumber
FROM CTE_FirstLines
)
,CTE_ExtraPages
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage, SimplePageNumber
,(ROW_NUMBER() OVER(PARTITION BY SimplePageNumber ORDER BY dt) - 1)
/ #DefaultLineNumberPerPage AS ExtraPageNumber
FROM CTE_SimplePages
)
,CTE_CompositePages
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage, SimplePageNumber, ExtraPageNumber
,SimplePageNumber * 1000 + ExtraPageNumber AS CompositePageNumber
FROM CTE_ExtraPages
)
,CTE_FinalPages
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage, SimplePageNumber, ExtraPageNumber
,CompositePageNumber
,DENSE_RANK() OVER(ORDER BY CompositePageNumber) AS FinalPageNumber
FROM CTE_CompositePages
)
,CTE_Sum
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage, SimplePageNumber, ExtraPageNumber
,CompositePageNumber
,FinalPageNumber
,SUM(Value) OVER(ORDER BY FinalPageNumber, dt
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS SumCumulative
FROM CTE_FinalPages
)
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage, SimplePageNumber, ExtraPageNumber
,CompositePageNumber
,FinalPageNumber
,SumCumulative
FROM CTE_Sum
-- WHERE FinalPageNumber = #PageNumber
ORDER BY dt
;
Result with the final WHERE filter commented out
Here is the full result with all intermediate columns to illustrate how the query works.
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
| ID | dt | VALUE | Lst | Fst | Simple | Extra | Composite | Final | TotalValue |
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
| 1 | 2006-10-07 | 10 | 0 | 1 | 1 | 0 | 1000 | 1 | 10 |
| 2 | 2006-10-14 | 12 | 0 | 0 | 1 | 0 | 1000 | 1 | 22 |
| 3 | 2006-10-21 | 4 | 1 | 0 | 1 | 0 | 1000 | 1 | 26 |
| 4 | 2006-10-28 | 6 | 0 | 1 | 2 | 0 | 2000 | 2 | 32 |
| 5 | 2006-11-04 | 8 | 1 | 0 | 2 | 0 | 2000 | 2 | 40 |
| 6 | 2006-11-25 | 125 | 0 | 1 | 3 | 0 | 3000 | 3 | 165 |
| 7 | 2006-12-02 | 1 | 0 | 0 | 3 | 0 | 3000 | 3 | 166 |
| 8 | 2006-12-09 | 5 | 0 | 0 | 3 | 0 | 3000 | 3 | 171 |
| 9 | 2006-12-16 | 45 | 0 | 0 | 3 | 0 | 3000 | 3 | 216 |
| 10 | 2006-12-30 | 1 | 1 | 0 | 3 | 0 | 3000 | 3 | 217 |
| 16 | 2007-01-25 | 125 | 0 | 1 | 4 | 0 | 4000 | 4 | 342 |
| 17 | 2007-02-02 | 1 | 0 | 0 | 4 | 0 | 4000 | 4 | 343 |
| 18 | 2007-02-09 | 5 | 0 | 0 | 4 | 0 | 4000 | 4 | 348 |
| 19 | 2007-02-16 | 45 | 0 | 0 | 4 | 0 | 4000 | 4 | 393 |
| 20 | 2007-02-20 | 1 | 0 | 0 | 4 | 0 | 4000 | 4 | 394 |
| 26 | 2007-02-25 | 125 | 0 | 0 | 4 | 1 | 4001 | 5 | 519 |
| 27 | 2007-03-02 | 1 | 0 | 0 | 4 | 1 | 4001 | 5 | 520 |
| 28 | 2007-03-09 | 5 | 0 | 0 | 4 | 1 | 4001 | 5 | 525 |
| 29 | 2007-03-10 | 5 | 0 | 0 | 4 | 1 | 4001 | 5 | 530 |
| 30 | 2007-03-11 | 5 | 0 | 0 | 4 | 1 | 4001 | 5 | 535 |
| 31 | 2007-03-12 | 5 | 0 | 0 | 4 | 2 | 4002 | 6 | 540 |
| 32 | 2007-03-13 | 5 | 1 | 0 | 4 | 2 | 4002 | 6 | 545 |
| 41 | 2007-10-07 | 10 | 0 | 1 | 5 | 0 | 5000 | 7 | 555 |
| 42 | 2007-10-14 | 12 | 0 | 0 | 5 | 0 | 5000 | 7 | 567 |
| 43 | 2007-10-21 | 4 | 1 | 0 | 5 | 0 | 5000 | 7 | 571 |
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
To get only one given page uncomment the WHERE filter in the final SELECT.
Result with the final WHERE filter
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
| ID | dt | VALUE | Lst | Fst | Simple | Extra | Composite | Final | TotalValue |
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
| 6 | 2006-11-25 | 125 | 0 | 1 | 3 | 0 | 3000 | 3 | 165 |
| 7 | 2006-12-02 | 1 | 0 | 0 | 3 | 0 | 3000 | 3 | 166 |
| 8 | 2006-12-09 | 5 | 0 | 0 | 3 | 0 | 3000 | 3 | 171 |
| 9 | 2006-12-16 | 45 | 0 | 0 | 3 | 0 | 3000 | 3 | 216 |
| 10 | 2006-12-30 | 1 | 1 | 0 | 3 | 0 | 3000 | 3 | 217 |
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
The TotalValue in the last row gives you the total page value that you want to show at the bottom of the page. If you sum all values on this page (125+1+5+45+1 = 177) and subtract it from the last TotalValue (217-177 = 40) you'll get the total of previous pages that you want to show at the top of the page. You'd better do these calculations on the client.

I have a partial solution. Still doesnt count default page size, but can give you an idea. So let me know what you think. Hope you are familiar with CTE's. Test each step so you see what are the partial results.
SQL Demo
WITH cte as (
SELECT [ID], [DATE], [VALUE], [LASTLINEOFPAGE],
SUM([VALUE]) OVER (ORDER BY [ID]) as Total,
SUM([LASTLINEOFPAGE]) OVER (ORDER BY [ID]) as page_group
FROM Table1
),
pages as (
SELECT c1.[ID], c1.[Total],
CASE WHEN c1.[ID] = 1 THEN 0
WHEN c1.[ID] = m.[minID] THEN c1.[page_group] -1
ELSE c1.[page_group]
END as [page_group]
FROM cte as c1
JOIN (SELECT [page_group], MIN([ID]) as minID
FROM cte
GROUP BY [page_group]) m
ON c1.[page_group] = m.[page_group]
)
SELECT c.[ID], c.[DATE], c.[VALUE], c.[LASTLINEOFPAGE],
(SELECT MAX([Total])
FROM pages p2
WHERE p2.[page_group] = p.[page_group]) as [Total],
p.[page_group]
FROM cte c
JOIN pages p
ON c.[ID] = p.[id]
As you can see the total and the page are in the aditional column and you shouldnt display those on your app

Related

How to sum 2 columns and add it with the previous summed columns in sql?

I have a table with these rows:
+------+--------+---------+---------+
| ID | Date | Amount1 | Amount2 |
+------+--------+---------+---------+
| 1 | 13 Nov | 8 | 3 |
| 2 | 11 Nov | 5 | 1 |
| 3 | 15 Nov | 0 | 3 |
| 4 | 18 Nov | 5 | 7 |
| 5 | 20 Nov | 10 | 0 |
+------+--------+---------+---------+
Would like to query with these result with the formula
Total = (Amount1 - Amount2) + Previous Row's Total
+------+--------+---------+---------+---------+
| ID | Date | Plus | Minus | Total |
+------+--------+---------+---------+---------+
| 2 | 11 Nov | 5 | 1 | 4 |
| 1 | 13 Nov | 8 | 3 | 9 |
| 3 | 15 Nov | 0 | 3 | 6 |
| 4 | 18 Nov | 5 | 7 | 4 |
| 5 | 20 Nov | 10 | 0 | 14 |
+------+--------+---------+---------+---------+
Is there any way to query this without binding the Total to a column on temporary table?
To get a running total, you can use SUM(columnname) OVER (ORDER BY sortedcolumnname).
To me it's actually a little counterintuitive compared to most windowed functions, as it doesn't have a partition but produces different results over the set of rows. However, it does work.
Here is some somewhat-obfuscated documentation from Microsoft about it.
I think you can therefore use
SELECT mt.[ID],
mt.[Date],
mt.[Amount1] AS [Plus],
mt.[Amount2] AS [Minus],
SUM(mt.[Amount1] - mt.[Amount2]) OVER (ORDER BY mt.[Date], mt.[ID]) AS Total
FROM mytable mt
ORDER BY mt.[Date],
mt.[ID];
And here are the results - they match yours.
ID Date Plus Minus Total
2 2020-11-11 5 1 4
1 2020-11-13 8 3 9
3 2020-11-15 0 3 6
4 2020-11-18 5 7 4
5 2020-11-20 10 0 14
Demo
You can acheive this using CTE first followed by self join. For amount1 - amount2, for id=3, you will be getting 0 -3 = -3. So, for id 3, the result below will be different for id=3
DECLARE #t table(id int, dateval date, amount1 int, amount2 int)
INSERT INTO #t
values
(1 ,'2020-11-13', 8, 3),
(2 ,'2020-11-11', 5, 1),
(3 ,'2020-11-15', 0, 3),
(4 ,'2020-11-18', 5, 7),
(5 ,'2020-11-20',10, 0);
;WITH CTE_First AS
(
SELECT id, dateval, amount1 as plus, amount2 as minus, (amount1-amount2) as total ,
ROW_NUMBER() OVER (ORDER BY dateval) as rnk
FROM #t
)
SELECT c.ID, c.DATEVAL, c.plus,c.minus,c.total + isnull(c1.total,0) as new_total
FROM CTE_First AS c
left outer join CTE_First AS C1
on C1.rnk = c.rnk- 1
+----+------------+------+-------+-----------+
| ID | DATEVAL | plus | minus | new_total |
+----+------------+------+-------+-----------+
| 2 | 2020-11-11 | 5 | 1 | 4 |
| 1 | 2020-11-13 | 8 | 3 | 9 |
| 3 | 2020-11-15 | 0 | 3 | 2 |
| 4 | 2020-11-18 | 5 | 7 | -5 |
| 5 | 2020-11-20 | 10 | 0 | 8 |
+----+------------+------+-------+-----------+

Running Count by Group and Flag in BigQuery?

I have a table that looks like the below:
Row | Fullvisitorid | Visitid | New_Session_Flag
1 | A | 111 | 1
2 | A | 120 | 0
3 | A | 128 | 0
4 | A | 133 | 0
5 | A | 745 | 1
6 | A | 777 | 0
7 | B | 388 | 1
8 | B | 401 | 0
9 | B | 420 | 0
10 | B | 777 | 1
11 | B | 784 | 0
12 | B | 791 | 0
13 | B | 900 | 1
14 | B | 904 | 0
What I want to do is if it's the first row for a fullvisitorid then mark the field as 1, otherwise use the above row as the value, but if the new_session_flag = 1 then use the above row plus 1, example of output I'm looking for below:
Row | Fullvisitorid | Visitid | New_Session_Flag | Rank_Session_Order
1 | A | 111 | 1 | 1
2 | A | 120 | 0 | 1
3 | A | 128 | 0 | 1
4 | A | 133 | 0 | 1
5 | A | 745 | 1 | 2
6 | A | 777 | 0 | 2
7 | B | 388 | 1 | 1
8 | B | 401 | 0 | 1
9 | B | 420 | 0 | 1
10 | B | 777 | 1 | 2
11 | B | 784 | 0 | 2
12 | B | 791 | 0 | 2
13 | B | 900 | 1 | 3
14 | B | 904 | 0 | 3
As you can see:
Row 1 is 1 because it's the first time fullvisitorid A appears
Row 2 is 1 because it's not the first time fullvisitorid A appears and new_session_flag <> 1 therefore it uses the above row (i.e. 1)
Row 5 is 2 because it's not the first time fullvisitorid A appears and new_session_Flag = 1 therefore it uses the above row (i.e 1) plus 1
Row 7 is 1 because it's the first time fullvisitorid B appears
etc.
I believe this can be done through a retain statement in SAS but is there an equivalent in Google BigQquery?
Hopefully the above makes sense, let me know if not.
Thanks in advance
Below is for BigQuery Standard SQL
#standardSQL
SELECT *,
COUNTIF(New_Session_Flag = 1) OVER(PARTITION BY Fullvisitorid ORDER BY Visitid) Rank_Session_Order
FROM `project.dataset.table`
The answer by Mikhail Berlyant using a conditional window count is corret and works. I am answering because I find that a window sum is even simpler (and possibly more efficient on a large dataset):
select
t.*,
sum(new_session_flag) over(partition by fullvisitorid order by visid_id) rank_session_order
from mytable t
This works because the new_session_flag contains 0s and 1s only; so counting the 1s is actually equivalent to suming all values.

SQL how to force to display row with 0 if no data available?

My table returns results as following (skips row if HourOfDay does not have data for particular ID)
ID HourOfDay Counts
--------------------------
1 5 5
1 13 10
1 23 3
..........................HourOfDay up till 23
2 9 1
and so on.
What I am trying to achieve is to force showing rows displaying 0 for HoursOfDay, which don't have data, like following:
ID HourOfDay Counts
--------------------------
1 0 0
1 1 0
1 2 0
1......................
1 5 5
1 6 0
1......................
1 23 3
2 0 0
2 1 0
etc.
I have researched around about it. It looks like I can achieve this result if I create an extra table and outer join it. So I have created table variable in SP (as a temp workaround)
DECLARE #Hours TABLE
(
[Hour] INT NULL
);
INSERT INTO #Hours VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)
,(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23);
However, no matter how I join it, it does not achieve desired result.
How do I proceed? Do I add extra columns to join on? Completely different approach? Any hint in the right direction is appreciated!
Using a derived table for the distinct Ids cross joined to #Hours, left joined to your table:
select
i.Id
, h.Hour
, coalesce(t.Counts,0) as Counts
from (select distinct Id from t) as i
cross join #Hours as h
left join t
on i.Id = t.Id
and h.Hour = t.HourOfDay
rextester demo: http://rextester.com/XFZYX88502
returns:
+----+------+--------+
| Id | Hour | Counts |
+----+------+--------+
| 1 | 0 | 0 |
| 1 | 1 | 0 |
| 1 | 2 | 0 |
| 1 | 3 | 0 |
| 1 | 4 | 0 |
| 1 | 5 | 5 |
| 1 | 6 | 0 |
| 1 | 7 | 0 |
| 1 | 8 | 0 |
| 1 | 9 | 0 |
| 1 | 10 | 0 |
| 1 | 11 | 0 |
| 1 | 12 | 0 |
| 1 | 13 | 10 |
| 1 | 14 | 0 |
| 1 | 15 | 0 |
| 1 | 16 | 0 |
| 1 | 17 | 0 |
| 1 | 18 | 0 |
| 1 | 19 | 0 |
| 1 | 20 | 0 |
| 1 | 21 | 0 |
| 1 | 22 | 0 |
| 1 | 23 | 3 |
| 2 | 0 | 0 |
| 2 | 1 | 0 |
| 2 | 2 | 0 |
| 2 | 3 | 0 |
| 2 | 4 | 0 |
| 2 | 5 | 0 |
| 2 | 6 | 0 |
| 2 | 7 | 0 |
| 2 | 8 | 0 |
| 2 | 9 | 1 |
| 2 | 10 | 0 |
| 2 | 11 | 0 |
| 2 | 12 | 0 |
| 2 | 13 | 0 |
| 2 | 14 | 0 |
| 2 | 15 | 0 |
| 2 | 16 | 0 |
| 2 | 17 | 0 |
| 2 | 18 | 0 |
| 2 | 19 | 0 |
| 2 | 20 | 0 |
| 2 | 21 | 0 |
| 2 | 22 | 0 |
| 2 | 23 | 0 |
+----+------+--------+

Postgresql change value based on the change of another field

I have a Postgres table like this:
id | value
----+-------
1 | 100
2 | 100
3 | 100
4 | 100
5 | 200
6 | 200
7 | 200
8 | 100
9 | 100
10 | 300
I'd have a table like this
id | value |new_id
----+---------+-----
1 | 100 | 1
2 | 100 | 1
3 | 100 | 1
4 | 100 | 1
5 | 200 | 2
6 | 200 | 2
7 | 200 | 2
8 | 100 | 3
9 | 100 | 3
10 | 300 | 4
I'd have a new field with a new_id that change when value change and remain the same until value changes again.
My question is similar this but I cannot found a solution.
You can identify sequences where the value is the same by using a difference of row_number(). After getting the difference, you have a group identifier and can calculate the minimum id for each group. Then, dense_rank() will renumber the values based on this ordering.
It looks like this:
select t.id, t.value, dense_rank() over (order by minid) as new_id
from (select t.*, min(id) over (partition by value, grp) as minid
from (select t.*,
(row_number() over (order by id) - row_number() over (partition by value order by id)
) as grp
from table t
) t
) t
You can see what happens to your sample data:
id | value | grp | minid | new_id |
----+-------+-----+-------+--------+
1 | 100 | 0 | 1 | 1 |
2 | 100 | 0 | 1 | 1 |
3 | 100 | 0 | 1 | 1 |
4 | 100 | 0 | 1 | 1 |
5 | 200 | 4 | 5 | 2 |
6 | 200 | 4 | 5 | 2 |
7 | 200 | 4 | 5 | 2 |
8 | 100 | 3 | 8 | 3 |
9 | 100 | 3 | 8 | 3 |
10 | 300 | 9 | 10 | 4 |

SQL report show result in one line of group

I am trying to reach the follwoing result:
ID | Part | QTY| Boxes| Reference
1 | ABC123 | 20 | 0 | REF0001
2 | ABC345 | 10 | 0 | REF0001
3 | ABC487 | 5 | 1 | REF0001
4 | SEF453 | 4 | 0 | REF0002
5 | ABDS12 | 82 | 4 | REF0002
6 | EFR488 | 64 | 0 | REF0003
7 | XCV345 | 58 | 0 | REF0003
8 | SSFS33 | 23 | 3 | REF0003
Right now I get
ID | Part | QTY| Boxes| Reference
1 | ABC123 | 20 | 1 | REF0001
2 | ABC345 | 10 | 1 | REF0001
3 | ABC487 | 5 | 1 | REF0001
4 | SEF453 | 4 | 4 | REF0002
5 | ABDS12 | 82 | 4 | REF0002
6 | EFR488 | 64 | 3 | REF0003
7 | XCV345 | 58 | 3 | REF0003
8 | SSFS33 | 23 | 3 | REF0003
As you can see, the qty of boxes per reference repeat each row and i need to appear only one per reference.
Well, here is one way . . .
with t as (<your current query>)
select ID, Part, QTY,
max(Boxes) over (partition by Reference) as Boxes,
Reference
from t
Assigning row numbers grouped per each reference will mark highest ID sharing the same reference as 1; main query checks this mark and outputs zero if it is not satisfied.
; with q as
(
select *,
row_number() over (partition by Reference
order by ID desc) rn
from
(
your-query-here
) a
)
select q.ID,
q.Part,
q.QTY,
case when rn = 1 then q.Boxes else 0 end as Boxes,
q.Reference
from q
order by q.ID