Possible fallbacks in my pagination technique and how can I improve it? - sql

I want to perform pagination for my web page.The method that I am using (and I found mostly on internet ) is explained below with an example.
Suppose I have the following table user
+----+------+----------+
| id | name | category |
+----+------+----------+
| 1 | a | 1 |
| 2 | b | 2 |
| 3 | c | 2 |
| 4 | d | 3 |
| 5 | e | 1 |
| 6 | f | 3 |
| 7 | g | 1 |
| 8 | h | 3 |
| 9 | i | 2 |
| 10 | j | 2 |
| 11 | k | 1 |
| 12 | l | 3 |
| 13 | m | 3 |
| 14 | n | 3 |
| 15 | o | 1 |
| 16 | p | 1 |
| 17 | q | 2 |
| 18 | r | 1 |
| 19 | s | 3 |
| 20 | t | 3 |
| 21 | u | 3 |
| 22 | v | 3 |
| 23 | w | 1 |
| 24 | x | 1 |
| 25 | y | 2 |
| 26 | z | 2 |
+----+------+----------+
And I want to show information about category 3 users with 2 users per page, I am using the following query for this
select * from user where category=3 limit 0,2;
+----+------+----------+
| id | name | category |
+----+------+----------+
| 4 | d | 3 |
| 6 | f | 3 |
+----+------+----------+
and for next two
select * from user where category=3 limit 2,2;
+----+------+----------+
| id | name | category |
+----+------+----------+
| 8 | h | 3 |
| 12 | l | 3 |
+----+------+----------+
and so on.
Now in practice I have around 7000 tuples in a single table.So is there any better way in terms of speed to achieve this or in terms of any fallback this method may have.
Thanks.

You don't want to fetch more values than your current page can handle, so yes, you will essentially be making one query per page. Some other solutions (such as Rails will_paginate) will execute essentially the same queries.
Now, you could build some logic into your client side to do the pagination there - prefetch multiple (or all) pages at once and store them on the client side. This way pagination is handled completely on the client side without need for further queries. It is a bit wasteful if a user is likely to only look at a small percentage of pages overall though.
If your actual production table has more columns in it, you could select only the relevant columns instead of *, or potentially add some sort of order by (for sorting).

I hope this will help, you gotta put your page number in place of your_page_number, and records per page in place of records_per_page which in your sample is 2:
select A.* from
(select #row := #row + 1 as Row_Number, User.* from User
join (select #row := 0) Row_Temp_View
where category = 3
) A
where row_number
between (your_page_number * records_per_page)-records_per_page+1
and your_page_number * records_per_page;
notice that this will fetch you the right records, where your sample will not, and this is because your sample will fetch you always two records, which is not always true, lets say that you have 3 users you wonna show in two pages so your sample will show the first and the second in the first page and it will show the second and the third in the second page which is not right, my code will show you the first and the second in the first page and in the second page it will show you only the third one....

You can use Datatables. It's meant for exact same thing that you are looking for. I successfully use it for paginating more than a million rows, it's very fast & easy to implement.

Related

Transform table from sequential identifier to real with attributes

I changed a but the context, but it's basically the same issue.
Imagine we are in a never-ending tunnel, shaped like a circle. We split every section of the circle, from 1 to 10 and we'll call each section slot (sl). There are 2 groups (gr) of living things walking in the tunnel. Each group has 2 bands, where each has a name and global hitpoints (hp). Every group is walking forward (although the bands might change order). If a group is at slot #10 and moves forward, he will be at slot #1. We snapshot their information every day. All the data gathered is stored in a table with this structure:
+----------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+--------------+--+
| day_id | | gr_1_sl_1_id | | gr_1_sl_1_name | | gr_1_sl_1_hp | | gr_1_sl_2_id | | gr_1_sl_2_name | | gr_1_sl_2_hp | | gr_2_sl_1_id | | gr_2_sl_1_name | | gr_2_sl_1_hp | | gr_2_sl_2_id | | gr_2_sl_2_name | | gr_2_sl_2_hp | |
+----------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+--------------+--+
| 1 | 3 | orc | 100 | 4 | goblin | 10 | 10 | human | 50 | 1 | dwarf | 25 | |
| 2 | 6 | goblin | 7 | 7 | orc | 76 | 2 | human | 60 | 3 | dwarf | 28 | |
+----------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+--------------+--+
As you can see, the columns are structured in a sequential way, while the data shows what is the actual value. What I want is to have the information shaped this way instead:
+---------+-------+-------+-----------+---------+
| id_game | gr_id | sl_id | band_name | band_hp |
+---------+-------+-------+-----------+---------+
| 1 | 1 | 3 | orc | 100 |
| 1 | 1 | 4 | goblin | 10 |
| 1 | 2 | 10 | human | 50 |
| 1 | 2 | 1 | dwarf | 25 |
| 2 | 1 | 6 | goblin | 7 |
| 2 | 1 | 7 | orc | 76 |
| 2 | 2 | 2 | human | 60 |
| 2 | 2 | 3 | dwarf | 28 |
+---------+-------+-------+-----------+---------+
I have this information in power bi, although I can create views in sql server if need be. I have tried many things, closest thing I got was unpivoting and parsing the original columns to get day_id, gr_id, sl_id, attributes and values. In attributes and values, it's basically name and hp with their corresponding value (I changed hp into string), but then I'm stocked, I'm not sure what to do next.
Anyone has any ideas ? Keep in mind that I oversimplified the problem; there are more groups, more slots, more bands and more statistics (i.e. attack and defense rating, etc.)
You seem to want to unpivot the table. In SQL Server, I recommend using apply:
select t.day_id, v.*
form t cross apply
(values (1, 1, gr_1_sl_1_id, gr_1_sl_1_name, gr_1_sl_1_hp),
(1, 2, gr_1_sl_2_id, gr_1_sl_2_name, gr_1_sl_2_hp),
(2, 1, gr_2_sl_1_id, gr_1_sl_1_name, gr_2_sl_1_hp),
(2, 2, gr_2_sl_2_id, gr_1_sl_2_name, gr_2_sl_2_hp)
) v(id_game, gr_id, sl_id, band_name, band_hp);
In other databases, you can do something similar with union all.

Looking up parent item based on a bill of materials

I'm trying to figure out how to put together a SQL statement that will let me find an end-item in our database based on its bill of materials. I guess you could say this is like a reverse BOM lookup question.
My table structure is pretty simple.
-End-item table
-Component table
-Linking table to tie together multiple components to an end item record.
The data I have is just the component list, and I want to find the end item. Since every bill of material is unique it has to match the bill of materials perfectly ie exact number of components and exact matches to the component SKU numbers. In some cases 2 end-items might use all the same components, but one of them just uses an extra part or two that makes the end-item SKU number different, so it has to account for that. That is, again, it has to match the BOM perfectly.
If not an outright answer, could someone at least steer me on the correct path to finding one?
------ UPDATE ----------
Table structure would be something like this.
ManufacturedPart
,--------------------,
| ID | PART_NUM |
|--------------------|
| 1 | V3175-01 |
| 2 | V3367-01 |
| 3 | V3988-01 |
| 4 | V3175-CV |
`--------------------`
Component
,--------------------,
| ID | COMP_NUM |
|--------------------|
| 1 | V3175 |
| 2 | V3367 |
| 3 | V3369 |
| 4 | V3114 |
| 5 | V3370 |
| 6 | V4060 |
| 7 | V3550 |
| 8 | V3988 |
`--------------------`
ManufacturedComponent
,-------------------------------------------------,
| ID | MANUFACTURED_PART_ID | COMPONENT_ID |
|-------------------------------------------------|
| 1 | 1 | 1 |
| 2 | 1 | 4 |
| 3 | 1 | 6 |
| 4 | 2 | 2 |
| 5 | 2 | 3 |
| 6 | 2 | 5 |
| 7 | 2 | 7 |
| 8 | 3 | 1 |
| 9 | 3 | 8 |
| 10 | 4 | 1 |
| 11 | 4 | 4 |
`-------------------------------------------------`
Assuming I have only the COMP_NUMs (component numbers) to search with I want to match back to the ManufacturedPart that contains that exact list of components.
So some examples: If I have components V3175, V3114, and V4060, it should match back to V3175-01 manufactured part. But, if I only have components V3175 and V3114 it should match back to V3175-CV manufactured part. If I have components V3367, V3369, V3370, and V3550 it should match back to manufactured part V3367-01.
I have no SQL written at all yet as I'm unsure of how to break the problem down..

Developing SCV using SQL

I am trying to identify all related records using IDs from two different systems.
I have seen solutions that matches SourceA to SourceB and back to SourceA but obviously this will not pick up everything.
The below table shows that 1-A is seemingly unrelated to 4-C, however when we pair them up we can see that all of the below records are related and the latest ID combination is 4-C.
| SystemA_ID | SystemB_ID | Date | PrimaryA | PrimaryB |
| 1 | A | 1/1/2016 | 4 | C |
| 2 | A | 2/1/2016 | 4 | C |
| 2 | B | 3/1/2016 | 4 | C |
| 3 | B | 4/1/2016 | 4 | C |
| 3 | C | 5/1/2016 | 4 | C |
| 4 | C | 6/1/2016 | 4 | C |
What I need is to populate the PrimaryA and PrimaryB columns with 4 and 'C' respectively.
I was thinking of doing a double loop similar to the solution described here
However, I could not get it working and also there might be a better solution.

Rails 3 - complex query with joins and counts, possible subqueries?

Ok, so i have a bit of a complex query i am trying to come up with in my rails application. I have four tables: Clients, Projects, Invoices, Invoice_Line_Items. I am trying to get certain bits of data from all of those tables and display it in a "reports" type view in my application. This is what the structures look like for the four tables:
Clients
| id | name | archive |
----------------------------------------
| 1 | Client 1 | 0 |
| 2 | Client 2 | 0 |
Projects
| id | client_id | name | archive |
------------------------------------------------------
| 1 | 1 | Project 1 | 0 |
| 2 | 1 | Project 2 | 1 |
| 3 | 2 | Project 3 | 0 |
| 4 | 2 | Project 4 | 1 |
Invoices
| id | client_id | project_id | name | archive |
----------------------------------------------------------------------
| 1 | 1 | 1 | Invoice 1 | 0 |
| 2 | 1 | 1 | Invoice 2 | 0 |
| 3 | 1 | 2 | Invoice 3 | 1 |
| 4 | 1 | 2 | Invoice 4 | 1 |
| 5 | 2 | 3 | Invoice 5 | 0 |
| 6 | 2 | 3 | Invoice 6 | 0 |
| 7 | 2 | 4 | Invoice 7 | 1 |
| 8 | 2 | 4 | Invoice 8 | 1 |
Invoice_Line_Items
| id | invoice_id | name | amount_due |
---------------------------------------------------------
| 1 | 1 | Item 1 | 500 |
| 2 | 1 | Item 2 | 500 |
| 3 | 2 | Item 3 | 500 |
| 4 | 2 | Item 4 | 500 |
| 5 | 3 | Item 5 | 500 |
| 6 | 3 | Item 6 | 500 |
| 7 | 4 | Item 7 | 500 |
| 8 | 4 | Item 8 | 500 |
| 9 | 5 | Item 9 | 500 |
| 10 | 5 | Item 10 | 500 |
| 11 | 6 | Item 11 | 500 |
| 12 | 6 | Item 12 | 500 |
| 13 | 7 | Item 13 | 500 |
| 14 | 7 | Item 14 | 500 |
| 15 | 8 | Item 15 | 500 |
| 16 | 8 | Item 16 | 500 |
Ok, hope those diagrams make sense enough. What i am looking for as a result set is this (example data set taken from above example data):
| clients.name | current_projects | archived_projects | total_amount_due | total_amount_paid |
-----------------------------------------------------------------------------------------------------------
| Client 1 | 1 | 1 | 2000 | 2000 |
| Client 2 | 1 | 1 | 2000 | 2000 |
Ok, so here's what's going on there:
Getting all non-archived clients
Getting a count of all non-archived projects
Getting a count of all archived projects
Getting a total_amount_due from the invoice_line_items table that is a sum of all of the non-archived invoices
Getting a total_amount_paid from the invoice_line_items table that is a sum of all of the archived invoices
I am relatively new to Rails and this is a fairly complex query (at least in my head). Please let me know if there is a simpler solution that i am overlooking or if i am just over complicating it. If i need to do multiple queries in my controller that's fine, i was just wanting to see if i could get away with one sql call. I'm pretty sure i can do this pretty easily with some subqueries but i'm not sure how to write those in the controller in Rails.
Thanks for any help or direction you can provide and if this question is just outrageous or whatever just let me know and i'll delete it and go search the Googles more (have tried already to no avail).
Ok, well i ended up figuring out a solution myself. Not quite sure it's the best solution....feels heavy and messing but i just created quite a few objects in the controller to get the sql statements i needed to pull the data from the database. I basically have one object for each column (column, not each row). Let me know if anyone can figure out a better solution.

Quickly calculating running totals in sql server using set based operations

I have some data that looks like this:
+---+--------+-------------+---------------+--------------+
| | A | B | C | D |
+---+--------+-------------+---------------+--------------+
| 1 | row_id | disposal_id | excess_weight | total_weight |
| 2 | 1 | 1 | 0 | 30 |
| 3 | 2 | 1 | 10 | 30 |
| 4 | 3 | 1 | 0 | 30 |
| 5 | 4 | 2 | 5 | 50 |
| 6 | 5 | 2 | 0 | 50 |
| 7 | 6 | 2 | 15 | 50 |
| 8 | 7 | 2 | 5 | 50 |
| 9 | 8 | 2 | 5 | 50 |
+---+--------+-------------+---------------+--------------+
And I am transforming it to look like this:
+---+--------+-------------+---------------+--------------+
| | A | B | C | D |
+---+--------+-------------+---------------+--------------+
| 1 | row_id | disposal_id | excess_weight | total_weight |
| 2 | 1 | 1 | 0 | 30 |
| 3 | 2 | 1 | 10 | 30 |
| 4 | 3 | 1 | 0 | 20 |
| 5 | 4 | 2 | 5 | 50 |
| 6 | 5 | 2 | 0 | 45 |
| 7 | 6 | 2 | 15 | 45 |
| 8 | 7 | 2 | 5 | 30 |
| 9 | 8 | 2 | 5 | 25 |
+---+--------+-------------+---------------+--------------+
Basically, I need to update the total_weight column by subtracting the sum of the excess_weights from previous rows in the table which belong to the same disposal_id.
I'm currently using a cursor because it's faster then other solutions I've tried (cte, triangular join, cross apply). My cursor solution keeps a running total that is reset to zero for each new disposal_id, increments it by the excess weight, and performs updates when needed and runs in about 40 seconds. The other solutions I've tried took anywhere from 3-5 minutes and I'm wondering if there is a relatively performant way to do this using set based operations?
I've spent a lot of time optimizing such queries, ended up with two performant options: either store precalculated running totals, as described in Denormalizing to enforce business rules: Running Totals, or calculate them on the client, which is also fast and easy.
The other solution you probably already tried is to do something like the answers found here
Unless you are using Oracle, which has decent aggregates for cumulative sum, you're better off using a cursor. At best, you're going to have to rejoin the table to itself or use another methods for what should be a O(n) operation. In general, the set based solution for problems like these are messy or really messy.
'previous rows' implies an ordering. so no - no set based operations there.
Oracle's LEAD and LAG are built for this, but SQL Server forces you into triangular joins... which i suppose you have investigated.