SQL Performance for Querying 3-tiered Table - sql

I have a table of people that includes each person's ID as well as the ID of their Boss in the job hierarchy (maximum of 3 tiers). For example, the table may look like:
ID BossID
1 NULL
2 1
3 1
4 1
5 2
6 3
7 3
8 2
9 3
10 NULL
So 1 and 10 don't have a boss, 1 is the boss of 2,3, and 4. 2 is the boss of 5 and 8, etc. What I want is a way to query this table so I can find all people that are below a specified ID in the hierarchy, for example if query for ID 1 then it returns 1,2,3,4,5,6,7,8,and 9, if I query 2 if returns 2,5, and 8.
My current attempt is:
with Hierarchy AS (
select a.ID as AncestorID, a.ID as DescendantID, 0 as Depth from BossTable a
UNION ALL
select CTE.AncestorID, a.ID, CTE.Depth + 1 from BossTable a
inner join Hierarchy CTE on a.BossID = CTE.DescendantID
)
select a.AncestorID, a.DescendantID, a.Depth from Hierarchy a
which does return exactly what I am looking for, but is a slow query and is causing problems in production. My ideal goal is to make this into a view that can be indexed, but currently, that is not possible as this cannot have a unique clustered index as required for other indexes. BossTable can also be edited frequently, moving people to different bosses, adding, or removing, so creating another table with this data is not realistic.
Any suggestions would be greatly appreciated, just looking to make this as efficient as possible.

Related

SQL insert, select performance for categorized product table

I have relational category & product tables. Categories are hierarchical. I will have queries based on category, for example
select *
from products
where CatId = 3
or
select *
from products
where CatId = 1
I have 6 level of category and 24 million row for products, I have to find fast and optimal solutions. My question is which structure is suitable.
I write some options, feel free to suggest a better alternative.
Current category table:
Id ParentId Name
---------------------
1 null CatA
2 null CatB
3 1 CatAa
4 2 CatBa
Product table option 1
Id Cat Name
------------------
1 3 Product_1
2 4 Product_2
Product table option 2
Id CatLevel1 CatLevel2 ... Name
-------------------------------------
1 1 3 . Product_1
2 2 4 . Product_2
Product table option 3
Id Cats Name
------------------
1 1:3 Product_1
2 2:4 Product_2
Always keep option one, plus some denormalised tables (options two onwards) if you so desire. By keeping option one, you have the source truth to revert to or derive the others from.
Option two is only recommended if the searcher always knows what depth/level to search at. For example, if they know they need Level2=CATAb then it works, but if they don't know CATAb is at level two, they don't know which column to look in. It also relies on knowing how many levels to represent; if you can have a hundred levels, you need a hundred columns, and it's fragile of you need to add more depths. Generally, this doesn't apply and so is generally not a good optimisation.
Option three is a straight no. Never store multiple values in a one field (one column of one row). It will make Efficient searching of that column next to impossible.
The alternative to option three is to have a "link" table. Just two columns, category_id and product_id. Then you list all ancestors of a product, just on different rows.
category_id
product_id
1
1
3
1
2
2
4
2
These are all known as adjacency lists. A different model altogether is Nested Sets. I'm on my phone, and it's hard to describe without lots of formatting, but if you research online you'll find lots of information. They're much harder to comprehend and implement Initially, but very fast at retrieval when specifying a parent.
Your product table option 1 is fine and need no change
product_id,
category_id,
... other attributes
Your problem is in accessing the product based on the category hierarchy - which would make a need of a hierarchical query to get all categories in the tree below your selected category.
Instead of
select * from product where category_id = 1;
you'll need to write an additional hierarchical query to get the whole hierarchy tree
with cat_tree (id) as (
select id
from category where id = 1
UNION ALL
select ca.id
from cat_tree ct
join category ca
on ct.id = ca.parent_id
)
select * from product
where category_id in
(select id from cat_tree);
Which may not be practicable, but you may simplify it by denormalizing the category table
Let's assume your category data is such as
ID PARENT_ID
---------- ----------
1
3 1
5 3
6 3
The query below, which may be implemented as a MATERIALIZED VIEW that is refreshed on each category change pre-calculates all direct and indirect parent and child relations.
The result is
ID CHILD_ID
---------- ----------
1 1
1 3
1 5
1 6
3 3
3 5
3 6
5 5
6 6
E.g. for 1 you get itself, all its child's, their child's etc.
Using this category_denormobject your query can be simplified to
select *
from product
where category_id in
(select child_id from category_denorm where id = 1);

SQL query in postgresql to produce pivot table report to turn columns into rows

Unsure how to create a pivot report query in postgres (newbie to postgres) based on the following tables/report layout.
Please note that I am not able to change the structure of these tables as out of my control.
STOCK_REF ( sr_id, stock_name )
STOCK_INVENTORY ( si_id, sr_id, stock_count ) * where sr_id here is a foreign key constraint
Sample data for each table may include the following:
STOCK_REF
1 GUITAR
2 BASS
3 DRUMS
4 KEYBOARDS
STOCK_INVENTORY
1 1 10
2 2 5
3 3 2
4 4 15
Using the above two tables, I need to produce a report that looks like:
STOCK NAME COUNT
--------------------------- --------
GUITAR 10
BASS 5
DRUMS 2
KEYBOARDS 15
which is like a pivot table.
The thing is, I can write the query that will produce the stock name as columns with counts but I actually need to have the stock name and counts as rows, like above.
Any help with this postgres query would be ideal
Try the query below. It's simple SQL. I think POSTGRES don't determine too much here...
SELECT stock_name AS "STOCK NAME", stock_count AS "COUNT"
FROM STOCK_REF INNER JOIN STOCK_INVENTORY
ON (STOCK_REF.sr_id = STOCK_INVENTORY.sr_id)

Oracle 12c - Insert values in a table using values from another table

I have a table (TABLEA) like so:
type_id level
1 7
2 4
3 2
4 5
And another table (TABLEB) like so:
seq_id type_id name order level
1 1 display 1 7
2 1 header 2
3 1 detail 3
4 2 display 1 4
5 2 header 2
6 2 detail 3
TABLEB.TYPE_ID is FK to TABLEA.TYPE_ID. Currently I am entering the data in TABLEB manually.
I have 2 new rows in TABLEA.. type_id 3 and 4.
How can I populate data which do not exist in TABLEB automatically using TABLEA? I would like all columns in TABLEB to be inserted automatically.
So, as you can see:
SEQ_ID will be sequential
When ORDER value is 1, NAME value will be "display", and LEVEL will be 7
When ORDER value is 2, NAME value will be "header"
When ORDER value is 3, NAME value will be "detail"
I am expecting after the insert:
seq_id type_id name order level
1 1 display 1 7
2 1 header 2
3 1 detail 3
4 2 display 1 4
5 2 header 2
6 2 detail 3
7 3 display 1 2
8 3 header 2
9 3 detail 3
10 4 display 1 5
11 4 header 2
12 4 detail 3
Any help is appreciated!
You can either:
Have your application code populate both tables, i.e.: INSERTs the appropriate records into to both tables.
(Sounds like you're leaning to) Have Oracle do the work behind the scenes, i.e.: Oracle does the INSERTs in TABLEB for you. The way to do this is by creating a TRIGGER on TABLEA. Here's an example that might get you started: https://stackoverflow.com/a/13356277/1680777
Some people will tell you that TRIGGERs might make debugging difficult because part of your logic is in the database. There's some validity to that criticism. I won't say always/never use triggers. Use them where they make sense: where the value they provide outweighs their complexity.
So this can be done in pure SQL: INSERT ALL which allows us to issue multiple inserts in the same statement.
insert all
into tableb (seq_id, type_id, name, order_id, level_id)
values(tableb_id_seq.nextval, type_id, 'display', 1, level_id)
into tableb (seq_id, type_id, name, order_id)
values(tableb_id_seq.nextval+1, type_id, 'header', 2)
into tableb (seq_id, type_id, name, order_id)
values(tableb_id_seq.nextval+2, type_id, 'detail', 3)
select a.type_id, a.level_id from tablea a
minus
select b.type_id, b.level_id from tableb b
/
The manipulation of the sequence is a bit funny: it is required because every call to NEXTVAL returns the same value in one statement. Obviously in order to make this work, the sequence needs to increment by three:
create sequence tableb_id_seq increment by 3;
This might be enough to rule out such an approach approach. As an alternative you could use (SEQ_ID, ORDER_ID) as a compound primary key but that's not nice either.
By the way, ORDER and LEVEL are keywords: you can't use them as column names.

Selecting only rows containing at least the given elements in SQL

Using PostgreSQL, and given the following sample table, how do I select all parents that have at least a child 10 and a child 20?
parent | child
--------+-------
1 | 10
1 | 20
1 | 30
2 | 10
2 | 20
3 | 10
In other words, this is the expected result:
parent
--------
1
2
In general, how do I select all parents that have at least all of the given children x1, x2, ..., xn? What is the most efficient way to do this?
Thanks!
SELECT parent FROM table WHERE child IN(10,20)
GROUP BY parent
HAVING COUNT(DISTINCT child)>=2
Fiddle
It's not completely clear what your asking. However, I shall give it a crack.
If you're going to manually define the children you can do a simple select statement:
SELECT DISTINCT parent
FROM table1
WHERE child IN ('10', '20')
This would select all Parents that have 10 or 20 as there child. To add more, just add the number to the IN() part.
If however you want to do this for a large number of children or perhaps an unknown number of children then you can create a temp table to store the children search values and join it to your main table. Something like:
CREATE TABLE #SearchChildren
(
Child int
)
Then input your search values into #SearchChildren. Need to know more about what your doing to do this bit.
SELECT DISTINCT a.parent
FROM table1 as a
JOIN #SearchChildren as s
ON a.child = s.Child
Without knowing more about what your trying to do it's difficult to give a full answer but hopefully this helps.

MySQL Query That Can Pull the Data I am Seeking?

On the project I am working on, I am stuck with the table structure from Hades. Two things to keep in mind:
I can't change the table structure right now. I'm stuck with it for the time being.
The queries are dynamically generated and not hard coded. So, while I am asking for a query that can pull this data, what I am really working toward is an algorithm that will generate the query I need.
Hopefully, I can explain the problem without making your eyes glaze over and your brain implode.
We have an instance table that looks (simplified) along these lines:
Instances
InstanceID active
1 Y
2 Y
3 Y
4 N
5 Y
6 Y
Then, there are multiple data tables along these lines:
Table1
InstanceID field1 reference_field2
1 John 5
2 Sally NULL
3 Fred 6
4 Joe NULL
Table2
InstanceID field3
5 1
6 1
Table3
InstanceID fieldID field4
5 1 Howard
5 2 James
6 2 Betty
Please note that reference_field2 in Table1 contains a reference to another instance.
Field3 in Table2 is a bit more complicated. It contains a fieldID for Table 3.
What I need is a query that will get me a list as follows:
InstanceID field1 field4
1 John Howard
2 Sally
3 Fred
The problem is, in the query I currently have, I do not get Fred because there is no entry in Table3 for fieldID 1 and InstanceID 6. So, the very best list I have been able to get thus far is
InstanceID field1 field4
1 John Howard
2 Sally
In essence, if there is an entry in Table1 for Field 2, and there is not an entry in Table 3 that has the instanceID contained in field2 and the field ID contained in field3, I don't get the data from field1.
I have looked at joins till I'm blue in the face, and I can't see a way to handle the case when table3 has no entry.
LEFT JOIN...
SELECT a.InstanceID, b.field1, d.field4
FROM instances AS a
JOIN Table1 AS b ON a.InstanceID = b.InstanceID
LEFT JOIN Table2 AS c ON b.reference_field2 = c.InstanceID
LEFT JOIN Table3 AS d ON (c.InstanceID = d.InstanceID AND c.field3 = d.fieldId)
WHERE a.active = 'Y'
The two left joins should handle the case where there are no other rows...
It would help if you posted the query you have, because I think you have some mistakes in the table descriptions here, so it's not very clear how are the tables connected.
Anyway, you probably have an inner join in your query (normally written as just JOIN). Replace it with a left outer join (LEFT JOIN). It will not require the right table to contain the row and return NULL instead of the actual value.