Getting parenthood of tree structure - sql

I am trying to retrieve a tree structure i want specific levels of 'parenthood'. my table has level of depth, pathIndex and mapping. my first approach was to make some kinds of substrings to be able to look for the value via the mapping, but I am getting multiple errors on conversion of strings. one thing that might be possible is that if i try and query an item that is not at the lowest level it should return null for the levels it is missing.
In the table if i where to query for the line while asterisks
Id depth pathindex ItemNumber
4CF91F7F-832E-468D-B44A-E14DC66E710A 0 0 0.0
D34784A3-2134-4D09-828E-0EDA0C275C43 1 1 1
38158804-3EBC-4841-B1AF-1B86AD153010 2 1 1.1
8E25D494-322F-45F9-8A91-2A385F561C71 3 1 1.1.1
**64EB6C43-FF9C-0FF9-133F-01F4F21DA14F** 4 1 1.1.1.1
13AFA35C-80F8-405A-8980-33C3F7733EE2 2 2 1.2
3F1332E9-4D42-4BD8-9423-598430E94CB5 3 1 1.2.1
B3CC1306-A122-46F6-8F67-30FBABA3B590 4 1 1.2.1.1
C3F27C8E-F96B-4498-A85F-E4FC8EA90ED7 4 2 1.2.1.2
This is how it should be looking for the information, the static string are the ones i don't know how to generate in order to get nulls when asking for a level that is not that deep.
Select top 1 VehicleGroupId as Region
from GroupHierarchy where GroupHierarchy.numericalmapping = '1'
Select top 1 VehicleGroupId as gz
from GroupHierarchy where GroupHierarchy.numericalmapping = '1.1'
Select top 1 VehicleGroupId as cedis
from GroupHierarchy where GroupHierarchy.numericalmapping = '1.1.1'

Instead of putting decimals between your hierarchy members rather use forwardslashes (1/2/3) and then you can use Microsoft SQL hiearchy data type and functions to easily join and retain the structure:
https://www.sqlshack.com/use-hierarchyid-sql-server/

Related

Checking integer lists under a column for each record to tag/classify each record (SQL) - reverse index match (or some other way)?

I have a problem I'm trying to resolve. I'd love to get your ideas or help with the method I have in my mind. I created a good analogy for it: I need to tag each record in a table if they're partial purchases or not. It is clearer if you can look at the example below:
I have products classified as Level 2 and Level 1. And each Level 1 product is associated with one or more level 2 products. I only need to check Level 1 purchases, because Level 2 ones can't be partial purchase.
Partial purchase: If a customer who purchased Level 1 product didn't purchase any of the associated Level 2 products, that purchase is partial.
It should be simple, but the columns I have hold the necessary information in a weird way.
Method I have in mind:
For each level 1 purchase
Get all the Level 2 purchases from that customer with a cte
Check the product dependency columns of those:
if this level 1 product's product ID is in any of those lists(!) it's not a partial purchase. If not it's partial.
This should work, but I can't get it to work in SQL.
My question:
How can I check the each element in each list in the entire column for all the Level 2 purchases from that customer? Because I don't have any way of associating these level 1 & level 2 purchases based on their product dependencies.
example image showing the columns etc. - first three records here should explain enough about the problem & the expected result
Any help is appreciated, I'm a bit slow on SQL
What we have is PostgreSQL through Redash
You can do it using unnest() function of postgresql. This function are converted array list to row for easily using.
I don't know type of your product_dependency field. If this field is not a int array so you can convert this field to array using string_to_array() function and after then use unnest().
I wrote an example for you:
select * from your_table t1
where
t1.product_level = 'Level 1' and
t1.product_id not in (select unnest(product_dep) from your_table where customer_id = t1.customer_id)
And result (gets only partial purchases):
id customer_id product_id product_level
3 B 1 Level 1
6 D 5 Level 1
I will write an alternative query for you:
Example:
select
t1.id, t1.customer_id, t1.product_id, t1.product_level,
case when product_level = 'Level 1' then (t1.product_id not in (select unnest(product_dep) from your_table where customer_id = t1.customer_id)) else false end as partial_purchase
from your_table t1
Result:
id customer_id product_id product_level partial_purchase
1 A 1 Level 1 false
2 A 2 Level 2 false
3 B 1 Level 1 true
4 C 3 Level 2 false
5 C 4 Level 1 false
6 D 5 Level 1 true
7 D 6 Level 2 false
You can use this query in your subquery for easily filtering like as:
where partial_purchase = true
or
where partial_purchase = false
or
where product_id = <id>

Calculating relative frequencies in SQL

I am working on a tag recommendation system that takes metadata strings (e.g. text descriptions) of an object, and splits it into 1-, 2- and 3-grams.
The data for this system is kept in 3 tables:
The "object" table (e.g. what is being described),
The "token" table, filled with all 1-, 2- and 3-grams found (examples below), and
The "mapping" table, which maintains associations between (1) and (2), as well as a frequency count for these occurrences.
I am therefore able to construct a table via a LEFT JOIN, that looks somewhat like this:
SELECT mapping.object_id, mapping.token_id, mapping.freq, token.token_size, token.token
FROM mapping LEFT JOIN
token
ON (mapping.token_id = token.id)
WHERE mapping.object_id = 1;
object_id token_id freq token_size token
+-----------+----------+------+------------+--------------
1 1 1 2 'a big'
1 2 1 1 'a'
1 3 1 1 'big'
1 4 2 3 'a big slice'
1 5 1 1 'slice'
1 6 3 2 'big slice'
Now I'd like to be able to get the relative probability of each term within the context of a single object ID, so that I can sort them by probability, and see which terms are most probably (e.g. ORDER BY rel_prob DESC LIMIT 25)
For each row, I'm envisioning the addition of a column which gives the result of freq/sum of all freqs for that given token_size. In the case of 'a big', for instance, that would be 1/(1+3) = 0.25. For 'a', that's 1/3 = 0.333, etc.
I can't, for the life of me, figure out how to do this. Any help is greatly appreciated!
If I understood your problem, here's the query you need
select
m.object_id, m.token_id, m.freq,
t.token_size, t.token,
cast(m.freq as decimal(29, 10)) / sum(m.freq) over (partition by t.token_size, m.object_id)
from mapping as m
left outer join token on m.token_id = t.id
where m.object_id = 1;
sql fiddle example
hope that helps

How to Order By within an Order By in SQL?

I'm trying to create a SQL statement which will recreate the hierarchical container/folder/test structure in SilkCentral Test Manager.
Test containers have no ParentID
Test folders contain a ParentID and IsLeaf = 0
Tests contain a ParentID and IsLeaf = 1
This Query results in all of the test containers, folders, and tests:
SELECT "NodeID", "ParentID", "Name", "IsLeaf", "OrderNumber"
FROM "Silk"."TM_TestPlanNodes" AS TPN
WHERE PROJECTID = 36
ORDER BY "ParentID", "OrderNumber", "IsLeaf"
Here are some of the Results:
NodeID ParentID Name IsLeaf OrderNumber
65408 Installation and Upgrades 0 0
65445 Connectivity 0 1
65448 Focus 0 2
65409 GINA / PLAP 0 3
65446 Graphical User Interface 0 4
71038 Login Properties 0 5
65449 Miscellaneous 0 6
70636 Net Firewall 0 7
70998 Software Updates 0 8
65447 Third-party Services 0 9
70805 SilkTest Automated Tests 0 10
68812 65408 0. Setup 0 0
65454 65408 1. Installations & Upgrades 0 1
65450 65408 Typical/Custom Installation 0 2
I would like this ordering instead:
The ParentID is sorted, but if there exists a Node with the ParentID=thePreviousNode'sID, then that is chosen next. If there are multiple of those nodes, they should be ordered by IsLeaf and then, OrderNumber.
How to accomplish this? I'm very limited in what I can do, because I think very complicated syntax will end up throwing errors in Silk. I was going to try a nested SELECT statement:
SELECT "NodeID", "ParentID", "Name","IsLeaf"
FROM "Silk"."TM_TestPlanNodes"
WHERE PROJECTID = '36'AND ParentID LIKE (
SELECT ParentID
FROM "Silk"."TM_TestPlanNodes"
WHERE NAME = 'Installation and Upgrades')
But this is getting this error: "Could not execute report query: Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression."
This is why I'm fiddling with Order By.
You can use a recursive cte to create a hidden column and orderby that column. The hidden column should be Something like:
WITH cte (NodeID, ParentID, Name, IsLeaf, [Order])
AS
(
SELECT NodeID, ParentID, Name, IsLeaf, cast(NodeID as nvarchar(10))
FROM "Silk"."TM_TestPlanNodes"
WHERE PROJECTID = '36'
UNION ALL
SELECT "NodeID", "ParentID", "Name","IsLeaf", cast(leftNode.ParentID as nvarchar(10)) + cast(leftNode.NodeID as nvarchar(10))
FROM "Silk"."TM_TestPlanNodes" as leftNode
INNER JOIN cte on cte.NodeID = leftNode.ParentID
WHERE leftNode.ParentID = cte.NodeID
)
select "NodeID", "ParentID", "Name","IsLeaf" from cte
order by cast([Order] as nvarchar(50))
This was written in notepad so is possible to have some errors, but the idea is to make an [order] column that for example for 65530 would be 654086554569530 (the parent_parent, the parent and the node)
EDIT:
this only works if the ids are all 5 characters long, but from here you can make the proper tweaks.
Although it might not be a PERFECT fit, it is very close with a nested hierarchical representation of parent-child records in a self-joined list and incorporated proper ordering concerns. You may have to tweak it a bit for your table, but here's a link to a prior solution
To clarify that problem with the menu and the corresponding data.
id | parentid | name
1 | 0 | CatOne
2 | 0 | CatTwo
3 | 0 | CatThree
4 | 1 | SubCatOne
5 | 1 | SubCatOne2
6 | 3 | SubCatThree
Desired output
CatOne 1
--SubCatOne 4
--SubCatOne2 5
CatTwo 2
CatThree 3
--SubCatThree 6
The FIRST case is pre-grouping all the like ID's based on the parent... So, when the parent ID is 0, it IS the top-most level, so we keep it's ID. Then, any children under it, we want their respective PARENT IDs so all of the same are correctly pre-grouped.
The purpose of the SECOND group by is to force the entry that represents the actual TOP LEVEL menu item to the top of the list regardless of the child entries.
Say you have a table where IDs are already established, and you now add a new item into position ID = 7 for "New Top Level" and want to move ID #s 2 and 3 into the new "top-level section. If you just to the query with the first CASE, your records would be simulated returned as
ID Parent Name (natural order from the table)
2 7 CatTwo
3 7 CatThree
7 0 New Category. (we want THIS one in FIRST POSITION of the group)
As you can see, this would be a bad representation of the sub-grouping order. The top-level item actually is in the 3rd position... To bring it to the front, we are now sub-grouping and saying... if the Parent ID of the record = 0, then sort it as if it were a '1' priority. Anything else is considered a '2' priority and would simulate the result like
ID Parent Name SubPrioritySort
7 0 New Category. 1
2 7 CatTwo 2
3 7 CatThree 2
Since you are not actually returning these "CASE" values in your result query, you wouldn't otherwise visually see it... but for grins, add them as columns to your query to see the impact. Hopefully this clarified the answer for you.
In your question, you would obviously be able to add your sort order column to the basis of this query.

How to fetch categories and sub-categories in a single query in sql? (mysql)

I would like to know if it's possible to extract the categories and sub-categories in a single DB fetch.
My DB table is something similar to that shown below
table
cat_id parent_id
1 0
2 1
3 2
4 3
5 3
6 1
i.e. when the input is 3, then all the rows with parent_id as 3 AND the row 3 itself AND all the parents of row 3 should be fetched.
output
cat_id parent_id
3 2 -> The row 3 itself
4 3 -> Row with parent as 3
5 3 -> Row with parent as 3
2 1 -> 2 is the parent of row 3
1 0 -> 1 is the parent of row 2
Can this be done using stored procedures and loops? If so, will it be a single DB fetch or multiple? Or are there any other better methods?
Thanks!!!
If you asking about "Is there in mysql recursive queries?" answer "NO".
But there is very good approach to handle it.
Create helper table (saying CatHierarchy)
CatHierarchy:
SuperId, ChildId, Distance
------------------------------
1 1 0
1 2 1
2 2 0
This redundant data allows easily in 1 query to select any hierarchy, and in 2 insert support any hierarchy (deletion also performed in 1 query with help of delete cascade integrity).
So what does this mean. You track all path in hierarchy. Each node of Cat must add reference to itself (distance 0), then support duplication by adding redundant data about nodes are linked.
To select category with sub just write:
SELECT c.* from Category c inner join CatHierarchy ch ON ch.ChildId=c.cat_id
WHERE ch.SuperId = :someSpecifiedRootOfCat
someSpecifiedRootOfCat - is parameter to specify root of category
THATS ALL!
Theres a really good article about this on Sitepoint - look especially at Modified Preorder Tree Traversal
It's tricky. I assume you want to display categories, kind of like a folder view? Three fields: MainID, ParentID, Name... Apply to your table, and it should work like a charm. I think it's called a recursive query?
WITH CATEGORYVIEW (catid, parentid, categoryname) AS
(
SELECT catid, ParentID, cast(categoryname as varchar(255))
FROM [CATEGORIES]
WHERE isnull(ParentID,0) = 0
UNION ALL
SELECT C.catid, C.ParentID, cast(CATEGORYVIEW.categoryname+'/'+C.categoryname as varchar(255))
FROM [CATEGORIES] C
JOIN CATEGORYVIEW ON CATEGORYVIEW.catID = C.ParentID
)
SELECT * FROM CATEGORYVIEW ORDER BY CATEGORYNAME

SQL Recursive Tables

I have the following tables, the groups table which contains hierarchically ordered groups and group_member which stores which groups a user belongs to.
groups
---------
id
parent_id
name
group_member
---------
id
group_id
user_id
ID PARENT_ID NAME
---------------------------
1 NULL Cerebra
2 1 CATS
3 2 CATS 2.0
4 1 Cerepedia
5 4 Cerepedia 2.0
6 1 CMS
ID GROUP_ID USER_ID
---------------------------
1 1 3
2 1 4
3 1 5
4 2 7
5 2 6
6 4 6
7 5 12
8 4 9
9 1 10
I want to retrieve the visible groups for a given user. That it is to say groups a user belongs to and children of these groups. For example, with the above data:
USER VISIBLE_GROUPS
9 4, 5
3 1,2,4,5,6
12 5
I am getting these values using recursion and several database queries. But I would like to know if it is possible to do this with a single SQL query to improve my app performance. I am using MySQL.
Two things come to mind:
1 - You can repeatedly outer-join the table to itself to recursively walk up your tree, as in:
SELECT *
FROM
MY_GROUPS MG1
,MY_GROUPS MG2
,MY_GROUPS MG3
,MY_GROUPS MG4
,MY_GROUPS MG5
,MY_GROUP_MEMBERS MGM
WHERE MG1.PARENT_ID = MG2.UNIQID (+)
AND MG1.UNIQID = MGM.GROUP_ID (+)
AND MG2.PARENT_ID = MG3.UNIQID (+)
AND MG3.PARENT_ID = MG4.UNIQID (+)
AND MG4.PARENT_ID = MG5.UNIQID (+)
AND MGM.USER_ID = 9
That's gonna give you results like this:
UNIQID PARENT_ID NAME UNIQID_1 PARENT_ID_1 NAME_1 UNIQID_2 PARENT_ID_2 NAME_2 UNIQID_3 PARENT_ID_3 NAME_3 UNIQID_4 PARENT_ID_4 NAME_4 UNIQID_5 GROUP_ID USER_ID
4 2 Cerepedia 2 1 CATS 1 null Cerebra null null null null null null 8 4 9
The limit here is that you must add a new join for each "level" you want to walk up the tree. If your tree has less than, say, 20 levels, then you could probably get away with it by creating a view that showed 20 levels from every user.
2 - The only other approach that I know of is to create a recursive database function, and call that from code. You'll still have some lookup overhead that way (i.e., your # of queries will still be equal to the # of levels you are walking on the tree), but overall it should be faster since it's all taking place within the database.
I'm not sure about MySql, but in Oracle, such a function would be similar to this one (you'll have to change the table and field names; I'm just copying something I did in the past):
CREATE OR REPLACE FUNCTION GoUpLevel(WO_ID INTEGER, UPLEVEL INTEGER) RETURN INTEGER
IS
BEGIN
DECLARE
iResult INTEGER;
iParent INTEGER;
BEGIN
IF UPLEVEL <= 0 THEN
iResult := WO_ID;
ELSE
SELECT PARENT_ID
INTO iParent
FROM WOTREE
WHERE ID = WO_ID;
iResult := GoUpLevel(iParent,UPLEVEL-1); --recursive
END;
RETURN iResult;
EXCEPTION WHEN NO_DATA_FOUND THEN
RETURN NULL;
END;
END GoUpLevel;
/
Joe Cleko's books "SQL for Smarties" and "Trees and Hierarchies in SQL for Smarties" describe methods that avoid recursion entirely, by using nested sets. That complicates the updating, but makes other queries (that would normally need recursion) comparatively straightforward. There are some examples in this article written by Joe back in 1996.
I don't think that this can be accomplished without using recursion. You can accomplish it with with a single stored procedure using mySQL, but recursion is not allowed in stored procedures by default. This article has information about how to enable recursion. I'm not certain about how much impact this would have on performance verses the multiple query approach. mySQL may do some optimization of stored procedures, but otherwise I would expect the performance to be similar.
Didn't know if you had a Users table, so I get the list via the User_ID's stored in the Group_Member table...
SELECT GroupUsers.User_ID,
(
SELECT
STUFF((SELECT ',' +
Cast(Group_ID As Varchar(10))
FROM Group_Member Member (nolock)
WHERE Member.User_ID=GroupUsers.User_ID
FOR XML PATH('')),1,1,'')
) As Groups
FROM (SELECT User_ID FROM Group_Member GROUP BY User_ID) GroupUsers
That returns:
User_ID Groups
3 1
4 1
5 1
6 2,4
7 2
9 4
10 1
12 5
Which seems right according to the data in your table. But doesn't match up with your expected value list (e.g. User 9 is only in one group in your table data but you show it in the results as belonging to two)
EDIT: Dang. Just noticed that you're using MySQL. My solution was for SQL Server. Sorry.
-- Kevin Fairchild
There was already similar question raised.
Here is my answer (a bit edited):
I am not sure I understand correctly your question, but this could work My take on trees in SQL.
Linked post described method of storing tree in database -- PostgreSQL in that case -- but the method is clear enough, so it can be adopted easily for any database.
With this method you can easy update all the nodes depend on modified node K with about N simple SELECTs queries where N is distance of K from root node.
Good Luck!
I don't remember which SO question I found the link under, but this article on sitepoint.com (second page) shows another way of storing hierarchical trees in a table that makes it easy to find all child nodes, or the path to the top, things like that. Good explanation with example code.
PS. Newish to StackOverflow, is the above ok as an answer, or should it really have been a comment on the question since it's just a pointer to a different solution (not exactly answering the question itself)?
There's no way to do this in the SQL standard, but you can usually find vendor-specific extensions, e.g., CONNECT BY in Oracle.
UPDATE: As the comments point out, this was added in SQL 99.