Get hierarchical structure using SQL Server - sql

I have a self-referencing table with a primary key, id and a foreign key parent_id.
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PK | NULL | IDENTITY |
| parent_id | int(11) | YES | | NULL | |
| name | varchar(255) | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
I have got a table as following (reduce data for more clear)
Table MySiteMap
Id Name parent_id
1 A NULL
2 B 1
3 C 1
4 D 1
20 B1 2
21 B2 2
30 C1 3
31 C2 3
40 D1 4
41 D2 4
I would like get the hierarchical structure using SQL Server query:
A
|
B
|
| B1
| B2
C
|
| C1
| C2
D
|
| D1
| D2
Any suggestions?

You can use Common Table Expressions.
WITH LeveledSiteMap(Id, Name, Level)
AS
(
SELECT Id, Name, 1 AS Level
FROM MySiteMap
WHERE Parent_Id IS NULL
UNION ALL
SELECT m.Id, m.Name, l.Level + 1
FROM MySiteMap AS m
INNER JOIN LeveledSiteMap AS l
ON m.Parent_Id = l.Id
)
SELECT *
FROM LeveledSiteMap

Use this:
;WITH CTE(Id, Name, parent_id, [Level], ord) AS (
SELECT
MySiteMap.Id,
CONVERT(nvarchar(255), MySiteMap.Name) AS Name,
MySiteMap.parent_id,
1,
CONVERT(nvarchar(255), MySiteMap.Id) AS ord
FROM MySiteMap
WHERE MySiteMap.parent_id IS NULL
UNION ALL
SELECT
MySiteMap.Id,
CONVERT(nvarchar(255), REPLICATE(' ', [Level]) + '|' + REPLICATE(' ', [Level]) + MySiteMap.Name) AS Name,
MySiteMap.parent_id,
CTE.[Level] + 1,
CONVERT(nvarchar(255),CTE.ord + CONVERT(nvarchar(255), MySiteMap.Id)) AS ord
FROM MySiteMap
JOIN CTE ON MySiteMap.parent_id =CTE.Id
WHERE MySiteMap.parent_id IS NOT NULL
)
SELECT Name
FROM CTE
ORDER BY ord
For this:
A
| B
| B1
| B2
| C
| C1
| C2
| D
| D1
| D2

I started with a query, (but when I check it now it is similar to Mark.)
I will add it anyway, while I created also a sqlfiddle with mine and Mark query.
WITH tList (id,name,parent_id,nameLevel)
AS
(
SELECT t.id, t.name, t.parent_id, 1 AS nameLevel
FROM t as t
WHERE t.parent_id IS NULL
UNION ALL
SELECT tnext.id, tnext.name, tnext.parent_id, tList.nameLevel + 1
FROM t AS tnext
INNER JOIN tList AS tlist
ON tnext.parent_id = tlist.id
)
SELECT id,name,isnull(parent_id,0) 'parent_id',nameLevel FROM tList order by nameLevel;
A good blog:
SQL Query – How to get data in Hierarchical Structure?

i know changing the structure of a table is always a critical operation but since sql server 2008 introduced the HierarchyId Datatype i really like workig with it. Maybe have a look at:
http://www.codeproject.com/Articles/37171/HierarchyID-Data-Type-in-SQL-Server
http://www.codeproject.com/Tips/740553/Hierarchy-ID-in-SQL-Server
I am sure you will understand quickly how to use this datatype and his functions. The SQL Code using this datatype is more structured and has better performance than CTE's.

Related

SQL Server recursive query to show path of parents

I am working with SQL Server statements and have one table like:
| item | value | parentItem |
+------+-------+------------+
| 1 | 2test | 2 |
| 2 | 3test | 3 |
| 3 | 4test | 4 |
| 5 | 1test | 1 |
| 6 | 3test | 3 |
| 7 | 2test | 2 |
And I would like to get the below result using a SQL Server statement:
| item1 | value1 |
+-------+--------------------------+
| 1 | /4test/3test/2test |
| 2 | /4test/3test |
| 3 | /4test |
| 5 | /4test/3test/2test/1test |
| 6 | /4test/3test |
| 7 | /4test/3test/2test |
I didn't figure out the correct SQL to get all the values for all the ids according to parentItem.
I have tried this SQL :
with all_path as
(
select item, value, parentItem
from table
union all
select a.item, a.value, a.parentItem
from table a, all_path b
where a.item = b.parentItem
)
select
item as item1,
stuff(select '/' + value
from all_path
order by item asc
for xml path ('')), 1, 0, '') as value1
from
all_path
But got the "value1" column in result like
/4test/4test/4test/3test/3test/3test/3test/2test/2test/2test/2test
Could you please help me with that? Thanks a lot.
based on the expected output you gave, use the recursive part to concatenate the value
;with yourTable as (
select item, value, parentItem
from (values
(1,'2test',2)
,(2,'3test',3)
,(3,'4test',4)
,(5,'1test',1)
,(6,'3test',3)
,(7,'2test',2)
)x (item,value,parentItem)
)
, DoRecursivePart as (
select 1 as Pos, item, convert(varchar(max),value) value, parentItem
from yourTable
union all
select drp.pos +1, drp.item, convert(varchar(max), yt.value + '/' + drp.value), yt.parentItem
from yourTable yt
inner join DoRecursivePart drp on drp.parentItem = yt.item
)
select drp.item, '/' + drp.value
from DoRecursivePart drp
inner join (select item, max(pos) mpos
from DoRecursivePart
group by item) [filter] on [filter].item = drp.item and [filter].mpos = drp.Pos
order by item
gives
item value
----------- ------------------
1 /4test/3test/2test
2 /4test/3test
3 /4test
5 /4test/3test/2test/1test
6 /4test/3test
7 /4test/3test/2test
Here's the sample data
drop table if exists dbo.test_table;
go
create table dbo.test_table(
item int not null,
[value] varchar(100) not null,
parentItem int not null);
insert dbo.test_table values
(1,'test1',2),
(2,'test2',3),
(3,'test3',4),
(5,'test4',1),
(6,'test5',3),
(7,'test6',2);
Here's the query
;with recur_cte(item, [value], parentItem, h_level) as (
select item, [value], parentItem, 1
from dbo.test_table tt
union all
select rc.item, tt.[value], tt.parentItem, rc.h_level+1
from dbo.test_table tt join recur_cte rc on tt.item=rc.parentItem)
select rc.item,
stuff((select '/' + cast(parentItem as varchar)
from recur_cte c2
where rc.item = c2.item
order by h_level desc FOR XML PATH('')), 1, 1, '') [value1]
from recur_cte rc
group by item;
Here's the results
item value1
1 4/3/2
2 4/3
3 4
5 4/3/2/1
6 4/3
7 4/3/2

SQL join tables and CASE with CONCAT

I'm a SQL newb... and I need to join two tables (see below)
Table A
| id | Recipe_Web_Codes |
|----|--------------------|
| 1 | GF VGT |
| 2 | |
| 3 | VGN |
Table B
| id | Recipe_Web_Code | Webcode_Fullname | Color |
|----|-----------------|------------------------|---------|
| 1 | VGT | Vegetarian | #ff6038 |
| 2 | VGN | Vegan Friendly | #97002d |
| 3 | GF | Gluten Friendly | #6ca4b6 |
and produce the following table:
RESULT
| id | Recipe_Web_Codes | Wecode_Fullname | Color |
|-------------------------------------------------------|------------------|
| 1 | GF VGT | Gluten Friendly, Vegetarian | #6ca4b6, #ff6038 |
| 2 | | | |
| 3 | VGN | Vegan Friendly | #97002d |
I honestly don't know where to begin. I tried this but got stuck on how to concatenate case results into a single field. Am I on the right track?
select Recipe_Web_Codes, Webcode_Fullname =
case
when Recipe_Web_Codes like '%VGT%' then (select Webcode_Fullname FROM TABLE_B where Recipe_Web_Code = 'VGT')
when Recipe_Web_Codes like '%VGN%' then (select Webcode_Fullname FROM TABLE_B where Recipe_Web_Code = 'VGN')
when Recipe_Web_Codes like '%GF%' then (select Webcode_Fullname FROM TABLE_B where Recipe_Web_Code = 'GF')
end,
Color =
case
when Recipe_Web_Codes like '%VGT%' then (select Color FROM TABLE_B where Recipe_Web_Code = 'VGT')
when Recipe_Web_Codes like '%VGN%' then (select Color FROM TABLE_B where Recipe_Web_Code = 'VGN')
when Recipe_Web_Codes like '%GF%' then (select Color FROM TABLE_B where Recipe_Web_Code = 'GF')
end
from TABLE_A
EDIT: It just clicked on me that I missed a very important point as to why I need to aggregate these strings. The resulting table is going to be exported to JSON by another separate process, so no point to mention database normalization. Also, this is SQL 2016 SP2 so I don't have the fuction for String_agg available.
You should not be storing multiple values in a single column, so I would advise you to fix your data model.
That said, you can do what you want by pulling apart the strings and re-aggregating:
select a.*, b.*
from a outer apply
(select string_agg(b.Webcode_Fullname, ',') as Webcode_Fullname,
string_agg(b.Webcode_Fullname, ',') as Colors
from string_split(a.recipe_web_codes, ' ') s join
b
on s.value = b.Recipe_Web_Code
) b;
Very important: string_split() does not guarantee the ordering of the values. If the ordering of the resulting strings is important, you can handle this -- assuming you have no duplicates -- by using logic such as:
select a.*, b.*
from a outer apply
(select string_agg(b.Webcode_Fullname, ',') within group (order by charindex(b.Recipe_Web_Codeas, a.recipe_web_codes)) as Webcode_Fullname,
string_agg(b.Webcode_Fullname, ',') within group (order by charindex(b.Recipe_Web_Codeas, a.recipe_web_codes)) as Colors
from string_split(a.recipe_web_codes, ' ') s join
b
on s.value = b.Recipe_Web_Code
) b;
Let me emphasize again that you should put your effort into fixing your data model, by having a separate table for the recipe web codes, with one row per code.
EDIT:
One solution for older versions is a recursive CTE:
with bs as (
select b.*, row_number() over (order by id) as seqnum
from b
),
cte as (
select a.id, convert(varchar(max), Recipe_Web_Codes) as fullnames, convert(varchar(max), Recipe_Web_Codes) as colors, 1 as ind
from a
union all
select cte.id,
replace(cte.fullnames, bs.Recipe_Web_Code, bs.Webcode_Fullname),
replace(cte.colors, bs.Recipe_Web_Code, bs.color),
1 + cte.ind
from cte join
bs
on cte.ind = bs.seqnum and ind < 10
)
select cte.id, cte.fullnames, cte.colors
from (select cte.*, max(ind) over (partition by id) as max_ind
from cte
) cte
where ind = max_ind ;
Here is a db<>fiddle.
You can do it with a left join of the tables and the aggregate function string_agg() (works in SQL Server 2017+):
select a.id, a.Recipe_Web_Codes,
string_agg(b.Webcode_Fullname, ', ') Webcode_Fullname,
string_agg(b.Color, ', ') Color
from Table_A a left join Table_B b
on ' ' + a.Recipe_Web_Codes + ' ' like '% ' + b.Recipe_Web_Code + ' %'
group by a.id, a.Recipe_Web_Codes
order by a.id
See the demo.
Results:
> id | Recipe_Web_Codes | Webcode_Fullname | Color
> -: | :--------------- | :-------------------------- | :---------------
> 1 | GF VGT | Vegetarian, Gluten Friendly | #ff6038, #6ca4b6
> 2 | null | null | null
> 3 | VGN | Vegan Friendly | #97002d

Recursive SQL query to find all matching identifiers

I have a table with following structure
CREATE TABLE Source
(
[ID1] INT,
[ID2] INT
);
INSERT INTO Source ([ID1], [ID2])
VALUES (1, 2), (2, 3), (4, 5),
(2, 5), (6, 7)
Example of Source and Result tables:
Source table basically stores which id is matching which another id. From the diagram it can be seen that 1, 2, 3, 4, 5 are identical. And 6, 7 are identical. I need a SQL query to get a Result table with all matches between ids.
I found this item on the site - Recursive query in SQL Server
similar to my task, but with a different result.
I tried to edit the code for my task, but it does not work. "The statement terminated. The maximum recursion 100 has been exhausted before statement completion."
;WITH CTE
AS
(
SELECT DISTINCT
M1.ID1,
M1.ID1 as ID2
FROM Source M1
LEFT JOIN Source M2
ON M1.ID1 = M2.ID2
WHERE M2.ID2 IS NULL
UNION ALL
SELECT
C.ID2,
M.ID1
FROM CTE C
JOIN Source M
ON C.ID1 = M.ID1
)
SELECT * FROM CTE ORDER BY ID1
Thanks a lot for the help!
This is a challenging question. You are trying to walk through a graph in two directions. There are two key ideas:
Add "reverse" edges, so the graph behaves like a digraph but with edges in both directions.
Keep a list of edges that have been visited. In SQL Server, strings are one method.
So:
with s as (
select id1, id2 from source
union -- on purpose
select id2, id1 from source
),
cte as (
select s.id1, s.id2, ',' + cast(s.id1 as varchar(max)) + ',' + cast(s.id2 as varchar(max)) + ',' as ids
from s
union all
select cte.id1, s.id2, ids + cast(s.id2 as varchar(max)) + ','
from cte join
s
on cte.id2 = s.id1
where cte.ids not like '%,' + cast(s.id2 as varchar(max)) + ',%'
)
select *
from cte
order by 1, 2;
Here is a db<>fiddle.
Since all node connections are bidirectional - add reversed relations to the original list
Find all possible paths from each node; almost usual recursion, the only difference is - we need to keep root id1
Avoid cycles - we need to be aware of it because we don't have directions
source:
;with src as(
select id1, id2 from source
union
-- reversed connections
select id2, id1 from source
), rec as (
select id1, id2, CAST(CONCAT('/', src.id1, '/', src.id2, '/') as varchar(8000)) path
from src
union all
-- keep the root id1 from the start of each path
select rec.id1, src.id2, CAST(CONCAT(rec.path, src.id2, '/') as varchar(8000))
from rec
-- usual recursion
inner join src on src.id1 = rec.id2
-- avoid cycles
where rec.path not like CONCAT('%/', src.id2, '/%')
)
select id1, id2, path
from rec
order by 1, 2
output
| id1 | id2 | path |
|-----|-----|-----------|
| 1 | 2 | /1/2/ |
| 1 | 3 | /1/2/3/ |
| 1 | 4 | /1/2/5/4/ |
| 1 | 5 | /1/2/5/ |
| 2 | 1 | /2/1/ |
| 2 | 3 | /2/3/ |
| 2 | 4 | /2/5/4/ |
| 2 | 5 | /2/5/ |
| 3 | 1 | /3/2/1/ |
| 3 | 2 | /3/2/ |
| 3 | 4 | /3/2/5/4/ |
| 3 | 5 | /3/2/5/ |
| 4 | 1 | /4/5/2/1/ |
| 4 | 2 | /4/5/2/ |
| 4 | 3 | /4/5/2/3/ |
| 4 | 5 | /4/5/ |
| 5 | 1 | /5/2/1/ |
| 5 | 2 | /5/2/ |
| 5 | 3 | /5/2/3/ |
| 5 | 4 | /5/4/ |
| 6 | 7 | /6/7/ |
| 7 | 6 | /7/6/ |
http://sqlfiddle.com/#!18/76114/13
source table will contain about 100,000 records
There is nothing that can help you with this. The task is unpleasant - finding all possible connections. Almost CROSS JOIN. With even more connections in the end.
Looks like I came up with a similar answer as the other posters. My approach was to insert the existing value pairs, and then insert the reverse of each pair.
Once you expand the list of value pairs, you can transverse the table to find all the pairs.
CREATE TABLE #Source
([ID1] int, [ID2] int);
INSERT INTO #Source
(
[ID1]
,[ID2]
)
VALUES
(1, 2)
,(2, 3)
,(4, 5)
,(2, 5)
,(6, 7)
INSERT INTO #Source
(
[ID1]
,[ID2]
)
SELECT
[ID2]
,[ID1]
FROM #Source
;WITH expanded AS
(
SELECT DISTINCT
ID1 = s1.ID1
,ID2 = s1.ID2
FROM #Source s1
LEFT JOIN #Source s2 ON s1.ID2 = s2.ID1
UNION
SELECT DISTINCT
ID1 = s1.ID1
,ID2 = s2.ID2
FROM #Source s1
LEFT JOIN #Source s2 ON s1.ID2 = s2.ID1
WHERE s1.ID1 <> s2.ID2
)
,recur AS
(
SELECT DISTINCT
e1.ID1
,e1.ID2
FROM expanded e1
LEFT JOIN expanded e2 ON e1.ID2 = e2.ID1
WHERE e1.ID1 <> e1.ID2
UNION ALL
SELECT DISTINCT
e1.ID1
,e2.ID2
FROM expanded e1
INNER JOIN expanded e2 ON e1.ID2 = e2.ID1
WHERE e1.ID1 <> e2.ID2
)
SELECT DISTINCT
ID1, ID2
FROM recur
ORDER BY ID1, ID2
DROP TABLE #Source
This is a way to get that output by brute force, but may not be the best solution with a different/larger data set:
select sub1.rnk as ID1
,sub2.rnk as ID2
from
(
select a.*
,rank() over (partition by 1 order by id1, id2) as RNK
from source a
) sub1
cross join
(
select a.*
,rank() over (partition by 1 order by id1, id2) as RNK
from source a
) sub2
where sub1.rnk <> sub2.rnk
union all
select id1 as ID1
,id2 as ID2
from source
where id1 = 6
union all
select id2 as ID1
,id1 as ID2
from source
where id1 = 6;

Redshift create all the combinations of any length for the values in one column

How can we create all the combinations of any length for the values in one column and return the distinct count of another column for that combination?
Table:
+------+--------+
| Type | Name |
+------+--------+
| A | Tom |
| A | Ben |
| B | Ben |
| B | Justin |
| C | Ben |
+------+--------+
Output Table:
+-------------+-------+
| Combination | Count |
+-------------+-------+
| A | 2 |
| B | 2 |
| C | 1 |
| AB | 3 |
| BC | 2 |
| AC | 2 |
| ABC | 3 |
+-------------+-------+
When the combination is only A, there are Tom and Ben so it's 2.
When the combination is only B, 2 distinct names so it's 2.
When the combination is A and B, 3 distinct names: Tom, Ben, Justin so it's 3.
I'm working in Amazon Redshift. Thank you!
NOTE: This answers the original version of the question which was tagged Postgres.
You can generate all combinations with this code
with recursive td as (
select distinct type
from t
),
cte as (
select td.type, td.type as lasttype, 1 as len
from td
union all
select cte.type || t.type, t.type as lasttype, cte.len + 1
from cte join
t
on 1=1 and t.type > cte.lasttype
)
You can then use this in a join:
with recursive t as (
select *
from (values ('a'), ('b'), ('c'), ('d')) v(c)
),
cte as (
select t.c, t.c as lastc, 1 as len
from t
union all
select cte.type || t.type, t.type as lasttype, cte.len + 1
from cte join
t
on 1=1 and t.type > cte.lasttype
)
select type, count(*)
from (select name, cte.type, count(*)
from cte join
t
on cte.type like '%' || t.type || '%'
group by name, cte.type
having count(*) = length(cte.type)
) x
group by type
order by type;
There is no way to generate all possible combinations (A, B, C, AB, AC, BC, etc) in Amazon Redshift.
(Well, you could select each unique value, smoosh them into one string, send it to a User-Defined Function, extract the result into multiple rows and then join it against a big query, but that really isn't something you'd like to attempt.)
One approach would be to create a table containing all possible combinations — you'd need to write a little program to do that (eg using itertools in Python). Then, you could join the data against that reasonably easy to get the desired result (eg IF 'ABC' CONTAINS '%A%').

Oracle SQL Get unique symbols from table

I have table with descriptions of smth. For example:
My_Table
id description
================
1 ABC
2 ABB
3 OPAC
4 APEЧ
I need to get all unique symbols from all "description" columns.
Result should look like that:
symbol
================
A
B
C
O
P
E
Ч
And it shoud work for all languages, so, as I see, regular expressions cant help.
Please help me. Thanks.
with cte (c,description_suffix) as
(
select substr(description,1,1)
,substr(description,2)
from mytable
where description is not null
union all
select substr(description_suffix,1,1)
,substr(description_suffix,2)
from cte
where description_suffix is not null
)
select c
,count(*) as cnt
from cte
group by c
order by c
or
with cte(n) as
(
select level
from dual
connect by level <= (select max(length(description)) from mytable)
)
select substr(t.description,c.n,1) as c
,count(*) as cnt
from mytable t
join cte c
on c.n <= length(description)
group by substr(t.description,c.n,1)
order by c
+---+-----+
| C | CNT |
+---+-----+
| A | 4 |
| B | 3 |
| C | 2 |
| E | 1 |
| O | 1 |
| P | 2 |
| Ч | 1 |
+---+-----+
Create a numbers table and populate it with all the relevant ids you'd need (in this case 1..maxlength of string)
SELECT DISTINCT
locate(your_table.description, numbers.id) AS symbol
FROM
your_table
INNER JOIN
numbers
ON numbers.id >= 1
AND numbers.id <= CHAR_LENGTH(your_table.description)
SELECT DISTINCT(SUBSTR(ll,LEVEL,1)) OP --Here DISTINCT(SUBSTR(ll,LEVEL,1)) is used to get all distinct character/numeric in vertical as per requirment
FROM
(
SELECT LISTAGG(DES,'')
WITHIN GROUP (ORDER BY ID) ll
FROM My_Table --Here listagg is used to convert all values under description(des) column into a single value and there is no space in between
)
CONNECT BY LEVEL <= LENGTH(ll);