Minimizing a graph with SQL - sql

Suppose we have a directed graph defined as following:
node | neighbor
-----------------
1 | 2
1 | 3
2 | 4
2 | 3
3 | 4
the above table defines the only the edges between two nodes, a couple (1,2)for example means that node 1 and 2 are connected by an edge, here is a plot of the graph.
I also have a table of the transitive closure of the graph, this table holds all the possible paths of the graph (for example: (1,3) is present twice because it can be reached either directly or by the path 1=>2=>3), here is the table:
node | neighbor
-----------------
1 | 2
1 | 3
2 | 4
2 | 3
3 | 4
1 | 3
1 | 4
1 | 4
2 | 4
from these two tables, I want to return a minimized graph without losing any reachability, an idea was to only return edges that are not in dependency of the two tables, here's an example:
(1,2) is in the first table and (2,3) is in the second, and therefore (1,3) can be deleted from the first table because you can reach node 3 from 1 passing by node 2
the outuput table should look like this then:
node | neighbor
-----------------
1 | 2
2 | 3
3 | 4
How can I write an SQL query that does this?

Here is one approach:
with recursive cte as (
select node, neighbor, 1 is_initial from graph
union all
select c.node, g.neighbor, 0
from cte c
inner join graph g on g.node = c.neighbor
)
select node, neighbor
from graph g
where not exists (
select 1
from cte c
where c.node = g.node and c.neighbor = g.neighbor and c.is_initial = 0
)
order by node, neighbor
This uses the first table only (I called it graph). We start by generating all possible paths with a recursive query. This is quite similar to your closure table, but with one extra column, is_initial, that indicates whether the path comes from the original table, or was generated during a further iteration.
Then, all that is left to do is filter the graph to remove tuples that match a "non-initial" path.
Demo on DB Fiddle:
node | neighbor
---: | -------:
1 | 2
2 | 3
3 | 4

Related

Flatten tree structure represented in SQL [duplicate]

This question already has an answer here:
SQL Server recursive self join
(1 answer)
Closed 3 years ago.
I'm using an engineering calculation package and trying to extract some information from it in a built in reporting tool that allows SQL query
An abbreviated example SQL tables are as follows:
Id | Description | Ref
---|---------------------
1 | system 1 |
3 | block 4 | 6
3 | block 4 | 1
5 | formula1 | 3
6 | f |
7 | something | 1
9 | cheese | 5
The "Ref" column identifies rows that are subrecords of other items.
What I want to do is run a query that will produce a list that will show all items that appear on a each page. As you can see from the table above "ID" is not the unique key; each item can appear in multiple locations within the table. In the example above:
ID 5 is a subitem of ID3
ID 3 is a subitem of ID 1 AND ID 6
ID 1 and ID 6 aren't subitems of anything
So effectively it is representing a tree structure:
ID 1
+-------- ID 7
|---- ID 3
+---- ID 5
+---- ID 9
ID 6
+---- ID 3
+---- ID 5
+---- ID 9
What I'm hoping to is work out which items appear under each top level item (so the end result should be a table where in the "Ref" column only top level items appear):
Id | Description | Ref
---|---------------------
1 | system 1 |
3 | block 4 | 6
3 | block 4 | 1
5 | formula1 | 1
5 | formula1 | 6
6 | f |
9 | cheese | 1
9 | cheese | 6
7 | something | 1
The tree structure can be a total of 5 levels deep
I've been trying to use left joins to build up a list of page references, but I think I'm also going to need to union results tables (because obviously rows like ID=9, ID=5, and ID = 6 have to be duplicated in the final results set). It starts to get a bit messy!
WITH A
AS (SELECT *
FROM [RbdBlocks]),
B
AS (SELECT [x].[Id],
[x].[Description],
[x].[Page] AS Page1,
[y].[Page] AS Page2,
FROM A AS x
LEFT OUTER JOIN
A AS y
ON y.Id = x.Page)
SELECT *
FROM B
The above gives me some of the nested references, but I'm not sure if there's a better way to get this data together, and to manage the recursion rather than just duplicating the set of queries 4 times?
Have a look at Recursive Common Table Expressions (CTEs). They should be able to accomplish exactly what you need.
Have a look at Example D on the SQL Docs page.
Basically what you'd do in your case is:
In the "anchor member" of the CTE, select all top-level items
In the "recursive member" of the CTE, join all of the nested children to the top-level item
Recursive CTEs are not really trivial to understand, so be sure to read the docs carefully.

repeating / duplicating query entries based on a table value

Related to / copied from this PostgreSQL topic: so-link
Let's say I have a table with two rows
id | value |
----+-------+
1 | 2 |
2 | 3 |
I want to write a query that will duplicate (repeat) each row based on
the value. I want this result (5 rows total):
id | value |
----+-------+
1 | 2 |
1 | 2 |
2 | 3 |
2 | 3 |
2 | 3 |
How is this possible in SQL Anywhere (Sybase SQL)?
The easiest way to do this is to have a numbers table . . . one that generates integers. Perhaps you have one handy. There are other ways. For instance, using a recursive CTE:
with numbers as (
select 1 as n
union all
select n + 1
from numbers
where n < 100
)
select t.*
from yourtable t join
numbers n
on n.n <= value;
Not all versions of Sybase necessarily support recursive CTEs There are other ways to generate such a table or you might already have one handy.

Oracle Spatial Objects to Vertex List

I am struggling to convert a series of oracle SDOs (polygons specifically) into a more usable format.
I have data that is in this format:
PolygonID | Polygon
1 | SDO Geometry
2 | SDO Geometry
3 | SDO Geometry
And so on...
What i want to get is the following:
PolygonID | Vertex.X | Vertex.Y | Vertex.Order
1 | 1 | 1 | 1
1 | 3 | 5 | 2
1 | 2 | 3 | 3
2 | 1 | 2 | 1
And so on. So I just need to polygon converted into a ordered list of vertices. I can successfully convert a single SDO geometry into a ordered list using the below code but cant link it to its polygon ID.
select x,y
from table (
select sdo_util.getvertices(
SDO
)
from POLYGONS
where ID = 1
)
order by id;
I am a bit lost on how to link that data back to its original polygon ID. Any help would be greatly appreciated!
So I eventually found the solution. See below.
SELECT
c.POLYGONID,
t.X,
t.Y,
t.ID as ORDER
FROM POLYGONS c,
TABLE(SDO_UTIL.GETVERTICES(c.POLYGON)) t
The table function creates a sub-query linked to each polygon ID.

SQL Server: Select hierarchically related items from one table

Say, I have an organizational structure that is 5 levels deep:
CEO -> DeptHead -> Supervisor -> Foreman -> Worker
The hierarchy is stored in a table Position like this:
PositionId | PositionCode | ManagerId
1 | CEO | NULL
2 | DEPT01 | 1
3 | DEPT02 | 1
4 | SPRV01 | 2
5 | SPRV02 | 2
6 | SPRV03 | 3
7 | SPRV04 | 3
... | ... | ...
PositionId is uniqueidentifier. ManagerId is the ID of employee's manager, referring PositionId from the same table.
I need a SQL query to get the hierarchy tree going down from a position, provided as parameter, including the position itself. I managed to develop this:
-- Select the original position itself
SELECT
'Rank' = 0,
Position.PositionCode
FROM Position
WHERE Position.PositionCode = 'CEO' -- Parameter
-- Select the subordinates
UNION
SELECT DISTINCT
'Rank' =
CASE WHEN Pos2.PositionCode IS NULL THEN 0 ELSE 1+
CASE WHEN Pos3.PositionCode IS NULL THEN 0 ELSE 1+
CASE WHEN Pos4.PositionCode IS NULL THEN 0 ELSE 1+
CASE WHEN Pos5.PositionCode IS NULL THEN 0 ELSE 1
END
END
END
END,
'PositionCode' = RTRIM(ISNULL(Pos5.PositionCode, ISNULL(Pos4.PositionCode, ISNULL(Pos3.PositionCode, Pos2.PositionCode)))),
FROM Position Pos1
LEFT JOIN Position Pos2
ON Pos1.PositionId = Pos2.ManagerId
LEFT JOIN Position Pos3
ON Pos2.PositionId = Pos3.ManagerId
LEFT JOIN Position Pos4
ON Pos3.PositionId = Pos4.ManagerId
LEFT JOIN Position Pos5
ON Pos4.PositionId = Pos5.ManagerId
WHERE Pos1.PositionCode = 'CEO' -- Parameter
ORDER BY Rank ASC
It works not only for 'CEO' but for any position, displaying its subordinates. Which gives me the following output:
Rank | PositionCode
0 | CEO
... | ...
2 | SPRV55
2 | SPRV68
... | ...
3 | FRMN10
3 | FRMN12
... | ...
4 | WRKR01
4 | WRKR02
4 | WRKR03
4 | WRKR04
My problems are:
The output does not include intermediate nodes - it will only output end nodes, i.e. workers and intermediate managers which have no subordinates. I need all intermediate managers as well.
I have to manually UNION the row with original position on top of the output. I there any more elegant way to do this?
I want the output to be sorted in hieararchical tree order. Not all DeptHeads, then all Supervisors, then all Foremen then all workers, but like this:
Rank | PositionCode
0 | CEO
1 | DEPT01
2 | SPRV01
3 | FRMN01
4 | WRKR01
4 | WRKR02
... | ...
3 | FRMN02
4 | WRKR03
4 | WRKR04
... | ...
Any help would be greatly appreciated.
Try a recursive CTE, the example on TechNet is almost identical to your problem I believe:
http://technet.microsoft.com/en-us/library/ms186243(v=sql.105).aspx
Thx, everyone suggesting CTE. I got the following code and it's working okay:
WITH HierarchyTree (PositionId, PositionCode, Rank)
AS
(
-- Anchor member definition
SELECT PositionId, PositionCode,
0 AS Rank
FROM Position AS e
WHERE PositionCode = 'CEO'
UNION ALL
-- Recursive member definition
SELECT e.PositionId, e.PositionCode,
Rank + 1
FROM Position AS e
INNER JOIN HierarchyTree AS d
ON e.ManagerId = d.PositionId
)
SELECT Rank, PositionCode
FROM HierarchyTree
GO
I had a similar problem to yours on a recent project but with a variable recursion length - typically between 1 and 10 levels.
I wanted to simplify the SQL side of things so I put some extra work into the logic of storing the recursive elements by storing a "hierarchical path" in addition to the direct manager Id.
So a very contrived example:
Employee
Id | JobDescription | Hierarchy | ManagerId
1 | DIRECTOR | 1\ | NULL
2 | MANAGER 1 | 1\2\ | 1
3 | MANAGER 2 | 1\3\ | 1
4 | SUPERVISOR 1 | 1\2\4 | 2
5 | SUPERVISOR 2 | 1\3\5 | 3
6 | EMPLOYEE 1 | 1\2\4\6 | 4
7 | EMPLOYEE 2 | 1\3\5\7 | 5
This means you have the power to very quickly query any level of the tree and get all descendants by using a LIKE query on the Hierarchy column
For example
SELECT * FROM dbo.Employee WHERE Hierarchy LIKE '\1\2\%'
would return
MANAGER 1
SUPERVISOR 1
EMPLOYEE 1
Additionally you can also easily get one level of the tree by using the ManagerId column.
The downside to this approach is you have to construct the hierarchy when inserting or updating records but believe me when I say this storage structure saved me a lot of pain later on without the need for unnecessary query complexity.
One thing to note is that my approach gives you the raw data - I then parse the result set into a recursive strongly typed structure in my services layer. As a rule I don't tend to format output in SQL.

How to properly group SQL results set?

SQL noob, please bear with me!!
I am storing a 3-tuple in a database (x,y, {signal1, signal2,..}).
I have a database with tables coordinates (x,y) and another table called signals (signal, coordinate_id, group) which stores the individual signal values. There can be several signals at the same coordinate.
The group is just an abitrary integer which marks the entries in the signal table as belonging to the same set (provided they belong to the same coordinate). So that any signals with the same 'coordinate_id' and 'group' together form a tuple as shown above.
For example,
Coordinates table Signals table
-------------------- -----------------------------
| id | x | y | | id | signal | coordinate_id | group |
| 1 | 1 | 2 | | 1 | 45 | 1 | 1 |
| 2 | 2 | 5 | | 2 | 95 | 1 | 1 |
| 3 | 33 | 1 | 1 |
| 4 | 65 | 1 | 2 |
| 5 | 57 | 1 | 2 |
| 6 | 63 | 2 | 1 |
This would produce the tuples (1,2 {45,95,33}), (1,2,{65,57}), (2,5, {63}) and so on.
I would like to retrieve the sets of {signal1, signal2,...} for each coordinate. The signals belonging to a set have the same coordinate_id and group, but I do not necessarily know the group value. I only know that if the group value is the same for a particular coordinate_id, then all those with that group form one set.
I tried looking into SQL GROUP BY, but I realized that it is for use with aggregate functions.
Can someone point out how to do this properly in SQL or give tips for improving my database structure.
SQLite supports the GROUP_CONCAT() aggregate function similar to MySQL. It rolls up a set of values in the group and concatenates them together comma-separated.
SELECT c.x, c.y, GROUP_CONCAT(s.signal) AS signal_list
FROM Signals s
JOIN Coordinates ON s.coordinate_id = c.id
GROUP BY s.coordinate_id, s.group
SQLite also permits the mismatch between columns in the select-list and columns in the group-by clause, even though this isn't strictly permitted by ANSI SQL and most implementations.
personally I would write the database as 3 tables:
x_y(x, y, id) coords_groups(pos, group, id) signals(group, signal)
with signals.group->coords_groups.id and coords_groups.pos->x_y.id
as you are trying to represent a sort-of 4 dimensional array.
then, to get from a couple of coordinates (X, Y) an ArrayList of List of Signal you can use this
SELECT temp."group", signals.signal
FROM (
SELECT cg."group", cg.id
FROM x_y JOIN coords_groups AS cg ON x_y.id = cg.pos
WHERE x_y.x=X AND x_y.y=Y )
AS temp JOIN signals ON temp.id=signals."group"
ORDER BY temp."group" ASC
(X Y are in the innermost where)
inside this sort of pseudo-code:
getSignalsGroups(X, Y)
ArrayList<List<Signals>> a
List<Signals> temp
query=sqlLiteExecute(THE_SQL_SNIPPET, x, y)
row=query.fetch() //fetch the first row to set the groupCounter
actualGroup=row.group
temp.add(row.signal)
for(row : query) //foreach row add the signal to the list
if(row.group!=actualGroup) //or reset the list if is a new group
a.add(actualGroup, temp)
actualGroup=row.group; temp= new List
temp.add(row.signal)
return a