Recursive query - sql

I have a table which contains the following fields
Supervisorid
Empid
This is just like a referral program. A guy can refer 3 guys under him i.e, 3 is referring three guys namely 4 5 8 similarly 4 is referring 9 10 and 11 likewise 8 is referring 12, 13 it goes like this..
I want a query to get all EmpId under Supervisor 3

Do you want us to write the solution for you, or explain a bit how recursive queries can be built up ?
An example of how they are built up is on http://publib.boulder.ibm.com/infocenter/db2luw/v8//topic/com.ibm.db2.udb.doc/ad/samples/clp/s-flt-db2.htm.
The IBM DB2 redbook has an entire chapter on SQL recursion.
The gist is that the following steps are generally involved:
you define a "seed". SELECT SUPID, EMPID, 1 AS LVL FROM EMP WHERE SUPID = 3;
you assign to this a name. WITH SRC AS <your seed here>
you define the way to go to the 'next level', starting from the seed, using the assigned name. SELECT SRC.SUPID, F.EMPID, SRC.LVL+1 FROM SRC, EMP WHERE SRC.EMPID=EMP.SUPID
you combine the two together (inside the WITH clause) WITH SRC AS <your seed here> UNION ALL <the other SELECT here>
(optionally) you define which columns to select. SELECT EMPID, LVL FROM SRC.

Related

SQL PIVOT, JOIN, and aggregate function to generate report

I am working on creating a report which will incorporate data across 4 different tables. For this question, I have consolidated the data into 2 tables and am stuck trying to figure out exactly how to create this report using PIVOT.
The report will hold the top 5 strengths of an employee based on the Clifton StrengthsFinder assessment.
This is the table with the Names of the Clifton Strengths (34 rows total):
As mentioned, each employee has 5 strengths:
I would like to use PIVOT to generate a table which will ultimately look like this:
With a twist, I don't need the Team Name as a Row, it should be a column. The Count at the bottom and Themes at the top (Executing, Influencing, etc) can be ignored.
The columns of the table I'm trying to output are PersonFk, PersonName, TeamName, Achiever, Arranger, etc... (34 Strengths) and each row of the table with Values (personfk, name, team, 1 if person has the strength, 0 otherwise). This table should be SQL, not excel (sorry, just the best example I have on hand without spending an hour learning how to use Paint or something).
I'm not very familiar with aggregate functions, and am just now getting into the more complex SQL queries..
Interesting. Pivot requires an aggregate function to build the 1-5 values, so you'll have to rewrite your inner query probably as a union, and use MAX() as a throwaway aggregate function (throwaway because every record should be unique, so MAX, MIN, SUM, etc. should all return the same value:
SELECT * INTO #newblah from (
SELECT PersonFK, 1 as StrengthIndex, Strength1 as Strength from blah UNION ALL
SELECT PersonFK, 2 as StrengthIndex, Strength2 as Strength from blah UNION ALL
SELECT PersonFK, 3 as StrengthIndex, Strength3 as Strength from blah UNION ALL
SELECT PersonFK, 4 as StrengthIndex, Strength4 as Strength from blah UNION ALL
SELECT PersonFK, 5 as StrengthIndex, Strength5 as Strength from blah
)
Then
select PersonFK, [Achiever], [Activator], [Adaptability], [Analytical], [Belief] .....
from
(
select PersonFK, StrengthIndex, Strength
from #newblah
) pivotsource
pivot
(
max(StrengthIndex)
for Strength in ([Achiever], [Activator], [Adaptability], [Analytical], [Belief] ..... )
) myPivot;
The result of that query should be able to be joined back to your other tables to get the Person name, Strength Category, and Team name, so I'll leave that to you. You don't HAVE to do the first join as a temporary table -- you could do it as a subselect inline, so this could all be done in one SQL query, but that seems painful if you can avoid it.
Use one of the techniques from this post. For your purposes, you may want to use a delimiter in your column name to the tune of 'StrngthTheme-Strength', which your web report will then parse for the headers.

SSRS Recursive Parent gives distinct children only, when children have multiple parents

I have made an SSRS report using the recursive parent functionality to show a hierarchical tree of values. The problem I have is that some children have more than one parent, but because (in order to use the recursive parent nicely) I need to group the results by Id, I only see distinct entries. This means that I only see each child once, even if it "should" appear in multiple locations in the report (under each of its parents).
Here is an example dataset that shows what I mean:
DECLARE #Bear Table
( ParentId INT NOT NULL
,Id INT NOT NULL
,Name VARCHAR(255))
INSERT INTO #Bear
SELECT * FROM
(SELECT 0 AS ParentId, 1 AS Id, 'Daddy Bear' AS Name UNION
SELECT 0 AS ParentId, 2 AS Id, 'Mummy Bear' AS Name UNION
SELECT 1 AS ParentId, 3 AS Id, 'Baby Bear' AS Name UNION
SELECT 2 AS ParentId, 3 AS Id, 'Baby Bear' AS Name) AS FamilyMember
ORDER BY FamilyMember.Id
SELECT * FROM #Bear
My Actual data contains lots of "Baby Bears" where for instance a function is used by more than one procedure, or a procedure is used by more than one report.
When I make the report, I group on Bear.Id, with a recursive parent of Bear.ParentId, which gives me something like this (in the report table):
Level 1 Level 2
Daddy Bear
Baby Bear
Mummy Bear
As you can see, "Baby Bear" only appears once (normally, Id would be unique and this would make perfect sense). What I would like is for SSRS to display is something more like this:
Level 1 Level 2
Daddy Bear
Baby Bear
Mummy Bear
Baby Bear
This would give the users a much better idea of the actual structure they are looking at.
So far, I have tried changing the report group to group by "cstr(Fields!Id.Value) & cstr(Fields!ParentId.Value)", in order to re-establish a unique grouping, so that no records are aggregated into invisibility, but this loses the ordering where children appear immediately after their parent, so I get something like this:
Level 1 Level 2
Daddy Bear
Baby Bear
Baby Bear
Mummy Bear
I have also tried adding ROW_NUMBER() OVER (ORDER BY Id, ParentId) as a new column in the query, to group on that, unquely, but SSRS seems to have a problem with this. The final workaround I am now using is to list only the distinct values as in the first example, but use an Action in each table row to run the report again for each node, on click. This is far from ideal, however.
I have also Googled without result.
I am stuck as to what to do.
Any help would be greatly appreciated - what should I do?
Thanks for your time,
Mark
Why can't you add the ROW_NUMBER() exactly?
SELECT ROW_NUMBER() over (order by parentid) as rn, * FROM
(SELECT 0 AS ParentId, 1 AS Id, 'Daddy Bear' AS Name UNION
SELECT 0 AS ParentId, 2 AS Id, 'Mummy Bear' AS Name UNION
SELECT 1 AS ParentId, 3 AS Id, 'Baby Bear' AS Name UNION
SELECT 2 AS ParentId, 3 AS Id, 'Baby Bear' AS Name) AS FamilyMember
Produces a "unique" id per row for grouping on.
UPDATE
So based on my understanding of your problem, you want a recursive CTE. There are quite a few questions here on SO about them, so between that and that link I encourage you to figure out how to produce a dataset that fits your needs.

Building a hierarchical tree with a single SQL query

I have a SQL table with the following structure.
id - int
par - int (relational to id)
name - varchar
Column par contains references to id or NULL if no reference, this table is meant to build an hierarchical tree.
Then, given the data:
id par name
1 NULL John
2 NULL Mario
3 1 George
4 3 Alfred
5 4 Nicole
6 2 Margaret
I want to retrieve a hierarchical tree, up to the last parent, from a given single id.
Example, I want to know the tree from Nicole to the last parent. So the query result will be:
id par name
5 4 Nicole
4 3 Alfred
3 1 George
1 NULL John
I would normally do this with a SQL query repeating over and over and building the tree server side but I do not want that now.
Is there any way to achieve this with a single SQL query?
I need this for either MySQL or PgSQL.
And I want to know also, if possible, is it also widely supported? In which versions of either MySQL or PgSQL can I expect support?
It is possible with a single query in Postgres using a recursive common table expression. This is not possible in MySQL as it is one of the few database to not support recursive CTEs.
It would look something like this (not tested)
WITH RECURSIVE tree (id, par, name) AS (
SELECT id, par, name
FROM the_table
WHERE name = 'Nicole'
UNION ALL
SELECT id, par, name
FROM the_table tt
JOIN tree tr ON tr.id = tt.par
)
SELECT *
FROM tree
For Postgres, see http://www.postgresql.org/docs/8.4/static/queries-with.html
MySQL doesn't support this syntax (unless it's in a beta/development tree somewhere). Oracle has something similar using connect by prior.
This article is probably what you need to look at:
http://explainextended.com/2009/03/17/hierarchical-queries-in-mysql/
In Oracle, this is done via:
SELECT [[LEVEL,]] id, par, name FROM my_table
START WITH name = 'Nicole'
CONNECT BY [[NOCYCLE]] id = PRIOR par
[[ORDER SIBLINGS BY name ASC]]
(my [[…]] syntax denotes optional query bits.
MySQL is planning to integrate such a feature. For PostgreSQL there is another answer helping you.

Reporting against a CSV field in a SQL server 2005 DB

Ok so I am writing a report against a third party database which is in sql server 2005. For the most part its normalized except for one field in one table. They have a table of users (which includes groups.) This table has a UserID field (PK), a IsGroup field (bit) , a members field (text) this members field has a comma separated list of all the members of this group or (if not a group) a comma separated list of the groups this member belongs to.
The question is what is the best way to write a stored procedure that displays what users are in what groups? I have a function that parses out the ids into a table. So the best way I could come up with was to create a cursor that cycles through each group and parse out the userid, write them to a temp table (with the group id) and then select out from the temp table?
UserTable
Example:
ID|IsGroup|Name|Members
1|True|Admin|3
2|True|Power|3,4
3|False|Bob|1,3
4|False|Susan|2
5|True|Normal|6
6|False|Bill|5
I want my query to show:
GroupID|UserID
1|3
2|3
2|4
5|6
Hope that makes sense...
If you have (or could create) a separate table containing the groups you could join it with the users table and match them with the charindex function with comma padding of your data on both sides. I would test the performance of this method with some fairly extreme workloads before deploying. However, it does have the advantage of being self-contained and simple. Note that changing the example to use a cross-join with a where clause produces the exact same execution plan as this one.
Example with data:
SELECT *
FROM (SELECT 1 AS ID,
'1,2,3' AS MEMBERS
UNION
SELECT 2,
'2'
UNION
SELECT 3,
'3,1'
UNION
SELECT 4,
'2,1') USERS
LEFT JOIN (SELECT '1' AS MEMBER
UNION
SELECT '2'
UNION
SELECT '3'
UNION
SELECT '4') GROUPS
ON CHARINDEX(',' + GROUPS.MEMBER + ',',',' + USERS.MEMBERS + ',') > 0
Results:
id members group
1 1,2,3 1
1 1,2,3 2
1 1,2,3 3
2 2 2
3 3,1 1
3 3,1 3
4 2,1 1
4 2,1 2
Your technique will probably be the best method.

SQL - How to store and navigate hierarchies?

What are the ways that you use to model and retrieve hierarchical info in a database?
I like the Modified Preorder Tree Traversal Algorithm. This technique makes it very easy to query the tree.
But here is a list of links about the topic which I copied from the Zend Framework (PHP) contributors webpage (posted there by Posted by Laurent Melmoux at Jun 05, 2007 15:52).
Many of the links are language agnostic:
There is 2 main representations and algorithms to represent hierarchical structures with databases :
nested set also known as modified preorder tree traversal algorithm
adjacency list model
It's well explained here:
http://www.sitepoint.com/article/hierarchical-data-database
Managing Hierarchical Data in MySQL
http://www.evolt.org/article/Four_ways_to_work_with_hierarchical_data/17/4047/index.html
Here are some more links that I've collected:
http://en.wikipedia.org/wiki/Tree_%28data_structure%29
http://en.wikipedia.org/wiki/Category:Trees_%28structure%29
adjacency list model
http://www.sqlteam.com/item.asp?ItemID=8866
nested set
http://www.sqlsummit.com/AdjacencyList.htm
http://www.edutech.ch/contribution/nstrees/index.php
http://www.phpriot.com/d/articles/php/application-design/nested-trees-1/
http://www.dbmsmag.com/9604d06.html
http://en.wikipedia.org/wiki/Tree_traversal
http://www.cosc.canterbury.ac.nz/mukundan/dsal/BTree.html (applet java montrant le fonctionnement )
Graphes
http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html
Classes :
Nested Sets DB Tree Adodb
http://www.phpclasses.org/browse/package/2547.html
Visitation Model ADOdb
http://www.phpclasses.org/browse/package/2919.html
PEAR::DB_NestedSet
http://pear.php.net/package/DB_NestedSet
utilisation : https://www.entwickler.com/itr/kolumnen/psecom,id,26,nodeid,207.html
PEAR::Tree
http://pear.php.net/package/Tree/download/0.3.0/
http://www.phpkitchen.com/index.php?/archives/337-PEARTree-Tutorial.html
nstrees
http://www.edutech.ch/contribution/nstrees/index.php
The definitive pieces on this subject have been written by Joe Celko, and he has worked a number of them into a book called Joe Celko's Trees and Hierarchies in SQL for Smarties.
He favours a technique called directed graphs. An introduction to his work on this subject can be found here
What's the best way to represent a hierachy in a SQL database? A generic, portable technique?
Let's assume the hierachy is mostly read, but isn't completely static. Let's say it's a family tree.
Here's how not to do it:
create table person (
person_id integer autoincrement primary key,
name varchar(255) not null,
dob date,
mother integer,
father integer
);
And inserting data like this:
person_id name dob mother father
1 Pops 1900/1/1 null null
2 Grandma 1903/2/4 null null
3 Dad 1925/4/2 2 1
4 Uncle Kev 1927/3/3 2 1
5 Cuz Dave 1953/7/8 null 4
6 Billy 1954/8/1 null 3
Instead, split your nodes and your relationships into two tables.
create table person (
person_id integer autoincrement primary key,
name varchar(255) not null,
dob date
);
create table ancestor (
ancestor_id integer,
descendant_id integer,
distance integer
);
Data is created like this:
person_id name dob
1 Pops 1900/1/1
2 Grandma 1903/2/4
3 Dad 1925/4/2
4 Uncle Kev 1927/3/3
5 Cuz Dave 1953/7/8
6 Billy 1954/8/1
ancestor_id descendant_id distance
1 1 0
2 2 0
3 3 0
4 4 0
5 5 0
6 6 0
1 3 1
2 3 1
1 4 1
2 4 1
1 5 2
2 5 2
4 5 1
1 6 2
2 6 2
3 6 1
you can now run arbitary queries that don't involve joining the table back on itself, which would happen if you have the heirachy relationship in the same row as the node.
Who has grandparents?
select * from person where person_id in
(select descendant_id from ancestor where distance=2);
All your descendants:
select * from person where person_id in
(select descendant_id from ancestor
where ancestor_id=1 and distance>0);
Who are uncles?
select decendant_id uncle from ancestor
where distance=1 and ancestor_id in
(select ancestor_id from ancestor
where distance=2 and not exists
(select ancestor_id from ancestor
where distance=1 and ancestor_id=uncle)
)
You avoid all the problems of joining a table to itself via subqueries, a common limitation is 16 subsuqeries.
Trouble is, maintaining the ancestor table is kind of hard - best done with a stored procedure.
I've got to disagree with Josh. What happens if you're using a huge hierarchical structure like a company organization. People can join/leave the company, change reporting lines, etc... Maintaining the "distance" would be a big problem and you would have to maintain two tables of data.
This query (SQL Server 2005 and above) would let you see the complete line of any person AND calculates their place in the hierarchy and it only requires a single table of user information. It can be modified to find any child relationship.
--Create table of dummy data
create table #person (
personID integer IDENTITY(1,1) NOT NULL,
name varchar(255) not null,
dob date,
father integer
);
INSERT INTO #person(name,dob,father)Values('Pops','1900/1/1',NULL);
INSERT INTO #person(name,dob,father)Values('Grandma','1903/2/4',null);
INSERT INTO #person(name,dob,father)Values('Dad','1925/4/2',1);
INSERT INTO #person(name,dob,father)Values('Uncle Kev','1927/3/3',1);
INSERT INTO #person(name,dob,father)Values('Cuz Dave','1953/7/8',4);
INSERT INTO #person(name,dob,father)Values('Billy','1954/8/1',3);
DECLARE #OldestPerson INT;
SET #OldestPerson = 1; -- Set this value to the ID of the oldest person in the family
WITH PersonHierarchy (personID,Name,dob,father, HierarchyLevel) AS
(
SELECT
personID
,Name
,dob
,father,
1 as HierarchyLevel
FROM #person
WHERE personID = #OldestPerson
UNION ALL
SELECT
e.personID,
e.Name,
e.dob,
e.father,
eh.HierarchyLevel + 1 AS HierarchyLevel
FROM #person e
INNER JOIN PersonHierarchy eh ON
e.father = eh.personID
)
SELECT *
FROM PersonHierarchy
ORDER BY HierarchyLevel, father;
DROP TABLE #person;
FYI: SQL Server 2008 introduces a new HierarchyID data type for this sort of situation. Gives you control over where in the "tree" your row sits, horizontally as well as vertically.
Oracle: SELECT ... START WITH ... CONNECT BY
Oracle has an extension to SELECT that allows easy tree-based retrieval. Perhaps SQL Server has some similar extension?
This query will traverse a table where the nesting relationship is stored in parent and child columns.
select * from my_table
start with parent = :TOP
connect by prior child = parent;
http://www.adp-gmbh.ch/ora/sql/connect_by.html
I prefer a mix of the techinques used by Josh and Mark Harrison:
Two tables, one with the data of the Person and other with the hierarchichal info (person_id, parent_id [, mother_id]) if the PK of this table is person_id, you have a simple tree with only one parent by node (which makes sense in this case, but not in other cases like accounting accounts)
This hiarchy table can be transversed by recursive procedures or if your DB supports it by sentences like SELECT... BY PRIOR (Oracle).
Other posibility is if you know the max deep of the hierarchy data you want to mantain is use a single table with a set of columns per level of hierarchy
We had the same issue when we implemented a tree component for [fleXive] and used the nested set tree model approach mentioned by tharkun from the MySQL docs.
In addition to speed things (dramatically) up we used a spreaded approach which simply means we used the maximum Long value for the top level right bounds which allows us to insert and move nodes without recalculating all left and right values. Values for left and right are calculated by dividing the range for a node by 3 und use the inner third as bounds for the new node.
A java code example can be seen here.
If you're using SQL Server 2005 then this link explains how to retrieve hierarchical data.
Common Table Expressions (CTEs) can be your friends once you get comfortable using them.