Oracle Connect_is_leaf similar in SQL server - sql

Here is my query which is in Oracle PL/SQL syntax, How can I Change it to SQL server format?
Any alternatives for Connect_by_isleaf?
(
select PARTY_KEY, ltrim(sys_connect_by_path(alt_name, '|'), '|') AS alt_name_list
from
(select PARTY_KEY, alt_name, row_number() over(partition by PARTY_KEY order by alt_name) rno
from (
select party_key, (select alt_name_type_desc from "CRMS"."PRJ_APP_ALT_NAME_TYPE" where alt_name_type_cd = alt_name_type) || ' - ' || alt_name as alt_name
from "CDD_PROFILES"."PRJ_PRF_ALT_NAME" order by party_key, alt_name_type
) alt
)
where connect_by_isleaf = 1
connect by PARTY_KEY = prior PARTY_KEY
and rno = prior rno+1
start with rno = 1
)
tried to use With AS clause but it is not working somehow.
Thanks in advance

The equivalent in SQL Server is called a "recursive CTE".
You can read about it here:
https://learn.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql?view=sql-server-2017

Oracle Hierarchical queries can be rewritten as recursive CTE statements in databases that support them (SQL Server included). A classic set of hierarchical data would be an organization hierarchy such as the one below:
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE ORGANIZATIONS
([ID] int primary key
, [ORG_NAME] varchar(30)
, [ORG_TYPE] varchar(30)
, [PARENT_ID] int foreign key references organizations)
;
INSERT INTO ORGANIZATIONS
([ID], [ORG_NAME], [ORG_TYPE], [PARENT_ID])
VALUES
(1, 'ACME Corp', 'Company', NULL),
(2, 'Finance', 'Division', 1),
(6, 'Accounts Payable', 'Department', 2),
(7, 'Accounts Receivables', 'Department', 2),
(8, 'Payroll', 'Department', 2),
(3, 'Operations', 'Division', 1),
(4, 'Human Resources', 'Division', 1),
(10, 'Benefits Admin', 'Department', 4),
(5, 'Marketing', 'Division', 1),
(9, 'Sales', 'Department', 5)
;
In the recursive t1 below the select statement before the union all is the anchor query and the select statement after the union all is the recursive part. The recursive part has exactly one reference to t1 in its from clause. The org_path column simulates oracles sys_connect_by_path function concatenating the org_names together. The level column simulates oracles LEVEL pseudo column and is utilized in the output query to determine the leaf status (is_leaf column) similar to oracles connect_by_isleaf pseudo column:
with t1(id, org_name, org_type, parent_id, org_path, level) as (
select o.*
, cast('|' + org_name as varchar(max))
, 1
from organizations o
where parent_id is null
union all
select o.*
, t1.org_path+cast('|'+o.org_name as varchar(max))
, t1.level+1
from organizations o
join t1
on t1.id = o.parent_id
)
select t1.*
, case when t1.level < lead(t1.level) over (order by org_path) then 0 else 1 end is_leaf
from t1 order by org_path
Results:
| id | org_name | org_type | parent_id | org_path | level | is_leaf |
|----|----------------------|------------|-----------|-------------------------------------------|-------|---------|
| 1 | ACME Corp | Company | (null) | |ACME Corp | 1 | 0 |
| 2 | Finance | Division | 1 | |ACME Corp|Finance | 2 | 0 |
| 6 | Accounts Payable | Department | 2 | |ACME Corp|Finance|Accounts Payable | 3 | 1 |
| 7 | Accounts Receivables | Department | 2 | |ACME Corp|Finance|Accounts Receivables | 3 | 1 |
| 8 | Payroll | Department | 2 | |ACME Corp|Finance|Payroll | 3 | 1 |
| 4 | Human Resources | Division | 1 | |ACME Corp|Human Resources | 2 | 0 |
| 10 | Benefits Admin | Department | 4 | |ACME Corp|Human Resources|Benefits Admin | 3 | 1 |
| 5 | Marketing | Division | 1 | |ACME Corp|Marketing | 2 | 0 |
| 9 | Sales | Department | 5 | |ACME Corp|Marketing|Sales | 3 | 1 |
| 3 | Operations | Division | 1 | |ACME Corp|Operations | 2 | 1 |
To select just the leaf nodes, change the output query from above to another CTE (T2) dropping the order by clause or moving it to final output query and limiting by the is_leaf column:
with t1(id, org_name, org_type, parent_id, org_path, level) as (
select o.*
, cast('|' + org_name as varchar(max))
, 1
from organizations o
where parent_id is null
union all
select o.*
, t1.org_path+cast('|'+o.org_name as varchar(max))
, t1.level+1
from organizations o
join t1
on t1.id = o.parent_id
), t2 as (
select t1.*
, case when t1.level < lead(t1.level) over (order by org_path) then 0 else 1 end is_leaf
from t1
)
select * from t2 where is_leaf = 1
Results:
| id | org_name | org_type | parent_id | org_path | level | is_leaf |
|----|----------------------|------------|-----------|-------------------------------------------|-------|---------|
| 6 | Accounts Payable | Department | 2 | |ACME Corp|Finance|Accounts Payable | 3 | 1 |
| 7 | Accounts Receivables | Department | 2 | |ACME Corp|Finance|Accounts Receivables | 3 | 1 |
| 8 | Payroll | Department | 2 | |ACME Corp|Finance|Payroll | 3 | 1 |
| 10 | Benefits Admin | Department | 4 | |ACME Corp|Human Resources|Benefits Admin | 3 | 1 |
| 9 | Sales | Department | 5 | |ACME Corp|Marketing|Sales | 3 | 1 |
| 3 | Operations | Division | 1 | |ACME Corp|Operations | 2 | 1 |
Alternatively if you realize that leaf nodes can be identified by their lack of child nodes, you can flip this on its head and start with the leaf nodes, and search up the tree, retaining all the original record values, building out the org_path in reverse, and passing along the next parent id as next_id. In the final output, stage, selecting only those records whose next_id is null will yield the same results as the prior query:
with t1(id, org_name, org_type, parent_id, org_path, level, next_id) as (
select o.*
, cast('|'+org_name as varchar(max))
, 1
, parent_id
from organizations o
where not exists (select 1 from organizations c where c.parent_id = o.id)
union all
select t1.id
, t1.org_name
, t1.org_type
, t1.parent_id
, cast('|'+p.org_name as varchar(max))+t1.org_path
, level+1
, p.parent_id
from organizations p
join t1
on t1.next_id = p.id
)
select * from t1 where next_id is null order by org_path
Results:
| id | org_name | org_type | parent_id | org_path | level | next_id |
|----|----------------------|------------|-----------|-------------------------------------------|-------|---------|
| 6 | Accounts Payable | Department | 2 | |ACME Corp|Finance|Accounts Payable | 3 | (null) |
| 7 | Accounts Receivables | Department | 2 | |ACME Corp|Finance|Accounts Receivables | 3 | (null) |
| 8 | Payroll | Department | 2 | |ACME Corp|Finance|Payroll | 3 | (null) |
| 10 | Benefits Admin | Department | 4 | |ACME Corp|Human Resources|Benefits Admin | 3 | (null) |
| 9 | Sales | Department | 5 | |ACME Corp|Marketing|Sales | 3 | (null) |
| 3 | Operations | Division | 1 | |ACME Corp|Operations | 2 | (null) |
One of these two methods may prove more performant than the other, but you'll need to try them each out on your data to see which one works better.

Related

Postgres - Unique values for id column using CTE, Joins alongside GROUP BY

I have a table referrals:
id | user_id_owner | firstname | is_active | user_type | referred_at
----+---------------+-----------+-----------+-----------+-------------
3 | 2 | c | t | agent | 3
5 | 3 | e | f | customer | 5
4 | 1 | d | t | agent | 4
2 | 1 | b | f | agent | 2
1 | 1 | a | t | agent | 1
And another table activations
id | user_id_owner | referral_id | amount_earned | activated_at | app_id
----+---------------+-------------+---------------+--------------+--------
2 | 2 | 3 | 3.0 | 3 | a
4 | 1 | 1 | 6.0 | 5 | b
5 | 4 | 4 | 3.0 | 6 | c
1 | 1 | 2 | 2.0 | 2 | b
3 | 1 | 2 | 5.0 | 4 | b
6 | 1 | 2 | 7.0 | 8 | a
I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.
Here is the query I ran:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select id, app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id )
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
Here is the result I got:
id | activations_count | amount_earned | referred_at | last_activated_at | id | best_selling_app | best_selling_app_count | best_selling_app_rank
----+-------------------+---------------+-------------+-------------------+----+------------------+------------------------+-----------------------
2 | 3 | 14.0 | 2 | 8 | 2 | b | 2 | 1
1 | 1 | 6.0 | 1 | 5 | 1 | b | 1 | 2
2 | 3 | 14.0 | 2 | 8 | 2 | a | 1 | 2
4 | 1 | 3.0 | 4 | 6 | 4 | c | 1 | 2
The problem with this result is that the table has a duplicate id of 2. I only need unique values for the id column.
I tried a workaround by harnessing distinct that gave desired result but I fear the query results may not be reliable and consistent.
Here is the workaround query:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select
distinct on(id), app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id
order by id, best_selling_app_count desc)
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
I need a recommendation on how best to achieve this.
I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.
Your question is really complicated with a very complicated SQL query. However, the above is what looks like the actual question. If so, you can use:
select r.*,
a.app_id as most_common_app_id,
a.cnt as most_common_app_id_count
from referrals r left join
(select distinct on (a.referral_id) a.referral_id, a.app_id, count(*) as cnt
from activations a
group by a.referral_id, a.app_id
order by a.referral_id, count(*) desc
) a
on a.referral_id = r.id;
You have not explained the other columns that are in your result set.

joining more than two tables without repeating values

I want to join three tables,
I have three tables user, profession and education where "uid" is primary key for user table and foreign key for other two tables. I want to join these tables to produce result in one single table
user profession education
+------+-------+ +-----+----------+ +-----+---------+
| uid | uName | | uid | profName | | uid | eduName |
+------+-------+ +-----+----------+ +-----+---------+
| 1 | aaa | | 1 | prof1 | | 1 | edu1 |
| 2 | bbb | | 1 | prof2 | | 1 | edu2 |
| 3 | ccc | | 2 | prof1 | | 1 | edu3 |
| | | | 3 | prof3 | | 3 | edu4 |
| | | | 3 | prof2 | | | |
+------+-------+ +-----+----------+ +-----+---------+
Expected output
+------+-------+-----+----------+-----+---------+
| uid | uName | uid | profName | uid | eduName |
+------+-------+-----+----------+-----+---------+
| 1 | aaa | 1 | prof1 | 1 | edu1 |
| null | null | 1 | prof2 | 1 | edu2 |
| null | null |null | null | 1 | edu3 |
| 2 | bbb | 2 | prof1 | null| null |
| 3 | ccc | 3 | prof3 | 3 | edu4 |
| null | null | 3 | prof2 | null| null |
+------+-------+-----+----------+-----+---------+
I tried following query
select u.uid ,u.uName,p.uid , p.profName,e.uid,e.eduName
from user u inner join profession p on u.uid=p.pid
inner join education e on u.uid = e.uid
where u.uid=p.uid
and u.uid=e.uid
and i.uid=1
Which gives me duplicate values
+------+-------+-----+----------+-----+---------+
| uid | uName | uid | profName | uid | eduName |
+------+-------+-----+----------+-----+---------+
| 1 | aaa | 1 | prof1 | 1 | edu1 |
| 1 | aaa | 1 | prof2 | 1 | edu1 |
| 1 | aaa | 1 | prof1 | 1 | edu2 |
| 1 | aaa | 1 | prof2 | 1 | edu2 |
| 1 | aaa | 1 | prof1 | 1 | edu3 |
| 1 | aaa | 1 | prof2 | 1 | edu3 |
+------+-------+-----+----------+-----+---------+
Is there a way to get the output with not repeating the values.
Thanks
Bit of a swine this one.
I agree with #GordonLinoff that ideally this presentation would be done on the client side.
However, if we wish to do it in SQL, then the basic approach is that you have to get the maximum number of rows that will be consumed by each user (based on a count of how many entries they have in each of the professions and educations tables, and then of these counts, the max count).
Once we have the number of rows required for each user, we expand the rows out for each user as necessary using a numbers table (I've included a number generator for the purpose).
Then we join each table on, according to the uid and the row number of the entry in the joined table relative to the row number of the "expanded" rows for each user. Then we select the relevant columns, and that's us done. Pay the nurse on the way out!
WITH
number_table(number) AS
(
SELECT
(ones.n) + (10 * tens.n) + (100 * hundreds.n) AS number
FROM --available range 0 to 999
(VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS ones(n)
,(VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS tens(n)
,(VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS hundreds(n)
)
,users(u_uid, userName) AS
(
SELECT 1, 'aaa'
UNION ALL
SELECT 2, 'bbb'
UNION ALL
SELECT 3, 'ccc'
)
,professions(p_u_uid, profName) AS
(
SELECT 1, 'prof1'
UNION ALL
SELECT 1, 'prof2'
UNION ALL
SELECT 2, 'prof1'
UNION ALL
SELECT 3, 'prof3'
UNION ALL
SELECT 3, 'prof2'
)
,educations(e_u_uid, eduName) AS
(
SELECT 1, 'edu1'
UNION ALL
SELECT 1, 'edu2'
UNION ALL
SELECT 1, 'edu3'
UNION ALL
SELECT 3, 'edu4'
)
,row_counts(uid, row_count) AS
(
SELECT u_uid, COUNT(u_uid) FROM users GROUP BY u_uid
UNION ALL
SELECT p_u_uid, COUNT(p_u_uid) FROM professions GROUP BY p_u_uid
UNION ALL
SELECT e_u_uid, COUNT(e_u_uid) FROM educations GROUP BY e_u_uid
)
,max_counts(uid, max_count) AS
(
SELECT uid, MAX(row_count) FROM row_counts GROUP BY uid
)
SELECT
u_uid
,userName
,p_u_uid
,profName
,e_u_uid
,eduName
FROM
max_counts
INNER JOIN
number_table ON number BETWEEN 1 AND max_count
LEFT JOIN
(
SELECT u_uid, userName, ROW_NUMBER() OVER (PARTITION BY u_uid ORDER BY userName) AS user_match
FROM users
) AS users
ON u_uid = uid
AND number = user_match
LEFT JOIN
(
SELECT p_u_uid, profName, ROW_NUMBER() OVER (PARTITION BY p_u_uid ORDER BY profName) AS prof_match
FROM professions
) AS professions
ON p_u_uid = uid
AND number = prof_match
LEFT JOIN
(
SELECT e_u_uid, eduName, ROW_NUMBER() OVER (PARTITION BY e_u_uid ORDER BY eduName) AS edu_match
FROM educations
) AS educations
ON e_u_uid = uid
AND number = edu_match
ORDER BY
IIF(COALESCE(u_uid, p_u_uid, e_u_uid) IS NULL, 1, 0) ASC --nulls last
,COALESCE(u_uid, p_u_uid, e_u_uid) ASC
,IIF(COALESCE(p_u_uid, e_u_uid) IS NULL, 1, 0) ASC --nulls last
,COALESCE(p_u_uid, e_u_uid) ASC
,IIF(e_u_uid IS NULL, 1, 0) ASC --nulls last
,e_u_uid ASC
And the results:
u_uid userName p_u_uid profName e_u_uid eduName
----------- -------- ----------- -------- ----------- -------
1 aaa 1 prof1 1 edu1
NULL NULL 1 prof2 1 edu2
NULL NULL NULL NULL 1 edu3
2 bbb 2 prof1 NULL NULL
3 ccc 3 prof2 3 edu4
NULL NULL 3 prof3 NULL NULL
Did you try the distinct keyword?
select DISTINCT u.uid ,u.uName,p.uid , p.profName,e.uid,e.eduName
from user u inner join profession p on u.uid=p.pid
inner join education e on u.uid = e.uid
where u.uid=p.uid
and u.uid=e.uid
and i.uid=1

Postgresql : Mark the first row of a group

I have a table t like this :
id | group_id | name
------------------------
1 | 1 | richard
2 | 1 | ray
3 | 2 | enzo
4 | 2 | shiela
5 | 2 | anne
I have no problem selecting each group, however I want to mark the first occurrence for each group by group_id. Then add it as column to mark that the row is the first occurrence of that group.
E.g, Richard for group 1, or Enzo for group 2 and so on.
I should be able to use:
select
t.*
case
when (condition)
...(boolean result here)
end as is_first_row
from t
and result to :
id | group_id | name |is_first_row
-------------------------------
1 | 1 | richard | t
2 | 1 | ray | f
3 | 2 | enzo | t
4 | 2 | shiela | f
5 | 2 | anne | f
How do I formulate the condition statement for the select query?
Use row_number():
with my_table(id, group_id, name) as (
values
(1, 1, 'richard'),
(2, 1, 'ray'),
(3, 2, 'enzo'),
(4, 2, 'shiela'),
(5, 2, 'anne')
)
select *, row_number() over w = 1 as is_first_row
from my_table
window w as (partition by group_id order by id);
id | group_id | name | is_first_row
----+----------+---------+--------------
1 | 1 | richard | t
2 | 1 | ray | f
3 | 2 | enzo | t
4 | 2 | shiela | f
5 | 2 | anne | f
(5 rows)
Select row_number() to see how it works. Row numbers are calculated in partitions by group_id i.e. for every group_id separately, in order by id:
with my_table(id, group_id, name) as (
values
(1, 1, 'richard'),
(2, 1, 'ray'),
(3, 2, 'enzo'),
(4, 2, 'shiela'),
(5, 2, 'anne')
)
select *, row_number() over w
from my_table
window w as (partition by group_id order by id);
id | group_id | name | row_number
----+----------+---------+------------
1 | 1 | richard | 1
2 | 1 | ray | 2
3 | 2 | enzo | 1
4 | 2 | shiela | 2
5 | 2 | anne | 3
(5 rows)
please check my answer and let me know in case of any error in the logic
Create Table #Temp(id int,group_id int,name nvarchar(max))
Insert into #Temp values
(1,1,'richard')
,(2,1,'ray')
,(3,2,'enzo')
,(4,2,'shiela')
,(5,2,'anne')
Select t2.id,t2.group_id,t2.name,t1.group_id_c, case
when t1.group_id_c=1 then 't'
else 'f'
end AS is_firstrow from #temp t2 join
(Select t.*, row_number() over (partition by group_id order by id) as group_id_c from #Temp t ) t1
on t1.id=t2.id

Recursive SQL - count number of descendants in hierarchical structure

Consider a database table with the following columns:
mathematician_id
name
advisor1
advisor2
The database represents data from the Math Genealogy Project, where each mathematician usually has one single advisor, but there are situations when there are two advisors.
Visual aid to make things clearer:
How do I count the number of descendants for each of the mathematicians?
I should probably use Common Table Expressions (WITH RECURSIVE), but I am pretty much stuck at the moment. All the similar examples I found deal with hierarchies having only one parent, not two.
Update:
I adapted the solution for SQL Server provided by Vladimir Baranov to also work in PostgreSQL:
WITH RECURSIVE cte AS (
SELECT m.id as start_id,
m.id,
m.name,
m.advisor1,
m.advisor2,
1 AS level
FROM public.mathematicians AS m
UNION ALL
SELECT cte.start_id,
m.id,
m.name,
m.advisor1,
m.advisor2,
cte.level + 1 AS level
FROM public.mathematicians AS m
INNER JOIN cte ON cte.id = m.advisor1
OR cte.id = m.advisor2
),
cte_distinct AS (
SELECT DISTINCT start_id, id
FROM cte
)
SELECT cte_distinct.start_id,
m.name,
COUNT(*)-1 AS descendants_count
FROM cte_distinct
INNER JOIN public.mathematicians AS m ON m.id = cte_distinct.start_id
GROUP BY cte_distinct.start_id, m.name
ORDER BY cte_distinct.start_id
You didn't say what DBMS you use. I'll use SQL Server for this example, but it will work in other databases that support recursive queries as well.
Sample data
I entered only the right part of your tree, starting from Euler.
The most interesting part is the multiple paths between Lagrange and Dirichlet.
DECLARE #T TABLE (ID int, name nvarchar(50), Advisor1ID int, Advisor2ID int);
INSERT INTO #T (ID, name, Advisor1ID, Advisor2ID) VALUES
(1, 'Euler', NULL, NULL),
(2, 'Lagrange', 1, NULL),
(3, 'Laplace', NULL, NULL),
(4, 'Fourier', 2, NULL),
(5, 'Poisson', 2, 3),
(6, 'Dirichlet', 4, 5),
(7, 'Lipschitz', 6, NULL),
(8, 'Klein', NULL, 7),
(9, 'Lindemann', 8, NULL),
(10, 'Furtwangler', 8, NULL),
(11, 'Hilbert', 9, NULL),
(12, 'Taussky-Todd', 10, NULL);
This is how it looks like:
SELECT * FROM #T;
+----+--------------+------------+------------+
| ID | name | Advisor1ID | Advisor2ID |
+----+--------------+------------+------------+
| 1 | Euler | NULL | NULL |
| 2 | Lagrange | 1 | NULL |
| 3 | Laplace | NULL | NULL |
| 4 | Fourier | 2 | NULL |
| 5 | Poisson | 2 | 3 |
| 6 | Dirichlet | 4 | 5 |
| 7 | Lipschitz | 6 | NULL |
| 8 | Klein | NULL | 7 |
| 9 | Lindemann | 8 | NULL |
| 10 | Furtwangler | 8 | NULL |
| 11 | Hilbert | 9 | NULL |
| 12 | Taussky-Todd | 10 | NULL |
+----+--------------+------------+------------+
Query
It is a classic recursive query with two interesting points.
1) The recursive part of the CTE joins to the anchor part using both Advisor1ID and Advisor2ID:
INNER JOIN CTE
ON CTE.ID = T.Advisor1ID
OR CTE.ID = T.Advisor2ID
2) Since it is possible to have multiple paths to the descendant, recursive query may output the node several times. To eliminate these duplicates I used DISTINCT in CTE_Distinct. It may be possible to solve it more efficiently.
To understand better how the query works run each CTE separately and examine intermediate results.
WITH
CTE
AS
(
SELECT
T.ID AS StartID
,T.ID
,T.name
,T.Advisor1ID
,T.Advisor2ID
,1 AS Lvl
FROM #T AS T
UNION ALL
SELECT
CTE.StartID
,T.ID
,T.name
,T.Advisor1ID
,T.Advisor2ID
,CTE.Lvl + 1 AS Lvl
FROM
#T AS T
INNER JOIN CTE
ON CTE.ID = T.Advisor1ID
OR CTE.ID = T.Advisor2ID
)
,CTE_Distinct
AS
(
SELECT DISTINCT
StartID
,ID
FROM CTE
)
SELECT
CTE_Distinct.StartID
,T.name
,COUNT(*) AS DescendantCount
FROM
CTE_Distinct
INNER JOIN #T AS T ON T.ID = CTE_Distinct.StartID
GROUP BY
CTE_Distinct.StartID
,T.name
ORDER BY CTE_Distinct.StartID;
Result
+---------+--------------+-----------------+
| StartID | name | DescendantCount |
+---------+--------------+-----------------+
| 1 | Euler | 11 |
| 2 | Lagrange | 10 |
| 3 | Laplace | 9 |
| 4 | Fourier | 8 |
| 5 | Poisson | 8 |
| 6 | Dirichlet | 7 |
| 7 | Lipschitz | 6 |
| 8 | Klein | 5 |
| 9 | Lindemann | 2 |
| 10 | Furtwangler | 2 |
| 11 | Hilbert | 1 |
| 12 | Taussky-Todd | 1 |
+---------+--------------+-----------------+
Here DescendantCount counts the node itself as a descendant. You can subtract 1 from this result if you want to see 0 instead of 1 for the leaf nodes.
Here is SQL Fiddle.

How to select hierarchy collection? (mixed with non hierarchy data, etc)

Having the table:
I need to show the following:
| ID | PERSONID | MASTERID | CHILDID | VALUE | DEPTHLEVEL |
---------------------------------------------------------------
| 1 | 3 | 78452 | 21456 | 100 | 1 |
| 2 | 3 | 21456 | | 0 | 2 |
| 3 | 3 | 652314 | 417859 | 115 | 1 |
| 4 | 3 | 417859 | | 0 | 2 |
| 5 | 4 | 998654 | 223655 | 300 | 1 |
| 6 | 4 | 223655 | | 0 | 2 |
| 7 | 4 | 201302 |789654,441592| 200 | 1 |
| 8 | 4 | 789654 | | 0 | 2 |
| 9 | 4 | 441592 | | 0 | 2 |
| 10 | 5 | 999852 | | 123 | 1 |
Look at the row with id 10 this row has not relations (childs), the row with id 7 has two childs.
I need to quit (put value to 0) the value for every child/leaf.
For the row 1-9 I try the following query:
select v.* from
(
select v.id, v.personid,
case when level > 1
then 0
else
v.value
end thevalue,
v.masterid, v.childid, level depthlevel
from tmpsimpleexample v
start with v.childid is not null
connect by v.masterid = prior v.childid
) v
order by v.id
Results:
Look the rows with id 7, 8 is the master with two childs, I need to put this in one row.
This is the first problem.
Also I need to show the data with no hierarchy relation(id 10 in expected result table, id 11 in image table data).
I think that I can query all rows with masterid not referenced by a childid and then make an union between the first query(above) and the query to search all master id not referenced by childid.
The query to to search all rows with masterid not referenced by childid will show me the row without relation and the master rows of level 1.
select id, personid, value thevalue, masterid, childid, 1 depthlevel
from TMPSIMPLEEXAMPLE
where masterid not in
(select childid from TMPSIMPLEEXAMPLE where childid is not null)
Here I can do an union and the result will fit my requirements(except the childid concatenate for master row).
select v.* from
(
select v.id, v.personid,
case when level > 1
then 0
else
v.value
end thevalue,
v.masterid, v.childid, level depthlevel
from tmpsimpleexample v
start with v.childid is not null
connect by v.masterid = prior v.childid
union
select id, personid, value thevalue, masterid, childid, 1 depthlevel
from TMPSIMPLEEXAMPLE
where masterid not in
(select childid from TMPSIMPLEEXAMPLE where childid is not null)
) v
order by v.id
Almost final result:
But knowing that my real table has hundred of thousands of records make union like that are a good approach?
I've taken a stab at what I think your source data looks like:
| ID | PERSONID | MASTERID | CHILDID | VALUE |
-----------------------------------------------
| 1 | 3 | 78452 | 21456 | 100 |
| 2 | 3 | 21456 | | -1 |
| 3 | 3 | 652314 | 417859 | 115 |
| 4 | 3 | 417859 | | -1 |
| 5 | 4 | 998654 | 223655 | 300 |
| 6 | 4 | 223655 | | -1 |
| 7 | 4 | 201302 | 441592 | 200 |
| 7 | 4 | 201302 | 789654 | 200 |
| 9 | 4 | 441592 | | -1 |
| 8 | 4 | 789654 | | -1 |
| 10 | 4 | 999852 | | 123 |
-----------------------------------------------
The following query gets you your desired results:
enter code here
select id,
personid,
masterid,
listagg(childid, ',') within group (order by childid) childid,
-- Took a guess that all values for a personid were the same and didn't need to be aggregated...
min(decode(depthlevel, 1, value, null)) value,
min(depthlevel) depthlevel
from (select v.*, level depthlevel
from tmpsimpleexample v
connect by v.masterid = prior v.childid
-- Trick here is to start with all of the desired starting conditions...
start with not exists ( select 'X' from tmpsimpleexample v2 where v2.childid = v.masterid ))
group by id, personid, masterid;
If ordering of your CHILDID is important, you would need to re-join the nested view with TMPSIMPLEEXAMPLE:
select v1.id,
v1.personid,
v1.masterid,
listagg(v1.childid, ',') within group (order by v2.id) childid,
min(decode(depthlevel, 1, v1.value, null)) value,
min(depthlevel) depthlevel
from (select v.*, level depthlevel
from tmpsimpleexample v
connect by v.masterid = prior v.childid
start with not exists ( select 'X' from tmpsimpleexample v2 where v2.childid = v.masterid )) v1,
tmpsimpleexample v2
-- Outer Join is important!
where v1.childid = v2.masterid (+)
group by v1.id, v1.personid, v1.masterid;
The real magic here is the LISTAGGG function. If you are not on 11g or better yet (why not?!?), then the following article can guide you in building your own aggregate function:
http://www.oracle-base.com/articles/misc/string-aggregation-techniques.php