How to display multiple hierarchical tree structure in BigQuery

How to display multiple hierarchical tree structure in BigQuery - google-bigquery

I am working on the tree hierarchy of supervisors and their supervised employees. The difficulty is that some supervisors are employees supervised by other supervisors, and there are lots of it.
For SQL queries I acquired from class, only about simple self-joins, which might be only like two levels: A is supervised by B, and that's it.
But the issue from the real world is far more complicated. There are multiple levels, and I am not sure about the exact number. For example, A is supervised by B, and B is supervised by C, and C is supervised by D, etc.
I assume there are only like 5 or more levels for supervision. The raw data might be like this:
Employee Supervisor
A B
C B
D B
B V
E V
F E
G V
V (Blank which indicates no boss)
H A
The codes provided by some BigQuery expert is as below:
#standardSQL
SELECT t.Supervisor,
IF(t.Supervisor = t5.Supervisor,
STRUCT(Employee2 AS Employee1, NULL AS Employee2),
STRUCT(t5.Supervisor AS Employee1, Employee2 AS Employee2)
).*
FROM (
SELECT t1.Employee Supervisor,
COALESCE(t4.Employee, t3.Employee, t2.Employee) Employee2
FROM `project.dataset.table` t1
LEFT JOIN `project.dataset.table` t2 ON t2.Supervisor = t1.Employee
LEFT JOIN `project.dataset.table` t3 ON t3.Supervisor = t2.Employee
LEFT JOIN `project.dataset.table` t4 ON t4.Supervisor = t3.Employee
WHERE t1.Supervisor IS NULL
) t
LEFT JOIN `project.dataset.table` t5 ON t5.Employee = t.Employee2
and the result turned to be like this:
Row Supervisor Employee1 Employee2
1 V B A
2 V B C
3 V B D
4 V E F
5 V G null
But we want is :
Row Supervisor Employee1 Employee2 Employee3
1 V B A H
2 V B C Null
3 V B D Null
4 V E F Null
5 V G null Null
Then how to change the codes if I would like to have more levels of hierarchies? which means if I would like to add employee3, or 4, how can I edit it? Thanks!

Below is for BigQuery Standard SQL
#standardSQL
WITH e0 AS (
SELECT Employee AS Supervisor FROM `project.dataset.table` WHERE Supervisor IS NULL
), e1 AS (
SELECT e.Supervisor, Employee AS Employee1
FROM e0 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Supervisor
), e2 AS (
SELECT e.Supervisor, Employee1, Employee AS Employee2
FROM e1 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee1
), e3 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee AS Employee3
FROM e2 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee2
)
SELECT * FROM e3
if to apply to sample data from your question - result/output is
Row Supervisor Employee1 Employee2 Employee3
1 V B A H
2 V B C null
3 V B D null
4 V E F null
5 V G null null
You can easily extend above adding more levels like below (replacing and with respective numbers like 4, 5, 6, 7 etc.) Obviously up to reasonable extend
e<N> AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, ... , Employee AS Employee<N>
FROM e<N-1> e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee<N-1>
)
SELECT * FROM e<N>
for example
#standardSQL
WITH e0 AS (
SELECT Employee AS Supervisor FROM `project.dataset.table` WHERE Supervisor IS NULL
), e1 AS (
SELECT e.Supervisor, Employee AS Employee1
FROM e0 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Supervisor
), e2 AS (
SELECT e.Supervisor, Employee1, Employee AS Employee2
FROM e1 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee1
), e3 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee AS Employee3
FROM e2 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee2
), e4 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, Employee AS Employee4
FROM e3 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee3
), e5 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, Employee4, Employee AS Employee5
FROM e4 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee4
)
SELECT * FROM e5

Related

How to get the cumulative sum of children up its parents?

So I have written a query to get the cumulative sum of children but I think partition sum has error as its totalling for the parent that is not part of the children.
My fiddle is http://sqlfiddle.com/#!15/88828/1
I have dont a running total but thats wrong. I want the siblings total to a child and child total back to its parent.So basically cumulative total of a child up the tree.
expected output
parent_child_tree id name dimensionvalueid level order_sequence volume cummulative_total
A1 1 A1 (null) 0 1 20 840
-----A1:1 2 A1:1 1 1 1_1 (null) 820
----------A1:1:1 3 A1:1:1 2 2 1_1_2 20 820
-----------A1:1:1:1 4 A1:1:1:1 3 3 1_1_2_3 300 800
-----------A1:1:1:2 5 A1:1:1:2. 3 3 1_1_2_3 500 500
B1 6 B1 (null) 0 6 200 300
-----B1:2 8 B1:2 6 1 6_6 (null) null
-----B1:1 7 B1:1 6 1 6_6 (null) 100
----------B1:2:1 9 B1:2:1 8 2 6_6_8 100 100

To get totals for tree nodes you need to generate hierarchy tree for every node in a subquery like this
SELECT
d.*,
v.volume,
(
WITH RECURSIVE cte AS (
SELECT
dd.id AS branch_id,
dd.id
FROM dimensionvalue dd
WHERE dd.id = d.id
UNION ALL
SELECT
cte.branch_id,
dd.id
FROM dimensionvalue dd
JOIN cte ON dd.dimensionvalueid = cte.id
)
SELECT SUM(v.volume)
FROM cte
JOIN valuation v ON v.dimensionvalueid = cte.id
GROUP BY cte.branch_id
) AS totals
FROM dimensionvalue d
LEFT JOIN valuation v ON v.dimensionvalueid = d.id
ORDER BY d.name;
If you really need all those "decoration" columns that you generate in your query for each tree node than you can combine your recursive CTE hierarchy with subquery for totals calculation like this
WITH RECURSIVE hierarchy AS (
SELECT
d.id,
d.name,
d.dimensionvalueid,
0 AS level,
CAST(d.id AS varchar(50)) AS order_sequence
FROM dimensionvalue d
WHERE d.dimensionvalueid IS NULL
UNION ALL
SELECT
e.id,
e.name,
e.dimensionvalueid,
hierarchy.level + 1 AS level,
CAST(hierarchy.order_sequence || '_' || CAST(hierarchy.id AS VARCHAR(50)) AS VARCHAR(50)) AS order_sequence
FROM hierarchy
JOIN dimensionvalue e ON e.dimensionvalueid = hierarchy.id
)
SELECT
RIGHT('-----------', h.level * 5) || h.name || ' ' AS parent_child_tree,
h.*,
v.volume,
(
WITH RECURSIVE cte AS (
SELECT
dd.id AS branch_id,
dd.id
FROM dimensionvalue dd
WHERE dd.id = h.id
UNION ALL
SELECT
cte.branch_id,
dd.id
FROM dimensionvalue dd
JOIN cte ON dd.dimensionvalueid = cte.id
)
SELECT SUM(v.volume)
FROM cte
JOIN valuation v ON v.dimensionvalueid = cte.id
GROUP BY cte.branch_id
) AS totals
FROM hierarchy h
LEFT JOIN valuation v ON v.dimensionvalueid = h.id
ORDER BY h.name
You can check a working demo here

show dummy row as group by heading for left outer join

I have 3 tables say company, department and employee. Now I want to find out all the employees who works in company A under department D and want to display data for inner as well as left outer join to see any department that does not have any employee and company which does not have any department.
There is parent key relation between all this tables
**TABLE COMPANY**
COMPANY_NAME COMPANY ID
C1 COMP1
C2 COMP2
C3 COMP3
**TABLE DEPARTMENT**
DEPARTMENT_NAME COMPANY_NAME
D1 C1
D2 C1
D3 C2
**TABLE EMPLOYEE**
EMPLOYEE_ID DEPARTMENT_NAME
E1 D1
E2 D1
E3 D1
E4 D2
E5 D2
Company -- > Department --- > Employee.
I also want to display entity against each column as dummy column.
ENTITY COMPANY_NAME DEPARTMENT_NAME EMPLOYEE_ID
COMPANY C1 - -
DEPARTMENT C1 D1
EMPLOYEE C1 D1 E1
EMPLOYEE C1 D1 E2
EMPLOYEE C1 D1 E3
DEPARTMENT C1 D2 -
EMPLOYEE C1 D2 E4
EMPLOYEE C1 D2 E5
COMPANY C2 - -
DEPARTMENT C2 D3 -
COMPANY C3 - -

You can use the SET operator UNION ALL as follows:
SELECT 'COMPANY' AS ENTITY,
COMPANY_NAME,
'-' AS DEPARTMENT_NAME,
'-' AS EMPLOYEE_ID
FROM COMPANY C
UNION ALL
SELECT 'DEPARTMENT' AS ENTITY,
C.COMPANY_NAME,
D.DEPARTMENT_NAME,
'-' AS EMPLOYEE_ID
FROM DEPARTMENT D
JOIN COMPANY C ON C.COMPANY_ID = D.COMPANY_ID
UNION ALL
SELECT 'EMPLOYEE' AS ENTITY,
C.COMPANY_NAME,
D.DEPARTMENT_NAME,
E.EMPLOYEE_ID
FROM DEPARTMENT D
JOIN COMPANY C ON C.COMPANY_ID = D.COMPANY_ID
JOIN EMPLOYEE E ON E.DEPARTMENT_NAME = D.DEPARTMENT_NAME
ORDER BY COMPANY_NAME,
CASE WHEN DEPARTMENT_NAME = '-' THEN 1 ELSE 2 END,
DEPARTMENT_NAME,
CASE WHEN EMPLOYEE_ID = '-' THEN 1 ELSE 2 END,
EMPLOYEE_ID

Please try this:
select
case entity_disp
when 1 then 'COMPANY'
when 2 then 'DEPARTMENT'
when 3 then 'EMPLOYEE'
end as entity,
company_name,
department_name,
employee_id
from
(select
c.company_name,
d.department_name,
e.employee_id,
1 + decode(d.department_name, null, 0, 1) + decode(e.employee_id, null, 0, 1) as entity_disp
from
company c
left outer join department d on (d.company_name = c.company_name)
left outer join employee e on (e.department_name = d.department_name)
) c
order by c.entity_disp
Take into consideration that according yor tabel model expample it looks like that Company and Department tabels are joined based on company_name. Also Department and Employee tables are joined based on Department_name.
This is how select is implemented. If needed - rework joins with proper relational columns.

How to compare IDs with comma separated string in Oracle SQL?

I have two tables, for instance "Employees" and "Projects". I need a list of all projects with the name of all the employes involved.
Problem now is, that the employe_ids are saved with commas, like in the example below:
employe | ID project | employe_id
-------------- -----------------------
Person A | 1 Project X | ,2,
Person B | 2 Project Y |
Person C | 3 Project Z | ,1,3,
select
p.project, e.employe
from
projects p
left join employees e on e.id = p.employe_id ???
How do I have to write the join to get the desired output:
project | employe
--------------------
Project X | Person B,
Project Y |
Project Z | Person A, Person C

You can try group by and listagg as following:
select
p.project,
listagg(e.employe,',') within group (order by e.id) as employee
from
projects p
left join employees e on p.employe_id like '%,' || e.id || ',%'
Group by p.project
Cheers!!

Here's a kind of fun way to do it:
WITH cteId_counts_by_project AS (SELECT PROJECT,
NVL(REGEXP_COUNT(EMPLOYE_ID, '[^,]'), 0) AS ID_COUNT
FROM PROJECTS),
cteMax_id_count AS (SELECT MAX(ID_COUNT) AS MAX_ID_COUNT
FROM cteId_counts_by_project),
cteProject_employee_ids AS (SELECT PROJECT,
EMPLOYE_ID,
REGEXP_SUBSTR(EMPLOYE_ID, '[^,]',1, LEVEL) AS ID
FROM PROJECTS
CROSS JOIN cteMax_id_count m
CONNECT BY LEVEL <= m.MAX_ID_COUNT),
cteProject_emps AS (SELECT DISTINCT PROJECT, ID
FROM cteProject_employee_ids
WHERE ID IS NOT NULL),
cteProject_empnames AS (SELECT pe.PROJECT, pe.ID, e.EMPLOYE
FROM cteProject_emps pe
LEFT OUTER JOIN EMPLOYEES e
ON e.ID = pe.ID
ORDER BY pe.PROJECT, e.EMPLOYE)
SELECT p.PROJECT,
LISTAGG(pe.EMPLOYE, ',') WITHIN GROUP (ORDER BY pe.EMPLOYE) AS EMPLOYEE_LIST
FROM PROJECTS p
LEFT OUTER JOIN cteProject_empnames pe
ON pe.PROJECT = p.PROJECT
GROUP BY p.PROJECT
ORDER BY p.PROJECT
You can certainly compress some of the CTE's together to save space, but I kept them separate so you can see how each little bit adds to the solution.
dbfiddle here

Yet another option; code you need (as you already have those tables) begins at line #12:
SQL> -- Your sample data
SQL> with employees (employee, id) as
2 (select 'Person A', 1 from dual union all
3 select 'Person B', 2 from dual union all
4 select 'Person C', 3 from dual
5 ),
6 projects (project, employee_id) as
7 (select 'Project X', ',2,' from dual union all
8 select 'Project Y', null from dual union all
9 select 'Project Z', ',1,3,' from dual
10 ),
11 -- Employees per project
12 emperpro as
13 (select project, regexp_substr(employee_id, '[^,]+', 1, column_value) id
14 from projects cross join table(cast(multiset(select level from dual
15 connect by level <= regexp_count(employee_id, ',') + 1
16 ) as sys.odcinumberlist))
17 )
18 -- Final result
19 select p.project, listagg(e.employee, ', ') within group (order by null) employee
20 from emperpro p left join employees e on e.id = p.id
21 group by p.project
22 /
PROJECT EMPLOYEE
--------- ----------------------------------------
Project X Person B
Project Y
Project Z Person A, Person C
SQL>

You can extract the numeric IDs from the employee_id column of the projects table by using regexp_substr() and rtrim() ( trimming the last extra comma ) functions together, and then concatenate by listagg() function :
with p2 as
(
select distinct p.*, regexp_substr(rtrim(p.employee_id,','),'[^,]',1,level) as p_eid,
level as rn
from projects p
connect by level <= regexp_count(rtrim(p.employee_id,','),',')
)
select p2.project, listagg(e.employee,', ') within group (order by p2.rn) as employee
from p2
left join employees e on e.id = p2.p_eid
group by p2.project
Demo

Pivot and left join

T-SQL, need help with combining 2 values in an column when Pivoting.
I have a Employee table with the below data -
EmpId Status L#
E1 A 1
E1 B 1
E2 A 2
E2 B 2
E3 B 3
E3 C 3
E3 D 3
and a Supervisor Table -
EmpId Sup
E1 S1
E2 S2
E3 S3
I would like to combine the values for L# when the Status is B or C
EmpId Sup A B,C D
E1 S1 1 1 0
E2 S2 1 2 0
E3 S3 0 2 1

Just use conditional aggregation. I am a little unclear what the results are. The following counts the statuses:
select s.empid, s.sup,
sum(case when status = 'A' then 1 else 0 end) as a,
sum(case when status in ('B', 'C') then 1 else 0 end) as bc,
sum(case when status = 'D' then 1 else 0 end) as d
from employee e join
supervisor s
on e.empid = s.empid
group by s.empid, s.sup;

This may help
DECLARE #TableEmp TABLE (EmpId VARCHAR(10), Status VARCHAR(10), L# INT)
INSERT INTO #TableEmp VALUES
('E1', 'A', 1 ),
('E1', 'B', 1 ),
('E2', 'A', 2 ),
('E2', 'B', 2 ),
('E3', 'B', 3 ),
('E3', 'C', 3 ),
('E3', 'D', 3 )
DECLARE #TableSup TABLE(EmpId VARCHAR(10), Sup VARCHAR(10))
INSERT INTO #TableSup VALUES
('E1', 'S1'),
('E2', 'S2'),
('E3', 'S3')
SELECT EmpId, Sup ,ISNULL(A,0) A, ISNULL(B,0) B,ISNULL(C,0) C,ISNULL(D,0) D
FROM(
SELECT e.EmpId, Sup,Status,ISNULL(L#,0) L#
FROM #TableEmp e
LEFT JOIN #TableSup s ON e.EmpId = s.EmpId
) p
PIVOT(
SUM(L#)
FOR Status IN (A, B, C,D)
) piv

The following query should do what you want:
CREATE TABLE #emp (EmpID VARCHAR(10), [Status] VARCHAR(10), L# INT)
CREATE TABLE #sup (EmpID VARCHAR(10), [Sup] VARCHAR(10))
INSERT INTO #emp VALUES
('E1','A',0),
('E1','B',1),
('E2','A',2),
('E2','B',2),
('E3','B',3),
('E3','C',5),
('E3','D',5)
INSERT INTO #sup VALUES
('E1','S1'),
('E2','S2'),
('E3','S3')
SELECT * FROM (
SELECT e.EmpID, s.Sup, t.[Status], COUNT(L#) AS [Cnt]
FROM #emp e
INNER JOIN #sup s ON e.EmpID = s.EmpID
CROSS APPLY (VALUES(CASE WHEN e.[Status] IN ('B','C') THEN 'B,C' ELSE e.[Status] END)) AS T([Status])
GROUP BY e.EmpID, s.Sup, t.[Status] ) A
PIVOT (MAX([Cnt]) FOR [Status] IN ([A],[B,C],[D])) pvt
The result is as below,
EmpID Sup A B,C D
E1 S1 1 1 NULL
E2 S2 1 1 NULL
E3 S3 NULL 2 1

Sql Query to find sum from different tables

Hi i require a help in writing a query.
Tables are:
tblStandard1students
tblStandard2students
tblStandard1students
tblDivision
tblCandidateinfo
tblStandard1students,tblStandard2students,tblStandard1studentstbl contain information about students enrolled in standard 1,2 and 3
tblStandars1students
Candid admitted
1 Y
2 N
3 Y
tblDivision contains only 2 columns
ID Division
1 A
2 B
3 C
tblCandidateinfo
Candid gender Division
1 M 1
2 F 2
and so on...
Now I want the table like this
Division Students(Standard1) Students(Standard2) Students(Standard3)
M F M F M F
------------------------------------------------------------------------
A 1 0 0 0 0 1
B 2 2 3 3 4 4
C 1 0 0 0 0 0
I tried this following query:
SELECT Division,
( SELECT count(*)
FROM tblStandard1students A
INNER JOIN tblCandidateinfo B ON A.Candid=B.Candid
INNER JOIN tblDivision C ON C.ID=B.Division) AS Students(Standard1),
( SELECT count(*)
FROM tblStandard2students A
INNER JOIN tblCandidateinfo B ON A.Candid=B.Candid
INNER JOIN tblDivision C ON C.ID=B.Division) AS Students(Standard2),
( SELECT count(*)
FROM tblStandard3students A
INNER JOIN tblCandidateinfo B ON A.Candid=B.Candid
INNER JOIN tblDivision C ON C.ID=B.Division ) AS Students(Standard3)
FROM tblDivision Z
but this is only half the query i din segregate it gender wise...help me to complete it.

;WITH combined AS
(
SELECT ci.Division, 'Students(Standard1) ' + ci.gender AS grp
FROM tblCandidateInfo ci
INNER JOIN tblStandard1students s ON ci.Candid = s.Candid
UNION ALL
SELECT ci.Division, 'Students(Standard2) ' + ci.gender AS grp
FROM tblCandidateInfo ci
INNER JOIN tblStandard2students s ON ci.Candid = s.Candid
UNION ALL
SELECT ci.Division, 'Students(Standard3) ' + ci.gender AS grp
FROM tblCandidateInfo ci
INNER JOIN tblStandard1studentstbl s ON ci.Candid = s.Candid
)
SELECT Division,
[Students(Standard1) M], [Students(Standard1) F],
[Students(Standard2) M], [Students(Standard2) F],
[Students(Standard3) M], [Students(Standard3) F]
FROM
(
SELECT d.Division, grp
FROM tblDivision d
LEFT OUTER JOIN combined c ON d.ID = c.Division
) x
PIVOT
(
COUNT(grp)
FOR grp IN ([Students(Standard1) M], [Students(Standard1) F],
[Students(Standard2) M], [Students(Standard2) F],
[Students(Standard3) M], [Students(Standard3) F])
) y
ORDER BY Division

SELECT divison.Division ,IFNULL(stander1.M,0),IFNULL(stander1.F,0) FROM test.tblDivision divison
Left join (SELECT division ,count( case gender when 'M' then 1 else null end) as M,count( case gender when 'F' then 1 else null end) as F
FROM
test.tblCandidateinfo tc inner join test.tblStandars1students ts1
ON tc.Candid=ts1.Candid
group by division) as stander1 on stander1.division= divison.id
group by divison.id
;
Insted of IFNULL use ISNULL and
take left join for all standar tables

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to display multiple hierarchical tree structure in BigQuery - google-bigquery

Related

How to get the cumulative sum of children up its parents?

show dummy row as group by heading for left outer join

How to compare IDs with comma separated string in Oracle SQL?

Pivot and left join

Sql Query to find sum from different tables

Categories

Resources