How to collapse similar rows in SQL query - sql

Using an MS SQL query, I would like to collapse redundant data in sequential rows to make a report easier to read. Suppose I have the following data:
ID | Department | Name | Task
-------------------------------------------------
1 Sales Mike Call customer
2 Sales Mike Create quote
3 Sales Sarah Create order
4 Engineering Sam Design prototype
5 Engineering Sam Calculate loads
6 Production Team1 Build parts
7 Production Team2 Build parts
8 Production Team1 Assemble parts
9 Accounting Amy Invoice job
10 Sales Mike Call customer
11 Sales Mike Follow up after 30 days
How can I get the following summarized rows from my query:
Sales: Mike (Call customer, Create quote) Sarah (Create order)
Engineering: Sam (Design prototype, Calculate loads)
Production: Team1 (Build parts) Team2 (Build parts) Team1 (Assemble parts)
Accounting: Amy (Invoice job)
Sales: Mike (Call customer, Follow up after 30 days)
Essentially if the previous department and name are the same, add the task to a comma separated sub-list. If only the department is the same, start a new named sub-list and if the department is new then start a new row. Notice that items 10 and 11 should not be included in the first row so that the order of tasks is not lost.
In C# this task is easy using loops and/or Linq, however, I now need to produce the same result from a SQL query.

Use recursive Common Table Expressions (working SQLFiddle example)
;with RecursivePersonTask( Id, Department, Name, Tasks )
as
(
select
a.Id
, a.Department
, a.Name
, a.Task
from
dbo.Task a
left outer join dbo.Task b
on a.Department = b.Department
and a.Name = b.Name
and a.Id = b.Id + 1
where
b.Id is null
union all
select
t.Id
, t.Department
, t.Name
, rpt.Tasks + ', ' + t.Task
from
RecursivePersonTask rpt
inner join dbo.Task t
on rpt.Department = t.Department
and rpt.Name = t.Name
and rpt.Id = t.Id - 1
)
, CombinedPersonTasks( Id, Department, Name, Tasks )
as
(
select
ROW_NUMBER() over ( order by a.Id )
, a.Department
, a.Name
, '(' + a.Tasks + ')'
from
RecursivePersonTask a
left outer join RecursivePersonTask b
on a.Department = b.Department
and a.Name = b.Name
and a.Id = b.Id - 1
where
b.Id is null
)
, RecursiveDepartmentTasks( Id, Department, Tasks )
as
(
select
a.Id
, a.Department
, a.Name + ' ' + a.Tasks
from
CombinedPersonTasks a
left outer join CombinedPersonTasks b
on a.Department = b.Department
and a.Id = b.Id + 1
where
b.Id is null
union all
select
cpt.Id
, cpt.Department
, rdt.Tasks + ' ' + cpt.Name + ' ' + cpt.Tasks
from
RecursiveDepartmentTasks rdt
inner join CombinedPersonTasks cpt
on rdt.Department = cpt.Department
and rdt.Id = cpt.Id - 1
)
, CombinedDepartmentTasks( Id, Department, Tasks )
as
(
select
ROW_NUMBER() over ( order by a.Id )
, a.Department
, a.Tasks
from
RecursiveDepartmentTasks a
left outer join RecursiveDepartmentTasks b
on a.Department = b.Department
and a.Id = b.Id - 1
where
b.Id is null
)
select
*
from
CombinedDepartmentTasks
order by
Id

If you have SQL Server 2012, you can use the ROWS parameters of the windowing functions:
with tasks as (
select 1 as id, 'Sales' as dept ,'Mike' as name, 'Call customer' as task union
select 2, 'Sales' ,'Mike', 'Create quote' union
select 3, 'Sales' ,'Sarah', ' Create order' union
select 4, 'Engineering' ,'Sam', ' Design prototype' union
select 5, 'Engineering' ,'Sam', ' Calculate loads' union
select 6, 'Production' ,'Team1', ' Build parts' union
select 7, 'Production' ,'Team2', ' Build parts' union
select 8, 'Production' ,'Team1', ' Assemble parts' union
select 9, 'Accounting' ,'Amy', ' Invoice job' union
select 10, 'Sales' ,'Mike', ' Call customer' union
select 11, 'Sales' ,'Mike', ' Follow up after 30 days'
)
select max(NewDepartmentRollover) over (order by id rows unbounded preceding) as ColumnToGroupOn , * from (
select
case when max(dept) over (order by id rows between 1 preceding and 1 preceding) = dept then NULL else id end as NewDepartmentRollover,
* from tasks
) GroupOn
If you only have SQL Server 2005 or 2008, something like this will do:
with tasks as (
select 1 as id, 'Sales' as dept ,'Mike' as name, 'Call customer' as task union
select 2, 'Sales' ,'Mike', 'Create quote' union
select 3, 'Sales' ,'Sarah', ' Create order' union
select 4, 'Engineering' ,'Sam', ' Design prototype' union
select 5, 'Engineering' ,'Sam', ' Calculate loads' union
select 6, 'Production' ,'Team1', ' Build parts' union
select 7, 'Production' ,'Team2', ' Build parts' union
select 8, 'Production' ,'Team1', ' Assemble parts' union
select 9, 'Accounting' ,'Amy', ' Invoice job' union
select 10, 'Sales' ,'Mike', ' Call customer' union
select 11, 'Sales' ,'Mike', ' Follow up after 30 days'
)
, b as (
select
row_number() over (order by tasks.id) as rn,
tasks.id as lefty
from tasks
left join tasks t3
on t3.id = tasks.id - 1
where tasks.dept <> isnull(t3.dept,'')
)
select tasks.*, lefty as columnToGroupOn from tasks left join (
select b.lefty, isnull(b2.lefty,999)-1 as righty from b
left join
b b2 on b.rn = b2.rn - 1
) c
on tasks.id between lefty and righty

Related

Is it possible to replace values in row to values from the same table by the reference in Oracle SQL

There is a table that contains values that are used in formulas. There are simple variables, that do not contain any expression, and also there are some variables that combined from simple variables into formula. I need to figure out if is it possible to do a SELECT query to get a readable formula based on aliases it contains. Each of these aliases could be used in other formulas.
Let's say that there are two tables:
ITEM TABLE
ID
Name
FORMULA_ID
1
Item name 1
f_3
2
Item name 2
f_26
FORMULA TABLE
ID
EXPRESSION
ALIASE
NAME
f_1
null
var_100
Ticket
f_2
null
var_200
Amount
f_3
var_100 * var_200
var_300
Some description
So is there any chance to query, with result like:
ITEM_NAME
READABLE_EXPRESSION
Item name 1
Ticket * Amount
Try this:
with items(ID,Name,Formula_Id) AS (
select 1, 'Item name 1', 'f_3' from dual union all
select 2, 'Item name 2', 'f_26' from dual
),
formulas (ID, EXPRESSION, ALIAS, NAME) as (
select 'f_1', null, 'var_100', 'Ticket' from dual union all
select 'f_2', null, 'var_200', 'Amount' from dual union all
select 'f_3', 'var_100 * var_200', 'var_300', 'Some description' from dual
),
rnformulas (id, EXPRESSION, ALIAS, NAME, rn) as (
select fm.*, row_number() over(order by id) as rn from formulas fm
),
recsubstitute( lvl, item_id, rn, expression ) as (
select 1, it.id, 0, fm.expression
from items it
join rnformulas fm on it.formula_id = fm.id
union all
select lvl+1, item_id, fm.rn, replace(r.expression, fm.alias, fm.name)
from recsubstitute r
join rnformulas fm on instr(r.expression, fm.alias) > 0 and fm.rn > r.rn
)
select item_id, expression from (
select item_id, expression, row_number() over(partition by item_id order by lvl desc, rn asc) as rn
from recsubstitute
)
where rn = 1
;
ITEM_ID EXPRESSION
---------- ------------------------------------------------------------
1 Ticket * Amount
Note that it's far to be bullet proof against all situations, especially recursion in the aliases.
Some improvement with another set of data:
with items(ID,Name,Formula_Id) AS (
select 1, 'Item name 1', 'f_3' from dual union all
select 2, 'Item name 2', 'f_4' from dual
),
formulas (ID, EXPRESSION, ALIAS, NAME) as (
select 'f_1', null, 'var_100', 'Ticket' from dual union all
select 'f_2', null, 'var_200', 'Amount' from dual union all
select 'f_3', 'var_100 * var_200', 'var_300', 'Some description' from dual union all
select 'f_4', 'var_300', null, 'Other description' from dual
),
rnformulas (id, EXPRESSION, ALIAS, NAME, rn) as (
select fm.*, row_number() over(order by id) as rn from formulas fm
),
recsubstitute( lvl, item_id, rn, expression ) as (
select 1, it.id, 0, fm.expression
from items it
join rnformulas fm on it.formula_id = fm.id
union all
select lvl+1, item_id, fm.rn, replace(r.expression, fm.alias, nvl(fm.expression,fm.name))
from recsubstitute r
join rnformulas fm on instr(r.expression, fm.alias) > 0
)
select item_id, expression from (
select item_id, expression, row_number() over(partition by item_id order by lvl desc, rn asc) as rn
from recsubstitute
)
where rn = 1
;
1 Ticket * Amount
2 Ticket * Amount
Split the space-delimited formulas into rows. Join the expression parts to the aliases and replace the alias with the name. Join this to the item_table using LISTAGG to concatenate the rows back into a single column.
WITH formula_split AS (
SELECT DISTINCT ft.id
,level lvl
,regexp_substr(ft.expression,'[^ ]+',1,level) expression_part
FROM formula_table ft
CONNECT BY ( ft.id = ft.id
AND level <= length(ft.expression) - length(replace(ft.expression,' ')) + 1 ) START WITH ft.expression IS NOT NULL
),readable_tbl AS (
SELECT ft.id
,ft.lvl
,replace(ft.expression_part,ftn1.aliase,ftn1.name) readable_expression
FROM formula_split ft
LEFT JOIN formula_table ftn1 ON ( ft.expression_part = ftn1.aliase )
)
SELECT it.name item_name
,LISTAGG(readable_expression,' ') WITHIN GROUP(ORDER BY lvl) readable_expression
FROM item_table it
JOIN readable_tbl rt ON ( it.formula_id = rt.id )
GROUP BY it.name
With sample data create CTE (calc_data) for modeling
WITH
items (ITEM_ID, ITEM_NAME, FORMULA_ID) AS
(
Select 1, 'Item name 1', 'f_3' From Dual Union All
Select 2, 'Item name 2', 'f_26' From Dual
),
formulas (FORMULA_ID, EXPRESSION, ALIAS, ELEMENT_NAME) AS
(
Select 'f_1', null, 'var_100', 'Ticket' From Dual Union All
Select 'f_2', null, 'var_200', 'Amount' From Dual Union All
Select 'f_3', 'var_100 * var_200', 'var_300', 'Some description' From Dual
),
calc_data AS
( SELECT e.ITEM_NAME, e.FORMULA_ID, e.FORMULA, e.X, e.OPERAND, e.Y,
ROW_NUMBER() OVER(Partition By e.ITEM_NAME Order By e.FORMULA_ID) "RN", f.ELEMENT_NAME
FROM( Select CAST('.' as VARCHAR2(32)) "FORMULA", i.ITEM_NAME, f.FORMULA_ID,
SubStr(Replace(f.EXPRESSION, ' ', ''), 1, InStr(Replace(f.EXPRESSION, ' ', ''), '*') - 1) "X",
CASE
WHEN InStr(f.EXPRESSION, '+') > 0 THEN '+'
WHEN InStr(f.EXPRESSION, '-') > 0 THEN '-'
WHEN InStr(f.EXPRESSION, '*') > 0 THEN '*'
WHEN InStr(f.EXPRESSION, '/') > 0 THEN '/'
END "OPERAND",
--
SubStr(Replace(f.EXPRESSION, ' ', ''), InStr(Replace(f.EXPRESSION, ' ', ''), '*') + 1) "Y"
From formulas f
Inner Join items i ON(f.FORMULA_ID = i.FORMULA_ID)
) e
Inner Join formulas f ON(f.FORMULA_ID <> e.FORMULA_ID)
)
Main SQL with MODEL clause
SELECT ITEM_NAME, FORMULA
FROM ( SELECT *
FROM calc_data
MODEL
PARTITION BY (ITEM_NAME)
DIMENSION BY (RN)
MEASURES (X, OPERAND, Y, FORMULA, ELEMENT_NAME)
RULES ( FORMULA[1] = ELEMENT_NAME[1] || ' ' || OPERAND[1] || ' ' || ELEMENT_NAME[2] )
)
WHERE RN = 1
R e s u l t :
ITEM_NAME
FORMULA
Item name 1
Amount * Ticket
Just as an option...
The same result without any analytic functions, pseudo columns, unions, etc... - just selecting over and over and over. Not readable, though...
Select
i.ITEM_NAME,
REPLACE( REPLACE( (Select EXPRESSION From formulas Where FORMULA_ID = f.FORMULA_ID),
(Select Min(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID),
(Select ELEMENT_NAME From formulas Where FORMULA_ID <> f.FORMULA_ID And ALIAS = (Select Min(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID) )
) ||
REPLACE( (Select EXPRESSION From formulas Where FORMULA_ID = f.FORMULA_ID),
(Select Max(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID),
(Select ELEMENT_NAME From formulas Where FORMULA_ID <> f.FORMULA_ID And ALIAS = (Select Max(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID) )
),
(SELECT Max(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID ) || (Select Min(ALIAS) From formulas Where FORMULA_ID <> f.FORMULA_ID) ||
SubStr(f.EXPRESSION, InStr(f.EXPRESSION, ' ', 1, 1), (InStr(f.EXPRESSION, ' ', 1, 2) - InStr(f.EXPRESSION, ' ', 1, 1)) + 1 ), ''
) "FORMULA"
From
formulas f
Left Join
items i ON(i.FORMULA_ID = f.FORMULA_ID)
Where i.ITEM_NAME Is Not Null
Thank you all for your answers!
I've decided to create a pl/sql function, just to modify a formula to readable row. So the function just looks for variables using regex, and uses indexes to replace every variable with a name.
CREATE OR REPLACE FUNCTION READABLE_EXPRESSION(inExpression IN VARCHAR2)
RETURN VARCHAR2
IS
matchesCount INTEGER;
toReplace VARCHAR2(32767);
readableExpression VARCHAR2(32767);
selectString VARCHAR2(32767);
BEGIN
matchesCount := REGEXP_COUNT(inExpression, '(var_)(.*?)');
IF matchesCount IS NOT NULL AND matchesCount > 0 THEN
readableExpression := inExpression;
FOR i in 1..matchesCount
LOOP
toReplace := substr(inExpression, REGEXP_INSTR(inExpression, '(var_)(.*?)', 1, i, 0),
REGEXP_INSTR(inExpression, '(var_)(.*?)', 1, i, 1) -
REGEXP_INSTR(inExpression, '(var_)(.*?)', 1, i, 0)
);
SELECT DISTINCT F.NAME
INTO selectString
FROM FORMULA F
WHERE F.ALIASE = toReplace FETCH FIRST 1 ROW ONLY;
readableExpression := REPLACE(readableExpression,
toReplace,
selectString
);
end loop;
end if;
return readableExpression;
END;
So such function returns 1 result row with replaced values for 1 input row with FORMULA. All you need to do is join the ITEM and FORMULA tables in the SELECT.
SELECT item.name, READABLE_EXPRESSION(formula.expression)
FROM item
JOIN formula ON item.formula_id = formula.id;
Please note that the tables are fictitious so as not to reveal the actual data structure, so there might be some inaccuracies. But the general idea should be clear.

How to display null values in IN operator for SQL with two conditions in where

I have this query
select *
from dbo.EventLogs
where EntityID = 60181615
and EventTypeID in (1, 2, 3, 4, 5)
and NewValue = 'Received'
If 2 and 4 does not exist with NewValue 'Received' it shows this
current results
What I want
Ideally you should maintain somewhere a table containing all possible EventTypeID values. Sans that, we can use a CTE in place along with a left join:
WITH EventTypes AS (
SELECT 1 AS ID UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
)
SELECT et.ID AS EventTypeId, el.*
FROM EventTypes et
LEFT JOIN dbo.EventLogs el
ON el.EntityID = 60181615 AND
el.NewValue = 'Received'
WHERE
et.ID IN (1,2,3,4,5);

Recursive/hierarchical query in BigQuery

I have a recursion/hierarchical problem that I'm trying to figure out in BigQuery.
I have a list of employees and each employee has a manager ID. I need to be able to enter a single Employee_ID and return an array of every person beneath them.
CREATE TABLE p_RLS.testHeirarchy
(
Employee_ID INT64,
Employee_Name STRING,
Position STRING,
Line_Manager_ID INT64
);
INSERT INTO p_RLS.testHeirarchy (Employee_ID, Employee_Name, Position, Line_Manager_ID)
VALUES(1,'Joe','Worker',11),
(2,'James','Worker',11),
(3,'Jack','Worker',11),
(4,'Jill','Worker',12),
(5,'Jan','Worker',12),
(6,'Jacquie','Worker',13),
(7,'Joaquin','Worker',14),
(8,'Jeremy','Worker',14),
(9,'Jade','Worker',15),
(10,'Jocelyn','Worker',15),
(11, 'Bob', 'Store Manager',16),
(12, 'Bill', 'Store Manager',16),
(13, 'Barb', 'Store Manager',16),
(14, 'Ben', 'Store Manager',17),
(15, 'Burt', 'Store Manager',17),
(16, 'Sally','Group Manager',18),
(17, 'Sam','Group Manager',19),
(18, 'Anna', 'Ops Manager',20),
(19, 'Amy', 'Ops Manager',20),
(20, 'Zoe', 'State Manager', NULL);
My desired output would resemble:
SELECT 20 as Employee_ID, [19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1] as Reports;
SELECT 11 as Employee_ID, [3,2,1] as Reports;
SELECT 1 as Employee_ID, [] as Reports;
I have got the following working but it seems very ugly/inconvenient and doesn't support unlimited levels:
WITH test as (
SELECT L0.Employee_ID, L0.Employee_Name, L0.Position, L0.Line_Manager_ID,
ARRAY_AGG(DISTINCT L1.Employee_ID IGNORE NULLS) as Lvl1,
ARRAY_AGG(DISTINCT L2.Employee_ID IGNORE NULLS) as Lvl2,
ARRAY_AGG(DISTINCT L3.Employee_ID IGNORE NULLS) as Lvl3,
ARRAY_AGG(DISTINCT L4.Employee_ID IGNORE NULLS) as Lvl4,
ARRAY_AGG(DISTINCT L5.Employee_ID IGNORE NULLS) as Lvl5,
ARRAY_AGG(DISTINCT L6.Employee_ID IGNORE NULLS) as Lvl6,
ARRAY_AGG(DISTINCT L7.Employee_ID IGNORE NULLS) as Lvl7
FROM p_RLS.testHeirarchy as L0
LEFT OUTER JOIN p_RLS.testHeirarchy L1 ON L0.Employee_ID = L1.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L2 ON L1.Employee_ID = L2.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L3 ON L2.Employee_ID = L3.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L4 ON L3.Employee_ID = L4.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L5 ON L4.Employee_ID = L5.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L6 ON L5.Employee_ID = L6.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L7 ON L6.Employee_ID = L7.Line_Manager_ID
WHERE L0.Employee_ID = 16
GROUP BY 1,2,3,4)
SELECT
Employee_ID, ARRAY_CONCAT(
IFNULL(Lvl1,[]),
IFNULL(Lvl2,[]),
IFNULL(Lvl3,[]),
IFNULL(Lvl4,[]),
IFNULL(Lvl5,[]),
IFNULL(Lvl6,[]),
IFNULL(Lvl7,[])) as All_reports
FROM test
Is there a better way to do this? Is a recursive approach possible in BigQuery?
Recursive CTE was recently introduced !
This makes things so much easier
with recursive iterations as (
select line_manager_id, employee_id, 1 pos from your_table
union all
select b.line_manager_id, a.employee_id, pos + 1
from your_table a join iterations b
on b.employee_id = a.line_manager_id
)
select line_manager_id, string_agg('' || employee_id order by pos, employee_id desc) as reports_as_list
from iterations
where not line_manager_id is null
group by line_manager_id
order by line_manager_id desc
If applied to sample data in question - output is
Below is for BigQuery Standard SQL
DECLARE rows_count, run_away_stop INT64 DEFAULT 0;
CREATE TEMP TABLE initialData AS WITH input AS (
SELECT 1 Employee_ID,'Joe' Employee_Name,'Worker' Position,11 Line_Manager_ID UNION ALL
SELECT 2,'James','Worker',11 UNION ALL
SELECT 3,'Jack','Worker',11 UNION ALL
SELECT 4,'Jill','Worker',12 UNION ALL
SELECT 5,'Jan','Worker',12 UNION ALL
SELECT 6,'Jacquie','Worker',13 UNION ALL
SELECT 7,'Joaquin','Worker',14 UNION ALL
SELECT 8,'Jeremy','Worker',14 UNION ALL
SELECT 9,'Jade','Worker',15 UNION ALL
SELECT 10,'Jocelyn','Worker',15 UNION ALL
SELECT 11, 'Bob', 'Store Manager',16 UNION ALL
SELECT 12, 'Bill', 'Store Manager',16 UNION ALL
SELECT 13, 'Barb', 'Store Manager',16 UNION ALL
SELECT 14, 'Ben', 'Store Manager',17 UNION ALL
SELECT 15, 'Burt', 'Store Manager',17 UNION ALL
SELECT 16, 'Sally','Group Manager',18 UNION ALL
SELECT 17, 'Sam','Group Manager',19 UNION ALL
SELECT 18, 'Anna', 'Ops Manager',20 UNION ALL
SELECT 19, 'Amy', 'Ops Manager',20 UNION ALL
SELECT 20, 'Zoe', 'State Manager', NULL
)
SELECT * FROM input;
CREATE TEMP TABLE ttt AS
SELECT Line_Manager_ID, ARRAY_AGG(Employee_ID) Reports FROM initialData WHERE NOT Line_Manager_ID IS NULL GROUP BY Line_Manager_ID;
LOOP
SET (run_away_stop, rows_count) = (SELECT AS STRUCT run_away_stop + 1, COUNT(1) FROM ttt);
CREATE OR REPLACE TEMP TABLE ttt1 AS
SELECT Line_Manager_ID, ARRAY(SELECT DISTINCT Employee_ID FROM UNNEST(Reports) Employee_ID ORDER BY Employee_ID DESC) Reports
FROM (
SELECT Line_Manager_ID, ARRAY_CONCAT_AGG(Reports) Reports
FROM (
SELECT t2.Line_Manager_ID, ARRAY_CONCAT(t1.Reports, t2.Reports) Reports
FROM ttt t1, ttt t2
WHERE (SELECT COUNTIF(t1.Line_Manager_ID = Employee_ID) FROM UNNEST(t2.Reports) Employee_ID) > 0
) GROUP BY Line_Manager_ID
);
CREATE OR REPLACE TEMP TABLE ttt AS
SELECT * FROM ttt1 UNION ALL
SELECT * FROM ttt WHERE NOT Line_Manager_ID IN (SELECT Line_Manager_ID FROM ttt1);
IF (rows_count = (SELECT COUNT(1) FROM ttt) AND run_away_stop > 1) OR run_away_stop > 10 THEN BREAK; END IF;
END LOOP;
SELECT Employee_ID,
(
SELECT STRING_AGG(CAST(Employee_ID AS STRING), ',' ORDER BY Employee_ID DESC)
FROM ttt.Reports Employee_ID
) Reports_as_list
FROM (SELECT DISTINCT Employee_ID FROM initialData) d
LEFT JOIN ttt ON Employee_ID = Line_Manager_ID
ORDER BY Employee_ID DESC;
with result
Row Employee_ID Reports_as_list
1 20 19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1
2 19 17,15,14,10,9,8,7
3 18 16,13,12,11,6,5,4,3,2,1
4 17 15,14,10,9,8,7
5 16 13,12,11,6,5,4,3,2,1
6 15 10,9
7 14 8,7
8 13 6
9 12 5,4
10 11 3,2,1
11 10 null
12 9 null
13 8 null
14 7 null
15 6 null
16 5 null
17 4 null
18 3 null
19 2 null
20 1 null
In case if you need Reports as array - replace last statement in above script with below
SELECT Employee_ID, Reports Reports_as_array
FROM (SELECT DISTINCT Employee_ID FROM initialData) d
LEFT JOIN ttt ON Employee_ID = Line_Manager_ID
ORDER BY Employee_ID DESC;
Note: depends on level of nesting in your hierarchy - you might need to adjust 10 in OR run_away_stop > 10
To the question: "Is a recursive approach possible in BigQuery?"
Yes!
Now that BigQuery supports scripting and loops I solved some recursive problems from the Advent of Code with BigQuery:
https://towardsdatascience.com/advent-of-code-sql-bigquery-31e6a04964d4
CREATE TEMP TABLE planets AS SELECT 'YOU' planet;
LOOP
SET steps = steps+1
;
CREATE OR REPLACE TEMP TABLE planets AS
SELECT DISTINCT planet
FROM (
SELECT origin planet FROM t1 WHERE dest IN (SELECT planet FROM planets)
UNION ALL
SELECT dest planet FROM t1 WHERE origin IN (SELECT planet FROM planets)
)
;
IF 'SAN' IN (SELECT * FROM planets )
THEN LEAVE;
END IF;
END LOOP
;
SELECT steps-2
I would use a similar approach to navigate the graph and annotate all parent relationships.
Soon: I'll write a blog post on the specifics of tree traversal to get everyone under x. But this code will help you in the meantime.

SQL hierarchy count totals report

I'm creating a report with SQL server 2012 and Report Builder which must show the total number of Risks at a high, medium and low level for each Parent Element.
Each Element contains a number of Risks which are rated at a certain level. I need the total for the Parent Elements. The total will include the number of all the Child Elements and also the number the Element itself may have.
I am using CTEs in my query- the code I have attached isn't working (there are no errors - it's just displaying the incorrect results) and I'm not sure that my logic is correct??
Hopefully someone can help. Thanks in advance.
My table structure is:
ElementTable
ElementTableId(PK) ElementName ElementParentId
RiskTable
RiskId(PK) RiskName RiskRating ElementId(FK)
My query:
WITH cte_Hierarchy(ElementId, ElementName, Generation, ParentElementId)
AS (SELECT ElementId,
NAME,
0,
ParentElementId
FROM Extract.Element AS FirtGeneration
WHERE ParentElementId IS NULL
UNION ALL
SELECT NextGeneration.ElementId,
NextGeneration.NAME,
Parent.Generation + 1,
Parent.ElementId
FROM Extract.Element AS NextGeneration
INNER JOIN cte_Hierarchy AS Parent
ON NextGeneration.ParentElementId = Parent.ElementId),
CTE_HighRisk
AS (SELECT r.ElementId,
Count(r.RiskId) AS HighRisk
FROM Extract.Risk r
WHERE r.RiskRating = 'High'
GROUP BY r.ElementId),
CTE_LowRisk
AS (SELECT r.ElementId,
Count(r.RiskId) AS LowRisk
FROM Extract.Risk r
WHERE r.RiskRating = 'Low'
GROUP BY r.ElementId),
CTE_MedRisk
AS (SELECT r.ElementId,
Count(r.RiskId) AS MedRisk
FROM Extract.Risk r
WHERE r.RiskRating = 'Medium'
GROUP BY r.ElementId)
SELECT rd.ElementId,
rd.ElementName,
rd.ParentElementId,
Generation,
HighRisk,
MedRisk,
LowRisk
FROM cte_Hierarchy rd
LEFT OUTER JOIN CTE_HighRisk h
ON rd.ElementId = h.ElementId
LEFT OUTER JOIN CTE_MedRisk m
ON rd.ElementId = m.ElementId
LEFT OUTER JOIN CTE_LowRisk l
ON rd.ElementId = l.ElementId
WHERE Generation = 1
Edit:
Sample Data
ElementTableId(PK) -- ElementName -- ElementParentId
1 ------------------- Main --------------0
2 --------------------Element1-----------1
3 --------------------Element2 ----------1
4 --------------------SubElement1 -------2
RiskId(PK) RiskName RiskRating ElementId(FK)
a -------- Financial -- High ----- 2
b -------- HR --------- High ----- 3
c -------- Marketing -- Low ------- 2
d -------- Safety -----Medium ----- 4
Sample Output:
Element Name High Medium Low
Main ---------- 2 ---- 1 -------1
Here is your sample tables
SELECT * INTO #TABLE1
FROM
(
SELECT 1 ElementTableId, 'Main' ElementName ,0 ElementParentId
UNION ALL
SELECT 2,'Element1',1
UNION ALL
SELECT 3, 'Element2',1
UNION ALL
SELECT 4, 'SubElement1',2
)TAB
SELECT * INTO #TABLE2
FROM
(
SELECT 'a' RiskId, 'Fincancial' RiskName,'High' RiskRating ,2 ElementId
UNION ALL
SELECT 'b','HR','High',3
UNION ALL
SELECT 'c', 'Marketing','Low',2
UNION ALL
SELECT 'd', 'Safety','Medium',4
)TAB
We are finding the children of a parent, its count of High,Medium and Low and use cross join to show parent with all the combinations of its children's High,Medium and Low
UPDATE
The below variable can be used to access the records dynamically.
DECLARE #ElementTableId INT;
--SET #ElementTableId = 1
And use the above variable inside the query
;WITH CTE1 AS
(
SELECT *,0 [LEVEL] FROM #TABLE1 WHERE ElementTableId = #ElementTableId
UNION ALL
SELECT E.*,e2.[LEVEL]+1 FROM #TABLE1 e
INNER JOIN CTE1 e2 on e.ElementParentId = e2.ElementTableId
AND E.ElementTableId<>#ElementTableId
)
,CTE2 AS
(
SELECT E1.*,E2.*,COUNT(RiskRating) OVER(PARTITION BY RiskRating) CNT
from CTE1 E1
LEFT JOIN #TABLE2 E2 ON E1.ElementTableId=E2.ElementId
)
,CTE3 AS
(
SELECT DISTINCT T1.ElementName,C2.RiskRating,C2.CNT
FROM #TABLE1 T1
CROSS JOIN CTE2 C2
WHERE T1.ElementTableId = #ElementTableId
)
SELECT *
FROM CTE3
PIVOT(MIN(CNT)
FOR RiskRating IN ([High], [Medium],[Low])) AS PVTTable
SQL FIDDLE
RESULT
UPDATE 2
I am updating as per your new requirement
Here is sample table in which I have added extra data to test
SELECT * INTO #ElementTable
FROM
(
SELECT 1 ElementTableId, 'Main' ElementName ,0 ElementParentId
UNION ALL
SELECT 2,'Element1',1
UNION ALL
SELECT 3, 'Element2',1
UNION ALL
SELECT 4, 'SubElement1',2
UNION ALL
SELECT 5, 'Main 2',0
UNION ALL
SELECT 6, 'Element21',5
UNION ALL
SELECT 7, 'SubElement21',6
UNION ALL
SELECT 8, 'SubElement22',7
UNION ALL
SELECT 9, 'SubElement23',7
)TAB
SELECT * INTO #RiskTable
FROM
(
SELECT 'a' RiskId, 'Fincancial' RiskName,'High' RiskRating ,2 ElementId
UNION ALL
SELECT 'b','HR','High',3
UNION ALL
SELECT 'c', 'Marketing','Low',2
UNION ALL
SELECT 'd', 'Safety','Medium',4
UNION ALL
SELECT 'e' , 'Fincancial' ,'High' ,5
UNION ALL
SELECT 'f','HR','High',6
UNION ALL
SELECT 'g','HR','High',6
UNION ALL
SELECT 'h', 'Marketing','Low',7
UNION ALL
SELECT 'i', 'Safety','Medium',8
UNION ALL
SELECT 'j', 'Safety','High',8
)TAB
I have written the logic in query
;WITH CTE1 AS
(
-- Here you will find the level of every elements in the table
SELECT *,0 [LEVEL]
FROM #ElementTable WHERE ElementParentId = 0
UNION ALL
SELECT ET.*,CTE1.[LEVEL]+1
FROM #ElementTable ET
INNER JOIN CTE1 on ET.ElementParentId = CTE1.ElementTableId
)
,CTE2 AS
(
-- Filters the level and find the major parant of each child
-- ie, 100->150->200, here the main parent of 200 is 100
SELECT *,CTE1.ElementTableId MajorParentID,CTE1.ElementName MajorParentName
FROM CTE1 WHERE [LEVEL]=1
UNION ALL
SELECT CTE1.*,CTE2.MajorParentID,CTE2.MajorParentName
FROM CTE1
INNER JOIN CTE2 on CTE1.ElementParentId = CTE2.ElementTableId
)
,CTE3 AS
(
-- Since each child have columns for main parent id and name,
-- you will get the count of each element corresponding to the level you have selected directly
SELECT DISTINCT CTE2.MajorParentName,RT.RiskRating ,
COUNT(RiskRating) OVER(PARTITION BY MajorParentID,RiskRating) CNT
FROM CTE2
JOIN #RiskTable RT ON CTE2.ElementTableId=RT.ElementId
)
SELECT MajorParentName, ISNULL([High],0)[High], ISNULL([Medium],0)[Medium],ISNULL([Low],0)[Low]
FROM CTE3
PIVOT(MIN(CNT)
FOR RiskRating IN ([High], [Medium],[Low])) AS PVTTable
SQL FIDDLE

SQL Server : CTE going a level back in the where clause

I would like to set a limit to a recursive query. But I don't want to lose the last entry on which i limited the recursion.
Right now the query looks like this:
WITH MyCTE (EmployeeID, FirstName, LastName, ManagerID, level, SPECIAL)
AS (
SELECT a.EmployeeID, a.FirstName, a.LastName, a.ManagerID, 0 as Level
FROM MyEmployees as a
WHERE ManagerID IS NULL
UNION ALL
SELECT b.EmployeeID, b.FirstName, b.LastName, b.ManagerID, level + 1
FROM MyEmployees as b
INNER JOIN MyCTE ON b.ManagerID = MyCTE.EmployeeID
WHERE b.ManagerID IS NOT NULL and b.LastName != 'Welcker'
)
SELECT *
FROM MyCTE
OPTION (MAXRECURSION 25)
And it gets limited by the Manager Welcker, the result looks like this:
EmployeeID; FirstName; LastName; ManagerID; level
1 Ken Sánchez NULL 0
What I want is to have the employee 'Welcker' included. The problem is I only have the name of the last person I need to have on my list. There are some entries below in the command structure but I don't know them and I don't want to see them.
Here's a example of how I imagine the result of my query to look.
EmployeeID FirstName LastName ManagerID level
1 Ken Sánchez NULL 0
273 Brian Welcker 1 1
Any help would be very much appreciated
WITH MyCTE (EmployeeID, FirstName, LastName, ManagerID, level, NotMatched)
AS (
SELECT a.EmployeeID, a.FirstName, a.LastName, a.ManagerID, 0 as Level, 0 AS NotMatched
FROM MyEmployees as a
WHERE ManagerID IS NULL
UNION ALL
SELECT b.EmployeeID, b.FirstName, b.LastName, b.ManagerID, level + 1,
CASE WHEN (MyCTE.LastName = 'Welcker' AND MyCTE.level = 1) THEN 1 ELSE MyCTE.NotMatched END
FROM MyEmployees as b
INNER JOIN MyCTE ON b.ManagerID = MyCTE.EmployeeID
)
SELECT *
FROM MyCTE
WHERE NotMatched != 1
Can you just change that one line to b.LastName = 'Welcker'?
select distinct
t.b_eid as empId,
t.b_name as name,
case t.b_mid when 0 then null else t.b_mid end as managerId
--convert(varchar,t.b_eid) + ', ' +
--t.b_name + ', ' +
--case t.b_mid when '0' then 'null' else convert(varchar,t.b_mid) end
--as bvals
from(
select
coalesce(a.mid, 0) as a_mid,
a.eid as a_eid,
a.name as a_name,
coalesce(b.mid, 0) as b_mid,
b.eid as b_eid,
b.name as b_name
from
(values(null,1,'adam'),(null,2,'barb'),(201,3,'chris')) as a(mid,eid,name) cross join
(values(null,1,'adam'),(null,2,'barb'),(201,3,'chris')) as b(mid,eid,name)
) as t
where (t.a_mid = t.b_eid or t.a_mid = 0 or t.b_mid = 0) and t.a_name != 'chris'