Recursive view that sum value from double tree structure SQL Server - sql

First sorry for numerous repost of my question, I'm new around and getting used to properly and clearly asking questions.
I'm working on a recursive view that sum up values from a double tree structure.
I have researched around and found many questions about recursive sums but none of their solutions seemed to work for my issue specifically.
As of now I have issues aggregating the values in the right cells, the logic being i need the sum of each element per year in it's parent and also the sum of all the years for a given element.
Here is a fiddle of my tables and actual script:
SQL Fiddle
And here is a screenshot of the output I'm looking for:
My question is:
How can I get my view to aggregate the value from child to parent in this double tree structure?

If I understand your question correctly, you are trying to get an aggregation at 2 different levels to show in a single result set.
Clarification Scenario:
Below is an over-simplified sample data set for what I believe you are trying to achieve.
create table #agg_table
(
group_one int
, group_two int
, group_val int
)
insert into #agg_table
values (1, 1, 6)
, (1, 1, 7)
, (1, 2, 8)
, (1, 2, 9)
, (2, 3, 10)
, (2, 3, 11)
, (2, 4, 12)
, (2, 4, 13)
Given the sample data above, you want want to see the following output:
+-----------+-----------+-----------+
| group_one | group_two | group_val |
+-----------+-----------+-----------+
| 1 | NULL | 30 |
| 1 | 1 | 13 |
| 1 | 2 | 17 |
| 2 | NULL | 46 |
| 2 | 3 | 21 |
| 2 | 4 | 25 |
+-----------+-----------+-----------+
This output can be achieved by making use of the group by grouping sets
(example G. in the link) syntax in SQL Server as shown in the query below:
select a.group_one
, a.group_two
, sum(a.group_val) as group_val
from #agg_table as a
group by grouping sets
(
(
a.group_one
, a.group_two
)
,
(
a.group_one
)
)
order by a.group_one
, a.group_two
What that means for your scenario, is that I believe your Recursive-CTE is not the issue. The only thing that needs to change is in the final select query from the entire CTE.
Answer:
with Temp (EntityOneId, EntityOneParentId, EntityTwoId, EntityTwoParentId, Year, Value)
as
(
SELECT E1.Id, E1.ParentId, E2.Id, E2.ParentId, VY.Year, VY.Value
FROM ValueYear AS VY
FULL OUTER JOIN EntityOne AS E1
ON VY.EntityOneId = E1.Id
FULL OUTER JOIN EntityTwo AS E2
ON VY.EntityTwoId = E2.Id
),
T (EntityOneId, EntityOneParentId, EntityTwoId, EntityTwoParentId, Year, Value, Levels)
as
(
Select
T1.EntityOneId,
T1.EntityOneParentId,
T1.EntityTwoId,
T1.EntityTwoParentId,
T1.Year,
T1.Value,
0 as Levels
From
Temp
As T1
Where
T1.EntityOneParentId is null
union all
Select
T1.EntityOneId,
T1.EntityOneParentId,
T1.EntityTwoId,
T1.EntityTwoParentId,
T1.Year,
T1.Value,
T.Levels +1
From
Temp
AS T1
join
T
On T.EntityOneId = T1.EntityOneParentId
)
Select
T.EntityOneId,
T.EntityOneParentId,
T.EntityTwoId,
T.EntityTwoParentId,
T.Year,
sum(T.Value) as Value
from T
group by grouping sets
(
(
T.EntityOneId,
T.EntityOneParentId,
T.EntityTwoId,
T.EntityTwoParentId,
T.Year
)
,
(
T.EntityOneId,
T.EntityOneParentId,
T.EntityTwoId,
T.EntityTwoParentId
)
)
order by T.EntityOneID
, T.EntityOneParentID
, T.EntityTwoID
, T.EntityTwoParentID
, T.Year
FYI - I believe the sample data did not have the records necessary to match the expected output completely, but the last 20 records in the SQL Fiddle match the expected output perfectly.

Related

Get Ids from constant list for which there are no rows in corresponding table

Let say I have a table Vehicles(Id, Name) with below values:
1 Car
2 Bike
3 Bus
and a constant list of Ids:
1, 2, 3, 4, 5
I want to write a query returning Ids from above list for which there are no rows in Vehicles table. In the above example it should return:
4, 5
But when I add new row to Vehicles table:
4 Plane
It should return only:
5
And similarly, when from the first version of Vehicle table I remove the third row (3, Bus) my query should return:
3, 4, 5
I tried with exist operator but it doesn't provide me correct results:
select top v.Id from Vehicle v where Not Exists ( select v2.Id from Vehicle v2 where v.id = v2.id and v2.id in ( 1, 2, 3, 4, 5 ))
You need to treat your "list" as a dataset, and then use the EXISTS:
SELECT V.I
FROM (VALUES(1),(2),(3),(4),(5))V(I) --Presumably this would be a table (type parameter),
--or a delimited string split into rows
WHERE NOT EXISTS (SELECT 1
FROM dbo.YourTable YT
WHERE YT.YourColumn = V.I);
Please try the following solution.
It is using EXCEPT set operator.
Set Operators - EXCEPT and INTERSECT (Transact-SQL)
SQL
-- DDL and sample data population, start
DECLARE #Vehicles TABLE (ID INT PRIMARY KEY, vehicleType VARCHAR(30));
INSERT INTO #Vehicles (ID, vehicleType) VALUES
(1, 'Car'),
(2, 'Bike'),
(3, 'Bus');
-- DDL and sample data population, end
DECLARE #vehicleList VARCHAR(20) = '1, 2, 3, 4, 5'
, #separator CHAR(1) = ',';
SELECT TRIM(value) AS missingID
FROM STRING_SPLIT(#vehicleList, #separator)
EXCEPT
SELECT ID FROM #Vehicles;
Output
+-----------+
| missingID |
+-----------+
| 4 |
| 5 |
+-----------+
In SQL we store our values in tables. We therefore store your list in a table.
It is then simple to work with it and we can easily find the information wanted.
I fully agree that it is possible to use other functions to solve the problem. It is more intelligent to implement database design to use basic SQL. It will run faster, be easier to maintain and will scale for a table of a million rows without any problems. When we add the 4th mode of transport we don't have to modify anything else.
CREATE TABLE vehicules(
id int, name varchar(25));
INSERT INTO vehicules VALUES
(1 ,'Car'),
(2 ,'Bike'),
(3 ,'Bus');
CREATE TABLE ids (iid int)
INSERT INTO ids VALUES
(1),(2),(3),(4),(5);
CREATE VIEW unknownIds AS
SELECT iid unknown_id FROM ids
LEFT JOIN vehicules
ON iid = id
WHERE id IS NULL;
SELECT * FROM unknownIds;
| unknown_id |
| ---------: |
| 4 |
| 5 |
INSERT INTO vehicules VALUES (4,'Plane')
SELECT * FROM unknownIds;
| unknown_id |
| ---------: |
| 5 |
db<>fiddle here

SQL get top level object from joins

Working on a query right now where we want to understand which business is referring the most downstream orders for us. I've put together a very basic table for demonstration purposes here with 4 businesses listed. Bar and Donut were both ultimately referred by Foo and I want to be able to show Foo as a business has generated X number of orders. Obviously getting the the single referral for Foo (from Bar) and Bar (from Donut) are simple joins. But how do you go from Bar to get back to Foo?
I'll add that I've done some more googling this AM and found a few very similar questions about the top level parent and most of the responses suggest recursive CTE. It's been awhile since I've dug deep into SQL stuff, but 8 years ago I know these were not overly popular. Is there another way around this? Perhaps better to just store that parent ID on the order table at the time of order?
+----+--------+--------------------+
| Id | Name | ReferralBusinessId |
+----+--------+--------------------+
| 1 | Foo | |
| 2 | Bar | 1 |
| 3 | Donut | 2 |
| 4 | Coffee | |
+----+--------+--------------------+
WITH RECURSIVE entity_hierarchy AS (
SELECT id, name, parent FROM entities WHERE name = 'Donut'
UNION
SELECT e.id, e.name, e.parent FROM entities e INNER JOIN entity_hierarchy eh on e.id = eh.parent
)
SELECT id, name, parent FROM entity_hierarchy;
SQL Fiddle Example
Assuming you're using SQL Server, you could use a query like the one below to generate a hierarchical Id path for a particular business.
declare #tbl as table (Id int, Name varchar(30), ReferralBusinessId int)
insert into #tbl (id, Name, ReferralBusinessId) values
(1, 'Foo', null),
(2, 'Bar', 1),
(3, 'Donut', 2),
(4, 'Coffee', null);
;WITH business AS (
SELECT Id, Name, ReferralBusinessId
, 0 AS Level
, CAST(Id AS VARCHAR(255)) AS Path
FROM #tbl
UNION ALL
SELECT R.Id, R.Name, R.ReferralBusinessId
, Level + 1
, CAST(Path + '.' + CAST(R.Id AS VARCHAR(255)) AS VARCHAR(255))
FROM #tbl R
INNER JOIN business b ON b.Id = R.ReferralBusinessId
)
SELECT * FROM business ORDER BY Path

How do I trace former ids using a recursive query?

I have a table of provider information (providers) that contains the columns reporting_unit and predesessor. Predesessor is either
null or contains the reporting_unit that that row used to represent. I need
to find what the current reporting_unit for any provider is. By that I mean for any reporting_unit with a predesessor, that reporting_unit is the current_reporting_unit for the predesessor.
I am trying
to use a recursive CTE to accomplish this because some of the time
there are multiple links.
The table looks like this:
CREATE TABLE providers (
reporting_unit TEXT,
predesessor TEXT
);
INSERT INTO providers
VALUES
(NULL, NULL),
('ARE88', NULL),
('99BX7', '99BX6'),
('99BX6', '99BX5'),
('99BX5', NULL)
;
The results I would like to get from that are:
reporting_unit | current_reporting_unit
---------------------------------------
'99BX5' | '99BX7'
'99BX6' | '99BX7'
My current query is :
WITH RECURSIVE current_ru AS (
SELECT reporting_unit, predesessor
FROM providers
WHERE predesessor IS NULL
UNION ALL
SELECT P.reporting_unit, P.predesessor
FROM providers P
JOIN current_ru CR
ON P.reporting_unit = CR.predesessor
)
SELECT *
FROM current_ru
;
But that isn't giving me the results I'm looking for. I have tried a number of variations on this query but they all seem to end up in an infinite loop. How
You should find relations in the reverse order. Add depth column to find the deepest link:
with recursive current_ru (reporting_unit, predesessor, depth) as (
select reporting_unit, predesessor, 1
from providers
where predesessor is not null
union
select r.reporting_unit, p.predesessor, depth+ 1
from providers p
join current_ru r
on p.reporting_unit = r.predesessor
)
select *
from current_ru;
reporting_unit | predesessor | depth
----------------+-------------+-------
99BX7 | 99BX6 | 1
99BX6 | 99BX5 | 1
99BX6 | | 2
99BX7 | 99BX5 | 2
99BX7 | | 3
(5 rows)
Now switch the two columns, change their names, eliminate null rows and select the deepest links:
with recursive current_ru (reporting_unit, predesessor, depth) as (
select reporting_unit, predesessor, 1
from providers
where predesessor is not null
union
select r.reporting_unit, p.predesessor, depth+ 1
from providers p
join current_ru r
on p.reporting_unit = r.predesessor
)
select distinct on(predesessor)
predesessor reporting_unit,
reporting_unit current_reporting_unit
from current_ru
where predesessor is not null
order by predesessor, depth desc;
reporting_unit | current_reporting_unit
----------------+------------------------
99BX5 | 99BX7
99BX6 | 99BX7
(2 rows)

Trying to build an SQL Server query with a specific output

I have two main tables with a joining table. One table has all of the main records, the second table has categories that the main records would be associated with. the link table. The joining table has entries with IDs from both the categories, and the main records, and it builds associations (main record id 2, to category id 245 as an example)
I am trying to build a query that outputs all of the main records, with all of the categories for each row of the main records, as some rows can have many categories.
What I would love it to do is output them in a delimited fashion so that I can keep it to one row per main record. Right now the best I seem to be able to do is a row for each category that main item has. example of what I want (highly simplified).
ID | Name | Category
-------------------------------------
2 | thing | shiny,special,explosive
What I get now is:
ID | Name | Category
-------------------------
2 | thing | shiny
2 | thing | special
2 | thing | explosive
etc.
Here is my current query in it's state - the reason there are so many columns being selected is there is many columns in the table, and I only need a few shown.
SELECT Attractions.ID
, Attractions.HotelName
, Attractions.Enabled
, Attractions.HotelAddress1
, Attractions.HotelAddress2
, Attractions.City
, Attractions.Prov
, Attractions.Country
, Attractions.PostalCode
, Attractions.Latitude
, Attractions.Longitude
, Attractions.Ratings
, Attractions.Phone
, Attractions.Fax
, Attractions.TollFree
, Attractions.Email
, Attractions.Website
, Attractions.ShowInSearch
, Attractions.MoreInfoCounter
, Attractions.ContactPerson
, Attractions.ContactPersonFirst
, Attractions.ContactPersonLast
, Attractions.Notes
, Attractions.SponsorID
, Attraction_Sub_Types.Name
FROM
dbo.Attractions_Attraction_Sub_Types_Link
INNER JOIN dbo.Attractions
ON Attractions_Attraction_Sub_Types_Link.AttractionID = Attractions.ID
INNER JOIN dbo.Attraction_Sub_Types
ON Attractions_Attraction_Sub_Types_Link.Sub_TypeID = Attraction_Sub_Types.ID
WHERE
Attractions.ShowInSearch = 1
ORDER BY
Attractions.ID
I had at first experimented with sub queries but I could never get one to validate or even where to start so I abandoned that.
THis is a sample of how to do this:
select 'test' as Test, 1 as Item
into #test
union select 'test2', 2
union select 'test', 3
union select NUll, 4
union select 'test', 5
select t2.test, STUFF((SELECT ', ' + cast(t1.Item as varchar (10) )
FROM #test t1 where t2.test = t1.test
FOR XML PATH('')), 1, 1, '')
from #test t2
group by t2.test
Use the COALESCE function. Try something like:
DECLARE #Category VARCHAR(8000)
SELECT #Category = COALESCE(#Category + ', ', '') + Category
FROM categories_table
WHERE Category IS NOT NULL

Find duplicate groups of rows in SQL Server

I have a table with materials information where one material has from one to many constituents.
The table looks like this:
material_id contstiuent_id constituent_wt_pct
1 1 10.5
1 2 89.5
2 1 10.5
2 5 15.5
2 7 74
3 1 10.5
3 2 89.5
Generally, I can have different material ID's with the same constituents (both ID's and weight percent), but also the same constituent id with the same weight percent can be in multiple materials.
I need to find the material ID's that have exactly the same amount of constituents, same constituents id's and same weight percent (in the example of data that will be material ID 1 and 3)
What would be great is to have the output like:
ID Duplicate ID's
1 1,3
2 15,25
....
Just to clarify the question: I have several thousands of materials and it won't help me if I get just the id's of duplicate rows - I would like to see if it is possible to get the groups of duplicate material id's in the same row or field.
Build a XML string in a CTE that contains all constituents and use that string to figure out what materials is duplicate.
SQL Fiddle
MS SQL Server 2008 Schema Setup:
create table Materials
(
material_id int,
constituent_id int,
constituent_wt_pct decimal(10, 2)
);
insert into Materials values
(1, 1, 10.5),
(1, 2, 89.5),
(2, 1, 10.5),
(2, 5, 15.5),
(2, 7, 74),
(3, 1, 10.5),
(3, 2, 89.5);
Query 1:
with C as
(
select M1.material_id,
(
select M2.constituent_id as I,
M2.constituent_wt_pct as P
from Materials as M2
where M1.material_id = M2.material_id
order by M2.constituent_id,
M2.material_id
for xml path('')
) as constituents
from Materials as M1
group by M1.material_id
)
select row_number() over(order by 1/0) as ID,
stuff((
select ','+cast(C2.material_id as varchar(10))
from C as C2
where C1.constituents = C2.constituents
for xml path('')
), 1, 1, '') as MaterialIDs
from C as C1
group by C1.constituents
having count(*) > 1
Results:
| ID | MATERIALIDS |
--------------------
| 1 | 1,3 |
Well you can use the following code to get the duplicate value,
Select EMP_NAME as NameT,count(EMP_NAME) as DuplicateValCount From dbo.Emp_test
group by Emp_name having count(EMP_NAME) > 1