Combining related organisation records in SQL where there is a Parent-Child relationship between organisations - sql

I am trying to build a table of data for use in Yellowfin BI reporting. One limitation of this is that no temporary tables can be created and then dropped in the database. I am pulling the data from an existing database, which i have no control over. I can only use SQL to query the existing data.
There are two tables in the source database i need to work with. I've simplified them for clarity. The first contains organisations. It has an ORG_ID column which contains a unique ID for each organisation and a PARENT_ORG_ID column indicating which organisation is the Parent Company of others in the list:
ORG_ID PARENT_ORG_ID
1 Null
2 1
3 5
4 5
5 Null
6 1
Using the table above i can see that there are the following relationships between organisations:
ORG_ID RELATED_ORGANISATIONS
1 2 and 6
2 1 and 6
3 5 and 4
4 5 and 3
5 4 and 3
6 1 and 2
I'm not sure the best way to represent these connections in a query as i need to use these relationships with a second table.
The second table i have is a list of organisations and money owed:
ORG_ID MONEY_OWED
1 5
2 10
3 0
4 15
5 20
6 5
What i need to achieve is a table that i can search for any single ORG_ID, and see the combined data for that Organisation and all related Organisations. In the case of my example, this could be a results table something like this:
ORG_ID MONEY_OWED_BY_ALL_RELATED_ORGS
1 20
2 20
3 35
4 35
5 35
6 20
I'm thinking i should use a CTE to handle the relationships between organisations but i can't get my head around it.
Any help would be greatly appreciated!

For your particular example, you can use:
select o.*,
sum(mo.money_owed) over (partition by coalesce(o.parent_org_id, o.org_id)) as parent_owed
from organizations o left join
money_owed mo
on mo.org_id = o.org_id;
This works because your organizations are only one level deep -- which is consistent with your sample data.

Related

Querying duplicates table into related sets

We have a process that creates a table of duplicate records based on some arbitrary rules (details not relevant).
Every record gets checked against all other records and if a suspected duplicate is found both it and the duplicate are stored in a dupes table to be manually reviewed.
This results in a table something like this:
dupId, originalId, duplicateId
1 1 2
2 1 3
3 1 4
4 2 3
5 2 4
6 3 4
7 5 6
8 5 7
9 6 7
10 8 9
You can see here record #1 has 3 other records it is similar to (#2,#3 and #4) and they are each similar to each other.
Record #5 has 2 duplicates (#6 and #7) and record #8 has only 1 (#9).
I want to query the duplicates into sets, so my results would look something like this:
setId recordId
1 1
1 2
1 3
1 4
2 5
2 6
2 7
3 8
3 9
But I am too old/slow/tired/rubbish and a bit out of my depth here.
Currently, when checking for duplicates if the record pairing is already in the table we don't insert it twice (i.e. you don't see both sides of the duplicate pairing) but can easily do so if it makes the querying simpler.
Any advice much appreciated!
Duplicates seems to be transitive, so you have all pairs. That is, the "original" id has the information you need.
But it is not included in the duplicates and you want that. So:
select dense_rank() over (order by originalid) as setid, duplicateid
from ((select originalid, duplicateid
from t
where not exists (select 1 from t t2 where t.originalid = t2.duplicateid)
) union all
(select distinct originalid, originalid
from t
where not exists (select 1 from t t2 where t.originalid = t2.duplicateid)
)
) i
order by setid;

SQL Calculations With Multi-Group Affiliations

I'm attempting to have a function or view that is able to calculate and roll up various counts while being able to search on a many to many affiliation.
Here is an example data set:
Invoice Table:
InvoiceID LocationID StatusID
1 5 1
2 5 1
3 5 1
4 5 2
5 7 2
5 7 1
5 7 2
Group Table:
GroupID GroupName
1 Group 1
2 Group 2
GroupToLocation Table:
GroupToLocationID GroupID LocationID
1 1 5
2 2 5
3 2 7
I have gotten to the point where I could sum up the various statuses per location and get this:
LocationID Status1 Status2
5 3 1
7 1 2
Location 5 has 3 Invoices with a status of 1, and 1 invoice with a status of 2 while Location 7 has 1 status 1 and 2 status 2
There are two groups, and Location 5 is in both, while Location 7 is only in the second. I need to be able to set it up where I can append a where statement like this:
select * from vw_GroupCounts
where GroupName = 'Group 2'
or
select Invoice, SUM(*) from vw_GroupCounts
where GroupName = 'Group 2'
And that result in only getting Location 7. Whenever I do this, as I have to use left joins or something along those lines, the counts are duplicating for each group the the Location is affiliated with. I know I could do something along the lines of a subquery and pass in the GroupName into that, but the system I am working with uses a dynamic query builder that appends WHERE statements based on user input.
I don't mind using view, or functions, or any number of functions inside of functions, but I hope there is a way to do what I'm looking for.
Since locations 5 and 7 are in Group 2, if you search for group 2 in the where clause after joining all the tables, then you would get all records in this case, this isn't duplication, just the way the data is. A different join wouldn't change this, only changing the data. Let me know if I am misunderstanding something though.
Here is how you would join them to do that search.
Here it is with your first example of the location and status count.

T-SQL - Getting a list of records depending on other related records values

I'm trying to make a query and need a little help (SQL Server).
Imagine the following scenario: user is viewing a web page which has several related categories. According to some rules, the page should not be displayed if a specific category has been put together with another category.
For this I've got 2 tables:
1) Has the page Id and the related categories:
Pk CategoryNumber
--------------------
1 30
1 31
1 45
2 30
3 21
3 26
3 64
4 25
4 12
5 25
5 31
5 30
5 45
2) Rules table. First row means: when viewing a page with the category 30 it should not be retrieved if it also has the 45 category.
WhenViewingCategoryNumber HideEverythingWithCategoryNumber
-------------------------------------------------------
30 45
25 31
Output expected:
2
3
4
I've spent a few hours around this and I'm not going anywhere, so I would appreciate if someone could help. If possible, would be better an answer with a SELECT statement to integrate it directly within a larger CTE statement. Many thanks.
You can use the following query to identify those page ids related to conflicting categories:
SELECT DISTINCT c1.PageId
FROM Categories AS c1
INNER JOIN Rules AS r ON c1.ItemNumber = r.WhenViewingCategoryNumber
INNER JOIN Categories AS c2 ON c1.PageId = c2.PageId
AND r.HideEverythingWithCategoryNumber = c2.ItemNumber
This will return:
PageId
------
1
5
Now you can get expected result by simply using NOT IN:
SELECT DISTINCT PageId
FROM Categories
WHERE PageId NOT IN ( ... above query here ....)
Demo here

SQL 3 table Join While taking all values from 1 table but only filled from other 2

I have three tables: the first has a list of category IDs, the second has dataset information, and the third has import information.
What I have
select dataset.pc_id , count(*)
from import
join dataset on CAST (dataset.internal_id as varchar(20)) = import.product_id
group by dataset.pc_id, order by pc_id asc
This will output:
3 4
4 5
6 200
7 192
8 1000
Where product_category comes into play is this: I want the output to look like:
1 0
2 0
3 4
4 5
6 200
...
16 0
The 16 are the number of different product categories from the product_category table that I currently cannot figure out how to fit into that statement.
What is the way to get all the id's from product category into this list with the information joined occupying the result?
Figured it out, needed to get rid of selecting dataset.pc_id and just go with product_category.id and then right join product_category.

SortBy for one-to-many relationship

I have a two tables which are linked by a one-to-many relationship. Now I need to sort the rows based on the key which I have in my first table.
For Ex
TeacherID StudentID
1 1
1 2
1 3
1 4
1 5
1 6
1 7
1 8
2 9
2 10
2 11
2 12
If I sort the Rows based on by teacherID, the student id's are changing for each and every execution.
In the result set the studentID are changing in the random order. Now I need the studentID's not to be changed.
Is there anything I can do, to solve this problem without using sort by the studentID column.
You can sort on both.
ORDER BY
TeacherID, StudentID
Selecting data from any database engine without specifying an ORDER BY doesn't guarantee any order at all.