Nested sum loop until foreign key 'dies out' - sql

I am pulling my hair out over a data retrieval function I'm trying to write. In essence this query is meant to SUM up the count of all voorwerpnummers in the Voorwerp_in_Rubriek table, grouped by their rubrieknummer gathered from Rubriek.
After that I want to keep looping through the sum in order to get to their 'top level parent'. Rubriek has a foreign key reference to itself with a 'hoofdrubriek', this would be easier seen as it's parent in a category tree.
This also means they can be nested. A value of 'NULL' in the hoofdcategory column means that it is a top-level parent. The idea behind this query is to SUM up the count of voorwerpnummers in Voorwerp_in_rubriek, and add them together until they are at their 'top level parent'.
As the database and testdata is quite massive I've decided not to add direct code to this question but a link to a dbfiddle instead so there's more structure.
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=8068a52da6a29afffe6dc793398f0998
I got it working in some degree using this query:
SELECT R2.hoofdrubriek ,
COUNT(Vr.rubrieknummer) AS aantal
FROM Rubriek R1
RIGHT OUTER JOIN Rubriek R2 ON R1.rubrieknummer = R2.hoofdrubriek
INNER JOIN Voorwerp_in_rubriek Vr ON R2.rubrieknummer = Vr.rubrieknummer
WHERE NOT EXISTS ( SELECT *
FROM Rubriek
WHERE hoofdrubriek = R2.rubrieknummer )
AND R1.hoofdrubriek IS NOT NULL
GROUP BY Vr.rubrieknummer ,
R2.hoofdrubriek
But that doesn't get back all items and flops in general. I hope someone can help me.

If I got it right
declare #t table (
rubrieknummer int,
cnt int);
INSERT #t(rubrieknummer, cnt)
SELECT R.rubrieknummer, COUNT(Vr.voorwerpnummer)
FROM Rubriek R
INNER JOIN voorwerp_in_rubriek Vr ON R.rubrieknummer = Vr.rubrieknummer
GROUP BY Vr.rubrieknummer, R.rubrieknummer;
--select * from #t;
with t as(
select rubrieknummer, cnt
from #t
union all
select r.hoofdrubriek, cnt
from t
join Rubriek r on t.rubrieknummer = r.rubrieknummer
)
select rubrieknummer, sum(cnt) cnt
from t
group by rubrieknummer;
applying to your fiddle data returns
rubrieknummer cnt
<null> 42
100 42
101 26
102 6
103 10
10000 8
10100 4
10101 1
10102 3
10500 4
10501 2
10502 2
15000 18
15100 6
15101 2
15102 2
15103 2
15500 12
15501 4
15502 3
15503 5
20000 6
20001 2
20002 1
20003 1
20004 2
25000 4
25001 1
25002 1
25003 1
25004 1
30001 2
30002 1
30004 3

Related

How to get these rows as columns in an SQL query

I need some help in writing up this SQL query using a single table. Something like this
User ID
Category
Spend
Transactions
Country
1
Sport
30
2
USA
1
Bills
60
3
USA
2
Sport
10
1
MEX
3
Grocery
50
8
CAN
2
Grocery
70
4
MEX
3
Sport
20
5
CAN
3
Bills
30
2
CAN
1
Petrol
60
5
USA
I then want to group the rows by the User id and group the spend and transactions each by the category and having the country as a column by itself like this.
User ID
Sport_Spend
Bills_Spend
Grocery_Spend
Petrol_Spend
Sport_Transactions
Bills_Transactions
Grocery_Transactions
Petrol_Transactions
Country
1
30
60
0
60
2
3
0
5
USA
2
10
0
70
0
1
0
4
0
MEX
3
20
30
50
0
5
2
8
0
CAN
Its stumping me a bit would appreciate some help.
#jarlh comments are most relevant and need to be addressed. But here is something to start with: (ms sql code) (I opted out from transactions columns to reduce the problem, but the coding is just the same) https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=25550539029ba1c4be0826725bf9e00a
with data (UserID,Category,Spend,Transactions,Country) as(
select 1,'Sport',30,2,'USA' union all
select 1,'Bills',60,3,'USA' union all
select 2,'Sport',10,1,'MEX' union all
select 3,'Grocery',50,8,'CAN' union all
select 2,'Grocery',70,4,'MEX' union all
select 3,'Sport',20,5,'CAN' union all
select 3,'Bills',30,2,'CAN' union all
select 1,'Petrol',60,5,'USA'
)
select UserID
,isnull(SUM([Sport]),0)as Sport
,isnull(SUM([Bills]),0)as Bills
,isnull(SUM([Grocery]),0)as Grocery
,isnull(SUM([Petrol]),0)as Petrol
,MAX(Country)as Country
from (
select UserID,Category,Spend,Transactions,Country
from data) p
PIVOT(
SUM(SPEND)
For CATEGORY in ([Sport] ,[Bills] ,[Grocery] ,[Petrol])
)as PivotTable
group by UserID
select
COALESCE(user_id,0) as user_id,
COALESCE(Sport_Spend,0) as Sport_Spend,
COALESCE(Bills_Spend,0) as Bills_Spend,
COALESCE(Grocery_Spend,0) as Grocery_Spend,
COALESCE(Petrol_Spend,0) as Petrol_Spend,
COALESCE(Sport_Transactions,0) as Sport_Transactions,
COALESCE(Bills_Transactions,0) as Bills_Transactions,
COALESCE(Grocery_Transactions,0) as Grocery_Transactions,
COALESCE(Petrol_Transactions,0) as Petrol_Transactions
,country from
(SELECT DISTINCT user_id,country from table_name) as A
LEFT JOIN
(select user_id, spend as Sport_Spend ,transactions as Sport_Transactions from table_name where category='Sport') as B using (user_id)
LEFT JOIN
(select user_id, spend as Bills_Spend ,transactions as Bills_Transactions from table_name where category='Bills') as C using (user_id)
LEFT JOIN
(select user_id, spend as Grocery_Spend ,transactions as Grocery_Transactions from table_name where category='Grocery') as D using (user_id)
LEFT JOIN
(select user_id, spend as Petrol_Spend ,transactions as Petrol_Transactions from table_name where category='Petrol') as E using (user_id)
ORDER BY user_id;

Finding duplicate **Set of Values** in multiple rows in SQL Server table

I have a SQL Server database table with this sample data:
ProductID GenericID MG
---------------------------------
1 1 2g
1 2 5g
2 2 5g
3 1 2g
3 2 5g
4 1 2g
5 1 2g
5 3 7g
6 2 5g
7 1 2g
8 1 2g
I want to find out the query to select data
if I select ProductID=1 then the query should check what GenericID are associated with ProductID=1
In above data case if user select ProductID=1 then query will check GenericID=1 and 2 are associated with ProductID=1.
Then after I want to go through all rows and select those rows who has the same Unique ProductID and also having only GenericID=1 and 2.
as in above case the final output will be as shown below....
I select ProductID=1 and output has four rows, because only ProductId 3 has same GenericID as were of ProductId=1.
If I select only ProductId=1 then I want to get all the rows with the same exact set of GenericID values as ProductID=1, which is the set { 1, 2 } in my sample data. I am struggling with the query logic.
For example - I select ProductID=1, this is the output that I want is as follows, because ProductID 3 has the same set of GenericID values as ProductID 1.
ProductID GenericID MG
-------------------------------
1 1 2g
1 2 5g
3 1 2g
3 2 5g
GenericID can be on or multiple dynamic values.
Another example - if I select ProductID=7, this is the output I want:
In this example - It will only get those results that are having only GenericID=1 because ProductID=7 has only GenericID=1. any set of productID which is having GenericID=1 and also that set includes other GenericID will be neglected.
ProductID GenericID MG
------------------------------
7 1 2g
8 1 2g
4 1 2g
I need to find out the query to select the required output.
I want all of the products that have the same set of generic id's as the predicate product.
The simplest method is probably to use string_agg():
with t as (
select productID, string_agg(genericId, ',') within group (order by genericId) as genericIds
from sample
group by productID
)
select s.*
from t join
t t2
on t.genericIds = t2.genericIds and t2.productId = 1 join
sample s
on s.productId = t.productId;
Gordon, thanks a lot for your prompt response, basically I forget to inform you that I am using SQL 2014 and that's why string_agg(): action function wasn't helpful for me but I really appreciate your help and the prompt response that make my day. Here I created my query with the help of your other query and you became and very helpful resource for me.
select PG.PID2 as Alternatives
from (select d1.ProductID as PID1, d2.ProductID as PID2
from (select distinct ProductID from ProductsGenerics Where ProductID=#PID) d1 cross join
(select distinct ProductID from ProductsGenerics) d2
) PG left outer join
ProductsGenerics e1
on e1.ProductID = PG.PID1 full outer join
ProductsGenerics e2
on PG.PID2 = e2.ProductID and e1.genericid = e2.GenericID-- and e1.MG = e2.MG
group by PG.PID1, PG.PID2
having SUM(case when e1.GenericID is null then 1 else 0 end) = 0 and
SUM(case when e2.GenericID is null then 1 else 0 end) = 0

SQL join query for a view with sum of columns across 3 tables

I have 3 tables as below
Table - travel_requests
id industry_id travel_cost stay_cost other_cost
1 2 1000 500 200
2 4 4000 100 200
3 5 3000 0 400
4 1 3000 250 100
5 1 200 100 75
Table - industry_tech_region
id industry_name
1 Auto
2 Aero
3 Machinery
4 Education
5 MTV
Table - industry_allocation
id industry_id allocation
1 1 500000
2 2 300000
3 3 500000
4 4 300000
5 5 500000
6 1 200000
I want to create a view which has 3 columns
industry_name, total_costs, total_allocation
I created a view as below
SELECT industry_tech_region.industry_name,
SUM(travel_requests.travel_cost + travel_requests.stay_cost + travel_requests.other_cost) AS total_cost,
SUM(industry_allocation.allocation) AS total_allocation
FROM industry_tech_region
INNER JOIN industry_allocation
ON industry_tech_region.id = industry_allocation.industry_id
INNER JOIN travel_requests
ON industry_tech_region.id = travel_requests.industry_id
GROUP BY industry_tech_region.industry_name
But the result I get is as below which is incorrect
industry_name total_cost total_allocation
Aero 1700 300000
Auto 7450 1400000 (wrong should be 3725 and 700000)
Education 4300 300000
MTV 3400 500000
This is probably happening because there are 2 entries for industry_id 1 in the travel_requests table. But they should be counted only once.
Please let me know how do we correct the view statement.
Also I want to add another column in view which is remaining_allocation which is difference of total_allocation and total_cost for each industry.
you shoud join the sum (and not sum the join)
select
a.industry_name
, t1.total_cost
, t2.total_allocation
from dbo.industry_tech_region a
left join (
select dbo.travel_requests.industry_id
, SUM(dbo.travel_requests.travel_cost + dbo.travel_requests.stay_cost + dbo.travel_requests.other_cost) AS total_cost
FROM bo.travel_requests
group by dbo.travel_requests.industry_id
) t1 on a.id = t1.industry_id
left join (
select dbo.industry_allocation.industry_id
, SUM(dbo.industry_allocation.allocation) AS total_allocation
from dbo.industry_allocation
group by dbo.industry_allocation.industry_id
) t2 on a.id = t2.industry_id
this happen because you have two entry for the industry_id 1 and then the row are joined two time if you use the subquery for aggreated the row this can't happen ...
I have used left join because seems that not all the industry_id match for the 3 tables ..
You can use this approach too (without the ORDER BY because views do not allow it).
;WITH q AS (
SELECT industry_id
, sum(allocation) AS total_allocation
FROM #industry_allocation
GROUP BY industry_id
)
SELECT #industry_tech_region.industry_name
, isnull(SUM(#travel_request.travel_cost
+ #travel_request.stay_cost
+ #travel_request.other_cost),0.0) AS total_cost
,q.total_allocation AS total_allocation
FROM #industry_tech_region
LEFT JOIN q ON #industry_tech_region.id = q.industry_id
LEFT JOIN #travel_request ON #industry_tech_region.id = #travel_request.industry_id
GROUP BY #industry_tech_region.industry_name,q.total_allocation
ORDER BY industry_name

How to update a field's value by an incremental value without using loop in SQL Server 2008?

I have a table #temp in SQL Server 2008 like this:
Id p_id h_no f_id
------------------
1 100 A01 null
2 200 A02 null
3 300 A02 null
4 400 null null
5 500 null null
6 600 A03 null
7 700 A01 null
8 400 null null
So basically, every record has a p_id, but may or may not have h_no.
What I want is to replace f_id values with a dummy incremental number based on:
if h_no value of a record matches another(s), this (those) ones will have same f_id (check ids:1 & 7 or ids:2 & 3 in the example)
if h_no is null but p_id values are equal for some cases, they will have same f_id (check ids: 4 & 8 in the example)
For example, the sample table above should be:
Id p_id h_no f_id
-----------------
1 100 A01 1
2 200 A02 2
3 300 A02 2
4 400 null 3
5 500 null 4
6 600 A03 5
7 700 A01 1
8 400 null 3
I do not want to use a loop for this process. I am trying to find a more optimal solution for this. I need a query something like below, could not find the correct syntax.
declare #tempFID int = 1;
update t
set t.f_id = #tempFID++ --syntax error
from #temp t
inner join #temp t2 on t.Id = t2.Id
where (t.h_no is not null and t.h_no = t2.h_no)
or (t.h_no is null and t.p_id = t2.p_id)
I also tried but had syntax error:
update t
set t.f_id = (set #tempFID = #tempFID + 1) --syntax error
...
Any help would be so appreciated!
;WITH cte AS (
SELECT *
,CASE WHEN h_no IS NULL THEN p_id ELSE MIN(p_id) OVER (PARTITION BY h_no) END as PIdGroup
FROM
#Table
)
, cteFIdValue AS (
SELECT
Id
,DENSE_RANK() OVER (ORDER BY PIdGroup) as f_id
FROM
cte
)
UPDATE t
SET f_id = u.f_id
FROM
Table t
INNER JOIN cteFIdValue u
ON t.ID = u.ID
Find the minimum p_id for each h_no and just leave it as the assigned p_id if h_no is null
Then create a dense rank on the PidGroup
Update the Table
so you have problems besides a syntax error in your code above. First your join will only get the exact same record, you would have to change to t.ID <> t2.ID as left join and still need some sort of ranking. honestly I am not positive what you are attempting there.
This approach might be simpler:
update #temp
set f_id = isnull(f_id, 0) +
case when condition1 is met then value 1
etc
when final condition is met then 0
else null
end

SELECT clause with SUM condition

Have this table :
//TEST
NUMBER TOTAL
----------------------------
1 158
2 355
3 455
//TEST1
NUMBER QUANTITY UNITPRICE
--------------------------------------------
1 3 5
1 3 6
1 3 4
2 4 8
3 5 4
I used following query:
SELECT t.NUMBER,sum(t.TOTAL),NVL(SUM(t2.quantity*t2.unitprice),0)
FROM test t INNER JOIN test1 t2 ON t.NUMBER=t2.NUMBER
GROUP BY t.NUMBER;
OUTPUT:
NUMBER SUM(TOTAL) SUM(t2.quantity*t2.unitprice)
-----------------------------------------------------------
1 474 45 <--- only this wrong
2 355 32
It seem like loop for three times so 158*3 in the record.
EXPECTED OUTPUT:
NUMBER SUM(TOTAL) SUM(t2.quantity*t2.unitprice)
-----------------------------------------------------------
1 158 45
2 355 32
You have to understand that the result of your join is something like this:
//TEST1
NUMBER QUANTITY UNITPRICE TOTAL
--------------------------------------------------------------
1 3 5 158
1 3 6 158
1 3 4 158
2 4 8 355
3 5 4 455
It means you don't need to apply a SUM on TOTAL
SELECT t.NUMBER,t.TOTAL,NVL(SUM(t2.quantity*t2.unitprice),0)
FROM test t INNER JOIN test1 t2 ON t.NUMBER=t2.NUMBER
GROUP BY t.NUMBER, t.TOTAL;
Something like this should work using a subquery separating the sums:
select t.num,
sum(t.total),
test1sum
from test t
join (
select num, sum(qty*unitprice) test1sum
from test1
group by num
) t2 on t.num = t2.num
group by t.num, test1sum
SQL Fiddle Demo
In regards to your sample data, you may not even need the additional group by on the test total field. If that table only contains distinct ids, then this would work the same:
select t.num,
t.total,
sum(qty*unitprice)
from test t
join test1 t2 on t.num = t2.num
group by t.num, t.total