SQL Server: Two COUNTs in one query multiplying with one another in output - sql

I have a query is used to display information in a queue and part of that information is showing the amount of child entities (packages and labs) that belong to the parent entity (change). However instead of showing the individual counts of each type of child, they multiply with one another.
In the below case, there are supposed to be 3 labs and 18 packages, however the the multiply with one another and the output is 54 of each.
Below is the offending portion of the query.
SELECT cef.ChangeId, COUNT(pac.PackageId) AS 'Packages', COUNT(lab.LabRequestId) AS 'Labs'
FROM dbo.ChangeEvaluationForm cef
LEFT JOIN dbo.Lab
ON cef.ChangeId = Lab.ChangeId
LEFT JOIN dbo.Package pac
ON (cef.ChangeId = pac.ChangeId AND pac.PackageStatus != 6 AND pac.PackageStatus !=7)
WHERE cef.ChangeId = 255
GROUP BY cef.ChangeId
I feel like this is obvious but it's not occurring to me how to fix it so the two counts are independent of one another like to me they should be. There doesn't seem to be a scenario like this in any of my research either. Can anyone guide me in the right direction?

Because you do multiply source rows by each left join. So sometimes you have more likely cross join here.
SELECT cef.ChangeId, p.Packages, l.Labs
FROM dbo.ChangeEvaluationForm cef
OUTER APPLY(
SELECT COUNT(*) as Labs
FROM dbo.Lab
WHERE cef.ChangeId = Lab.ChangeId
) l
OUTER APPLY(
SELECT COUNT(*) AS Packages
FROM dbo.Package pac
WHERE (cef.ChangeId = pac.ChangeId AND pac.PackageStatus != 6 AND pac.PackageStatus !=7)
) p
WHERE cef.ChangeId = 255
GROUP BY cef.ChangeId
perhaps GROUP BY is not needed now.

From you question its difficult to derive what result do you expect from your query. So I presume you want following result:
+----------+----------+------+
| ChangeId | Packages | Labs |
+----------+----------+------+
| 255 | 18 | 3 |
+----------+----------+------+
Try below query if you are looking for above mentioned result.
SELECT cef.ChangeId, ISNULL(pac.PacCount, 0) AS 'Packages', ISNULL(Lab.LabCount, 0) AS 'Labs'
FROM dbo.ChangeEvaluationForm cef
LEFT JOIN (SELECT Lab.ChangeId, COUNT(*) LabCount FROM dbo.Lab GROUP BY) Lab
ON cef.ChangeId = Lab.ChangeId
LEFT JOIN (SELECT pac.ChangeId, COUNT(*) PacCount FROM dbo.Package pac WHERE pac.PackageStatus != 6 AND pac.PackageStatus !=7 GROUP BY pac.ChangeId) pac
ON cef.ChangeId = pac.ChangeId
WHERE cef.ChangeId = 255
Query Explanation:
In your query you didn't use group by, so it ended up giving you 54 as count which is Cartesian product.
In this query I tried to group by 'ChangeId' and find aggregate before joining tables. So 3 labs and 18 packages will be counted before join.
Your will also notice that I have moved PackageStatus filter before group by in pac table. So unwanted record won't mess with our count.

You start with a particular ChangeId from the dbo.ChangeEvaluationForm table (ChangeId = 255 from your example), then join to the dbo.Lab table. This join makes your result go from 1 row to 3, considering there are 3 Labs with ChangeId = 255. Your problem is on the next join, you are joining all 3 resulting rows from the previous join with the dbo.Package table, which has 18 rows for ChangeId = 255. The resulting count for columns pac.PackageId and lab.LabRequestId will then be 3 x 18 = 54.
To get what you want, there are 2 easy solutions:
Use COUNT DISTINCT instead of COUNT. This will just count the different values of pac.PackageId and lab.LabRequestId and not the repeated ones.
Split the joins into 2 subqueries and join their result (by ChangeId)

Related

SQL Query with 2 joins and different values

I'm quite the beginner so I suppose some of you would have an easy time on my task but I need some help:
I have 3 DBs. dbo_A_Personal, dbo_Z_Ferien and dbo_Z_ERFASSUNG
A_Pers has a Pers_ID (LPE_ID) that I can use to join Z_Ferien and Z_ERFASSUNG on.
In Z_Ferien I have 4 rows with that pers_ID and in Z_ERFASSUNG 96.
What I need is a result that has columns that are basically like that:
PersID
Erf
Fer
1224
5
0
1234
4
0
1234
6
0
1234
0
6
so far I have this:
SELECT dbo_A_PERSONAL.LPE_ID, dbo_Z_Ferien.ZFE_TAGE, dbo_Z_ERFASSUNG.ZER_Std100
FROM dbo_A_PERSONAL
INNER JOIN dbo_Z_Ferien ON dbo_A_PERSONAL.LPE_ID = dbo_Z_Ferien.ZFE_LPE_ID
INNER JOIN dbo_Z_ERFASSUNG ON dbo_A_PERSONAL.LPE_ID = dbo_Z_ERFASSUNG.ZER_LPE
WHERE dbo_A_PERSONAL.LPE_ID=804 AND dbo_Z_ERFASSUNG.ZER_EIGENSCH = 3;
I need that so I can sum up the value I need from Z_ERFASSUNG and Z_Ferien but I don't know how to make it so each value is only "printed" once.
I hope I explained it well enough so you guys can help me out.
If I understand correctly an aggerate function is what you need here.
I added a sum function of both dbo_Z_Ferien & dbo_Z_ERFASSUNG, as well as adding a group by statement for LPE_ID. Which tells SQL to partition the sum only on LPE_ID
SELECT dbo_A_PERSONAL.LPE_ID, sum(dbo_Z_Ferien.ZFE_TAGE), sum(dbo_Z_ERFASSUNG.ZER_Std100)
FROM dbo_A_PERSONAL
INNER JOIN dbo_Z_Ferien ON dbo_A_PERSONAL.LPE_ID = dbo_Z_Ferien.ZFE_LPE_ID
INNER JOIN dbo_Z_ERFASSUNG ON dbo_A_PERSONAL.LPE_ID = dbo_Z_ERFASSUNG.ZER_LPE
WHERE dbo_A_PERSONAL.LPE_ID=804 AND dbo_Z_ERFASSUNG.ZER_EIGENSCH = 3
GROUP BY dbo_A_PERSONAL

Postgresql: Values of multiple rows in one row

I have the following database:
Car: {[CarID, HorsePower, Brand, HeadDesigner]}
DesignsCar:{[CarID, DesID]}
Designer:{[DesID, Name]}
You should note that while every Car has only 1 HeadDesigner, multiple people can design cars (as in work on them).
Say I have 10 cars in my database. For CarID (1..9) only one DesID per CarID in DesignsCar.
However, for carID 10 we have 3 people working on it (carID has 3 entries in DesignsCar because 3 people worked on it).
Say I do this:
select *
from car c
left outer join designscar ds on c.carid = ds.carid
left outer join designer d on frb.persnr = r.persnr
This gives me 12 rows, when I only want 10. The reason why this gives me 12 rows should be clear: for carID 10 we have 3 people working on it (carID has 3 entries in DesignsCar because 3 people worked on it).
I hope I've done a good job explaining this problem, so here comes my question:
How do I modify the query above so I get 10 Rows. For CarID 10 I'd like the 3 designers to be written in one column (like, comma separated but anything works as long it's in one column).
Is that possible?
You need to aggregate the values. Here is one possibility:
select c.*,
array_agg(d.name) as designer_names
from car c left outer join
designscar ds
on c.carid = ds.carid left outer join
designer d
on frb.persnr = r.persnr
group by c.carid ; -- allowed assuming `carid` is the primary key

SQL join two tables and the elements that satisfies one condition

Good afternoon,
I'm having an issue with two tables that I'm trying to join.
What I am trying to do is, I have to print a table with all products that is registered in some agenda (codControl), so the person can put his price.
But first I have to look into lctocotacao to see if he had already given a price to some product. But when I do this, I just get the products that has some price, and the other ones I dont see.
Here is an example of my table cadprodutoscotacao
codProduct desc codControl
1 abc 197
2 cde 197
3 fgh 197
1 abc 198
And my table lctocotacao
codProduct price codControl codPerson
1 2.5000 197 19
2 3.0000 197 37
3 4.5000 198 37
I have this SQL statement at the moment:
SELECT cadc.cod, cadc.desc, lcto.codEnt, lcto.price
FROM cadprodutoscotacao cadc JOIN lctocotacao lcto
ON cadc.codControl = lcto.codControl
AND cadc.codProduct = lcto.codProduct
AND cadc.codControl = '197'
AND lcto.codPerson = '19'
ORDER BY cadc.codControl;
What I'm getting:
cod desc price codPerson codControl
1 abc 2.5000 19 197
And the table I expect
cod desc price codPerson codControl
1 abc 2.5000 19 197
2 cde 197
3 fgh 197
197 and 19 will be parameters to my query.
Any ideas on how to proceed?
E D I T
Basically, I have two queries:
SELECT *
FROM cadprodutoscotacao
WHERE cadc_codControl = '197'
This first, to return all products registered in the agenda '197'.
And the second one:
SELECT *
FROM lctocotacao
WHERE codPerson = 19
AND codControl = '197'
This second one to return products that already has some price added by the Person 19 in the agenda 197.
I have to return one table, including all records from the first query, and, if there is some price in the second one, I have to "concatenate" them.
Thanks in advance.
You need a LEFT JOIN, but you also need to be careful about the filtering conditions:
SELECT cadc.cod, cadc.desc, lcto.codEnt, lcto.price
FROM cadprodutoscotacao cadc LEFT JOIN
lctocotacao lcto
ON cadc.codControl = lcto.codControl AND
cadc.cod = lcto.cod AND
lcto.codEnt = '19'
WHERE cadc.codControl = '197'
ORDER BY cadc_codigo;
A LEFT JOIN keeps all rows in the first table, regardless of whether a match is found in the ON conditions. This applies to conditions on the first table as well as the second. Hence, you don't want to put filters on the first table in the ON clause.
The rule is: When using LEFT JOIN put filters on the first table in the WHERE clause. Filters on the second table go in the ON clause (otherwise the outer join is generally turned into an inner join).
Your rows are filtered because you specified JOIN, which is a shortcut for INNER JOIN
If you want all the records from the left table, even if they don't have correlated records in the right table, you should do a LEFT JOIN:
SELECT cadc.cod, cadc.desc, lcto.codEnt, lcto.price
FROM cadprodutoscotacao cadc
LEFT JOIN lctocotacao lcto
ON cadc.codControl = lcto.codControl
AND cadc.cod = lcto.cod
AND cadc.codControl = '197'
AND lcto.codEnt = '19'
ORDER BY cadc_codigo;
I don't understand your example. What are the primary keys? "cod" and "codentry" appear in both tables. Your schema seems to be very redundant.
But whenever someone JOINs and is missing some entries, it might be solved by using a LEFT OUTER JOIN.

Get MAX() on repeating IDs

This is how my query results look like currently. How can I get the MAX() value for each unique id ?
IE,
for 5267139 is 8.
for 5267145 is 4
5267136 5
5267137 8
5267137 2
5267139 8
5267139 5
5267139 3
5267141 4
5267141 3
5267145 4
5267145 3
5267146 1
5267147 2
5267152 3
5267153 3
5267155 8
SELECT DISTINCT st.ScoreID, st.ScoreTrackingTypeID
FROM ScoreTrackingType stt
LEFT JOIN ScoreTracking st
ON stt.ScoreTrackingTypeID = st.ScoreTrackingTypeID
ORDER BY st.ScoreID, st.ScoreTrackingTypeID DESC
GROUP BY will partition your table into separate blocks based on the column(s) you specify. You can then apply an aggregate function (MAX in this case) against each of the blocks -- this behavior applies by default with the below syntax:
SELECT First_column, MAX(Second_column) AS Max_second_column
FROM Table
GROUP BY First_column
EDIT: Based on the query above, it looks like you don't really need the ScoreTrackingType table at all, but leaving it in place, you could use:
SELECT st.ScoreID, MAX(st.ScoreTrackingTypeID) AS ScoreTrackingTypeID
FROM ScoreTrackingType stt
LEFT JOIN ScoreTracking st ON stt.ScoreTrackingTypeID = st.ScoreTrackingTypeID
GROUP BY st.ScoreID
ORDER BY st.ScoreID
The GROUP BY will obviate the need for DISTINCT, MAX will give you the value you are looking for, and the ORDER BY will still apply, but since there will only be a single ScoreTrackingTypeID value for each ScoreID you can pull it out of the ordering.

Problem with SQL Join

I have two tables, tblEntities and tblScheduling.
tblEntities:
EntityID ShortName Active
1 Dirtville 1
2 Goldtown 1
3 Blackston 0
4 Cornfelt 1
5 Vick 1
tblScheduling:
ScheduleID EntityID SchedulingYearID
1 1 20
2 1 21
3 2 20
4 3 19
5 5 20
I need a query that will show ALL ACTIVE Entities and their schedule information for a particular ScheduleYearID.
Output should look like (the desired SchedulingYearID in this case is 20):
EntityID ScheduleID
1 1
2 3
4 NULL
5 5
The query that I have written so far is:
SELECT tblEntities.EntityID, tblEntities.ShortName, tblScheduling.ScheduleID
FROM tblScheduling RIGHT OUTER JOIN
tblEntities ON tblScheduling.EntityID = tblEntities.EntityID
WHERE (tblScheduling.SchedulingYearID = #SchedulingYearID)
AND (tblEntities.Active = 1)
ORDER BY tblEntities.EntityID
My problem is that using this query it will not include active entities without schedule information (such as EntityID 4 in the example above). I can write the query to display all active entities and their schedule status fine, but once I start limiting it via the SchedulingYearID I lose those particular entities.
Are there any solutions that I am obviously missing without having to resort to subqueries, cursors, etc.? If not it's not a big deal, I just feel like I am missing something simple here.
Try this... Join conditions are evaluated to produce the intermediate Join result set, and then, (for an outer join), all the rows from the "Outer" side are added back in before moving on... Where conditions are evaluated after all joins are done...
SELECT E.EntityID, E.ShortName, S.ScheduleID
FROM tblEntities E
Left Join tblScheduling S
ON S.EntityID = E.EntityID
And S.SchedulingYearID = #SchedulingYearID
WHERE E.Active = 1
ORDER BY E.EntityID
I change your join order cause I prefer left joins... but it doesn't matter
It's your conditions in the where clause:
(tblScheduling.SchedulingYearID = #SchedulingYearID)
when there is no tblScheduling info this wil always fail. Add
(((tblScheduling.SchedulingYearID = #SchedulingYearID) OR (tblScheduling.SchedulingYearID is null) )
or wathever null condition checking your DB uses.
I think the trouble is that the WHERE clause is filtering out the rows where SchedulingYearID is null. So don't.
SELECT tblEntities.EntityID, tblEntities.ShortName, tblScheduling.ScheduleID
FROM tblScheduling RIGHT OUTER JOIN
tblEntities ON tblScheduling.EntityID = tblEntities.EntityID
WHERE (tblScheduling.SchedulingYearID = #SchedulingYearID OR
tblScheduling.SchedulingYearID IS NULL)
AND (tblEntities.Active = 1)
ORDER BY tblEntities.EntityID;