Select SUM from multiple tables - sql

I keep getting the wrong sum value when I join 3 tables.
Here is a pic of the ERD of the table:
(Original here: http://dl.dropbox.com/u/18794525/AUG%207%20DUMP%20STAN.png )
Here is the query:
select SUM(gpCutBody.actualQty) as cutQty , SUM(gpSewBody.quantity) as sewQty
from jobOrder
inner join gpCutHead on gpCutHead.joNum = jobOrder.joNum
inner join gpSewHead on gpSewHead.joNum = jobOrder.joNum
inner join gpCutBody on gpCutBody.gpCutID = gpCutHead.gpCutID
inner join gpSewBody on gpSewBody.gpSewID = gpSewHead.gpSewID

If you are only interested in the quantities of cuts and sews for all orders, the simplest way to do it would be like this:
select (select SUM(gpCutBody.actualQty) from gpCutBody) as cutQty,
(select SUM(gpSewBody.quantity) from gpSewBody) as sewQty
(This assumes that cuts and sews will always have associated job orders.)
If you want to see a breakdown of cuts and sews by job order, something like this might be preferable:
select joNum, SUM(actualQty) as cutQty, SUM(quantity) as sewQty
from (select joNum, actualQty, 0 as quantity
from gpCutBody
union all
select joNum, 0 as actualQty, quantity
from gpSewBody) sc
group by joNum

Mark's approach is a good one. I want to suggest the alternative of doing the group by's before the union, simply because this can be a more general approach for summing along multiple dimensions.
Your problem is that you have two dimensions that you want to sum along, and you are getting a cross product of the values in the join.
select joNum, act.quantity as ActualQty, q.quantity as Quantity
from (select joNum, sum(actualQty) as quantity
from gpCutBody
group by joNum
) act full outer join
(select joNum, sum(quantity) as quantity
from gpSewBody
group by joNum
) q
on act.joNum = q.joNum
(I have kept Mark's assumption that doing this by joNum is the desired output.)

Related

Sum fields of an Inner join

How I can add two fields that belong to an inner join?
I have this code:
select
SUM(ACT.NumberOfPlants ) AS NumberOfPlants,
SUM(ACT.NumOfJornales) AS NumberOfJornals
FROM dbo.AGRMastPlanPerformance MPR (NOLOCK)
INNER JOIN GENRegion GR ON (GR.intGENRegionKey = MPR.intGENRegionLink )
INNER JOIN AGRDetPlanPerformance DPR (NOLOCK) ON
(DPR.intAGRMastPlanPerformanceLink =
MPR.intAGRMastPlanPerformanceKey)
INNER JOIN vwGENPredios P โ€‹โ€‹(NOLOCK) ON ( DPR.intGENPredioLink =
P.intGENPredioKey )
INNER JOIN AGRSubActivity SA (NOLOCK) ON (SA.intAGRSubActivityKey =
DPR.intAGRSubActivityLink)
LEFT JOIN (SELECT RA.intGENPredioLink, AR.intAGRActividadLink,
AR.intAGRSubActividadLink, SUM(AR.decNoPlantas) AS
intPlantasTrabajads, SUM(AR.decNoPersonas) AS NumOfJornales,
SUM(AR.decNoPlants) AS NumberOfPlants
FROM AGRRecordActivity RA WITH (NOLOCK)
INNER JOIN AGRActividadRealizada AR WITH (NOLOCK) ON
(AR.intAGRRegistroActividadLink = RA.intAGRRegistroActividadKey AND
AR.bitActivo = 1)
INNER JOIN AGRSubActividad SA (NOLOCK) ON (SA.intAGRSubActividadKey
= AR.intAGRSubActividadLink AND SA.bitEnabled = 1)
WHERE RA.bitActive = 1 AND
AR.bitActive = 1 AND
RA.intAGRTractorsCrewsLink IN(2)
GROUP BY RA.intGENPredioLink,
AR.decNoPersons,
AR.decNoPlants,
AR.intAGRAActivityLink,
AR.intAGRSubActividadLink) ACT ON (ACT.intGENPredioLink IN(
DPR.intGENPredioLink) AND
ACT.intAGRAActivityLink IN( DPR.intAGRAActivityLink) AND
ACT.intAGRSubActivityLink IN( DPR.intAGRSubActivityLink))
WHERE
MPR.intAGRMastPlanPerformanceKey IN(4) AND
DPR.intAGRSubActivityLink IN( 1153)
GROUP BY
P.vchRegion,
ACT.NumberOfFloors,
ACT.NumOfJournals
ORDER BY ACT.NumberOfFloors DESC
However, it does not perform the complete sum. It only retrieves all the values โ€‹โ€‹of the columns and adds them 1 by 1, instead of doing the complete sum of the whole column.
For example, the query returns these results:
What I expect is the final sums. In NumberOfPlants the result of the sum would be 163,237 and of NumberJornales would be 61.
How can I do this?
First of all the (nolock) hints are probably not accomplishing the benefit you hope for. It's not an automatic "go faster" option, and if such an option existed you can be sure it would be already enabled. It can help in some situations, but the way it works allows the possibility of reading stale data, and the situations where it's likely to make any improvement are the same situations where risk for stale data is the highest.
That out of the way, with that much code in the question we're better served with a general explanation and solution for you to adapt.
The issue here is GROUP BY. When you use a GROUP BY in SQL, you're telling the database you want to see separate results per group for any aggregate functions like SUM() (and COUNT(), AVG(), MAX(), etc).
So if you have this:
SELECT Sum(ColumnB) As SumB
FROM [Table]
GROUP BY ColumnA
You will get a separate row per ColumnA group, even though it's not in the SELECT list.
If you don't really care about that, you can do one of two things:
Remove the GROUP BY If there are no grouped columns in the SELECT list, the GROUP BY clause is probably not accomplishing anything important.
Nest the query
If option 1 is somehow not possible (say, the original is actually a view) you could do this:
SELECT SUM(SumB)
FROM (
SELECT Sum(ColumnB) As SumB
FROM [Table]
GROUP BY ColumnA
) t
Note in both cases any JOIN is irrelevant to the issue.

SQL Sum returning wrong number

I am adding up the amount of tickets sold for a sporting event, the answer should be under 100 but my answer is in the thousands.
SELECT Stubhub.Active.Opponent,
SUM(Stubhub.Active.Qty) AS AQty, SUM(Stubhub.Sold.Qty) AS SQty
FROM Stubhub.Active INNER JOIN
Stubhub.Sold ON Stubhub.Active.Opponent = Stubhub.Sold.Opponent
GROUP BY Stubhub.Active.Opponent
This is type of problem occurs because you are getting a cartesian product between each table for each opponent. The solution is to pre-aggregate by opponent:
SELECT a.Opponent, a.AQty, s.SQty
FROM (SELECT a.Opponent, SUM(a.Qty) as AQty
FROM Stubhub.Active a
GROUP BY a.Opponent
) a INNER JOIN
(SELECT s.Opponent, SUM(s.QTY) as SQty
FROM Stubhub.Sold s
GROUP BY s.Opponent
) s
ON a.Opponent = s.Opponent;
Notice that in this case, you do not need the aggregation in the outer query.

Sum Distinct Rows Only In Sql Server

I have four tables,in which First has one to many relation with rest of three tables named as (Second,Third,Fourth) respectively.I want to sum only Distinct Rows returned by select query.Here is my query, which i try so far.
select count(distinct First.Order_id) as [No.Of Orders],sum( First.Amount) as [Amount] from First
inner join Second on First.Order_id=Second.Order_id
inner join Third on Third.Order_id=Second.Order_id
inner join Fourth on Fourth.Order_id=Third.Order_id
The outcome of this query is :
No.Of Orders Amount
7 69
But this Amount should be 49,because the sum of First column Amount is 49,but due to inner join and one to many relationship,it calculate sum of also duplicate rows.How to avoid this.Kindly guide me
I think the problem is cartesian products in the joins (for a given id). You can solve this using row_number():
select count(t1234.Order_id) as [No.Of Orders], sum(t1234.Amount) as [Amount]
from (select First.*,
row_number() over (partition by First.Order_id order by First.Order_id) as seqnum
from First inner join
Second
on First.Order_id=Second.Order_id inner join
Third
on Third.Order_id=Second.Order_id inner join
Fourth
on Fourth.Order_id=Third.Order_id
) t1234
where seqnum = 1;
By the way, you could also express this using conditions in the where clause, because you appear to be using the joins only for filtering:
select count(First.Order_id) as [No.Of Orders], sum(First.Amount) as [Amount]
from First
where exists (select 1 from second where First.Order_id=Second.Order_id) and
exists (select 1 from third where First.Order_id=third.Order_id) and
exists (select 1 from fourth where First.Order_id=fourth.Order_id);

Missing rows when selecting from 2 tables

Hi and sorry if that's a stupud question but I am trying to get results from two different tables and since one of them might have zero records I am not sure how to proceed. Here is the query:
SELECT aid.*, T.ItemId, T.Total, T.Stack
FROM (SELECT ItemId, Stack, count(ItemId) as Total FROM auction_house WHERE Sale = 0 GROUP BY ItemId, Stack) as T, ahbot_item_data aid
WHERE T.ItemId = aid.ItemId
AND T.Stack = aid.Stack
Basically I want to get a list of items from the aid table, and the current count of those items in the ah table. But since the count might just be zero, it might not return that row. I want the row and the Total to be 0.
Thank you in advance ^^;;
Similar to what #Twelfth was saying, you want to have a left outer join, which always outputs data from the left table even if there is no matching data in the right table.
SELECT
aid.ItemId,
count(ah.ItemId) AS "Total",
aid.Stack
FROM
ahbot_item_data aid
LEFT OUTER JOIN auction_house ah
ON (aid.ItemID = ah.ItemID AND aid.Stack = ah.Stack)
GROUP BY
aid.ItemID, aid.Stack
You are using old syntax...I'm going to switch to join syntax (newer and more accepted...same effect, easier to read). What you want is an 'outer join', which will produce nulls where no record is found. I'll arrange the tables and we'll use a left outer join
SELECT aid.*, T.ItemId, T.Total, T.Stack
FROM ahbot_item_data aid left join (SELECT ItemId, Stack, count(ItemId) as Total FROM auction_house WHERE Sale = 0 GROUP BY ItemId, Stack) as T,
on T.ItemId = aid.ItemId
AND T.Stack = aid.Stack
I'm not entirely sure on the syntax anymore...but it's also possible to use != to do this outer join in the syntax you've used (where aid.itemID != t.itemID I think?)

Distinct on multi-columns in sql

I have this query in sql
select cartlines.id,cartlines.pageId,cartlines.quantity,cartlines.price
from orders
INNER JOIN
cartlines on(cartlines.orderId=orders.id)where userId=5
I want to get rows distinct by pageid ,so in the end I will not have rows with same pageid more then once(duplicate)
any Ideas
Thanks
Baaroz
Going by what you're expecting in the output and your comment that says "...if there rows in output that contain same pageid only one will be shown...," it sounds like you're trying to get the top record for each page ID. This can be achieved with ROW_NUMBER() and PARTITION BY:
SELECT *
FROM (
SELECT
ROW_NUMBER() OVER(PARTITION BY c.pageId ORDER BY c.pageID) rowNumber,
c.id,
c.pageId,
c.quantity,
c.price
FROM orders o
INNER JOIN cartlines c ON c.orderId = o.id
WHERE userId = 5
) a
WHERE a.rowNumber = 1
You can also use ROW_NUMBER() OVER(PARTITION BY ... along with TOP 1 WITH TIES, but it runs a little slower (despite being WAY cleaner):
SELECT TOP 1 WITH TIES c.id, c.pageId, c.quantity, c.price
FROM orders o
INNER JOIN cartlines c ON c.orderId = o.id
WHERE userId = 5
ORDER BY ROW_NUMBER() OVER(PARTITION BY c.pageId ORDER BY c.pageID)
If you wish to remove rows with all columns duplicated this is solved by simply adding a distinct in your query.
select distinct cartlines.id,cartlines.pageId,cartlines.quantity,cartlines.price
from orders
INNER JOIN
cartlines on(cartlines.orderId=orders.id)where userId=5
If however, this makes no difference, it means the other columns have different values, so the combinations of column values creates distinct (unique) rows.
As Michael Berkowski stated in comments:
DISTINCT - does operate over all columns in the SELECT list, so we
need to understand your special case better.
In the case that simply adding distinct does not cover you, you need to also remove the columns that are different from row to row, or use aggregate functions to get aggregate values per cartlines.
Example - total quantity per distinct pageId:
select distinct cartlines.id,cartlines.pageId, sum(cartlines.quantity)
from orders
INNER JOIN
cartlines on(cartlines.orderId=orders.id)where userId=5
If this is still not what you wish, you need to give us data and specify better what it is you want.