Related
I'm using PostgreSQL 8.4.
I have the following sql-query:
SELECT p.partner_id,
CASE WHEN pa.currency_id = 1 THEN SUM(amount) ELSE 0 END AS curUsdAmount,
CASE WHEN pa.currency_id = 2 THEN SUM(amount) ELSE 0 END AS curRubAmount,
CASE WHEN pa.currency_id = 3 THEN SUM(amount) ELSE 0 END AS curUahAmount
FROM public.player_account AS pa
JOIN player AS p ON p.id = pa.player_id
WHERE p.partner_id IN (819)
GROUP BY p.partner_id, pa.currency_id
The thing is that query does not what I expected. I realize that, but now I want to understand what exactly that query does. I mean, what SUM will be counted after the query executed. Could you clarify?
I think you have the conditions backwards in the query:
SELECT p.partner_id,
SUM(CASE WHEN pa.currency_id = 1 THEN amount ELSE 0 END) AS curUsdAmount,
SUM(CASE WHEN pa.currency_id = 2 THEN amount ELSE 0 END) AS curRubAmount,
SUM(CASE WHEN pa.currency_id = 3 THEN amount ELSE 0 END) AS curUahAmount
FROM public.player_account pa JOIN
player p
ON p.id = pa.player_id
WHERE p.partner_id IN (819)
GROUP BY p.partner_id;
Note that I also removed currency_id from the group by clause.
Maybe one row per (partner_id, currency_id) does the job. Faster and cleaner that way:
SELECT p.partner_id, pa.currency_id, sum(amount) AS sum_amount
FROM player_account pa
JOIN player p ON p.id = pa.player_id
WHERE p.partner_id = 819
AND pa.currency_id IN (1,2,3) -- may be redundant if there are not other
GROUP BY 1, 2;
If you need 1 row per partner_id, you are actually looking for "cross-tabulation" or a "pivot table". In Postgres use crosstab() from the additional module tablefunc , which is very fast. (Also available for the outdated version 8.4):
SELECT * FROM crosstab(
'SELECT p.partner_id, pa.currency_id, sum(amount)
FROM player_account pa
JOIN player p ON p.id = pa.player_id
WHERE p.partner_id = 819
AND pa.currency_id IN (1,2,3)
GROUP BY 1, 2
ORDER BY 1, 2'
,VALUES (1), (2), (3)'
) AS t (partner_id int, "curUsdAmount" numeric
, "curRubAmount" numeric
, "curUahAmount" numeric); -- guessing data types
Adapt to your actual data types.
Detailed explanation:
PostgreSQL Crosstab Query
An application which we have built has undergone a large change in its database schema, particularly in the way financial data is stored. We have functions that calculate the total amount of billing, based on various scenarios; and the change is causing huge performance problems when the functions must be run many times in a row.
I'll include an explanation, the function and the relevant schema, and I hope someone sees a much better way to write the function. This is SQL Server 2008.
First, the business basis: think of a medical Procedure. The healthcare Provider performing the Procedure sends one or more Bills, each of which may have one or more line items (BillItems).
That Procedure is the re-billed to another party. The amount billed to the third party may be:
The total of the Provider's billing
The total of the Provider's billing plus a Copay amount,or
A completely separate amount (a Rebill amount)
The current function for calculating the billing for a Procedure looks at all three scenarios:
CREATE FUNCTION [dbo].[fnProcTotalBilled] (#PROCEDUREID INT)
RETURNS MONEY AS
BEGIN
DECLARE #billed MONEY
SELECT #billed = (SELECT COALESCE((SELECT COALESCE(sum(bi.Amount),0)
FROM BillItems bi INNER JOIN Bills b ON b.BillID=bi.BillID
INNER JOIN Procedures p on p.ProcedureID=b.ProcedureID
WHERE b.ProcedureID=#PROCEDUREID
AND p.StatusID=3
AND b.HasCopay=0
AND b.Rebill=0),0))
-- the total of the provider's billing, with no copay and not rebilled
+
(SELECT COALESCE((SELECT sum(bi.Amount) + COALESCE(b.CopayAmt,0)
FROM BillItems bi INNER JOIN Bills b ON b.BillID=bi.BillID
INNER JOIN Procedures p on p.ProcedureID=b.ProcedureID
WHERE b.ProcedureID=#PROCEDUREID
AND p.StatusID=3
AND b.HasCopay=1
GROUP BY b.billid,b.CopayAmt),0))
-- the total of the provider's billing, plus a Copay amount
+
(SELECT COALESCE((SELECT sum(COALESCE(b.RebillAmt,0))
FROM Bills b
INNER JOIN Procedures p on p.ProcedureID=b.ProcedureID
WHERE b.ProcedureID=#PROCEDUREID
AND p.StatusID=3
AND b.Rebill=1),0))
-- the Rebill amount, instead of the provider's billing
RETURN #billed
END
I'll omit the DDL for the Procedure. Suffice to say, it must have a certain status (shown in the function as p.StatusID= 3).
Here are the DDLs for Bills and related BillItems:
CREATE TABLE dbo.Bills (
BillID int IDENTITY(1,1) NOT NULL,
InvoiceID int DEFAULT ((0)),
CaseID int NOT NULL,
ProcedureID int NOT NULL,
TherapyGroupID int DEFAULT ((0)) NOT NULL,
ProviderID int NOT NULL,
Description varchar(1000),
ServiceDescription varchar(255),
BillReferenceNumber varchar(100),
TreatmentDate datetime,
DateBilled datetime,
DateBillReceived datetime,
DateBillApproved datetime,
HasCopay bit DEFAULT ((0)) NOT NULL,
CopayAmt money,
Rebill bit DEFAULT ((0)) NOT NULL,
RebillAmt money,
IncludeInDemand bit DEFAULT ((1)) NOT NULL,
CreateDate datetime DEFAULT (getdate()) NOT NULL,
CreatedByID int,
ChangeDate datetime,
ChangeUserID int,
PRIMARY KEY (BillID)
);
CREATE TABLE dbo.BillItems (
BillItemID int IDENTITY(1,1) NOT NULL,
BillID int NOT NULL,
ItemDescription varchar(1000),
Amount money,
WillNotBePaid bit DEFAULT ((0)) NOT NULL,
CreateDate datetime DEFAULT (getdate()),
CreatedByID int,
ChangeDate datetime,
ChangeUserID varchar(25),
PRIMARY KEY (BillItemID)
);
I fully realize how complex the function is; but I couldn't find another way to account for all the scenarios.
I'm hoping that a far better SQL programmer or DBA will see a more performant solution.
Any help will be greatly appreciated.
Thanks,
Tom
UPDATE:
Thanks to everyone for their replies. I tried to add a little clarification in comments, but I'll do so here, too.
First, a definition: a Procedure is medical service from a Provider on a single Date of Service. We only concern ourselves with the total amount billed for a procedure; multiple persons do not receive bills.
A "Case" can have many Procedures.
Generally, a single Procedure will have a single Bill - but not always. A Bill may have one or more BillItems. The Copay (if one exists) is added to the sum of the BillItems. A Rebill Amount trumps everything.
The performance issue comes into play at a higher level, when calculating the totals for an entire Case (many Procedures) and when needing to display grid data that shows hundreds of Cases at once.
My query was at the Procedure level, because it was simpler to describe the problem.
As to sample data, the data in #Serpiton's SQL Fiddle is an excellent, concise example. Thank you very much for it.
In reviewing the answers, it seems to me that both the CTE approach of #Serpiton and #GarethD's view approach both are strong improvements on my original. For the moment, I'm going to work with the CTE approach, simply to avoid the necessity of dealing with the multiple results from the SELECT.
I have modified #Serpiton's CTE to work at the Case level. If he or others would please take a look at it, I'd appreciate it. It's working well in my testing, but I'd appreciate other eyes on it.
It goes like this:
WITH Normal As (
SELECT b.BillID
, b.CaseID
, sum(coalesce(n.Amount * (1 - b.Rebill), 0)) Amount
FROM Procedures p
INNER JOIN Bills b ON p.ProcedureID = b.ProcedureID
LEFT JOIN BillItems n ON b.BillID = n.BillID
WHERE b.CaseID = 3444
AND p.StatusID = 3
GROUP BY b.CaseID,b.BillID, b.HasCopay
)
SELECT Amount = Sum(b.Amount)
+ Sum(Coalesce(c.CopayAmt, 0))
+ Sum(Coalesce(r.RebillAmt, 0))
FROM Normal b
LEFT JOIN Bills c ON b.BillID = c.BillID And c.HasCopay = 1
LEFT JOIN Bills r ON b.BillID = r.BillID And r.Rebill = 1
GROUP BY b.caseid
A very quick win is to use a (TABLE VALUED) (INLINE) FUNCTION instead of a (SCALAR) (MULTI-STATEMENT) FUNCTION.
CREATE FUNCTION [dbo].[fnProcTotalBilled] (#PROCEDUREID INT)
AS
RETURN (
SELECT
(sub-query1)
+
(sub-query2)
+
(sub-query3) AS amount
);
This can then be used as follows:
SELECT
something.*,
totalBilled.*
FROM
something
CROSS APPLY -- Or OUTER APPLY
[dbo].[fnProcTotalBilled](something.procedureID) AS totalBilled
Over larger data-sets this is significantly faster than using scalar functions.
- It must be INLINE (Not Multi-Statement)
- It must be TABLE-VALUED (Not Scalar)
If you work out better business logic for the calculation, you'll get even more performance benefits again.
EDIT :
This may be functionally the same as you have described, but it's hard to tell. Please add comments to my question to investigate further.
SELECT
SUM(
CASE WHEN b.HasCopay = 0 AND b.Rebill = 0 THEN COALESCE(bi.TotalAmount, 0)
WHEN b.HasCopay = 1 THEN b.CopayAmt + COALESCE(bi.TotalAmount, 0)
WHEN b.Rebill = 1 THEN b.RebillAmt
ELSE 0
END
) AS Amount
FROM
Procedures p
INNER JOIN
Bills b
ON b.ProcedureID = p.ProcedureID
LEFT JOIN
(
SELECT BillID, SUM(Amount) AS TotalAmount
FROM BillItems
GROUP BY BillID
)
AS bi
ON bi.BillID = b.BillID
WHERE
p.ProcedureID=#PROCEDUREID
AND p.StatusID=3
The 'trick' that makes this simpler is the sub-query to aggregate all the BillItems together in to one record per BillID. The optimiser won't actually do that for the whole table, but only for the relevant records based on your JOINs and WHERE clause.
This then means that Bill:BillItem is 1:0..1, and everything simplifies. I believe ;)
Answer to the update
To increase the performance you can create a view with the same definition of the CTE, so that the query plan will be stored and reused.
If you have to calculate more than one total amount don't try to get them individually, a better plan would be to get all of them with a single query, writing a condition like
WHERE b.CaseID IN (list of cases)
or some other condition that fit your needs, and adding some more information in the main query, at least the CaseID.
Update
#DRapp pointed out a problem with my previous solution (that I write without testing, sorry pals), to remove the trouble I had removed BillItems from the main query, that now works only with the Bills.
WITH Normal As (
SELECT b.BillID
, b.ProcedureID
, sum(coalesce(n.Amount * (1 - b.Rebill), 0)) Amount
FROM Procedures p
INNER JOIN Bills b ON p.ProcedureID = b.ProcedureID
LEFT JOIN BillItems n ON b.BillID = n.BillID
WHERE p.ProcedureID = #PROCEDUREID
AND p.StatusID = 3
GROUP BY b.ProcedureID, b.BillID, b.HasCopay
)
SELECT #Billed = Sum(b.Amount)
+ Sum(Coalesce(c.CopayAmt, 0))
+ Sum(Coalesce(r.RebillAmt, 0))
FROM Normal b
LEFT JOIN Bills c ON b.BillID = c.BillID And c.HasCopay = 1
LEFT JOIN Bills r ON b.BillID = r.BillID And r.Rebill = 1
GROUP BY b.ProcedureID
How it works
The Normal CTE get all the bills related to the ProcedureID, and calculate the Bill Total, the Amount * (1 - Rebill) set the Amount to 0 if the Bill is to rebill.
In the main query the Normal CTE is joined to the special type of bill, as Normal contains all the Bills for the selected ProcedureID, the table Procedures is not there.
Demo with random data.
Old Query
Without data to test our query this is a blind fly
SELECT #billed = Sum(Coalesce(n.Amount, 0))
+ Sum(Coalesce(c.CopayAmt, 0))
+ Sum(Coalesce(r.RebillAmt, 0))
FROM Procedures p on
INNER JOIN Bills b ON p.ProcedureID = b.ProcedureID And b.Rebill = 0
INNER JOIN BillItems n ON b.BillID = n.BillID
INNER JOIN Bills c ON p.ProcedureID = b.ProcedureID And c.HasCopay = 1
INNER JOIN Bills r ON p.ProcedureID = b.ProcedureID And r.Rebill = 1
Where p.ProcedureID = #PROCEDUREID
AND p.StatusID = 3
Where b is the alias for the "normal" bill (with n for the bill items), c for the copayed bill and r for the rebilled.
The JOIN condition of b check only for b.Rebill = 0 to get the bill items for both the "normal" bills and the copaid ones.
I assume that no bill can have both HasCopay and Rebill to 1
The first thing I have noticed is that your query could fail if there is more than one billID for a procedureID (I don't know if this is possible in your design though). If it is and it happens then this part will fail:
(SELECT COALESCE((SELECT sum(bi.Amount) + COALESCE(b.CopayAmt,0)
FROM BillItems bi INNER JOIN Bills b ON b.BillID=bi.BillID
INNER JOIN Procedures p on p.ProcedureID=b.ProcedureID
WHERE b.ProcedureID=#PROCEDUREID
AND p.StatusID=3
AND b.HasCopay=1
GROUP BY b.billid,b.CopayAmt),0))
Due to the grouping, you will get more than one result returned in the subquery which is not allowed. I don't think this would affect my overall decision on how to alter your schema though.
I would consider turning this into a view, when you operate this as a scalar UDF it is executed once per row, when you use a view the definition is expanded out into the outer query and can be optimised accordingly.
You can also turn this into a single select, the first step would be to get the components common to all three subqueries:
SELECT p.ProcedureID,
bi.Amount,
b.HasCopay,
b.CopayAmt,
b.Rebill,
b.RebillAmt,
FROM ( SELECT BillID, Amount = SUM(Amount)
FROM Billitems
GROUP BY BillID
) bi
INNER JOIN Bills b
ON b.BillID = bi.BillID
INNER JOIN Procedures p
ON p.ProcedureID = b.ProcedureID
WHERE p.StatusID = 3;
You can now combine the logic of the 3 subqueries to get the same total:
SELECT p.ProcedureID,
Amount = CASE WHEN b.Rebill = 0 THEN bi.Amount ELSE 0 END,
CopayAmt = CASE WHEN b.HasCopay = 1 THEN b.CopayAmt ELSE 0 END,
RebillAmt = CASE WHEN b.Rebill = 1 THEN b.RebillAmt ELSE 0 END,
FROM ( SELECT BillID, Amount = SUM(Amount)
FROM Billitems
GROUP BY BillID
) bi
INNER JOIN Bills b
ON b.BillID = bi.BillID
INNER JOIN Procedures p
ON p.ProcedureID = b.ProcedureID
WHERE p.StatusID = 3;
You can now combine aggregate this and move to a view for reusability (I have moved the case statements above to an APPLY simply to avoid repeating the case statement in the Total column):
CREATE VIEW dbo.ProcTotalBilled
AS
SELECT p.ProcedureID,
Amount = SUM(calc.Amount),
CopayAmt = SUM(calc.CopayAmt),
Rebill = SUM(cal.RebillAmt),
Total = SUM(calc.Amount + calc.CopayAmt + cal.RebillAmt)
FROM ( SELECT BillID, Amount = SUM(Amount)
FROM Billitems
GROUP BY BillID
) bi
INNER JOIN Bills b
ON b.BillID = bi.BillID
INNER JOIN Procedures p
ON p.ProcedureID = b.ProcedureID
CROSS APPLY
( SELECT Amount = CASE WHEN b.Rebill = 0 THEN bi.Amount ELSE 0 END,
CopayAmt = CASE WHEN b.HasCopay = 1 THEN b.CopayAmt ELSE 0 END,
RebillAmt = CASE WHEN b.Rebill = 1 THEN b.RebillAmt ELSE 0 END
) calc
WHERE p.StatusID = 3
GROUP BY p.ProcedureID;
Then instead of using something like:
SELECT Total = dbo.fnProcTotalBilled(p.ProcedureID)
FROM dbo.Procedures p;
You would use
SELECT Total = ISNULL(ptb.Total, 0)
FROM dbo.Procedures p
LEFT JOIN dbo.ProcTotalBilled ptb
ON ptb.ProcedureID = p.ProcedureID;
Slightly more verbose, but I would be surprised if it didn't outperform your scalar UDF considerably
Can you show some sample data that covers the variety of samples? Also, the procedure I would expect is more a lookup table, and many people could be billed for the same procedure, thus the BillID would be critical to the function. What has been billed to a given person for a given procedure. The function would then have TWO parameters, one for the procedure you were interested in, and second for the patient's actual Bill.
Then, the inner queries would be restricted down to the one person's bill... Unless the procedure is unique per person having the procedure done, but that is unclear since DDL for procedure is not provided.
I have other thoughts on the querying, but would need clarification from above context so I do not throw crud here just to show a query.
After all, you need some values from bills along with the sum of bill items. You could simplify the query thus:
select sum
(
coalesce( case when b.rebill = 1 then b.rebillamt end , 0 ) +
coalesce( case when b.rebill = 0 then (select sum(bi.amount) from billitems bi where bi.billid = b.billid) end , 0 ) +
coalesce( case when b.rebill = 0 and b.hascopay = 1 then b.copayamt end , 0 )
) as value
from procedures p
inner join bills b on b.procedureid = p.procedureid
where p.ProcedureID = #PROCEDUREID
and p.StatusID = 3;
but T-SQL is buggy in this regard and complains with "Cannot perform an aggregate function on an expression containing an aggregate or a subquery". So you will have to use an inner and outer select instead.
select sum(value) as total
from
(
select
coalesce( case when b.rebill = 1 then b.rebillamt end , 0 ) +
coalesce( case when b.rebill = 0 then (select sum(bi.amount) from billitems bi where bi.billid = b.billid) end , 0 ) +
coalesce( case when b.rebill = 0 and b.hascopay = 1 then b.copayamt end , 0 ) as value
from procedures p
inner join bills b on b.procedureid = p.procedureid
where p.ProcedureID = #PROCEDUREID
and p.StatusID = 3
) allvalues;
You wouldn't even have to join table procedures with the bills table, but get the procedure id in an inner select. But I tried it with Serpiton's SQL fiddle (thanks to Serpiton for this) and T-SQL processes this slower than the join. You can try it anyhow. Maybe it is faster in your SQL Server version with your tables:
select sum(value) as total
from
(
select
coalesce( case when b.rebill = 1 then b.rebillamt end , 0 ) +
coalesce( case when b.rebill = 0 then (select sum(bi.amount) from billitems bi where bi.billid = b.billid) end , 0 ) +
coalesce( case when b.rebill = 0 and b.hascopay = 1 then b.copayamt end , 0 ) as value
from bills b
where b.procedureid =
(
select p.procedureid
from procedures p
where p.ProcedureID = #PROCEDUREID
and p.StatusID = 3
)
) allvalues;
EDIT: Here is one more option. Provided the given procedure id always exists and you only want to check if the status id is 3, then you can write the statement so that the bill select is only executed in the case of status id = 3. That doesn't have to be faster; it can even turn out to be slower. It's just one more option you can try.
select
case when p.StatusID = 3 then
(
select sum(value)
from
(
select
coalesce( case when b.rebill = 1 then b.rebillamt end , 0 ) +
coalesce( case when b.rebill = 0 then (select sum(bi.amount) from billitems bi where bi.billid = b.billid) end , 0 ) +
coalesce( case when b.rebill = 0 and b.hascopay = 1 then b.copayamt end , 0 ) as value
from bills b
where b.procedureid = p.procedureid
) allvalues
)
else
0
end as value
from procedures p
where p.ProcedureID = #PROCEDUREID;
A bit of asp code which in context does; timeline the product's sales performance based on the start year which product was online to the number of years active, for example a product published on 2000 would have the peak sales for couple of years but by 2004 there would be none...but code this is for all products in the system and grouping by year...(if you questions I can answer them or I think looking at code would give a fair idea on what it's doing !!)
Firstly it runs this query to get the data ( which I want to change ):
SELECT Products.ProductID, Products.AnticipatedSalesPattern, Convert(Char(10),Invoices.Date,103) As [date], Orders.Cost
FROM (Orders INNER JOIN Products ON Orders.ProductID = Products.ProductID)
INNER JOIN Invoices ON Orders.Invoice = Invoices.InvoiceNum
WHERE (Products.IsResource=1 AND Orders.Returned<>1 AND Orders.Cost<>0)
ORDER BY Products.ProductID, Invoices.Date;
Here is how it process the data(in asp), I know it's not effective just made it sample modeling data...
while not dbrecords.eof
RecordsCount = RecordsCount + 1
if dbrecords("ProductID") <> LastPID then
PIDsCount = PIDsCount + 1
LastPID = dbrecords("ProductID")
FirstSaleDate = dbrecords("Date")
if month(FirstSaleDate) < 9 then
FirstSchoolYear = Year(FirstSaleDate) - 1
else
FirstSchoolYear = Year(FirstSaleDate)
end if
end if
YearsSinceFirstSale = int(DateDiff("d",FirstSaleDate,dbrecords("Date"))/365)
MyArray(FirstSchoolYear-2000,YearsSinceFirstSale) = MyArray(FirstSchoolYear-2000,YearsSinceFirstSale) + dbrecords("Cost")
MyArrayTotals(FirstSchoolYear-2000) = MyArrayTotals(FirstSchoolYear-2000) + dbrecords("Cost")
TotalSales = TotalSales + dbrecords("Cost")
dbrecords.movenext
wend
Now what I like is to remove the whole process of putting the data into an array and the query to return the yearly data
What i'm curently stuck on is to write sql to pass the years since first sale and keeping track of each products (how to implement that in sql), I beleieve parameters are accepted via CTE statements...
Also if you think that this can be achieved some other way much effieciently that would be great too
Any help would be greatly appreciated...
So far I got to;
This doesn't provide with the identical data produced by the asp...and also let me know if there is a better way to do this...
DROP VIEW inline_view;
GO
CREATE VIEW inline_view AS
SELECT p2.ProductID, p2.AnticipatedSalesPattern, Invoices.Date, Orders.Cost, case when MONTH(date) < 9 then YEAR(date)-1 else YEAR(date) end as year,
(select top 1 case when MONTH(i.date) < 9 then YEAR(i.date)-1 else YEAR(i.date) end from Invoices i inner join Orders o on i.InvoiceNum=o.Invoice
inner join Products p on o.ProductID = p.ProductID where p2.ProductID = p.productID order by i.Date asc) as startsale
FROM (Orders INNER JOIN Products p2 ON Orders.ProductID = p2.ProductID)
INNER JOIN Invoices ON Orders.Invoice = Invoices.InvoiceNum
WHERE (p2.IsResource=1 AND Orders.Returned<>1 AND Orders.Cost<>0)
;
GO
SELECT * FROM (
select SUM(cost) sum, datediff(y, startsale, year) as year, startsale from inline_view
group by year, startsale
) as data
PIVOT
(
sum(sum)
--years after product is online
FOR year IN ([1], [2], [3],[4],[5],[6],[7],[8],[9], [10],[11], [12])
) as pvt;
I have this query, I have an Acquisitions table (Incoming) and Invoice Table (Outgoing) I am trying to calculate the Value on Hand by taking the AVG dbo.tblAcqDetail.AcqPrice * the QtyOnHand which is figured taking Incoming - Outgoing. When I add a line item on the Acquisitions table that has a different cost for the same item, the AVG is not grouping and showing instead two line items example below. The shipment side works fine with multiple line items...
Product QtyIn QtyOut On_Hand AVGPrice Value_OnHand
Screws 100 30 70 25.0000 1750.0000
Nuts 50 10 40 40.0000 1600.0000
Nuts 100 10 90 50.0000 4500.0000
Bolts 100 20 80 100.000 8000.0000
.
SELECT DISTINCT
dbo.tblProduct.Product ,
SUM(DISTINCT dbo.tblAcqDetail.AcqQuantity) AS QtyIN ,
SUM(DISTINCT dbo.tblInvoiceDetail.InvQuantity) AS QtyOut ,
SUM(DISTINCT dbo.tblAcqDetail.AcqQuantity)
- SUM(DISTINCT dbo.tblInvoiceDetail.InvQuantity) AS On_Hand ,
dbo.tblAcqDetail.AcqPrice ,
dbo.tblAcqDetail.AcqPrice
* ( SUM(DISTINCT dbo.tblAcqDetail.AcqQuantity)
- SUM(DISTINCT dbo.tblInvoiceDetail.InvQuantity) ) AS Value_Hand
FROM dbo.tblAcq
INNER JOIN dbo.tblAcqDetail ON dbo.tblAcq.acqID = dbo.tblAcqDetail.AcqID
INNER JOIN dbo.tblProduct ON dbo.tblAcqDetail.ProductID = dbo.tblProduct.ProductID
INNER JOIN dbo.tblInvoiceDetail ON dbo.tblProduct.ProductID = dbo.tblInvoiceDetail.ProductID
INNER JOIN dbo.tblInvoice ON dbo.tblInvoiceDetail.InvoiceID = dbo.tblInvoice.InvoiceID
GROUP BY dbo.tblProduct.Product ,
dbo.tblAcqDetail.AcqPrice
Basing on PinnyM's answer, you don't need DISTINCT, I re-write your query as following using table alias:
SELECT
P.Product ,
SUM( AcD.AcqQuantity) AS QtyIN ,
SUM( InD.InvQuantity) AS QtyOut ,
SUM( AcD.AcqQuantity)
- SUM( InD.InvQuantity) AS On_Hand ,
AcD.AcqPrice ,
AcD.AcqPrice
* ( SUM(AcD.AcqQuantity)
- SUM( InD.InvQuantity) ) AS Value_Hand
FROM dbo.tblAcq Ac
INNER JOIN dbo.tblAcqDetail AcD ON Ac.acqID = AcD.AcqID
INNER JOIN dbo.tblProduct P ON AcD.ProductID = P.ProductID
INNER JOIN dbo.tblInvoiceDetail InD ON P.ProductID = InD.ProductID
INNER JOIN dbo.tblInvoice Inv ON InD.InvoiceID = Inv.InvoiceID
GROUP BY P.Product ,
AcD.AcqPrice
By reading this query, I don't understand why you need table dbo.tblInvoice, it is not part of aggregation.
The reason you still see different product is because you group by two columns P.Product, AcD.AcqPrice, not only Product, in your return result you can see their combination is unique.
To be mathematically accurate, you should not be using SUM(DISTINCT fieldname), but just SUM(fieldname). Otherwise, it will eliminate entries that happen to have the same quantity.
For that matter, you shouldn't be using DISTINCT at the beginning of your query either, GROUP BY already handles that.
If you believe you have duplicate rows being returned by your JOINs (which you shouldn't really if you're doing it right), wrap them in a subquery using DISTINCT before trying to aggregate.
As an example, a subquery to eliminate duplicates can be written like so:
SELECT
Product ,
SUM(AcqQuantity) AS QtyIN ,
SUM(InvQuantity) AS QtyOut ,
SUM(AcqQuantity)
- SUM(InvQuantity) AS On_Hand ,
AcqPrice ,
AcqPrice
* ( SUM(AcqQuantity)
- SUM(InvQuantity) ) AS Value_Hand
FROM (SELECT DISTINCT
dbo.tblProduct.Product ,
dbo.tblAcqDetail.AcqQuantity,
dbo.tblInvoiceDetail.InvQuantity,
dbo.tblAcqDetail.AcqPrice
FROM
dbo.tblAcqDetail
INNER JOIN dbo.tblProduct ON dbo.tblAcqDetail.ProductID = dbo.tblProduct.ProductID
INNER JOIN dbo.tblInvoiceDetail ON dbo.tblProduct.ProductID = dbo.tblInvoiceDetail.ProductID
INNER JOIN dbo.tblInvoice ON dbo.tblInvoiceDetail.InvoiceID = dbo.tblInvoice.InvoiceID ) productInfo
GROUP BY Product, AcqPrice
First, I will explain the what is being captured. User's have a member level associated with their accounts (Bronze, Gold, Diamond, etc). A nightly job needs to run to calculate the orders from today a year back. If the order total for a given user goes over or under a certain amount their level is upgraded or downgraded. The table where the level information is stored will not change much, but the minimum and maximum amount thresholds may over time. This is what the table looks like:
CREATE TABLE [dbo].[MemberAdvantageLevels] (
[Id] int NOT NULL IDENTITY(1,1) ,
[Name] varchar(255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[MinAmount] int NOT NULL ,
[MaxAmount] int NOT NULL ,
CONSTRAINT [PK__MemberAd__3214EC070D9DF1C7] PRIMARY KEY ([Id])
)
ON [PRIMARY]
GO
I wrote a query that will group the orders by user for the year to date. The query includes their current member level.
SELECT
Sum(dbo.tbh_Orders.SubTotal) AS OrderTotals,
Count(dbo.UserProfile.UserId) AS UserOrders,
dbo.UserProfile.UserId,
dbo.UserProfile.UserName,
dbo.UserProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent as IsCurrentLevel,
dbo.MemberAdvantageLevels.Id as MemberLevelId,
FROM
dbo.tbh_Orders
INNER JOIN dbo.tbh_OrderStatuses ON dbo.tbh_Orders.StatusID = dbo.tbh_OrderStatuses.OrderStatusID
INNER JOIN dbo.UserProfile ON dbo.tbh_Orders.CustomerID = dbo.UserProfile.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ON dbo.UserProfile.UserId = dbo.UserMemberAdvantageLevels.UserId
INNER JOIN dbo.MemberAdvantageLevels ON dbo.UserMemberAdvantageLevels.MemberAdvantageLevelId = dbo.MemberAdvantageLevels.Id
WHERE
dbo.tbh_OrderStatuses.OrderStatusID = 4 AND
(dbo.tbh_Orders.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE()) and IsCurrent = 1
GROUP BY
dbo.UserProfile.UserId,
dbo.UserProfile.UserName,
dbo.UserProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent,
dbo.MemberAdvantageLevels.Id
So, I need to check the OrdersTotal and if it exceeds the current level threshold, I then need to find the Level that fits their current order total and create a new record with their new level.
So for example, lets say jon#doe.com currently is at bronze. The MinAmount for bronze is 0 and the MaxAmount is 999. Currently his Orders for the year are at $2500. I need to find the level that $2500 fits within and upgrade his account. I also need to check their LevelAchievmentDate and if it is outside of the current year we may need to demote the user if there has been no activity.
I was thinking I could create a temp table that holds the results of all levels and then somehow create a CASE statement in the query above to determine the new level. I don't know if that is possible. Or, is it better to iterate over my order results and perform additional queries? If I use the iteration pattern I know i can use the When statement to iterate over the rows.
Update
I updated my Query A bit and so far came up with this, but I may need more information than just the ID from the SubQuery
Select * into #memLevels from MemberAdvantageLevels
SELECT
Sum(dbo.tbh_Orders.SubTotal) AS OrderTotals,
Count(dbo.AZProfile.UserId) AS UserOrders,
dbo.AZProfile.UserId,
dbo.AZProfile.UserName,
dbo.AZProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent as IsCurrentLevel,
dbo.MemberAdvantageLevels.Id as MemberLevelId,
(Select Id from #memLevels where Sum(dbo.tbh_Orders.SubTotal) >= #memLevels.MinAmount and Sum(dbo.tbh_Orders.SubTotal) <= #memLevels.MaxAmount) as NewLevelId
FROM
dbo.tbh_Orders
INNER JOIN dbo.tbh_OrderStatuses ON dbo.tbh_Orders.StatusID = dbo.tbh_OrderStatuses.OrderStatusID
INNER JOIN dbo.AZProfile ON dbo.tbh_Orders.CustomerID = dbo.AZProfile.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ON dbo.AZProfile.UserId = dbo.UserMemberAdvantageLevels.UserId
INNER JOIN dbo.MemberAdvantageLevels ON dbo.UserMemberAdvantageLevels.MemberAdvantageLevelId = dbo.MemberAdvantageLevels.Id
WHERE
dbo.tbh_OrderStatuses.OrderStatusID = 4 AND
(dbo.tbh_Orders.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE()) and IsCurrent = 1
GROUP BY
dbo.AZProfile.UserId,
dbo.AZProfile.UserName,
dbo.AzProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent,
dbo.MemberAdvantageLevels.Id
This hasn't been syntax checked or tested but should handle the inserts and updates you describe. The insert can be done as single statement using a derived/virtual table which contains the orders group by caluclation. Note that both the insert and update statement be done within the same transaction to ensure no two records for the same user can end up with IsCurrent = 1
INSERT UserMemberAdvantageLevels (UserId, MemberAdvantageLevelId, IsCurrent,
LevelAchiementAmount, LevelAchievmentDate)
SELECT t.UserId, mal.Id, 1, t.OrderTotals, GETDATE()
FROM
(SELECT ulp.UserId, SUM(ord.SubTotal) OrderTotals, COUNT(ulp.UserId) UserOrders
FROM UserLevelProfile ulp
INNER JOIN tbh_Orders ord ON (ord.CustomerId = ulp.UserId)
WHERE ord.StatusID = 4
AND ord.AddedDate BETWEEN DATEADD(year,-1,GETDATE()) AND GETDATE()
GROUP BY ulp.UserId) AS t
INNER JOIN MemberAdvantageLevels mal
ON (t.OrderTotals BETWEEN mal.MinAmount AND mal.MaxAmount)
-- Left join needed on next line in case user doesn't currently have a level
LEFT JOIN UserMemberAdvantageLevels umal ON (umal.UserId = t.UserId)
WHERE umal.MemberAdvantageLevelId IS NULL -- First time user has been awarded a level
OR (mal.Id <> umal.MemberAdvantageLevelId -- Level has changed
AND (t.OrderTotals > umal.LevelAchiementAmount -- Acheivement has increased (promotion)
OR t.UserOrders = 0)) -- No. of orders placed is zero (de-motion)
/* Reset IsCurrent flag where new record has been added */
UPDATE UserMemberAdvantageLevels
SET umal1.IsCurrent=0
FROM UserMemberAdvantageLevels umal1
INNER JOIN UserMemberAdvantageLevels umal2 On (umal2.UserId = umal1.UserId)
WHERE umal1.IsCurrent = 1
AND umal2.IsCurrent = 2
AND umal1.LevelAchievmentDate < umal2.LevelAchievmentDate)
One approach:
with cte as
(SELECT Sum(o.SubTotal) AS OrderTotals,
Count(p.UserId) AS UserOrders,
p.UserId,
p.UserName,
p.Email,
l.Name,
l.MinAmount,
l.MaxAmount,
ul.LevelAchievmentDate,
ul.LevelAchiementAmount,
ul.IsCurrent as IsCurrentLevel,
l.Id as MemberLevelId
FROM dbo.tbh_Orders o
INNER JOIN dbo.UserProfile p ON o.CustomerID = p.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ul ON p.UserId = ul.UserId
INNER JOIN dbo.MemberAdvantageLevels l ON ul.MemberAdvantageLevelId = l.Id
WHERE o.StatusID = 4 AND
o.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE() and
IsCurrent = 1
GROUP BY
p.UserId, p.UserName, p.Email, l.Name, l.MinAmount, l.MaxAmount,
ul.LevelAchievmentDate, ul.LevelAchiementAmount, ul.IsCurrent, l.Id)
select cte.*, ml.*
from cte
join #memLevels ml
on cte.OrderTotals >= ml.MinAmount and cte.OrderTotals <= ml.MaxAmount