SQL Server : query with incorrect SUM result

SQL Server : query with incorrect SUM result - sql

My SQL Server query is supposed to get a count of each customer's number of orders, and the SUM of their reward points. For most customers the result is accurate (most people only have one or two orders). For a few people, the result is wildly off.
Here's the original query:
SELECT
c.email,
c.lastlogindate,
c.custenabled,
c.maillist,
d.GroupName,
COUNT(o.orderid) AS orders,
SUM(r.points) AS total_points
FROM
((customers c
LEFT JOIN orders o ON (c.contactid = o.ocustomerid AND o.ostep = 'step 5')
)
LEFT JOIN discount_group d ON c.discount = d.id
)
LEFT JOIN
customer_rewards r ON r.contactid = c.contactid
WHERE
c.last_update > '2014-02-01'
OR c.lastlogindate > '2014-02-01'
GROUP BY
c.email, c.custenabled, c.maillist, c.lastlogindate, d.GroupName;
For one example, customerid 1234 has placed 21 orders, totaling 2724 points. This will report that he has placed 441 orders (21 * 21) valued at 57204 points (2724 * 21). The raw data is fine, but each order row is being duplicated by the amount of orders they placed (but not for most customers...)
If I change the query to this:
SELECT
o.orderid,
c.email,
COUNT(o.orderid) AS orders,
SUM(r.points) AS total_points
FROM
((customers c
INNER JOIN orders o ON (c.contactid = o.ocustomerid AND o.ostep = 'step 5')
)
)
INNER JOIN
customer_rewards r ON r.contactid = c.contactid
WHERE
c.last_update > '2014-02-01'
OR c.lastlogindate > '2014-02-01'
GROUP BY
c.email, o.orderid;
The aggregate functions are calculated properly, but it will display one result for each order placed. So it will show "Customer 1234/21 orders/2724 points", 21 times.
I did remove the 'discount_group' join in the second query, but that was just to make it easier to read and change. That hasn't had any effect on results.

Here is a solution using common table expressions to aggregate your results.
Note: this will not show customers that have 0 orders or 0 rewards points. If you would like to show these, change the INNER JOINs to LEFT JOINs
WITH cteOrders AS
(
SELECT o.ocustomerid, orderCount = count(*)
FROM orders o
WHERE o.ostep = 'step 5'
GROUP BY o.ocustomerid
)
, cteRewards as
(
SELECT cr.contactid, total_points = SUM(cr.points)
FROM customer_rewards cr
GROUP BY cr.contactid
)
SELECT
c.email,
o.orderCount as orders,
r.total_points
FROM
customers c
INNER JOIN cteOrders o ON c.contactid = o.ocustomerid
INNER JOIN cteRewards r ON r.contactid = c.contactid
WHERE
c.last_update > '2014-02-01'
OR c.lastlogindate > '2014-02-01'
;
Or using subqueries:
SELECT
c.email,
o.orderCount as orders,
r.total_points
FROM
customers c
INNER JOIN
(
SELECT o.ocustomerid, orderCount = count(*)
FROM orders o
WHERE o.ostep = 'step 5'
GROUP BY o.ocustomerid
) o ON c.contactid = o.ocustomerid
INNER JOIN
(
SELECT cr.contactid, total_points = SUM(cr.points)
FROM customer_rewards cr
GROUP BY cr.contactid
) r ON r.contactid = c.contactid
WHERE
c.last_update > '2014-02-01'
OR c.lastlogindate > '2014-02-01'
;

Related

How to select distinct items without having to use in group by clause?

I am trying to find sum of some columns using SQL like this:
select distinct c.customer,
c.customer_id,
sum(d.delay) as delay,
sum(d.delayed_amount) as delay_amt,
pd.product
from product pd
inner join mfg_company mfg on pd.product_id=mfg.product_id
inner join store s on mfg.store_id = s.store_id
inner join customer c on s.customer = c.customer_id
join delay_detail d on pd.product_id = d.material
where d.product_mfg_id = 466
group by c.customer,customer_id
order by c.customer,c.customer_id
The problem is mfg_company has duplicate product_id's(multiple mappings) ,So when I am trying to find the sum it's including those duplicates too.
Using product_id in group by clause doesn't help the result I want to see.So how to join only on distinct product_id's?

You can try below query if this helps -
select distinct c.customer
,c.customer_id
,sum(d.delay) as delay
,sum(d.delayed_amount) as delay_amt
,pd.product
from product pd
inner join (select distinct product_id
,store_id
from mfg_company) mfg on pd.product_id=mfg.product_id
inner join store s on mfg.store_id = s.store_id
inner join customer c on s.customer = c.customer_id
join delay_detail d on pd.product_id = d.material
where d.product_mfg_id = 466
group by c.customer,customer_id
order by c.customer,c.customer_id

I think the solution to your problem is to pre-aggregate the delays. It is entirely unclear if you want the product in the result set. Assuming you do not:
select c.customer, c.customer_id,
sum(d.delay) as delay, sum(d.delay_amt) as delay_amt
from product pd join
mfg_company mfg
on pd.product_id = mfg.product_id join
store s
on mfg.store_id = s.store_id
customer c
on s.customer = c.customer_id join
(select d.material, sum(d.delay) as delay, sum(d.delayed_amount) as delay_amt
from delay_detail d
group by d.material
) d
on pd.product_id = d.material
where d.product_mfg_id = 466
group by c.customer, customer_id
order by c.customer, c.customer_id;
Note that using select distinct with group by is almost never needed.

How to query MAX(SUM(relation)) in Postgresql?

I have read several related threads on StackOverflow but none of them solves my problem.
I have a Sales database as where I need to query for the customer who spent the most amount in buying stuff.
For that, I need to find who bought which product using
SELECT sum(qty*rate)
AS exp from salesdetails as s JOIN sales as ss on (ss.invno = s.invno)
JOIN customer as c ON (ss.customerno = c.custno) GROUP BY(c.name)
ORDER BY sum(qty*rate);
It returns a table with the name of the person and what he spent in ascending order as
Output of command above:
While what I actually need is to only print a tuple when sum(qty*rate) is maximum. Currently I'm getting the results by sub querying like:
SELECT name, sum(qty*rate) FROM salesdetails as s JOIN sales as ss on (ss.invno=s.invno)
JOIN customer as c ON (ss.customerno = c.custno) GROUP BY(c.name)
HAVING sum(qty*rate) IN (SELECT max(exp) FROM (SELECT sum(qty*rate)
AS exp from salesdetails as s JOIN sales as ss on (ss.invno = s.invno)
JOIN customer as c ON (ss.customerno = c.custno) GROUP BY(c.name) ORDER BY sum(qty*rate)) aa);
Expected Output:
Is there any shorter way to get to the output?

Are you looking for something like this:
select *
from (
SELECT c.Name, sum(qty*rate)
AS exp from salesdetails as s JOIN sales as ss on (ss.invno = s.invno)
JOIN customer as c ON (ss.customerno = c.custno)
GROUP BY(c.name)
ORDER BY sum(qty*rate) desc
) t
limit 1;

You want row_number() or distinct on:
SELECT DISTINCT ON (c.name) c.name, sum(qty*rate) AS exp
FROM salesdetails s JOIN
sales ss
on (ss.invno = s.invno) JOIN
customer c
ON (ss.customerno = c.custno)
GROUP BY c.name
ORDER BY c.name, sum(qty*rate) DESC;

Finding count in multiple tables

I need to find the amount of times a customer appears in the Orders and Requests tables respectively. However this script is producing the same count value for both places where COUNT is used. The value cannot possibly be the same so what am I doing wrong?
SELECT o.CustomerID,
COUNT(o.CustomerID) as OrdersPerCustomer,
COUNT(r.CustomerID) as RequestsPerCustomer
FROM Orders o
INNER JOIN [Customers] c on c.ID = o.CustomerID
INNER JOIN [Request] r on r.CustomerID = c.ID
GROUP BY o.CustomerID

You are multiplying the number of order and request records. I.e. by joining the tables, you get for, say, 3 orders and 4 requests for a customer 12 result rows. As the IDs will never be null in a record, COUNT(o.CustomerID) and COUNT(r.CustomerID) are just COUNT(*) (12 in my example, and not 3 and 4 as you expected).
The easiest approach:
select
customer_id,
(select count(*) from orders o where o.customerid = c.id) as o_count,
(select count(*) from request r where r.customerid = c.id) as r_count
from customers c;
The same with subqueries (derived tables) in the from clause:
select
customer_id,
coalesce(o.total, 0) as o_count,
coalesce(r.total, 0) as r_count
from customers c
left join (select customerid, count(*) as total from orders group by customerid) o
on o.customerid = c.id
left join (select customerid, count(*) as total from request group by customerid) r
on r.customerid = c.id;
When aggregating from multiple tables, always aggregate first and join then.

Oracle: How to use left outer join to get all entries from left table and satisfying the condition in Where clause

I have the tables below.
Client:
ID | clientName
--------------
1 A1
2 A2
3 A3
Order:
OrdID clientID status_cd
------------------------
100 1 DONE
101 1 SENT
102 3 SENT
Status:
status_cd status_category
DONE COMPL
SENT INPROG
I have to write a query to get all the clients and count of order against all of them, whether the client_id exists in Order table or not and has the orders with "COMPL" as status category.
In this case, I am using the query below but it's filtering out the clients which has no orders. I want to get all clients such that the expected result is as below.
Query:
select c.ID, count(distinct o.OrdID)
from client c, order o, status s
where c.ID=o.client_id(+)
and o.status_cd=s.status_cd where s.status_category='COMPL'
group by c.ID
Expected result:
C.ID count(distinct o.OrdID)
----------------------------
1 1
2 0
3 0
Can someone please help me with this? I know, in this case, left outer join is behaving like inner join when I am using where clause, but is there any other way to achieve the results above?

This can be dealt with a lot easier when using an explicit join operator:
select c.ID, count(distinct s.status_cd)
from client c
left join orders o on o.clientid = c.id
left join status s on s.status_cd = o.status_cd and s.status_category='COMPL'
group by c.ID;
The above assumes that orders.status_cd is defined as not null
Another option is to move the join between orders and status in a derived table:
select c.ID, count(distinct o.ordid)
from client c
left join (
select o.ordid
from orders o
join status s on s.status_cd = o.status_cd
where s.status_category='COMPL'
) o on o.clientid = c.id
group by c.ID;
The above "states" more clearly (at least in my eyes) that only orders within that status category are of interest compared to the first solution

As usual, there are lots of ways to express this requirement.
Try ANSI join people will hate me an vote down this answer ;) :
select c.ID, count(distinct o.OrdID)
from client c, order o, status s
where c.ID = o.client_id(+)
and o.status_cd = s.status_cd
and s.status_category='COMPL'
group by c.ID
;
or
select c.ID
, nvl((select count(distinct o.OrdID)
from order o, status s
where c.ID = o.client_id
and o.status_cd = s.status_cd
and s.status_category='COMPL'
), 0) as order_count
from client c
group by c.ID
;
or
with ord as
(select client_id, count(distinct o.OrdID) cnt
from order o, status s
where 1=1
and o.status_cd = s.status_cd
and s.status_category='COMPL'
group by client_id
)
select c.ID
, nvl((select cnt from ord o where c.ID = o.client_id ), 0) as order_count
from client c
group by c.ID
;
or
...

The second WHERE should be an AND.
Other than that, you need the plus sign, (+), marking left outer join, in the second join condition as well. It is not enough to left-outer-join the first two tables.
Something like
select c.ID, count(distinct o.OrdID)
from client c, order o, status s
where c.ID=o.client_id(+)
and o.status_cd=s.status_cd(+) AND s.status_category='COMPL'
-- ^^^ ^^^ (not WHERE)
group by c.ID
Of course, it would be much better if you used proper (SQL Standard) join syntax.

How to calculate a total of values from other tables in a column of the select statement

I have three tables:
CustOrder: id, CreateDate, Status
DenominationOrder: id, DenID, OrderID
Denomination: id, amount
I want to create a view based upon all these tables but there should be an additional column i.e. Total should be there which can calculate the sum of the amount of each order.
e.g.
order 1 total denominations 3, total amount = 250+250+250=750
order 2 total denominations 2, total amount = 250+250=500
Is it possible?

I try to guess your table relations (and data too, you did not provide any sample):
SELECT co.id,
COUNT(do.DenID) AS `Total denominations`,
SUM(d.amount) AS `Total amount`
FROM CustOrder co
INNER JOIN DenominationOrder do ON co.id = do.OrderId
INNER JOIN Denomination d ON do.DenId = d.id
GROUP BY co.id

Try this:
SELECT o.CreateDate, COUNT(o.id), SUM(d.amount) AS 'Total Amount'
FROM CustOrder o
INNER JOIN DenominationOrder do ON o.id = do.OrderID
INNER JOIN Denomination d ON do.DenId = d.id
GROUP BY o.CreateDate
DEMO
Another way to do this, by using CTE, like this:
;WITH CustomersTotalOrders
AS
(
SELECT o.id, SUM(d.amount) AS 'TotalAmount'
FROM CustOrder o
INNER JOIN DenominationOrder do ON o.id = do.OrderID
INNER JOIN Denomination d ON do.DenId = d.id
GROUP BY o.id
)
SELECT o.id, COUNT(ot.id) AS 'Orders Count', ot.TotalAmount
FROM CustOrder o
INNER JOIN CustomersTotalOrders ot on o.id = ot.id
INNER JOIN DenominationOrder do ON ot.id = do.OrderID
INNER JOIN Denomination d ON do.DenId = d.id
GROUP BY o.id, ot.TotalAmount
This will give you:
id | Orders Count | Total Amount
-------+---------------+-------------
1 3 750
2 2 500
DEMO using CTE

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server : query with incorrect SUM result - sql

Related

How to select distinct items without having to use in group by clause?

How to query MAX(SUM(relation)) in Postgresql?

Finding count in multiple tables

Oracle: How to use left outer join to get all entries from left table and satisfying the condition in Where clause

How to calculate a total of values from other tables in a column of the select statement

Categories

Resources