Left join returning bad values - sql

I'm not very good with SQL queries but I attempted to write this one:
SELECT DATEPART(YY,Orders.OrderDate) as Year,
DATEPART(MM,Orders.OrderDate) as Month,
(SUM(case when OrderDetails.ProductCode = 'XXX' then
OrderDetails.ProductPrice else 0 end) + SUM(Orders.Total))
AS XXX
FROM Orders
LEFT JOIN OrderDetails ON Orders.OrderID = OrderDetails.OrderID
WHERE Orders.OrderStatus = 'Shipped'
GROUP BY DATEPART(MM,Orders.OrderDate), DATEPART(YY,Orders.OrderDate)
ORDER BY DATEPART(YY,Orders.OrderDate),DATEPART(MM,Orders.OrderDate)
The OrderDetails is linked to the Orders table by the field OrderID. In this SELECT query I'm trying to get the SUM of OrderDetails.ProductPrice when the OrderDetails.ProductCode is XXX and add it to the Orders.Total to get total amounts for each month/year.
The query is working except for one problem (that's probably either a amateur mistake or has been worked around several times). When performing the LEFT JOIN, the OrderDetails table can have multiple records linked to the Orders table which is throwing bad results in the SUM(Orders.Total). I've isolated that issue I just can't seem to fix it.
Can anybody point me in the right direction?

If we assume that the XXX product only appears at most once for each order, then this should work:
SELECT year(o.OrderDate) as Year, month(o.OrderDate) as Month,
(COALESCE(SUM(od.ProductPrice), 0) + SUM(o.Total)) AS XXX
FROM Orders o LEFT JOIN
OrderDetails od
ON o.OrderID = od.OrderID AND od.ProductCode = 'XXX'
WHERE o.OrderStatus = 'Shipped'
GROUP BY year(o.OrderDate), month(o.OrderDate)
ORDER BY year(o.OrderDate), month(o.OrderDate);
If it can appear multiple times, then move that part of the aggregation to a subquery:
SELECT year(o.OrderDate) as Year, month(o.OrderDate) as Month,
(COALESCE(XXX, 0) + SUM(o.Total)) AS XXX
FROM Orders o LEFT JOIN
(SELECT od.OrderId, SUM(od.ProductPrice) as XXX
FROM OrderDetails od
WHERE od.ProductCode = 'XXX'
GROUP BY od.OrderId
) od
ON o.OrderID = od.OrderID
WHERE o.OrderStatus = 'Shipped'
GROUP BY year(o.OrderDate), month(o.OrderDate)
ORDER BY year(o.OrderDate), month(o.OrderDate);

Related

sql use aggregate function that counts a unique value with group by using inner joins

I searched and found similar questions online but not my particular one, they all use where or having clause.If theres one similar to mine please link it. It's a 2 part question and I have the first one done. Thank you in advance.
Okay so heres the question, part 1
"Find by customer, the total cost and the total discounted cost for each product on the order ?".
It also asks to use inner joins to find the customer and order it a specific way. Below is the answer.
SELECT
C.companyname, O.orderid, O.orderdate, P.productname,
OD.orderid, OD.unitprice, OD.qty, OD.discount,
(OD.unitprice * OD.qty - (OD.qty * OD.discount)) AS TotalCost,
(OD.qty * OD.discount) AS TotalDiscountedCost
FROM
Sales.Customers AS C
INNER JOIN
Sales.Orders AS O ON C.custid = O.custid
INNER JOIN
Sales.OrderDetails OD ON O.orderid = OD.orderid
INNER JOIN
Production.Products as P ON OD.productid = P.productid
ORDER BY
C.companyname, O.orderdate;
Now the second question is to
follow up and resume the first one by "customer and the order date year, the total cost and the total discounted cost on the order ?". It also asks for this, "Project following columns in the select clause as.
GroupByColumns.companyname
GroupByColumns.OrderdateYear
AggregationColumns.CountNumberOfIndividualOrders
AggregationColumns.CountNumberOfProductsOrders
AggregationColumns.TotalCost
AggregationColumns.TotalDiscountedCost
Finally to order by company name and orderdateYear( which are groups). Where im stuck is how to count the specific orders of qty that equal 1 as an aggregate function in the SELECT clause. I know it has to use the aggregate function COUNT because of the GROUP BY, just don't know how to. This is what I have.
SELECT
C.companyname, YEAR(O.orderdate) AS orderyear,OD.qty,
-- Where in the count function or if theres another way do I count all the
--single orders
--COUNT(OD.qty) AS indiviualorders,
(OD.unitprice * OD.qty - (OD.qty * OD.discount)) AS TotalCost,
(OD.qty * OD.discount) AS TotalDiscountedCost
FROM
Sales.Customers AS C
INNER JOIN
Sales.Orders AS O ON C.custid = O.custid
INNER JOIN
Sales.OrderDetails OD ON O.orderid = OD.orderid
INNER JOIN
Production.Products as P ON OD.productid = P.productid
GROUP BY
C.companyname, YEAR(O.orderdate)
ORDER BY
C.companyname, O.orderdate;
You case use a case statement inside a sum
SUM(CASE WHEN <xyz> THEN 1 ELSE 0 END)
But for the count of unique orders, use SELECT(DISTINCT ) on a key that is unique in the order table
SELECT COUNT(DISTINCT O.OrderID) As DistinctOrders FROM Table

Creating an SQL query that eliminates duplicate months/year

Hello Stack Overflow community - hopefully i'm on the right track with this one, but i'm trying to write a query where a report out shows the number of orders placed by month/year. The report currently brings up all the days where i'm trying to join them all by month/year collectively. Hopefully this makes sense, i'm pretty new to this, be gentle please ;)
select distinct month(o.orderdate) 'Month',
year(o.orderdate) 'Year', sum(od.Quantity) as Orders
from OrderDetails od
join Products p
on od.ProductID = p.ProductID
join Orders o
on od.OrderID = o.OrderID
group by o.orderdate
Order by year, month asc;
You need to group by what you want to define each row. In your case, that is the year and month:
select year(orderdate) as yyyy, month(o.orderdate) as mm,
sum(od.Quantity) as Orders
from OrderDetails od join
Products p
on od.ProductID = p.ProductID join
Orders o
on od.OrderID = o.OrderID
group by year(o.orderdate), month(o.orderdate)
Order by yyyy, mm asc;
Notes:
I changed the column names to yyyy and mm so they do not conflict with the reserved words year and month.
Don't use single quotes for column aliases. This is a bad habit that will eventually cause problems in your query.
I always use as for column aliases (to help prevent missing comma mistakes), but never for table aliases.
The product table is not needed for this query.
Edit: If you want a count of orders, which your query suggests, then this might be more appropriate:
select year(o.orderdate) as yyyy, month(o.o.orderdate) as mm,
count(*) as Orders
from orders o
group by year(o.orderdate), month(o.orderdate)
Order by yyyy, mm asc;
You have to group by month and year
select distinct month(o.orderdate) 'Month',
year(o.orderdate) 'Year', sum(od.Quantity) as Orders
from OrderDetails od
join Products p
on od.ProductID = p.ProductID
join Orders o
on od.OrderID = o.OrderID
group by month(o.orderdate), year(o.orderdate)
Order by [Year],[month]

Finding out when summed values reached a certain checkpoint in SQL

First of all: I've found some possible answers to my problem in the previously asked questions, but I've encountered problems with getting them to work properly. I know the question was already asked, but the answers always were working code with little to no explaination on the method used.
So: I've got to find out when a customer reached the VIP status, which is when value of his orders exceeds 50 000. I've got 2 tables: one with orderid, customerid and orderdate, and second with orderid, quantity and unitprice.
The result of the query I'm writing should be 3 colums wide, one with the customerid, one with true/false named "is VIP?", and the third is the date of getting the VIP status(which is the date of order that summed with the previous ones gave a result of over 50 000)-the last one should be blank if the customer didn't reach the VIP status
select o.customerid, sum(od.quantity*od.unitprice),
case
when sum(od.quantity*od.unitprice)>50000 then 'VIP'
else 'Normal'
end as 'if vip'
from
orders o join [Order Details] od on od.orderid=o.orderid
group by o.customerid
That is as far as I got with the code, it returns the status of the customer and now I need to get the date when that happend.
.
You can easily calculate a running total using a window functions:
select o.customerid,
o.orderdate,
sum(od.quantity*od.unitprice) over (partition by o.customerid order by orderdate) as running_sum,
from orders o
join Order_Details od on od.orderid = o.orderid
order by customer_id, orderdate;
Now you need to find a way to detect the first row, where the running total exceeds the threshold:
The following query starts numbering the rows in a descending manner once the threshold is reached. Which in turn means the row with then number 1 is the first one to cross the threshold:
with totals as (
select o.customerid,
o.orderdate,
sum(od.quantity*od.unitprice) over (partition by o.customerid order by orderdate) as running_sum,
case
when
sum(od.quantity*od.unitprice) over (partition by o.customerid order by orderdate) > 50000 then row_number() over (partition by o.customerid order by orderdate desc)
else 0
end as rn
from orders o
join Order_Details od on od.orderid = o.orderid
)
select *
from totals
where rn = 1
order by customerid;
SQLFiddle example: http://sqlfiddle.com/#!6/a7f18/3
You get the cumulative sum using an Analytic Function, SUM OVER. And then add an aggregate to find the minimum date:
with cte as
( select o.customerid,
o.orderdate,
case when sum(od.quantity*od.unitprice) -- running total
over (partition by o.customerid
order by orderdate
rows unbounded preceding) > 50000
then 'Y'
else 'N'
end as VIP
from orders o
join Order_Details od on od.orderid = o.orderid
)
select customerid,
MAX(VIP) AS "isVIP?", -- either 'N' or 'Y'
MIN(CASE WHEN VIP = 'Y' THEN orderdate END) AS VIP_date -- only when VIP status reached
from cte
group by customerid
order by customers;
See fiddle
Not going to complicate the answer with logic to show 'vip' and 'vip date'. This will give you a running total for each customer order.
select o.orderid, o.customerid, o.orderdate, sum(od.quantity*od.unitprice) 'Total', (
select sum(od.quantity * od.unitprice) total
from orders o2
join [Order Details] od2 on od2.orderid=o2.orderid
where o2.orderID <= o.orderID
and o2.customerid = o.customerid) 'RunningTotal'
from orders o
join [Order Details] od
on od.orderid=o.orderid
group by o.orderid, o.customerid, o.orderdate
order by o.customerid
To answer your question on how to approach, you could consider going for an SQL trigger which runs on each update to the tables involved and sets the status when the threshold is hit.This would set the date as and when the event happens.
Another approach would be to use a stored procedure wherein you can use a loop top iterate over the records and arrive at the date.
The choice can be made based on the volume of the data, with the former bring suitable for extremely large amounts of data.

SQL: list items not sold in between time period

The software we use has two tables for orders, Orders and OrderItems.
I am trying do run a query on the database to show which items have not sold in a period of time.
This is what I have but It's not working ( it brings up all the records)
SELECT
OrderItem.Name,
OrderItem.SKU,
[Order].OrderDate
FROM
[Order]
INNER JOIN OrderItem ON [Order].OrderID = OrderItem.OrderID
WHERE
(OrderItem.SKU NOT IN
(SELECT DISTINCT OrderItem.SKU WHERE ([Order].OrderDate BETWEEN '2014-09-08' AND '2014-01-01')))
You can actually do this with a having clause:
SELECT oi.Name, oi.SKU, max(o.OrderDate) as lastOrderDate
FROM [Order] o INNER JOIN
OrderItem oi
ON o.OrderID = oi.OrderID
GROUP BY oi.Name, oi.SKU
HAVING sum(case when o.OrderDate between '2014-01-01' and '2014-09-08' then 1 else 0 end) = 0;
If you are just looking for orders before this year, it is easier to write the having clause as:
HAVING max(o.OrderDate) < '2014-01-01'
Flip your dates. It should be [Begin Date] Between [End Date]
WHERE (OrderItem.SKU NOT IN
(SELECT DISTINCT OrderItem.SKU
WHERE ([Order].OrderDate BETWEEN '2014-01-01' AND '2014-09-08')))
How about the following:
SELECT oi.Name, oi.SKU, o.OrderDate
FROM [Order] o
INNER JOIN OrderItem oi ON o.OrderID = oi.OrderID
WHERE oi.SKU NOT IN
(
SELECT os.SKU
FROM [Order] os
INNER JOIN OrderItem ois ON os.OrderID = ois.OrderID
WHERE os.OrderDate BETWEEN '2014-01-01' AND '2014-09-08'
)
You have to join to the OrderItem table in the sub query in order to get the SKU.
Looks like your query was build 'upside down'.
SELECT DISTINCT SKU FROM OrderItem oi
WHERE NOT EXISTS
(SELECT 1 FROM Order o
JOIN OrderItem oi2 ON o.OrderID = oi2.OrderID
WHERE oi2.SKU = oi.SKU
AND o.OrderDate BETWEEN '2014-01-01' AND '2014-09-08' );
Ideally, you should have another table containing your distinct items, so you could write the following query to see which items were not sold during a certain period (and may have never been sold at all).
select i.SKU from items
where not exists (
select 1 from OrderItem oi
join Order o on o.OrderID = oi.OrderID
where oi.SKU = i.SKU
and o.OrderDate BETWEEN '2014-01-01' and '2014-09-08'
)
if you don't have such a table, you can select all products that have been ordered at some point, but not during another period
select i.SKU from (
select distinct oi.SKU from OrderItem oi
) i
where not exists (
select 1 from OrderItem oi
join Order o on o.OrderID = oi.OrderID
where oi.SKU = i.SKU
and o.OrderDate BETWEEN '2014-01-01' and '2014-09-08'
)

SQL Change output of column if duplicate

I have a table which has rows for each product that a customer has purchased. I want to output a column from a SELECT query which shows the time it takes to deliver said item based on whether the customer has other items that need to be delivered. The first item takes 5 mins to deliver and all subsequent items take 2 mins to deliver e.g. 3 items would take 5+2+2=9 mins to deliver.
This is what I have at the moment(Using the Northwind sample database on w3schools to test the query):
SELECT orders.customerid,
orders.orderid,
orderdetails.productid,
CASE((SELECT Count(orders.customerid)
FROM orders
GROUP BY orders.customerid))
WHEN 1 THEN '00:05'
ELSE '00:02'
END AS DeliveryTime
FROM orders
LEFT JOIN orderdetails
ON orderdetails.orderid = orders.orderid
This outputs '00:05' for every item due to the COUNT in my subquery(I think?), any ideas on how to fix this?
Try this
SELECT orders.customerid,
orders.orderid,
orderdetails.productid,
numberorders,
2 * ( numberorders - 1 ) + 5 AS deleveryMinutes
FROM orders
INNER JOIN (SELECT orders.customerid AS countId,
Count(1) AS numberOrders
FROM orders
GROUP BY orders.customerid) t1
ON t1.countid = orders.customerid
LEFT JOIN orderdetails
ON orderdetails.orderid = orders.orderid
ORDER BY customerid
Gregory's answer works a treat and here's my attempts
-- Without each product line item listed
SELECT O.CustomerId,
O.OrderId,
COUNT(*) AS 'NumberOfProductsOrderd',
CASE COUNT(*)
WHEN 1 THEN 5
ELSE (COUNT(*) * 2) + 3
END AS 'MinutesToDeliverAllProducts'
FROM Orders AS O
INNER JOIN OrderDetails AS D ON D.OrderId = O.OrderId
GROUP BY O.CustomerId, O.OrderId
-- Without each product line item listed
SELECT O.CustomerId,
O.OrderId,
D.ProductId,
CASE
WHEN P.ProductsInOrder = 1 THEN 5
ELSE (P.ProductsInOrder * 2) + 3
END AS 'MinutesToDeliverAllProducts'
FROM Orders AS O
INNER JOIN OrderDetails AS D ON D.OrderId = O.OrderId
INNER JOIN (
SELECT OrderId, COUNT(*) AS ProductsInOrder
FROM OrderDetails
GROUP BY OrderId
) AS P ON P.OrderId = O.OrderId
GROUP BY O.CustomerId,
O.OrderId,
D.ProductId,
P.ProductsInOrder
Final code is below for anyone interested:
SELECT O.CustomerId,
O.OrderId,
Group_Concat(D.ProductID) AS ProductID,
CASE COUNT(*)
WHEN 1 THEN 5
ELSE (COUNT(*) * 2) + 3
END AS 'MinutesToDeliverAllProducts'
FROM Orders AS O
INNER JOIN OrderDetails AS D ON D.OrderId = O.OrderId
GROUP BY O.CustomerId