How to group results from a column together - sql

so... i want to find out the top 5 customers that spent the most on 1992 Ferrari 360 Spider red from my dataset
This is the query responding to the question above:
select C.customerNumber, C.customerName, P.productName, sum(ODF.priceEach) from orderDetailFacts ODF
left join Customers C on ODF.customerNumber = C.customerNumber
left join Products P on ODF.productCode = P.productCode
where P.productName like '%1992 Ferrari%'
group by C.customerNumber, C.customerName, P.productName, ODF.priceEach
order by sum(ODF.priceEach) desc
limit 5
The result that i got back is: [https://i.stack.imgur.com/rowRY.png] (There are more results than the screenshot)
However, the problem i have is that i am unable to group for e.g Mini Gifts Distributors Ltd(row 5 and 6) from the customerName column together so that the priceEach column will be the sum.
So it means that row 5 and 6 will be joined together and the priceEach column will be 386.09 instead of 196.43 and 189.66 separately.
Is there a solution to this?

I think you want one row per customer/product. But you are including the price as well. So, fix the GROUP BY:
select C.customerNumber, C.customerName, P.productName, sum(ODF.priceEach)
from orderDetailFacts ODF join
Customers C
on ODF.customerNumber = C.customerNumber join
Products P
on ODF.productCode = P.productCode
where P.productName like '%1992 Ferrari%'
group by C.customerNumber, C.customerName, P.productName;
Note the changes:
ODF.priceEach is removed from the GROUP BY.
You are requiring matches between the tables, so LEFT JOIN is not appropriate.

Remove ODF.priceEach from group by clause since you are aggregating this column
select C.customerNumber, C.customerName, P.productName, sum(ODF.priceEach) from orderDetailFacts ODF
left join Customers C on ODF.customerNumber = C.customerNumber
left join Products P on ODF.productCode = P.productCode
where P.productName like '%1992 Ferrari%'
group by C.customerNumber, C.customerName, P.productName

Related

sql use aggregate function that counts a unique value with group by using inner joins

I searched and found similar questions online but not my particular one, they all use where or having clause.If theres one similar to mine please link it. It's a 2 part question and I have the first one done. Thank you in advance.
Okay so heres the question, part 1
"Find by customer, the total cost and the total discounted cost for each product on the order ?".
It also asks to use inner joins to find the customer and order it a specific way. Below is the answer.
SELECT
C.companyname, O.orderid, O.orderdate, P.productname,
OD.orderid, OD.unitprice, OD.qty, OD.discount,
(OD.unitprice * OD.qty - (OD.qty * OD.discount)) AS TotalCost,
(OD.qty * OD.discount) AS TotalDiscountedCost
FROM
Sales.Customers AS C
INNER JOIN
Sales.Orders AS O ON C.custid = O.custid
INNER JOIN
Sales.OrderDetails OD ON O.orderid = OD.orderid
INNER JOIN
Production.Products as P ON OD.productid = P.productid
ORDER BY
C.companyname, O.orderdate;
Now the second question is to
follow up and resume the first one by "customer and the order date year, the total cost and the total discounted cost on the order ?". It also asks for this, "Project following columns in the select clause as.
GroupByColumns.companyname
GroupByColumns.OrderdateYear
AggregationColumns.CountNumberOfIndividualOrders
AggregationColumns.CountNumberOfProductsOrders
AggregationColumns.TotalCost
AggregationColumns.TotalDiscountedCost
Finally to order by company name and orderdateYear( which are groups). Where im stuck is how to count the specific orders of qty that equal 1 as an aggregate function in the SELECT clause. I know it has to use the aggregate function COUNT because of the GROUP BY, just don't know how to. This is what I have.
SELECT
C.companyname, YEAR(O.orderdate) AS orderyear,OD.qty,
-- Where in the count function or if theres another way do I count all the
--single orders
--COUNT(OD.qty) AS indiviualorders,
(OD.unitprice * OD.qty - (OD.qty * OD.discount)) AS TotalCost,
(OD.qty * OD.discount) AS TotalDiscountedCost
FROM
Sales.Customers AS C
INNER JOIN
Sales.Orders AS O ON C.custid = O.custid
INNER JOIN
Sales.OrderDetails OD ON O.orderid = OD.orderid
INNER JOIN
Production.Products as P ON OD.productid = P.productid
GROUP BY
C.companyname, YEAR(O.orderdate)
ORDER BY
C.companyname, O.orderdate;
You case use a case statement inside a sum
SUM(CASE WHEN <xyz> THEN 1 ELSE 0 END)
But for the count of unique orders, use SELECT(DISTINCT ) on a key that is unique in the order table
SELECT COUNT(DISTINCT O.OrderID) As DistinctOrders FROM Table

How to Organize Multiple Joins SQL

In SQL, how should I be joining tables together when I do multiple joins in one query. Should I join on only one table - in this case the Customers table or is it okay to do what I have done (joining on different tables as new keys are needed)?
SELECT O.OrderID, O.OrderDate, C.City, C.Country, C.PostalCode, C.ContactName, O.CustomerID, O.ShipperID, D.ProductID, COUNT(D.ProductID) ProductCount, S.SupplierID
FROM Customers C
INNER JOIN Orders O
ON O.CustomerID = C.CustomerID
INNER JOIN OrderDetails D
ON O.OrderID = D.OrderID
INNER JOIN Products P
ON D.ProductID = P.ProductID
INNER JOIN Suppliers S
ON S.SupplierID = P.SupplierID
WHERE 1 = 1
GROUP BY O.OrderID
ORDER BY OrderDate DESC
I am using W3Schools SQL TryIt editor to test this, not sure what DB engine it is!
Thanks!
Of course you can join on multiple tables in a query. That is a big part of the power of SQL.
In your particular case, you don't need the join to the Suppliers table, because the column is already in Products.
Also, you need to be careful about your SELECT and GROUP BY clauses. In general, you should put all non-aggregated columns in the GROUP BY:
SELECT O.OrderID, O.OrderDate, C.City, C.Country, C.PostalCode, C.ContactName,
O.CustomerID, O.ShipperID, D.ProductID,
COUNT(D.ProductID) as ProductCount,
P.SupplierID
FROM Customers C INNER JOIN
Orders O
ON O.CustomerID = C.CustomerID INNER JOIN
OrderDetails D
ON O.OrderID = D.OrderID INNER JOIN
Products P
ON D.ProductID = P.ProductID
GROUP BY O.OrderID, O.OrderDate, C.City, C.Country, C.PostalCode, C.ContactName,
O.CustomerID, O.ShipperID, D.ProductID, P.SupplierId
ORDER BY OrderDate DESC;
The WHERE 1=1 is also unnecessary.
I wonder if this query really does what you want. However, you don't state what you actually want the query to do, so I'm merely speculating.
The way you have done it is find, don't forget that for each inner join, your record set may reduce by the number of non matching keys in each additional join.
you could also just use the JOIN syntax.

What is 'keyword' is missing from this query?

This is what I have;
SELECT c.customerFN, c.customerEmail, p.productName,
SUM(p.unitsonstock + p.unitsordered) AS "All Units"
FROM customer c
INNER JOIN order o
WHERE c.customerID=o.customerID
INNER JOIN orderDetails d
WHERE o.orderID=d.orderID
INNER JOIN product p
WHERE p.productCode=l.productCode
WHERE orderDate <= '2015-03-15'
ORDER BY productName;
When I enter this the database throws a "missing keyword" error at the fourth line. Could you tell me what it is that I'm missing
Instead of WHERE in line 5, 7 and 9 you need to use ON. You are also using function SUM, but there is no GROUP BY. Change your query like this:
SELECT c.customerFN, c.customerEmail, p.productName,
SUM(p.unitsonstock + p.unitsordered) AS "All Units"
FROM customer c
INNER JOIN order o
ON c.customerID=o.customerID
INNER JOIN orderDetails d
ON o.orderID=d.orderID
INNER JOIN product p
ON p.productCode=l.productCode
WHERE orderDate <= '2015-03-15'
GROUP BY c.customerFN, c.customerEmail, p.productName
ORDER BY p.productName;
JOIN is performed using the ON clause, not with a WHERE:
...
FROM customer c
INNER JOIN order o ON c.customerID=o.customerID
INNER JOIN orderDetails d ON o.orderID=d.orderID
INNER JOIN product p ON p.productCode=d.productCode
WHERE orderDate <= '2015-03-15'
...
The WHERE clause that comes after join should be used like you have it in your query.
Apart from the problem with the JOIN there is also a problem using SUM without grouping. You probably want something like:
SELECT c.customerFN, c.customerEmail, p.productName,
SUM(p.unitsinstock + p.unitsordered) AS "All Units"
FROM customer c
INNER JOIN order o ON c.customerID=o.customerID
INNER JOIN orderDetails d ON o.orderID=d.orderID
INNER JOIN product p ON p.productCode=d.productCode
WHERE orderDate <= '2015-03-15'
GROUP BY customerFN, customerEmail, productName
ORDER BY p.productName;
Use of SUM function implies a GROUP BY clause. Every column selected that is not part of an aggregate function like SUM must be present in the GROUP BY clause.

SQL query with w3schools db

I should have asked multiple questions in my other post. Thanks to all who have helped, I am now stuck on another one..
Using the w3schools db, List SupplierID, SupplierName and ItemSupplied (count of number of items supplied by a supplier), sort the list first by number of items supplied (descending) and then by supplier name (ascending)
SELECT supplierid,
suppliername,
p.productname,
Count(s.supplierid) AS itemssupplied
FROM [Suppliers] AS s
INNER JOIN [Products] AS p
ON p.supplierid = s.supplierid
GROUP BY p.productid,
p.productname
ORDER BY Count (p.productid, p.productname) DESC
order BY s.suppliername
It's giving me an error, then again I am ordering by multiple ones. I think there's something I am not quite understanding here.
My other question is
List customers for each category and the total of order placed by that customer in a given category. In the query show three columnm: CategoryName, CustomerName, and TotalOrders (which is price * quantity for orders for a given customer in a given category). Sort this data in descending order by TotalOrders.
SELECT cg.CategoryName,
c.CustomerName,
Sum(p.Price * od.Quantity) AS TotalOrders
FROM [products] AS p
INNER JOIN [orderdetails] AS od
ON od.ProductID = p.ProductID
INNER JOIN [orders] AS o
ON o.OrderID = od.OrderID
INNER JOIN [customers] AS c
ON c.customerID = o.CustomerID
INNER JOIN [categories] AS cg
ON cg.CategoryID = p.CategoryID
GROUP BY c.CustomerName
ORDER BY TotalOrders DESC
Can someone please check if my query is correct? Thank you once again!
Question 1
You are really close but you only need to state ORDER BY once (also make sure to include all shown fields in your GROUP BY unless you are aggregating them):
SELECT SupplierID, SupplierName, p.ProductName, count(s.SupplierID) AS ItemsSupplied
FROM [Suppliers] AS s
INNER JOIN [Products] AS p ON p.SupplierID = s.SupplierID
GROUP BY p.ProductID, p.ProductName, SupplierID, SupplierName -- Added SupplierID, SupplierName
ORDER BY COUNT (p.productID, p.ProductName) DESC, s.SupplierName
Notice that you just place multiple sorts on the same line with a comma separating them.
Question 2
You're almost there but you need to group by any field that is not being aggregated. So in order not to get a parsing error, I added the cg.CategoryName to the GROUP BY line.
SELECT cg.CategoryName, c.CustomerName, Sum(p.Price*od.Quantity) AS TotalOrders
FROM [Products] AS p
INNER JOIN [OrderDetails] AS od ON od.ProductID = p.ProductID
INNER JOIN [Orders] AS o ON o.OrderID = od.OrderID
INNER JOIN [Customers] AS c ON c.customerID = o.CustomerID
INNER JOIN [Categories] AS cg ON cg.CategoryID = p.CategoryID
GROUP BY c.CustomerName, cg.CategoryName --Added CategoryName
ORDER BY TotalOrders DESC
You have several problems with the first query:
You're grouping by ProductID and ProductName even though you want the number of items supplied by a supplier, which means that you want to group by SupplierID and SupplierName.
You're supplying too many arguments to the COUNT function, which takes a single column name or *.
You've included a ProductName column in your results, which is not called for.
You need to ORDER BY both the number of products supplied and the SupplierName.
With those points in mind:
SELECT
s.SupplierID,
s.SupplierName,
COUNT(p.ProductID) AS ItemsSupplied
FROM
[Suppliers] AS s
INNER JOIN [Products] AS p ON p.SupplierID = s.SupplierID
GROUP BY
s.SupplierID, s.SupplierName
ORDER BY
ItemsSupplied DESC,
s.SupplierName ASC
Your second query is quite close, you're just missing one point, which is that you're looking for total of order placed by that customer in a given category. This means that in addition to grouping by c.CustomerName, you need to group by cg.CategoryID:
SELECT
cg.CategoryName,
c.CustomerName,
SUM(p.Price*od.Quantity) AS TotalOrders
FROM
[Products] AS p
INNER JOIN [OrderDetails] AS od ON od.ProductID = p.ProductID
INNER JOIN [Orders] AS o ON o.OrderID = od.OrderID
INNER JOIN [Customers] AS c ON c.customerID = o.CustomerID
INNER JOIN [Categories] AS cg ON cg.CategoryID = p.CategoryID
GROUP BY
c.CustomerName, cg.CategoryID
ORDER BY
TotalOrders DESC
The first one has two order by clauses
ORDER BY COUNT (p.productID, p.ProductName) DESC
and
ORDER BY s.SupplierName
also some databases will complain when order by columns for queries using group by are not included in the selected columns

Oracle: Using Sub-Queries, JOIN and distinct function together

Here is how I contructed the step-by-step:
M1. create a sub-query that will return CustomerId and total invoiced for that customer
M2. A second subquery that will give a list of distinct ProductIDs (along with product SKUs) and CustomerIDs.
M3. The M1 and M2 subqueries will be joined to make association between customer totals and products (for the same CustomerId).
M4. The query M3 will be fed to the final query that will just find the top 5 products.
I'm stuck on creating the distinct ProductID and customerID because they would have to be in aggregate functions in order to make them distinct.
Attached is an image that is the erwin diagram which helps understand the system.
If you can help me with M1-M4, I will greatly appreciate it. I'm not a programmer by trade but a business analyst.
--M1--
select C.CustomerId, COUNT(I.InvoiceId) TotalNumInvoices
from Customer C
JOIN Invoice I ON (I.CustomerId = C.CustomerId)
group by C.CustomerId
--M2: Incomplete--
select P.ProductID, P.SKU, C.CustomerID
from Product P
JOIN InvoiceLine IL ON (IL.ProductId = P.ProductId)
JOIN Invoice I ON (IL.InvoiceId = I.InvoiceId)
JOIN Customer C ON (C.CustomerId = I.CustomerId)
you can also use the DISTINCT keyword your select clause in order to get unique values. Try this for m2:
select DISTINCT p.productID, p.sku, i.customerID
from invoice i INNER JOIN invoiceLine il
ON i.invoiceID = il.invoiceID
JOIN product p
ON il.productID = p.productID;