Find Average Sales Amount Per Customer using AdventureWorks2012 - sql

I am following a tutorial using MS AdventureWorks2012 and I wanted to write a query to find average sales amount per customer (or in other words, Average Sales for each customer) using AdventureWorks2012. Below is my attempt and it doesn't run. What Am I doing wrong here ?
SELECT soh.CustomerID AS 'Customer ID'
,p.FirstName + ' ' + p.LastName AS 'Customer Name'
,AVG(soh.TotalDue) AS 'Average Sales Amount Per Customer'
FROM Sales.SalesOrderHeader AS soh
INNER JOIN Sales.Customer AS c ON c.CustomerID = soh.CustomerID
INNER JOIN Person.BusinessEntityContact AS bec ON bec.PersonID = c.PersonID
INNER JOIN Person.Person AS p ON p.BusinessEntityID = bec.BusinessEntityID
GROUP BY p.FirstName , p.LastName, soh.CustomerID;

Your query runs, it just returns an empty result set.
If you look at BusinessEntityContact, it relates a BusinessEntityID, which is a customer business, to a PersonID, who is a person that is the contact for the business. So if you change your query to this:
SELECT soh.CustomerID AS 'Customer ID', p.FirstName + ' ' + p.LastName AS 'Customer Name',
AVG(soh.TotalDue) AS 'Average Sales Amount Per Customer'
FROM Sales.SalesOrderHeader AS soh
INNER JOIN Sales.Customer AS c ON c.CustomerID = soh.CustomerID
INNER JOIN Person.BusinessEntityContact AS bec ON bec.PersonID = c.PersonID
INNER JOIN Person.Person AS p ON p.BusinessEntityID = bec.PersonID
GROUP BY p.FirstName , p.LastName, soh.CustomerID;
(note the third inner join)
You will get 635 rows.

In SQL single quote ' identifies a string. If you must to have spaces in your result column names then you can identify column names with double quotes " or preferably by wrapping the column name in square brackes.
SELECT
soh.CustomerID AS [Customer ID]
,p.FirstName + ' ' + p.LastName AS [Customer Name]
,AVG(soh.TotalDue) AS [Average Sales Amount Per Customer]
FROM Sales.SalesOrderHeader AS soh
INNER JOIN Sales.Customer AS c ON c.CustomerID = soh.CustomerID
INNER JOIN Person.BusinessEntityContact AS bec ON bec.PersonID = c.PersonID
INNER JOIN Person.Person AS p ON p.BusinessEntityID = bec.BusinessEntityID
GROUP BY p.FirstName , p.LastName, soh.CustomerID;
Otherwise your syntax looks okay to me

Related

Top sales performer is in each month for a specified year

Using the adventure works 2017 test database I need to see who the top sales performer is in each month for a specified year. The management is only interested in the sales of bike “Components”. Create a stored procedure to get this information.
The year must be an input parameter.
Show the firstname and surname in one field.
Show the total value of the sales and the month for each top performer.
Additional marks will be allocated for using a single statement
So far I have this:
CREATE PROCEDURE getTopSalesByYear (#Year int)
AS
BEGIN
SET NOCOUNT ON
SELECT FirstName + ' ' + LastName AS SalesPerson,
sp.BusinessEntityID,
DATENAME(MONTH,SOH.OrderDate) as SalesMonth,
SUM(SOH.SubTotal) AS TotalSales FROM sales.SalesOrderHeader SOH
INNER JOIN sales.SalesOrderDetail SOD ON SOH.SalesOrderId = SOD.SalesOrderId
INNER JOIN sales.SalesPerson sp on soh.SalesPersonID = sp.BusinessEntityID
INNER JOIN Person.Person p on p.BusinessEntityID = sp.BusinessEntityID
INNER JOIN Production.Product Pr on sod.ProductID = pr.ProductID
INNER JOIN Production.ProductCategory pc on pc.ProductCategoryID = pr.ProductSubcategoryID
INNER JOIN Production.ProductSubcategory psc on pc.ProductCategoryID = pc.ProductcategoryID
WHERE psc.ProductCategoryID = 2
GROUP BY p.FirstName,p.LastName,sp.BusinessEntityID, DATENAME(MONTH,SOH.OrderDate)
ORDER BY TotalSales desc
This is what I have so far but it needs to be a procedure with the year being passed. Also noting that I do not know where to pass the parameter to what value.
You need to use ROW_NUMBER to get the best performer per month.
To check that a date is within a particular year, instead of comapring using the YEAR function, it is best to calculate the beginning and end points. The end should be exclusive, this is called a half-open interval.
You can also group by EOMONTH (end of month) which can be a little more efficient than grouping by DATENAME(MONTH, you can calculate the actual name afterwards.
CREATE PROCEDURE getTopSalesByYear (#Year int)
AS
SET NOCOUNT ON;
SELECT
s.FirstName + ' ' + s.LastName AS SalesPerson,
DATENAME(MONTH, s.SalesMonth) AS SalesMonth,
s.TotalSales
FROM (
SELECT
p.FirstName,
p.LastName,
p.BusinessEntityID,
EOMONTH(SOH.OrderDate) AS SalesMonth,
SUM(SOH.SubTotal) AS TotalSales,
ROW_NUMBER() OVER (PARTITION BY EOMONTH(SOH.OrderDate) ORDER BY SUM(SOH.SubTotal) DESC) AS rn
FROM sales.SalesOrderHeader SOH
INNER JOIN sales.SalesOrderDetail SOD ON SOH.SalesOrderId = SOD.SalesOrderId
INNER JOIN sales.SalesPerson sp on soh.SalesPersonID = sp.BusinessEntityID
INNER JOIN Person.Person p on p.BusinessEntityID = sp.BusinessEntityID
INNER JOIN Production.Product Pr on sod.ProductID = pr.ProductID
WHERE pr.ProductCategoryID = 2
AND SOH.OrderDate >= DATEFROMPARTS(#Year , 1, 1)
AND SOH.OrderDate < DATEFROMPARTS(#Year + 1, 1, 1)
GROUP BY
p.FirstName,
p.LastName,
p.BusinessEntityID,
EOMONTH(SOH.OrderDate)
) s
WHERE s.rn = 1
ORDER BY SalesMonth;
GO
Note that this will only give you results for a month if there are actually sales in that month. If there are no sales, you will not get 0, there will be no row for that month.
Thank you for the assistance, see below finished and working query.
CREATE PROCEDURE getTopSalesByYear (#Year int)
AS
SET NOCOUNT ON;
SELECT
s.FirstName + ' ' + s.LastName AS SalesPerson,
DATENAME(MONTH, s.SalesMonth) AS SalesMonth,
s.TotalSales
FROM (
SELECT
p.FirstName,
p.LastName,
p.BusinessEntityID,
EOMONTH(SOH.OrderDate) AS SalesMonth,
SUM(SOH.SubTotal) AS TotalSales,
ROW_NUMBER() OVER (PARTITION BY EOMONTH(SOH.OrderDate) ORDER BY SUM(SOH.SubTotal) DESC) AS rn
FROM sales.SalesOrderHeader SOH
INNER JOIN sales.SalesOrderDetail SOD ON SOH.SalesOrderId = SOD.SalesOrderId
INNER JOIN sales.SalesPerson sp on soh.SalesPersonID = sp.BusinessEntityID
INNER JOIN Person.Person p on p.BusinessEntityID = sp.BusinessEntityID
INNER JOIN Production.Product Pr on sod.ProductID = pr.ProductID
WHERE Pr.ProductSubcategoryID = 2
AND SOH.OrderDate >= DATEFROMPARTS(#Year , 1, 1)
AND SOH.OrderDate < DATEFROMPARTS(#Year + 1, 1, 1)
GROUP BY
p.FirstName,
p.LastName,
p.BusinessEntityID,
EOMONTH(SOH.OrderDate)
) s
WHERE s.rn = 1
ORDER BY SalesMonth;
GO
--Execute
--Exec getTopSalesByYear 2011

get no value when I run the query to this question

The assignment question is...
List the order's customer name, order status, date ordered, count of items on the order, and average quantity ordered where the count of items on the order is greater than 300.
I'm using Adventure Works 2019 for the assignment.
The answer I have been able to come up with is....
SELECT
LastName + ', ' + FirstName AS 'Customer Name',
ssoh.Status AS 'Order Status',
ssoh.OrderDate AS 'Date Order',
SUM(ssod.Orderqty) AS 'Count of Items',
AVG(ssod.Orderqty) AS 'Average Quantity'
FROM
Person.Person pp
JOIN Sales.SalesOrderHeader ssoh ON pp.BusinessEntityID = ssoh.CustomerID
JOIN Sales.SalesOrderDetail ssod on ssoh.SalesOrderID = ssod.SalesOrderid
GROUP BY
LastName + ', ' + FirstName,
ssoh.OrderDate,
ssoh.Status
HAVING
SUM(ssod.OrderQty) > 300
When I change the inner join to an outer join I get null values for the customer name. I doubled checked the "customerID" foreign key to make sure it the same as the primary key "BusinessEntityID" and get results. Any ideas would be greatly appreciated. Thanks
Based on having a quick look at the schema online (I don't have it installed), it's my understanding that you need to join Person.Person through Sales.Customer to Sales.SalesOrderHeader. So perhaps try the following:
FROM
Person.Person pp
JOIN Sales.Customer sc ON pp.BusinessEntityID= sc.PersonID
JOIN Sales.SalesOrderHeader ssoh ON sc.CustomerID= ssoh.CustomerID
JOIN Sales.SalesOrderDetail ssod on ssoh.SalesOrderID = ssod.SalesOrderid
First, grouping by LastName + ', ' + FirstName will provide you inaccurate results.
Because there are approx 380 names with duplications. Sample:
Second, Manachi is right about the join.
So, your final query should be like this:
WITH cte AS (
SELECT
BusinessEntityID AS CustomerId,
LastName + ', ' + FirstName AS 'Customer Name',
ssoh.Status AS 'Order Status',
ssoh.OrderDate AS 'Date Order',
SUM(ssod.Orderqty) AS 'Count of Items',
AVG(ssod.Orderqty) AS 'Average Quantity'
FROM
Person.Person pp
JOIN Sales.Customer c ON c.PersonID = pp.BusinessEntityID
JOIN Sales.SalesOrderHeader ssoh ON c.CustomerID = ssoh.CustomerID
JOIN Sales.SalesOrderDetail ssod on ssoh.SalesOrderID = ssod.SalesOrderid
GROUP BY
BusinessEntityID,
LastName + ', ' + FirstName,
ssoh.OrderDate,
ssoh.Status
HAVING
SUM(ssod.OrderQty) > 300
)
SELECT
[Customer Name],
[Order Status],
[Date Order],
[Count of Items],
[Average Quantity]
FROM cte

Retrieving records with inner joins

My assignment is to get the the First name, Middle name and Last name for all Customers that have had an order before '2012-09-30' and after '2013-09-30'. I'm using the AdventureWorks2017 as a sample DB
Table: Sales.SalesOrderHeader
[SalesOrderID]
,[RevisionNumber]
,[OrderDate]
,[DueDate]
,[ShipDate]
,[Status]
,[OnlineOrderFlag]
,[SalesOrderNumber]
,[PurchaseOrderNumber]
,[AccountNumber]
,[CustomerID]
,[SalesPersonID]
,[TerritoryID]
,[BillToAddressID]
,[ShipToAddressID]
,[ShipMethodID]
,[CreditCardID]
,[CreditCardApprovalCode]
,[CurrencyRateID]
,[SubTotal]
,[TaxAmt]
,[Freight]
,[TotalDue]
,[Comment]
,[rowguid]
,[ModifiedDate]
Table: Person.Person
[BusinessEntityID]
,[PersonType]
,[NameStyle]
,[Title]
,[FirstName]
,[MiddleName]
,[LastName]
,[Suffix]
,[EmailPromotion]
,[AdditionalContactInfo]
,[Demographics]
,[rowguid]
,[ModifiedDate]
Table: Sales.Customers
[CustomerID]
,[PersonID]
,[StoreID]
,[TerritoryID]
,[AccountNumber]
,[rowguid]
,[ModifiedDate]
My Query
SELECT DISTINCT person_table.FirstName,
person_table.MiddleName,
person_table.LastName
FROM Sales.SalesOrderHeader as sales_order_table
inner join Sales.Customer as sales_customer_table
on (sales_customer_table.CustomerID = sales_order_table.CustomerID
and sales_order_table.OrderDate <= '2012-09-30' )
inner join Sales.Customer as sales_customer_table2
on (sales_customer_table2.CustomerID = sales_order_table.CustomerID
and sales_order_table.OrderDate >= '2013-06-30' )
inner join Sales.Customer as match_result
on (match_result.CustomerID = sales_customer_table2.CustomerID)
inner join Person.Person as person_table
on (person_table.BusinessEntityID = match_result.PersonID)
In this current state returns no rows and im unsure where the problem is
[UPDATE]
Found a relatevly good solution to the problem by editing Bilal Fakih answer
SELECT DISTINCT person_table.FirstName,
person_table.MiddleName,
person_table.LastName,
count(*) as Total_Instanses
FROM Sales.SalesOrderHeader as sales_order_table
inner join Sales.Customer as sales_customer_table
on (sales_customer_table.CustomerID = sales_order_table.CustomerID)
inner join Person.Person as person_table
on (person_table.BusinessEntityID = sales_customer_table.PersonID)
WHERE sales_order_table.OrderDate NOT BETWEEN '2012-09-30' AND '2013-06-30'
GROUP BY person_table.FirstName,
person_table.MiddleName,
person_table.LastName
HAVING count(*) >= 2
The suggestion was good but it woud return records that only had one instance. Im running into a few corner cases now. For example If a person has made 2 Orders that are bewfore 2012 or after 2013 will still be shown. The result im looking for is for a person to show up only when he has made orders before AND after the given dates
Try this, I'm not sure if it works I don't have the dataset to test, but it should
SELECT DISTINCT person_table.FirstName,
person_table.MiddleName,
person_table.LastName
FROM Sales.SalesOrderHeader as sales_order_table
inner join Sales.Customer as sales_customer_table
on (sales_customer_table.CustomerID = sales_order_table.CustomerID
inner join Person.Person as person_table
on (person_table.BusinessEntityID = match_result.PersonID)
WHERE sales_order_table.OrderDate NOT BETWEEN '2012-09-30' AND '2013-06-30'
You could simply this using below. Also your dates filter was not correct.
SELECT DISTINCT p.FirstName,
p.MiddleName,
p.LastName
FROM Sales.SalesOrderHeader as s
INNER JOIN Sales.Customer as c
ON c.CustomerID = s.CustomerID
INNER JOIN Person.Person as p
ON p.BusinessEntityID = c.PersonID)
WHERE s.OrderDate >= '2012-09-30' <----- add this
AND s.OrderDate <= '2013-06-30' ) ---- and this
My assignment is to get the the First name, Middle name and Last name for all Customers that have had an order before '2012-09-30' and after '2013-09-30'.
One method uses aggregation:
SELECT p.FirstName, p.MiddleName, p.LastName
FROM person_table p JOIN
Sales.Customer c
ON p.BusinessEntityID = c.PersonID JOIN
Sales.SalesOrderHeader so
ON c.CustomerID = so.Cus tomerID
GROUP BY p.FirstName, p.MiddleName, p.LastName
HAVING MIN(so.OrderDate) < '2020-09-30' AND
MAX(so.OrderDate) >'2013-06-30';
I will say that this condition looks suspicious:
ON p.BusinessEntityID = c.PersonID
However, that is what you use in your query. I would expect the person table to have an id called something like PersonId.

SQL Distinct Sum

SELECT DISTINCT
E.FirstName + ' ' + E.LastName [Full Name],
P.ProductName,
OD.Quantity
FROM Employees E,
Products P,
[Order Details] OD,
Orders O
WHERE
E.EmployeeID = O.EmployeeID
AND O.OrderID = OD.OrderID
AND OD.ProductID = P.ProductID
In the Northwind gives back duplicate FullNames and ProductNames because of the Quantity which is changed (because of the date shipped each time).
I want to present only a Name to a specific ProductName with the Total Quantity and not divided.
You need to use GROUP BY with SUM:
SELECT
e.FirstName + ' ' + e.LastName AS [Full Name],
p.ProductName,
SUM(od.Quantity) AS [Quantity]
FROM Employees e
INNER JOIN Orders o
ON o.EmployeeID = e.EmployeeID
INNER JOIN [Order Details] od
ON od.OrderID = o.OrderID
INNER JOIN Products p
ON p.ProductID = od.ProductID
GROUP BY
e.FirstName + ' ' + e.LastName,
p.ProductName
Note, you need to stop using the old-style JOIN syntax.
I think,it was a good question for discussion.
Correct query always depend upon your actual requirement.
I think your table is too much normalise.In such situation most of them will also keep Employeeid in order_detail table.
At the same time,most of them keep sum value in Order table.
Like sum of quantity,sum of amount etc per orderid in order table.
you can also create view without aggregate function joining all the table.
IMHO,Using Group By clause on so many column and that too on varchar column is bad idea.
Try something like this,
;With CTE as
(
SELECT
E.FirstName + ' ' + E.LastName [Full Name],
O.OrderID,od.qty,P.ProductName
FROM Employees E
inner join Orders O on E.EmployeeID = O.EmployeeID
inner join [Order Details] OD on o.orderid=od.orderid
inner join [Products] P on p.ProductID=od.ProductID
)
,CTE1 as
(
select od.orderid, sum(qty) TotalQty
from CTE c
group by c.orderid
)
select c.[Full Name],c1.TotalQty, P.ProductName from cte c
inner join cte1 c1 on c.orderid=c1.orderid

SQL Left Joining four different tables

From AdventureWorks2012, I want to write a query using the Sales.SalesOrderHeader, Sales.Customer, Sales.Store, and Person.Person tables, showing the SalesOrderID, StoreName, the customer’s first and last name as CustomerName and the salesperson’s first and last names as SalesPersonName. I want to do a left join with Sales.Customer to the Sales.Store and Person.Person tables.
Here is my work so far. However, the CustomerName and SalesPersonName both have the same information when they should be different.
SELECT soh.SalesOrderID, ST.Name AS StoreName, pp.[PersonType], pp.[FirstName] + [LastName] AS CustomerName,
pp.[FirstName] + [LastName] AS SalesPersonName
FROM Sales.SalesOrderHeader soh
JOIN Sales.Customer SC ON soh.SalesOrderID = sc.CustomerID
JOIN Sales.Store ST ON sc.CustomerID = ST.BusinessEntityID
JOIN Person.Person PP ON ST.BusinessEntityID = PP.BusinessEntityID
WHERE Persontype LIKE 'SP%'
You're getting bad results, because your joins are wrong. You should join on fields, that represent a relation between the 2 table. For example:
JOIN Sales.Customer sc ON soh.CustomerID = sc.CustomerID