Retrieving records with inner joins - sql

My assignment is to get the the First name, Middle name and Last name for all Customers that have had an order before '2012-09-30' and after '2013-09-30'. I'm using the AdventureWorks2017 as a sample DB
Table: Sales.SalesOrderHeader
[SalesOrderID]
,[RevisionNumber]
,[OrderDate]
,[DueDate]
,[ShipDate]
,[Status]
,[OnlineOrderFlag]
,[SalesOrderNumber]
,[PurchaseOrderNumber]
,[AccountNumber]
,[CustomerID]
,[SalesPersonID]
,[TerritoryID]
,[BillToAddressID]
,[ShipToAddressID]
,[ShipMethodID]
,[CreditCardID]
,[CreditCardApprovalCode]
,[CurrencyRateID]
,[SubTotal]
,[TaxAmt]
,[Freight]
,[TotalDue]
,[Comment]
,[rowguid]
,[ModifiedDate]
Table: Person.Person
[BusinessEntityID]
,[PersonType]
,[NameStyle]
,[Title]
,[FirstName]
,[MiddleName]
,[LastName]
,[Suffix]
,[EmailPromotion]
,[AdditionalContactInfo]
,[Demographics]
,[rowguid]
,[ModifiedDate]
Table: Sales.Customers
[CustomerID]
,[PersonID]
,[StoreID]
,[TerritoryID]
,[AccountNumber]
,[rowguid]
,[ModifiedDate]
My Query
SELECT DISTINCT person_table.FirstName,
person_table.MiddleName,
person_table.LastName
FROM Sales.SalesOrderHeader as sales_order_table
inner join Sales.Customer as sales_customer_table
on (sales_customer_table.CustomerID = sales_order_table.CustomerID
and sales_order_table.OrderDate <= '2012-09-30' )
inner join Sales.Customer as sales_customer_table2
on (sales_customer_table2.CustomerID = sales_order_table.CustomerID
and sales_order_table.OrderDate >= '2013-06-30' )
inner join Sales.Customer as match_result
on (match_result.CustomerID = sales_customer_table2.CustomerID)
inner join Person.Person as person_table
on (person_table.BusinessEntityID = match_result.PersonID)
In this current state returns no rows and im unsure where the problem is
[UPDATE]
Found a relatevly good solution to the problem by editing Bilal Fakih answer
SELECT DISTINCT person_table.FirstName,
person_table.MiddleName,
person_table.LastName,
count(*) as Total_Instanses
FROM Sales.SalesOrderHeader as sales_order_table
inner join Sales.Customer as sales_customer_table
on (sales_customer_table.CustomerID = sales_order_table.CustomerID)
inner join Person.Person as person_table
on (person_table.BusinessEntityID = sales_customer_table.PersonID)
WHERE sales_order_table.OrderDate NOT BETWEEN '2012-09-30' AND '2013-06-30'
GROUP BY person_table.FirstName,
person_table.MiddleName,
person_table.LastName
HAVING count(*) >= 2
The suggestion was good but it woud return records that only had one instance. Im running into a few corner cases now. For example If a person has made 2 Orders that are bewfore 2012 or after 2013 will still be shown. The result im looking for is for a person to show up only when he has made orders before AND after the given dates

Try this, I'm not sure if it works I don't have the dataset to test, but it should
SELECT DISTINCT person_table.FirstName,
person_table.MiddleName,
person_table.LastName
FROM Sales.SalesOrderHeader as sales_order_table
inner join Sales.Customer as sales_customer_table
on (sales_customer_table.CustomerID = sales_order_table.CustomerID
inner join Person.Person as person_table
on (person_table.BusinessEntityID = match_result.PersonID)
WHERE sales_order_table.OrderDate NOT BETWEEN '2012-09-30' AND '2013-06-30'

You could simply this using below. Also your dates filter was not correct.
SELECT DISTINCT p.FirstName,
p.MiddleName,
p.LastName
FROM Sales.SalesOrderHeader as s
INNER JOIN Sales.Customer as c
ON c.CustomerID = s.CustomerID
INNER JOIN Person.Person as p
ON p.BusinessEntityID = c.PersonID)
WHERE s.OrderDate >= '2012-09-30' <----- add this
AND s.OrderDate <= '2013-06-30' ) ---- and this

My assignment is to get the the First name, Middle name and Last name for all Customers that have had an order before '2012-09-30' and after '2013-09-30'.
One method uses aggregation:
SELECT p.FirstName, p.MiddleName, p.LastName
FROM person_table p JOIN
Sales.Customer c
ON p.BusinessEntityID = c.PersonID JOIN
Sales.SalesOrderHeader so
ON c.CustomerID = so.Cus tomerID
GROUP BY p.FirstName, p.MiddleName, p.LastName
HAVING MIN(so.OrderDate) < '2020-09-30' AND
MAX(so.OrderDate) >'2013-06-30';
I will say that this condition looks suspicious:
ON p.BusinessEntityID = c.PersonID
However, that is what you use in your query. I would expect the person table to have an id called something like PersonId.

Related

Top sales performer is in each month for a specified year

Using the adventure works 2017 test database I need to see who the top sales performer is in each month for a specified year. The management is only interested in the sales of bike “Components”. Create a stored procedure to get this information.
The year must be an input parameter.
Show the firstname and surname in one field.
Show the total value of the sales and the month for each top performer.
Additional marks will be allocated for using a single statement
So far I have this:
CREATE PROCEDURE getTopSalesByYear (#Year int)
AS
BEGIN
SET NOCOUNT ON
SELECT FirstName + ' ' + LastName AS SalesPerson,
sp.BusinessEntityID,
DATENAME(MONTH,SOH.OrderDate) as SalesMonth,
SUM(SOH.SubTotal) AS TotalSales FROM sales.SalesOrderHeader SOH
INNER JOIN sales.SalesOrderDetail SOD ON SOH.SalesOrderId = SOD.SalesOrderId
INNER JOIN sales.SalesPerson sp on soh.SalesPersonID = sp.BusinessEntityID
INNER JOIN Person.Person p on p.BusinessEntityID = sp.BusinessEntityID
INNER JOIN Production.Product Pr on sod.ProductID = pr.ProductID
INNER JOIN Production.ProductCategory pc on pc.ProductCategoryID = pr.ProductSubcategoryID
INNER JOIN Production.ProductSubcategory psc on pc.ProductCategoryID = pc.ProductcategoryID
WHERE psc.ProductCategoryID = 2
GROUP BY p.FirstName,p.LastName,sp.BusinessEntityID, DATENAME(MONTH,SOH.OrderDate)
ORDER BY TotalSales desc
This is what I have so far but it needs to be a procedure with the year being passed. Also noting that I do not know where to pass the parameter to what value.
You need to use ROW_NUMBER to get the best performer per month.
To check that a date is within a particular year, instead of comapring using the YEAR function, it is best to calculate the beginning and end points. The end should be exclusive, this is called a half-open interval.
You can also group by EOMONTH (end of month) which can be a little more efficient than grouping by DATENAME(MONTH, you can calculate the actual name afterwards.
CREATE PROCEDURE getTopSalesByYear (#Year int)
AS
SET NOCOUNT ON;
SELECT
s.FirstName + ' ' + s.LastName AS SalesPerson,
DATENAME(MONTH, s.SalesMonth) AS SalesMonth,
s.TotalSales
FROM (
SELECT
p.FirstName,
p.LastName,
p.BusinessEntityID,
EOMONTH(SOH.OrderDate) AS SalesMonth,
SUM(SOH.SubTotal) AS TotalSales,
ROW_NUMBER() OVER (PARTITION BY EOMONTH(SOH.OrderDate) ORDER BY SUM(SOH.SubTotal) DESC) AS rn
FROM sales.SalesOrderHeader SOH
INNER JOIN sales.SalesOrderDetail SOD ON SOH.SalesOrderId = SOD.SalesOrderId
INNER JOIN sales.SalesPerson sp on soh.SalesPersonID = sp.BusinessEntityID
INNER JOIN Person.Person p on p.BusinessEntityID = sp.BusinessEntityID
INNER JOIN Production.Product Pr on sod.ProductID = pr.ProductID
WHERE pr.ProductCategoryID = 2
AND SOH.OrderDate >= DATEFROMPARTS(#Year , 1, 1)
AND SOH.OrderDate < DATEFROMPARTS(#Year + 1, 1, 1)
GROUP BY
p.FirstName,
p.LastName,
p.BusinessEntityID,
EOMONTH(SOH.OrderDate)
) s
WHERE s.rn = 1
ORDER BY SalesMonth;
GO
Note that this will only give you results for a month if there are actually sales in that month. If there are no sales, you will not get 0, there will be no row for that month.
Thank you for the assistance, see below finished and working query.
CREATE PROCEDURE getTopSalesByYear (#Year int)
AS
SET NOCOUNT ON;
SELECT
s.FirstName + ' ' + s.LastName AS SalesPerson,
DATENAME(MONTH, s.SalesMonth) AS SalesMonth,
s.TotalSales
FROM (
SELECT
p.FirstName,
p.LastName,
p.BusinessEntityID,
EOMONTH(SOH.OrderDate) AS SalesMonth,
SUM(SOH.SubTotal) AS TotalSales,
ROW_NUMBER() OVER (PARTITION BY EOMONTH(SOH.OrderDate) ORDER BY SUM(SOH.SubTotal) DESC) AS rn
FROM sales.SalesOrderHeader SOH
INNER JOIN sales.SalesOrderDetail SOD ON SOH.SalesOrderId = SOD.SalesOrderId
INNER JOIN sales.SalesPerson sp on soh.SalesPersonID = sp.BusinessEntityID
INNER JOIN Person.Person p on p.BusinessEntityID = sp.BusinessEntityID
INNER JOIN Production.Product Pr on sod.ProductID = pr.ProductID
WHERE Pr.ProductSubcategoryID = 2
AND SOH.OrderDate >= DATEFROMPARTS(#Year , 1, 1)
AND SOH.OrderDate < DATEFROMPARTS(#Year + 1, 1, 1)
GROUP BY
p.FirstName,
p.LastName,
p.BusinessEntityID,
EOMONTH(SOH.OrderDate)
) s
WHERE s.rn = 1
ORDER BY SalesMonth;
GO
--Execute
--Exec getTopSalesByYear 2011

SQL Server query to select the highest quantity data row

I have been trying to find a similar case. I found a lot, but I still can't figure it out to adopt to my query.
I have a testDB in SQL Server that has 3 tables, as shown in picture below:
I created query as below:
SELECT P.FirstName,
P.LastName,
O.ProductType,
PO.ProductName,
PO.Quantity
FROM Persons AS P
INNER JOIN Orders AS O ON P.PersonID = O.PersonID
INNER JOIN ProductOrders AS PO ON PO.OrderID = O.OrderID;
Current result, it shows all records from ProductOrders. See picture below:
I want the result that only shows, for each Person name only record with the highest quantity. My expected result as shown in picture below:
Thanks very much for your help.
SQL Server has the TOP WITH TIES/ROW_NUMBER() trick that does this very elegantly:
SELECT TOP (1) WITH TIES P.FirstName, P.LastName, O.ProductType, PO.ProductName, PO.Quantity
FROM Persons P INNER JOIN
Orders O
ON P.PersonID = O.PersonID INNER JOIN
ProductOrders PO
ON PO.OrderID = O.OrderID
ORDER BY ROW_NUMBER() OVER (PARTITION BY P.PersonId, P.ProductType ORDER BY PO.Quantity DESC);
Use Window functions:
SELECT distinct P.FirstName
, P.LastName
, O.ProductType
, first_value(O.ProductName) OVER (Partition By P.FirstName, P.LastName, O.ProductType Order by PO.Quantity desc) as [Productname]
, max(PO.Quantity) OVER (Partition By P.FirstName, P.LastName, O.ProductType) as [Quantity]
FROM Persons AS P
INNER JOIN Orders AS O ON P.PersonID = O.PersonID
INNER JOIN ProductOrders AS PO ON PO.OrderID = O.OrderID;

Blank Table Result

I am having a problem with my database. I have replicated it using adventure works 2014.
I want to show all results where the BusinessEntityID shows more than once. So if a user has been a member of two deparments, their ID will show twice
But this is what I get with the below query.
SELECT Person.FirstName,
Person.LastName,
HumanResources.Department.Name AS CurrentDepartment,
StartDate,
EndDate
FROM AdventureWorks2014.Person.Person
JOIN HumanResources.EmployeeDepartmentHistory
ON HumanResources.EmployeeDepartmentHistory.BusinessEntityID = Person.BusinessEntityID
JOIN HumanResources.Department
ON EmployeeDepartmentHistory.DepartmentID = HumanResources.Department.DepartmentID
GROUP BY Person.BusinessEntityID,
HumanResources.Department.DepartmentID,
Person.FirstName,
Person.LastName,
HumanResources.Department.Name,
StartDate,
EndDate
HAVING COUNT(Person.BusinessEntityID) > 1
ORDER BY Person.LastName, StartDate
I remove the Having I do get returned result(the whole table). So I think I know where the problem is not what it is / how to resolve it.
Im going assume your query works ok and if you dont include the group by will bring all the employees. So you need join with a list of employees with +1 department
JOIN (SELECT P.BusinessEntityID --, COUNT(EDH.DepartmentID) for debug
FROM AdventureWorks2014.Person.Person P
JOIN HumanResources.EmployeeDepartmentHistory EDH
ON P.BusinessEntityID = EDH.BusinessEntityID
GROUP BY P.BusinessEntityID
HAVING COUNT(EDH.DepartmentID) > 1
) as list_of_employees_with_two_or_more
ON AdventureWorks2014.Person.Person.BusinessEntityID =
list_of_employees_with_two_or_more.BusinessEntityID
WITH cte AS (
SELECT Person.FirstName AS FirstName,
Person.LastName AS LastName,
Person.BusinessEntityID AS BusinessEntityID
FROM AdventureWorks2014.Person.Person
INNER JOIN HumanResources.EmployeeDepartmentHistory
ON HumanResources.EmployeeDepartmentHistory.BusinessEntityID = Person.BusinessEntityID
INNER JOIN HumanResources.Department
ON EmployeeDepartmentHistory.DepartmentID = HumanResources.Department.DepartmentID
GROUP BY Person.FirstName,
Person.LastName,
Person.BusinessEntityID
HAVING COUNT(*) > 1
)
SELECT Person.FirstName,
Person.LastName,
HumanResources.Department.Name AS CurrentDepartment,
StartDate,
EndDate
FROM AdventureWorks2014.Person.Person
INNER JOIN HumanResources.EmployeeDepartmentHistory
ON HumanResources.EmployeeDepartmentHistory.BusinessEntityID = Person.BusinessEntityID
INNER JOIN HumanResources.Department
ON EmployeeDepartmentHistory.DepartmentID = HumanResources.Department.DepartmentID
INNER JOIN cte t
ON Person.FirstName = t.FirstName AND
Person.LastName = t.LastName AND
Person.BusinessEntityID = t.BusinessEntityID
You need to do the grouping separately, what I think you really want is people with more than one record in EmployeeDepartmentHistory, e.g.
SELECT BusinessEntityID
FROM HumanResources.EmployeeDepartmentHistory
GROUP BY BusinessEntityID
HAVING COUNT(*) > 1
I think the most efficient way to integrate this into your current query is by using EXISTS:
SELECT p.FirstName,
p.LastName,
d.Name AS CurrentDepartment,
edh.StartDate,
edh.EndDate
FROM Person.Person AS p
JOIN HumanResources.EmployeeDepartmentHistory AS edh
ON edh.BusinessEntityID = p.BusinessEntityID
JOIN HumanResources.Department AS d
ON d.DepartmentID = edh.DepartmentID
WHERE EXISTS
( SELECT 1
FROM HumanResources.EmployeeDepartmentHistory AS edh2
WHERE edh2.BusinessEntityID = p.BusinessEntityID
HAVING COUNT(*) > 1
)
ORDER BY p.LastName, StartDate

SQL Left Joining four different tables

From AdventureWorks2012, I want to write a query using the Sales.SalesOrderHeader, Sales.Customer, Sales.Store, and Person.Person tables, showing the SalesOrderID, StoreName, the customer’s first and last name as CustomerName and the salesperson’s first and last names as SalesPersonName. I want to do a left join with Sales.Customer to the Sales.Store and Person.Person tables.
Here is my work so far. However, the CustomerName and SalesPersonName both have the same information when they should be different.
SELECT soh.SalesOrderID, ST.Name AS StoreName, pp.[PersonType], pp.[FirstName] + [LastName] AS CustomerName,
pp.[FirstName] + [LastName] AS SalesPersonName
FROM Sales.SalesOrderHeader soh
JOIN Sales.Customer SC ON soh.SalesOrderID = sc.CustomerID
JOIN Sales.Store ST ON sc.CustomerID = ST.BusinessEntityID
JOIN Person.Person PP ON ST.BusinessEntityID = PP.BusinessEntityID
WHERE Persontype LIKE 'SP%'
You're getting bad results, because your joins are wrong. You should join on fields, that represent a relation between the 2 table. For example:
JOIN Sales.Customer sc ON soh.CustomerID = sc.CustomerID

Include 0 when using count() SQL Server Left Join not working

Here I have used a left join on the person table because I want to include every record in that table even if it doesn't have an associated record in the task table. How can I resolve this to include the 0's in my results?
SELECT CONVERT(NVARCHAR, COUNT(p.personID)) AS count,
CONVERT(DECIMAL(4, 2), 1.0 * COUNT(p.personID) / DATEDIFF(DAY, #startDate, #endDate)) AS average,
p.personID,
p.firstname,
p.lastname,
c.companyname
FROM Tasks t
LEFT JOIN Person p
ON p.personID = t.personID
JOIN Client c
ON c.id = p.employer
JOIN Commission m
ON m.ClientID = c.ID
WHERE t.created BETWEEN #startDate AND #endDate
AND m.owner IN ( 'John Doe' )
GROUP BY p.personID,
p.firstname,
p.lastname,
c.companyname
ORDER BY c.companyname,
count DESC
I just figured out the answer to my problem... because I put conditions in the where clause on behalf of the other tables that I joined, it filtered out what I wanted. So I changed the person table to the "driving table" and I took the conditions from the where clause and put them in the join statement as I was joining the tasks table as follows:
SELECT
convert(nvarchar, COUNT(t.personID)) AS count,
CONVERT(decimal(4, 2), 1.0*COUNT(t.personID)
/ DATEDIFF(DAY, #startDate, #endDate)
) AS average,
p.personID,
p.firstname,
p.lastname,
c.companyname
FROM Person p
LEFT JOIN Tasks t
ON t.personID = p.personID AND t.created BETWEEN #startDate AND #endDate
JOIN Client c
ON c.id = p.employer
JOIN Commission m
ON m.ClientID = c.ID AND m.owner IN ('John Doe')
GROUP BY p.personID, p.firstname, p.lastname, c.companyname
ORDER BY c.companyname, count DESC
Provide filtered input through a subquery instead of the WHERE clause:
SELECT
convert(nvarchar, COUNT(p.personID)) AS count,
CONVERT(decimal(4, 2),1.0*COUNT(p.personID) / DATEDIFF(DAY, #startDate, #endDate)) AS average,
p.personID,
p.firstname,
p.lastname,
c.companyname
FROM Person p
LEFT JOIN (
select personId from tasks
where t.created BETWEEN #startDate AND #endDate
) t
ON t.personID = p.personID
JOIN Client c
ON c.id = p.employer
JOIN Commission m
ON m.ClientID = c.ID AND m.owner IN ('John Doe')
GROUP BY p.personID, p.firstname, p.lastname, c.companyname
ORDER BY c.companyname, count DESC
Maybe not the most useful in this particular example, but if the WHERE clause is complex and filtering more than one table, this is the answer.