SQL: Including Null values with Outer Joins - sql

Using this query, my output includes all values where COUNT is 0, but I still get the warning that Null values have been not been included. The double joins are to get the tables linked appropriately to count the number of orders, but I want to include all NULL as well, not just where COUNT is 0. What am I missing?
SELECT EmpNo, LastName, COUNT(CustomerOrder.OrderNo)
FROM Employee
LEFT OUTER JOIN Customer
ON Customer.AcctRepNo = EmpNo
LEFT OUTER JOIN CustomerOrder
ON Customer.CustNo =CustomerOrder.CustNo
GROUP BY EmpNo, LastName
ORDER BY COUNT(CustomerOrder.OrderNo) DESC, LastName

The results are fine, it is including all your values. The message it's only saying that when the column CustomerOrder.OrderNo is null, then it is not counting them (hence, the count value of zero).

Related

Filtering SELECT TOP WITH TIES When No Records Exist for a Specific Column

Question: How can I filter my results (see below) to exclude erroneous data? I'm guessing my problem is somewhere in the WHERE clause but for the life of me, I can't figure it out.
End Goal: Return NULL values for the CDA_Orientation column where no values exist in the portfolio and e_component tables (e.g. employee has not had Orientation yet)
DB Schema:
Result Set with Errors:
NOTE: The Orientation dates for Eastman, DeLuca, and Fontano are the same date and represent the TOP 1 result from the course_startdate column of the portfolio table.
What I Want the Results to Look Like:
If I've done my JOINS correctly, the CDA_Orientation column should show NULL because there is no entry in the portfolio table (and accordingly, the e_component table) for these three individuals. The entry is only created by the front end when the Employee is assigned to a course.
Here is My Code:
SELECT TOP (1) WITH TIES
P.lastname+', '+P.firstname AS Employee,
P.person_id,
CONVERT(DATE,PC.CDAI_EXP_DATE) AS CDA_Infant,
CONVERT(DATE,PC.CDAP_EXP_DATE) AS CDA_Preschool,
CONVERT(DATE,PO.course_startdate) AS CDA_Orientation
FROM person P
JOIN person_custom PC ON PC.person_id=P.person_id
LEFT JOIN portfolio PO ON P.person_id=PO.person_id
FULL JOIN e_component EC ON PO.component_id=EC.component_id
WHERE (cdai_exp_date IS NOT NULL OR cdap_exp_date IS NOT NULL)
AND PO.course_startdate IN (SELECT course_startdate
FROM portfolio PO
LEFT JOIN e_component EC ON PO.component_id=EC.component_id
WHERE (EC.userdefined_id LIKE '000150%' AND PO.status=11))
ORDER BY ROW_NUMBER() OVER(PARTITION BY P.lastname+', '+P.firstname
ORDER BY PO.person_id)
NOTE: The TOP (1) WITH TIES has successfully pulled the most recent orientation date (employees can have more than one) from the portfolio table for Tarkin and Rust. I've cut out any and all unnecessary JOINS and caveats.
Thanks in advance!
I believe the joins are the issue. Using WITH TIES in that way is also confusing if you're just trying to get a record for each person; I would use a GROUP BY. If you wanted to do it without a sub-query you could do:
SELECT
P.lastname+', '+P.firstname AS Employee,
P.person_id,
CONVERT(DATE,PC.CDAI_EXP_DATE) AS CDA_Infant,
CONVERT(DATE,PC.CDAP_EXP_DATE) AS CDA_Preschool,
MAX(CONVERT(DATE,PO.course_startdate)) AS CDA_Orientation
FROM #person P
JOIN #person_custom PC
ON PC.person_id=P.person_id
LEFT JOIN
(#portfolio PO
JOIN #e_component EC
ON PO.component_id=EC.component_id
AND EC.userdefined_id LIKE '000150%'
AND PO.status=11)
ON P.person_id=PO.person_id
WHERE (cdai_exp_date IS NOT NULL OR cdap_exp_date IS NOT NULL)
GROUP BY P.lastname, P.firstname, P.person_id,PC.CDAI_EXP_DATE,PC.CDAP_EXP_DATE

How to obtain all the tuples after a certain date in sql?

I have to obtain the male employee with highest number of requests in the second half of April 2014.
I have these tables:
Employee (EmployeeID, firstName, LastName, gender)
Workplace (CompanyID, EmployeeID, CompanyName)
Extras (ExtraID, CompanyID, Requests, Description, Date)
Extras.Requests is a string, not numerical.
My SQL attempt looks like this:
SELECT
Employee.FirstName, Employee.LastName,
SUM(COUNT(Extras.ExtraID)
FROM
Employee
INNER JOIN
(Workplace
INNER JOIN
Extras ON Workplace.CompanyID = Extras.CompanyID)
ON Workplace.EmployeeID = Employee.EmployeeID
WHERE
Employee.Gender = "male"
AND Extras.Date BETWEEN #4/15/2014# AND #4/30/2014#
SORT BY
SUM(COUNT(Extras.ExtraID) DESC;
LIMIT 1;
I'm not sure if my query is correct or not, thanks in advance.
There are several issues with your querySUM(COUNT(...)) nesting aggregate functions like this isn't permitted
You also need a GROUP BY clause to use aggregation function with non-aggregating columns (which are Employee.FirstName, Employee.LastName in your query).
Sorting is performed by an ORDER BY clause
Your original query includes a nested inner join which is likely to produce unexpected results.
FROM Employee
INNER JOIN(Workplace
INNER JOIN Extras ON Workplace.CompanyID=Extras.CompanyID
) ON Workplace.EmployeeID=Employee.EmployeeID
While nesting joins is allowed it is rarely used, I suggest you avoid it.
I would expect the query to look more like this
SELECT
Employee.FirstName, Employee.LastName, COUNT(Extras.ExtraID)
FROM ((Employee
INNER JOIN Workplace
ON Workplace.EmployeeID = Employee.EmployeeID)
INNER JOIN Extras
ON Workplace.CompanyID = Extras.CompanyID)
WHERE Employee.Gender = "male"
AND Extras.Date BETWEEN #4/15/2014# AND #4/30/2014#
GROUP BY
Employee.FirstName
,Employee.LastName
ORDER BY
COUNT(Extras.ExtraID) DESC;
LIMIT 1;
It's been years since I used access, I think it still wants parentheses in the from clause as I've shown above. In most SQL implementation they are not required, and it is more common for literals to use single quotes e.g. WHERE Employee.Gender = 'male'.

How to count all null values of right join by nesting it?

SELECT COUNT(Orders.EmployeeID)
FROM Orders
WHERE (Orders.EmployeeID IS NULL)
AND (IN(SELECT Orders.EmployeeID
FROM Orders
RIGHT JOIN Employees ON Orders.EmployeeID = Employees.EmployeeID))
GROUP BY Orders.EmplyoeeID;
You need to say us which DBMS are you using? MySQL or SQL Server? You have tagged both!
In SQL Server:
SELECT COUNT( CASE
WHEN Orders.EmployeeID IS NULL THEN 1
ELSE NULL
END
)
FROM Orders
RIGHT JOIN Employees
ON Orders.EmployeeID = Employees.EmployeeID;
If you pass a column name to the COUNT function, it wont count the null values, so in order to count the NULL values, you can use CASE to determine the NULL values and make COUNT function to count them(WHEN Orders.EmployeeID IS NULL THEN 1) and also you need to determine the non-NULL values and make make COUNT function not to count them(ELSE NULL).
Read more about COUNT here: https://learn.microsoft.com/en-us/sql/t-sql/functions/count-transact-sql?view=sql-server-2017

SQL Server 2016 Sub Query Guidance

I am currently working on an assignment for my SQL class and I am stuck. I'm not looking for full code to answer the question, just a little nudge in the right direction. If you do provide full code would you mind a small explanation as to why you did it that way (so I can actually learn something.)
Here is the question:
Write a SELECT statement that returns three columns: EmailAddress, ShipmentId, and the order total for each Client. To do this, you can group the result set by the EmailAddress and ShipmentId columns. In addition, you must calculate the order total from the columns in the ShipItems table.
Write a second SELECT statement that uses the first SELECT statement in its FROM clause. The main query should return two columns: the Client’s email address and the largest order for that Client. To do this, you can group the result set by the EmailAddress column.
I am confused on how to pull in the EmailAddress column from the Clients table, as in order to join it I have to bring in other tables that aren't being used. I am assuming there is an easier way to do this using sub Queries as that is what we are working on at the time.
Think of SQL as working with sets of data as opposed to just tables. Tables are merely a set of data. So when you view data this way you immediately see that the query below returns a set of data consisting of the entirety of another set, being a table:
SELECT * FROM MyTable1
Now, if you were to only get the first two columns from MyTable1 you would return a different set that consisted only of columns 1 and 2:
SELECT col1, col2 FROM MyTable1
Now you can treat this second set, a subset of data as a "table" as well and query it like this:
SELECT
*
FROM (
SELECT
col1,
col2
FROM
MyTable1
)
This will return all the columns from the two columns provided in the inner set.
So, your inner query, which I won't write for you since you appear to be a student, and that wouldn't be right for me to give you the entire answer, would be a query consisting of a GROUP BY clause and a SUM of the order value field. But the key thing you need to understand is this set thinking: you can just wrap the ENTIRE query inside brackets and treat it as a table the way I have done above. Hopefully this helps.
You need a subquery, like this:
select emailaddress, max(OrderTotal) as MaxOrder
from
( -- Open the subquery
select Cl.emailaddress,
Sh.ShipmentID,
sum(SI.Value) as OrderTotal -- Use the line item value column in here
from Client Cl -- First table
inner join Shipments Sh -- Join the shipments
on Sh.ClientID = Cl.ClientID
inner join ShipItem SI -- Now the items
on SI.ShipmentID = Sh.ShipmentID
group by C1.emailaddress, Sh.ShipmentID -- here's your grouping for the sum() aggregation
) -- Close subquery
group by emailaddress -- group for the max()
For the first query you can join the Clients to Shipments (on ClientId).
And Shipments to the ShipItems table (on ShipmentId).
Then group the results, and count or sum the total you need.
Using aliases for the tables is usefull, certainly when you select fields from the joined tables that have the same column name.
select
c.EmailAddress,
i.ShipmentId,
SUM((i.ShipItemPrice - i.ShipItemDiscountAmount) * i.Quantity) as TotalPriceDiscounted
from ShipItems i
join Shipments s on (s.ShipmentId = i.ShipmentId)
left join Clients c on (c.ClientId = s.ClientId)
group by i.ShipmentId, c.EmailAddress
order by i.ShipmentId, c.EmailAddress;
Using that grouped query in a subquery, you can get the Maximum total per EmailAddress.
select EmailAddress,
-- max(TotalShipItems) as MaxTotalShipItems,
max(TotalPriceDiscounted) as MaxTotalPriceDiscounted
from (
select
c.EmailAddress,
-- i.ShipmentId,
-- count(*) as TotalShipItems,
SUM((i.ShipItemPrice - i.ShipItemDiscountAmount) * i.Quantity) as TotalPriceDiscounted
from ShipItems i
join Shipments s on (s.ShipmentId = i.ShipmentId)
left join Clients c on (c.ClientId = s.ClientId)
group by i.ShipmentId, c.EmailAddress
) q
group by EmailAddress
order by EmailAddress
Note that an ORDER BY is mostly meaningless inside a subquery if you don't use TOP.

NTILE Function and Using Inner Join in Oracle

I am supposed to use the given Database(Its pretty huge so I used codeshare) to list last names and customer numbers of top 5% of customers for each branch. To find the top 5% of customers, I decided to use the NTILE Function, (100/5 = 20, hence NTILE 20). The columns are pulled from two separate tables so I used Inner joins. For the life of me, I honesly cannot figure out where I am going wrong. I keep getting "missing expression" errors but Do not know what exactly I am missing. Here is the Database
Database: https://codeshare.io/5XKKBj
ERD: https://drive.google.com/file/d/0Bzum6VJXi9lUX1d2ZkhudTE3QXc/view?usp=sharing
Here is my SQL Query so far.
SELECT
Ntile(20) over
(partition by Employee.Branch_no
order by sum(ORDERS.SUBTOTAL) desc
) As Top_5,
CUSTOMER.CUSTOMER_NO,
CUSTOMER.LNAME
FROM
CUSTOMER
INNER JOIN ORDERS
ON
CUSTOMER.CUSTOMER_NO = ORDERS.CUSTOMER_NO
GROUP BY
ORDERS.SUBTOTAL,
CUSTOMER.CUSTOMER_NO,
CUSTOMER.LNAME;
You need to join Employee and the GROUP BY must include all non-aggregated expressions. You can use a subquery to generate the subtotals and get the NTILE in the outer query, e.g.:
SELECT
Ntile(20) over
(partition by BRANCH_NO
order by sum_subtotal desc
) As Top_5,
CUSTOMER_NO,
LNAME
FROM (
SELECT
EMPLOYEE.BRANCH_NO,
CUSTOMER.CUSTOMER_NO,
CUSTOMER.LNAME,
sum(ORDERS.SUBTOTAL) as sum_subtotal
FROM CUSTOMER
JOIN ORDERS
ON CUSTOMER.CUSTOMER_NO = ORDERS.CUSTOMER_NO
JOIN EMPLOYEE
ON ORDERS.EMPLOYEE_NO = EMPLOYEE.EMPLOYEE_NO
GROUP BY
EMPLOYEE.BRANCH_NO,
CUSTOMER.CUSTOMER_NO,
CUSTOMER.LNAME
);
Note: you might want to include BRANCH_NO in the select list as well, otherwise the output will look confusing with duplicate customers (if a customer has ordered from employees in multiple branches).
Now, if you want to filter the above query to just get the top 5%, you can put the whole thing in another subquery and add a predicate on the Top_5 column, e.g.:
SELECT CUSTOMER_NO, LNAME
FROM (... the query above...)
WHERE Top_5 = 1;