SQL NULL Products - sql

I have to make it to show how many categories are available that are not currently any products in. Not sure what I am doing wrong, since I have moved stuff around and still get the result of 682 rows when there is supposed to be 0.
SELECT
Quantity,
ProductName,
CategoryID
FROM
Products,
OrderItems
WHERE NOT EXISTS (
SELECT Quantity
FROM OrderItems
WHERE Quantity IS NULL
)
Was told that "NOT EXIST" needs to be used in it.

You need a join condition between the tables. First hint: Never use commas in the FROM clause. Always use proper, explicit JOIN syntax.
I assume you want something like this:
SELECT oi.Quantity, p.ProductName, p.CategoryID
FROM Products p LEFT JOIN
OrderItems oi
ON oi.ProductId = p.ProductId
WHERE oi.quantity IS NULL;
The exact syntax is a bit of a guess, because you don't provide sample data.

Related

SQL how to count the number of relations between two tables and include zeroes?

I have a table of orders, and a table of products contained in these orders. (The products-table has order_id, a foreign key referring to orders.id).
I would like to query the number of products contained in each order. However, I also want orders to be contained in the results if they do not contain any products at all.
This means that a simple
SELECT *, COUNT(*) n_products FROM `orders` INNER JOIN `products` on `products.order_id` = `orders.id` GROUP_BY `order_id`
does not work, since orders without any products disappear.
Using a LEFT OUTER JOIN instead would add rows without product-information, but the distinction between an order with 1 product and an order with 0 products is lost.
What am I missing here?
You need a left join here, and you should be counting some column from the products table:
SELECT
o.*,
COUNT(p.order_id) AS n_products
FROM orders o
LEFT JOIN products p
ON p.order_id = o.id
GROUP BY
o.id;
Note that I assume that Postgres would allow grouping by orders.id and then selecting all columns from that table. If not, then you would only be able to select o.id in addition to the count.

Get most sold product for each country from NORTHWIND database

Good day guys, I've been struggling with this for the past day and I just can't seem to figure it out.
My task is to derive the most sold product for each country from the popular open source database called NORTHWIND: https://northwinddatabase.codeplex.com
I was able to get to this stage, here is my code in SQL Server:
--Get most sold product for each country
WITH TotalProductsSold AS
(
SELECT od.ProductID, SUM(od.Quantity) AS TotalSold
FROM [Order Details] AS od
GROUP BY od.ProductID
)
SELECT MAX(TotalProductsSold.TotalSold) AS MostSoldQuantity, s.Country --,p.ProductName
FROM Products AS p
INNER JOIN TotalProductsSold
ON TotalProductsSold.ProductID = p.ProductID
INNER JOIN Suppliers AS s
ON s.SupplierID = p.SupplierID
GROUP BY s.Country
ORDER BY MostSoldQuantity DESC
This gives me the following result:
That's all good but I wish to find out the product name for the MostSoldQuantity.
Thank you very much !
P.S I put a comment --p.ProductName where I thought it would work but it didnt and if someone could explain me why does GROUP BY not automatically allow me to derive the product name for the row that would be great
First, start with the count of products sold, per country, not just per product. Then rank them and pick only anything at RANK = 1.
Something like...
WITH
ProductQuantityByCountry AS
(
SELECT
s.CountryID,
p.ProductID,
SUM(od.Quantity) AS Quantity
FROM
[Order Details] AS od
INNER JOIN
Products AS p
ON p.ProductID = od.ProductID
INNER JOIN
Suppliers AS s
ON s.SupplierID = p.SupplierID
GROUP BY
s.CountryID,
p.ProductID
),
RankedProductQuantityByCountry
AS
(
SELECT
RANK() OVER (PARTITION BY CountryID ORDER BY Quantity DESC) AS countryRank,
*
FROM
ProductQuantityByCountry
)
SELECT
*
FROM
RankedProductQuantityByCountry
WHERE
countryRank = 1
Note, one country may supply identical quantity of different producs, and so two products could both have rank = 1. Look into ROW_NUMER() and/or DENSE_RANK() for other but similar behaviours to RANK().
EDIT:
A simple though exercise to cover why SQL doesn't let you put Product.Name in your final query is to ask a question.
What should SQL do in this case?
SELECT
MAX(TotalProductsSold.TotalSold) AS MostSoldQuantity,
MIN(TotalProductsSold.TotalSold) AS LeastSoldQuantity,
s.Country,
p.ProductName
FROM
blahblahblah
GROUP BY
s.Country
ORDER BY
MostSoldQuantity DESC
The presence of a MIN and a MAX makes things ambiguous.
You may be clear that you want to perform an operation by country and that operation to be to pick the product with the highest sales volume from that country. But it's not actually explicit, and small changes to the query could have very confusing consequences to any inferred behaviour. Instead SQL's declarative syntax provides a very clear / explicit / deterministic description of the problem to be solved.
If an expression isn't mentioned in the GROUP BY clause, you can't SELECT it, without aggregating it. This is so that there is no ambiguity as to what is meant or what the SQL engine is supposed to do.
By requiring you to stipulate get the total sales per country per product at one level of the query, you can then cleanly state and then pick the highest ranked per country at another level of the query.
This can feel like you end up with queries that are longer than "should" be necessary. But it also results in queries that are completely un-ambiguous, both for compiling the query down to an execution plan, and for other coders who will read your code in the future.

Making select query more efficient (subquery slows run speed)

The below query seems to take forever to run ever since I have added the subquery into it.
I originally tried to accomplish my goal by having two joins but the results were wrong.
Does anyone know the correct way to write this?
SELECT
c.cus_Name,
COUNT(o.orderHeader_id) AS Orders,
(select count(ol.orderLines_id) from orderlines ol where ol.orderLines_orderId = o.orderHeader_id) as linesOrderd,
MAX(o.orderHeader_dateCreated) AS lastOrdered,
SUM(o.orderHeader_totalSell) AS orderTotal,
SUM(o.orderHeader_currentSell) AS sellTotal
FROM
cus c
JOIN
orderheader o ON o.orderHeader_customer = c.cus_id
group by
c.cus_name
order by
orderTotal desc
Example data below
For the data you want, I think this is the way to go:
SELECT c.cus_Name,
COUNT(o.orderHeader_id) AS Orders,
SUM(ol.cnt) as linesOrderd,
MAX(o.orderHeader_dateCreated) AS lastOrdered,
SUM(o.orderHeader_totalSell) AS orderTotal,
SUM(o.orderHeader_currentSell) AS sellTotal
FROM cus c JOIN
orderheader o
ON o.orderHeader_customer = c.cus_id LEFT JOIN
(SELECT ol.orderLines_orderId, count(*) as cnt
FROM orderlines ol
GROUP BY ol.orderLines_orderId
) ol
ON ol.orderLines_orderId = o.orderHeader_id)
GROUP BY c.cus_name
ORDER BY orderTotal DESC;
I'm not sure if it will be much faster, but it will at least produce a sensible result -- the total number of order lines for a customer rather than the number of order lines on an arbitrary order.
Strange that subselect should not be possible since the count is only very indirectly related to the grouping. You want to count all orderlines of all orders which are related to one customer? Normally this should be done using the second join, but then the orderheader will be repeated as often as the order_lines exist. That would produce wrong results in the other aggregations.
normally this should help then, put the subselect into the joined table:
could you replace orderheader o by
(select o.*, (select count(ol.orderLines_id) from orderlines ol where ol.orderLines_orderId = o.orderHeader_id) as linesOrder from orderheader o) as o
and replace the subselect by
sum(o.linesOrder)

SQL a SELECT within a SELECT? Northwind (Microsoft)

First of all, I'm practicing with Northwind database (Microsoft creation).
The table design I'm working with is:
The question I'm trying to solve is:
Which Product is the most popular? (number of items)
Well, my query was:
SELECT DISTINCT
P.ProductName
FROM
Products P,
[Order Details] OD,
Orders O,
Customers C
WHERE
C.CustomerID = O.CustomerID
and O.OrderID = OD.OrderID
and OD.ProductID = P.ProductID
and P.UnitsInStock = (SELECT MAX(P.UnitsInStock) Items
FROM Products P)
Now, I had exactly one result as they asked:
ProductName
1 Rhönbräu Klosterbier
Yet, I doublt that my query was good. Do I really need a SELECT within a SELECT?
It feels like duplication for some reason.
Any help would be appreciated. Thanks.
To get the most popular product (bestselling product) use query
SELECT ProductName, MAX(SumQuantity)
FROM (
SELECT P.ProductName ProductName,
SUM(OD.Quantity) SumQuantity
FROM [Order Details] OD
LEFT JOIN Product P ON
P.ProductId = OD.ProductID
GROUP BY OD.ProductID
) Res;
Does the most units in stock necessarily equate to the most popular product? I don't think that is always a true statement (It could even be the opposite in fact.).
To me the question is asking, which is the most popular product sold. If you think about it that way, you'd be looking at the amount sold for each product and selecting the product with the most sold.
Does that make sense?
With regards to your specific query, the query only utilizes the Products table. You make joins, but they are not used at all in the query and should get overlooked by the query optimizer.
I would personally rewrite your query as the following:
SELECT
P.ProductName
FROM
Products P
INNER JOIN
(SELECT
MAX(P.UnitsInStock) AS Items
FROM Products P) maxProd
ON P.UnitsInStock= maxProd.Items
About your question, it is perfectly acceptable to utilize a subquery (the select in the where clause). It is even necessary at times. Most of the time I would use an Inner Join like I did above, but if the dataset is small enough, it shouldn't make much difference with query time.
In this scenario, you should rethink the question that is being asked and think about what being the most popular item means.
Rethinking the problem:
Let's look at the datasets that you've shown above. Which could be used to tell you how many products have been sold? A customer would have to order a product, right? Looking at the two tables that are potentially applicable, one contains details about number of items sold, quantity, or you could think of popularity in terms of the number of times appearing in orders. Start with that dataset and use a similar methodology to what you've done, but perhaps you'll have to use a sum and group by. Why? Perhaps more than one customer bought the item.
The problem with the dataset is it doesn't tell you the name of the product. It only gives you the ID. There is a table though that has this information. Namely, the Products table. You'll notice that both tables have the Product ID variable, and you are able to join on this.
You can find the most popular product by counting the number of orders placed on each product .And the one with most number of order will be the most popular product.
Below script will give you the most popular product based on the the number of orders placed .
;WITH cte_1
AS(
SELECT p.ProductID,ProductName, count(OrderID) CNT
FROM Products p
JOIN [Order Details] od ON p.ProductID=od.ProductID
GROUP BY p.ProductID,ProductName)
SELECT top 1 ProductName
FROM cte_1
ORDER BY CNT desc
if you are using SQL server 2012 or any higher version, use 'with ties' for fetching multiple products having same order count.
;WITH cte_1
AS(
SELECT p.ProductID,ProductName, count(OrderID) CNT
FROM Products p
JOIN [Order Details] od ON p.ProductID=od.ProductID
GROUP BY p.ProductID,ProductName)
SELECT top 1 with ties ProductName
FROM cte_1
ORDER BY CNT desc
In your sample code,you tried to pull the product with maximum stock held. since you joined with other tables (like order details etc) you are getting multiple results for the same product. if you wanted to get a product with maximum stock,you can use any of the following script.
SELECT ProductName
FROM Products P
WHERE P.UnitsInStock = (SELECT MAX(P.UnitsInStock) Items
FROM Products P)
OR
SELECT top 1 ProductName
FROM Products P
ORDER BY P.UnitsInStock desc
OR
SELECT top 1 with ties ProductName --use with ties in order to pull top products having same UnitsInStock
FROM Products P
ORDER BY P.UnitsInStock desc

Get data from multiple tables

I need to get data from multiple tables and put it into a subform.
The SubForm columns are 'product name' and 'quantity' and they need to list out the products relating to the order ID.
The tables in question are:
PRODUCTS(productID,productName)
ORDER(orderID,prodID,quantity)
Where the prodID in ORDER refers to the PRODUCTS table.
As you can see, the problem is that the name of the product is in a different table to the order. So, the data I need to get is:
Products(productName)
Order(quantity)
In relation to the orderID.
How can I use a SQL query to get this data? I am aware of joins and so on, but I just can't see how to apply it.
Thank you. I hope this makes sense.
This is a simple inner join between the two tables to return the rows you want:
SELECT P.PRODUCTNAME, O.QUANTITY
FROM PRODUCTS P INNER JOIN ORDER O ON P.PRODUCTID = O.PRODID
WHERE O.ORDERID = <order id>
SELECT
PRODUCTS.productName AS productName,
`ORDER`.quantity AS quantity
FROM
`ORDER`
INNER JOIN PRODUCTS on `ORDER`.prodID=PRODUCTS.productID
WHERE
..
You migh also want to rename the table ORDER - using reserved words as table names is not the best of styles.
Select p.productname, q.quantity from product_table p, quantity_table q where p.productId = q.productId;