SQL a SELECT within a SELECT? Northwind (Microsoft) - sql

First of all, I'm practicing with Northwind database (Microsoft creation).
The table design I'm working with is:
The question I'm trying to solve is:
Which Product is the most popular? (number of items)
Well, my query was:
SELECT DISTINCT
P.ProductName
FROM
Products P,
[Order Details] OD,
Orders O,
Customers C
WHERE
C.CustomerID = O.CustomerID
and O.OrderID = OD.OrderID
and OD.ProductID = P.ProductID
and P.UnitsInStock = (SELECT MAX(P.UnitsInStock) Items
FROM Products P)
Now, I had exactly one result as they asked:
ProductName
1 Rhönbräu Klosterbier
Yet, I doublt that my query was good. Do I really need a SELECT within a SELECT?
It feels like duplication for some reason.
Any help would be appreciated. Thanks.

To get the most popular product (bestselling product) use query
SELECT ProductName, MAX(SumQuantity)
FROM (
SELECT P.ProductName ProductName,
SUM(OD.Quantity) SumQuantity
FROM [Order Details] OD
LEFT JOIN Product P ON
P.ProductId = OD.ProductID
GROUP BY OD.ProductID
) Res;

Does the most units in stock necessarily equate to the most popular product? I don't think that is always a true statement (It could even be the opposite in fact.).
To me the question is asking, which is the most popular product sold. If you think about it that way, you'd be looking at the amount sold for each product and selecting the product with the most sold.
Does that make sense?
With regards to your specific query, the query only utilizes the Products table. You make joins, but they are not used at all in the query and should get overlooked by the query optimizer.
I would personally rewrite your query as the following:
SELECT
P.ProductName
FROM
Products P
INNER JOIN
(SELECT
MAX(P.UnitsInStock) AS Items
FROM Products P) maxProd
ON P.UnitsInStock= maxProd.Items
About your question, it is perfectly acceptable to utilize a subquery (the select in the where clause). It is even necessary at times. Most of the time I would use an Inner Join like I did above, but if the dataset is small enough, it shouldn't make much difference with query time.
In this scenario, you should rethink the question that is being asked and think about what being the most popular item means.
Rethinking the problem:
Let's look at the datasets that you've shown above. Which could be used to tell you how many products have been sold? A customer would have to order a product, right? Looking at the two tables that are potentially applicable, one contains details about number of items sold, quantity, or you could think of popularity in terms of the number of times appearing in orders. Start with that dataset and use a similar methodology to what you've done, but perhaps you'll have to use a sum and group by. Why? Perhaps more than one customer bought the item.
The problem with the dataset is it doesn't tell you the name of the product. It only gives you the ID. There is a table though that has this information. Namely, the Products table. You'll notice that both tables have the Product ID variable, and you are able to join on this.

You can find the most popular product by counting the number of orders placed on each product .And the one with most number of order will be the most popular product.
Below script will give you the most popular product based on the the number of orders placed .
;WITH cte_1
AS(
SELECT p.ProductID,ProductName, count(OrderID) CNT
FROM Products p
JOIN [Order Details] od ON p.ProductID=od.ProductID
GROUP BY p.ProductID,ProductName)
SELECT top 1 ProductName
FROM cte_1
ORDER BY CNT desc
if you are using SQL server 2012 or any higher version, use 'with ties' for fetching multiple products having same order count.
;WITH cte_1
AS(
SELECT p.ProductID,ProductName, count(OrderID) CNT
FROM Products p
JOIN [Order Details] od ON p.ProductID=od.ProductID
GROUP BY p.ProductID,ProductName)
SELECT top 1 with ties ProductName
FROM cte_1
ORDER BY CNT desc
In your sample code,you tried to pull the product with maximum stock held. since you joined with other tables (like order details etc) you are getting multiple results for the same product. if you wanted to get a product with maximum stock,you can use any of the following script.
SELECT ProductName
FROM Products P
WHERE P.UnitsInStock = (SELECT MAX(P.UnitsInStock) Items
FROM Products P)
OR
SELECT top 1 ProductName
FROM Products P
ORDER BY P.UnitsInStock desc
OR
SELECT top 1 with ties ProductName --use with ties in order to pull top products having same UnitsInStock
FROM Products P
ORDER BY P.UnitsInStock desc

Related

Lookup with duplicate values

This is kind of a complicated question. I have three tables:
A PRODUCTS table
ProductID
ProductName
Product A
Edwardian Desk
Product B
Edwardian Lamp
And a GROUPS table
ProductGroup
ProductID
Group A
Product A
Group A
Product B
Group B
Product C
And a SALES table
Product ID
Sales
Product A
1000
Product B
500
And I need to show the total of Sales per Product Group.
This part I understand; I wrote the query:
SELECT Groups.ProductGroup, SUM(Sales) AS TotalSales
FROM Groups
JOIN Sales
ON Groups.ProductID=Sales.ProductID
GROUP BY Groups.ProductGroup
This is the part that confuses me though: for each group, I need to pull in one of the names of the products in the group. However, it does not matter which name is pulled. So the final data could show:
Group A, Edwardian Desk, 1500
or
Group A, Edwardian Lamp, 1500
How can I pull the name of the product into my query?
I am working in Microsoft SQL Server
There's a number of ways to bring in one of your product's names, a couple of options are to either use an aggregation with a correlated subquery or to to use an apply.
Note, I've used aliases for your table names - doing so is good practice and makes queries more compact and easier to read. Also - presumably this is a contrived example and not your actual tables - but generally it's not a good practice to have column names identical to the table name, so if Sales on table Sales represents a quantity, then just call it Quantity!
select g.ProductGroup,
(select Min(ProductName) from Products p where p.ProductId=g.ProductId) FirstProductAlphabetically,
Sum(s.Sales) as TotalSales
from Groups g
join Sales s on s.ProductID=g.ProductID
group by g.ProductGroup
select g.ProductGroup,
p.ProductName as FirstProductById,
Sum(s.Sales) as TotalSales
from Groups g
join Sales s on s.ProductID=g.ProductID
cross apply (
select top (1) p.ProductName
from Products p
where p.ProductId=g.ProductId
order by ProductId
)p
group by g.ProductGroup
You can add products to the JOIN and use an aggregation function:
SELECT g.ProductGroup, SUM(s.Sales) AS TotalSales,
MIN(p.ProductName)
FROM Groups g JOIN
Sales s
ON g.ProductID = s.ProductID JOIN
Products p
ON p.ProductID = s.ProductId
GROUP BY g.ProductGroup;
Note: I often add two columns, MIN() and MAX() to get two sample names.
I should add. Your sample data has ProductIds that are not in the Products. That suggests a problem with either the question (more likely) or the data model. If you actually have references to non-existent products, then use a LEFT JOIN to Products rather than an inner join.

Adventureworks sqlzoo

A "Single Item Order" is a customer order where only one item is ordered. Show the SalesOrderID and the UnitPrice for every Single Item Order
Hi,
I tried this question and this is my answer below:
SELECT s.salesOrderID, s.UnitPrice
FROM SalesOrderDetail s
INNER JOIN Product p ON s.ProductID = p.ProductID
GROUP BY s.salesOrderID, s.unitPrice
HAVING count(s.OrderQty) = 1
.please let me know if this is current or else provide a solution .Looking for answers
Thank you
I would suggest:
SELECT MAX(s.salesOrderID) as salesOrderId, MAX(s.UnitPrice) as UnitPrice
FROM SalesOrderDetail s
GROUP BY s.OrderID
HAVING COUNT(*) = 1;
Based on your query, UnitPrice is in SalesOrderDetail not Product, so the JOIN is not necessary. If UnitPrice is really in Product, then you need the JOIN.
Note the logic of the query. You want to count the number order detail records per order. So, that should be the aggregation. The HAVING clause ensures that there is just one detail record for the order.
Hence, the MAX() function return the values on that single row.
I took the question to mean a customer order of exactly one type of item with a order quantity of 1. The JOIN is unecessary as all the needed data is in one table. Here was my solution:
SELECT SalesOrderID, UnitPrice
FROM SalesOrderDetail
WHERE SalesOrderID IN
(SELECT SalesOrderID
FROM SalesOrderDetail
GROUP BY SalesOrderID HAVING COUNT(ProductID) = 1) AND OrderQty = 1
This eliminates orders that have more than one item of a single quantity.
SQLZoo doesn't have smiley faces for these answers, so I'm not sure if I'm correct ;)

Get most sold product for each country from NORTHWIND database

Good day guys, I've been struggling with this for the past day and I just can't seem to figure it out.
My task is to derive the most sold product for each country from the popular open source database called NORTHWIND: https://northwinddatabase.codeplex.com
I was able to get to this stage, here is my code in SQL Server:
--Get most sold product for each country
WITH TotalProductsSold AS
(
SELECT od.ProductID, SUM(od.Quantity) AS TotalSold
FROM [Order Details] AS od
GROUP BY od.ProductID
)
SELECT MAX(TotalProductsSold.TotalSold) AS MostSoldQuantity, s.Country --,p.ProductName
FROM Products AS p
INNER JOIN TotalProductsSold
ON TotalProductsSold.ProductID = p.ProductID
INNER JOIN Suppliers AS s
ON s.SupplierID = p.SupplierID
GROUP BY s.Country
ORDER BY MostSoldQuantity DESC
This gives me the following result:
That's all good but I wish to find out the product name for the MostSoldQuantity.
Thank you very much !
P.S I put a comment --p.ProductName where I thought it would work but it didnt and if someone could explain me why does GROUP BY not automatically allow me to derive the product name for the row that would be great
First, start with the count of products sold, per country, not just per product. Then rank them and pick only anything at RANK = 1.
Something like...
WITH
ProductQuantityByCountry AS
(
SELECT
s.CountryID,
p.ProductID,
SUM(od.Quantity) AS Quantity
FROM
[Order Details] AS od
INNER JOIN
Products AS p
ON p.ProductID = od.ProductID
INNER JOIN
Suppliers AS s
ON s.SupplierID = p.SupplierID
GROUP BY
s.CountryID,
p.ProductID
),
RankedProductQuantityByCountry
AS
(
SELECT
RANK() OVER (PARTITION BY CountryID ORDER BY Quantity DESC) AS countryRank,
*
FROM
ProductQuantityByCountry
)
SELECT
*
FROM
RankedProductQuantityByCountry
WHERE
countryRank = 1
Note, one country may supply identical quantity of different producs, and so two products could both have rank = 1. Look into ROW_NUMER() and/or DENSE_RANK() for other but similar behaviours to RANK().
EDIT:
A simple though exercise to cover why SQL doesn't let you put Product.Name in your final query is to ask a question.
What should SQL do in this case?
SELECT
MAX(TotalProductsSold.TotalSold) AS MostSoldQuantity,
MIN(TotalProductsSold.TotalSold) AS LeastSoldQuantity,
s.Country,
p.ProductName
FROM
blahblahblah
GROUP BY
s.Country
ORDER BY
MostSoldQuantity DESC
The presence of a MIN and a MAX makes things ambiguous.
You may be clear that you want to perform an operation by country and that operation to be to pick the product with the highest sales volume from that country. But it's not actually explicit, and small changes to the query could have very confusing consequences to any inferred behaviour. Instead SQL's declarative syntax provides a very clear / explicit / deterministic description of the problem to be solved.
If an expression isn't mentioned in the GROUP BY clause, you can't SELECT it, without aggregating it. This is so that there is no ambiguity as to what is meant or what the SQL engine is supposed to do.
By requiring you to stipulate get the total sales per country per product at one level of the query, you can then cleanly state and then pick the highest ranked per country at another level of the query.
This can feel like you end up with queries that are longer than "should" be necessary. But it also results in queries that are completely un-ambiguous, both for compiling the query down to an execution plan, and for other coders who will read your code in the future.

SQL based Northwind, hard time on filtering

So in a practice site there is a question:
Which Product is the most popular? (number of items)
This means that There are Customers, and they want to know the most popular Ordered Product by the Customers(Overall Orders of TOP 1 ordered Product).
I Sincerely do not know How to solve this one.
Any help?
What I've tried so far is:
SELECT TOP(1) ProductID, ProductName
FROM Products
GROUP BY ProductID, ProductName
ORDER BY COUNT(*) DESC
But that's far from what they have asked.
In this one, I just get the top 1 Product with the lowest count, but that doesn't mean anything about the customers who ordered this product.
That only means that this specific Item could have been at low quantity and still is lower then the others, while the others were very high quantity and now just low (but still not low enough)
I hope I was clear enough.
If the data exists in that table, you might just need to order by something more sophisticated than count, like summing the quantity (if that column exists). Also, if ProductID and ProductName are already unique identifiers, note that you don't need the group by and sum at all.
SELECT TOP(1) ProductID, ProductName
FROM Products
GROUP BY ProductID, ProductName
ORDER BY SUM(Quantity) DESC
I don't know what your keys are, but it sounds like you actually want to be counting how many times it was ordered by customers, so you may need to join on the Customers table. I am assuming here that you have a table Orders, that has one line per order and shares the ProductID key. I also assume that ProductID is unique in Products (which may not be true based on your first query).
SELECT TOP(1) Products.ProductID, Products.ProductName
FROM Products
LEFT JOIN Orders
ON Orders.ProductID = Products.ProductID
GROUP BY Products.ProductID, Products.ProductName
ORDER BY COUNT(Orders.OrderID) DESC
This really depends on what tables and keys you have available to you.
Select top 1 P.ProductID,P.ProductName,Sum(OD.Quantity)AS Quantity
From [Order Details] OD
inner join Products P ON P.ProductID = OD.ProductID
Group By P.ProductID,P.ProductName
Order by Quantity Desc
You can workout something like this, (Table name/schema may differ)
with cte_product
as
(
select ProductID,Rank() over (order by Count(1) desc) as Rank from
Orders O
inner join Product P
on P.ProductID = O.ProductID
group by ProductID
)
select P.productID, P.ProductName from
cte_product ct
inner join product p
on ct.productId = p.ProductID
where ct.Rank = 1
Crux is usage of RANK() to get most popular product. Rest you may fetch columns as per need using relevant Joins.

SQL query for showing the products which sell better, based on how many times an index is found in the table?

I have customer table which have products and quantity , I need to retrieve those products which sell better in the company.
How would I accomplish this with an SQL query?
Out of the common-assumption of a product/order/sales table schema, the following query is constructed. So please either show us your tables or change the query according to your tables.
This will give you the best product:
SELECT s.ProductID, ProductName, Max(s.Quantity) as MaxSales
FROM Products p, SalesOrder s
WHERE p.ProductID = s.ProductID
GROUP BY s.ProductID;
This will give you 10 best products:
SELECT TOP 10 s.ProductID, ProductName, s.Quantity
FROM Products p, SalesOrder s
WHERE p.ProductID = s.ProductID
ORDER BY s.Quantity DESC;
I think a simple order by clause should do
select products, quantity
from tableName
order by quantity desc
If you need only top 5 for example, add "Top 5" between select and products word in above query
Hope that helps