Is a join necessary in this sql query? - sql

Suppose I have a table Orders in sql with 4 columns (OrderId int, ProductID int, Price money, Size int) with OrderID and ProductID as a joint primary key.
If I want a query to return the most recent order (along with price and size) for every product (i.e., maximal OrderId) is there a way to do this without a join?
I was thinking of the following:
select o.OrderId, o.ProductId, o.Price, o.Size
from Orders o inner join
(select Max(OrderId) OrderId, ProductId from Orders group by ProductId) recent
on o.OrderId = recent.OrderId and o.ProductId = recent.ProductId
but this doesn't seem like the most elegant solution possible.
Also, assume that an order with multiple products generates multiple rows. They are both necessarily part of the primary key.

No, there isn't. You've got the right idea. You need to do the subquery to get the max(orderID), productID pair, and then join that to the full table to limit the query to the rows in the full table that contain the max OrderId.

select
MAX(o.OrderId) as NewestOrderID,
o.ProductId,
o.Price,
o.Size
from Orders o
Group by
o.ProductId,
o.Price,
o.Size

Related

SQL how to count the number of relations between two tables and include zeroes?

I have a table of orders, and a table of products contained in these orders. (The products-table has order_id, a foreign key referring to orders.id).
I would like to query the number of products contained in each order. However, I also want orders to be contained in the results if they do not contain any products at all.
This means that a simple
SELECT *, COUNT(*) n_products FROM `orders` INNER JOIN `products` on `products.order_id` = `orders.id` GROUP_BY `order_id`
does not work, since orders without any products disappear.
Using a LEFT OUTER JOIN instead would add rows without product-information, but the distinction between an order with 1 product and an order with 0 products is lost.
What am I missing here?
You need a left join here, and you should be counting some column from the products table:
SELECT
o.*,
COUNT(p.order_id) AS n_products
FROM orders o
LEFT JOIN products p
ON p.order_id = o.id
GROUP BY
o.id;
Note that I assume that Postgres would allow grouping by orders.id and then selecting all columns from that table. If not, then you would only be able to select o.id in addition to the count.

Adventureworks sqlzoo

A "Single Item Order" is a customer order where only one item is ordered. Show the SalesOrderID and the UnitPrice for every Single Item Order
Hi,
I tried this question and this is my answer below:
SELECT s.salesOrderID, s.UnitPrice
FROM SalesOrderDetail s
INNER JOIN Product p ON s.ProductID = p.ProductID
GROUP BY s.salesOrderID, s.unitPrice
HAVING count(s.OrderQty) = 1
.please let me know if this is current or else provide a solution .Looking for answers
Thank you
I would suggest:
SELECT MAX(s.salesOrderID) as salesOrderId, MAX(s.UnitPrice) as UnitPrice
FROM SalesOrderDetail s
GROUP BY s.OrderID
HAVING COUNT(*) = 1;
Based on your query, UnitPrice is in SalesOrderDetail not Product, so the JOIN is not necessary. If UnitPrice is really in Product, then you need the JOIN.
Note the logic of the query. You want to count the number order detail records per order. So, that should be the aggregation. The HAVING clause ensures that there is just one detail record for the order.
Hence, the MAX() function return the values on that single row.
I took the question to mean a customer order of exactly one type of item with a order quantity of 1. The JOIN is unecessary as all the needed data is in one table. Here was my solution:
SELECT SalesOrderID, UnitPrice
FROM SalesOrderDetail
WHERE SalesOrderID IN
(SELECT SalesOrderID
FROM SalesOrderDetail
GROUP BY SalesOrderID HAVING COUNT(ProductID) = 1) AND OrderQty = 1
This eliminates orders that have more than one item of a single quantity.
SQLZoo doesn't have smiley faces for these answers, so I'm not sure if I'm correct ;)

With SqlServer, where * not exists

I got two tables, Orders with two columns as orderid and customerid, and Customers with two columns as customerid and location.
What I'd like to do is find all the customerid in the table Customers, which are not in Orders. For example, Customers.customerid = {A, B, C, D}, Orders.customerid = {A, B, C}, guess what I need to do is just get the ones from Customers but not exists in Orders. For achieving that, I put,
select customerid from Customers where customerid not exists (select customerid from Orders)
But it returns nothing.. My logic is quite simple like, first got all customerid in table Orders, then get the ones which doesn't exisit in the customerIds from table Orders. I can't see why this is wrong..
I tried this later, and it works. May anyone can help me pls?
select customerid from Customers as c where customerid not exists(select customerid from orders as o where c.customerid = o.customerid)
Why do I have to add c.customerid = o.customerid?
Why do I have to add c.customerid = o.customerid?
Because just because you're using the same name for two columns in your database, that doesn't mean that any specific relationship is enforced or assumed between them.
You need to add the c.customerid = o.customerid to specify that you're interested in the specific condition that these two columns are equal.
But any other correlation condition is also allowed by the language. E.g. you could write a query:
select customerid from Customers as c where not exists(
select customerid from Customers as c2 where c2.customerid < c.customerid)
Which would find you the "first" customer, if considering the customers sorted by their customerid values (not that this is the best way of writing this query, it's just a demonstration of the flexibility)
Your first query was, in effect "give me all rows from the Customer table, provided that no rows exist in the Order table" - which is also a perfectly valid thing to ask for, but wasn't what you intended - you intended to perform some form of correlation, which is what you did in your second query.
May be you need:
select customerid from Customers where customerid not in (select customerid from Orders)
Try below query :
SELECT customerid from Customers C WHERE NOT EXISTS
(
SELECT 1 FROM orders O WHERE C.customerid = O.customerid
)
what you need is
select c.customerid from customer c inner join order o on c.customerid = o.customerid where c.customerid not in (select od.customerid from order od)
you can't access data of 2 tables without joining them.
The syntax is a bit off. You mean to write it like this:
SELECT C.CustomerID
FROM Customers C
WHERE NOT EXISTS
(
SELECT O.CustomerID
FROM Orders O
WHERE O.CustomerID = C.CustomerID
)
;
You could also do this with NOT IN, as such:
SELECT customerid
FROM Customers
WHERE customerid NOT IN
(
SELECT customerid
FROM Orders
)
;
The two are semantically equivalent, for the most part.
Some people will probably tell you that you could do the same thing with a LEFT JOIN / IS NULL construct, but you can look at this article to see why that is a poorer choice in many circumstances.
#Damien_The_Unbeliever gave correct explanation and u need to try like this
with some data i created for 2 tables
CREATE TABLE #Orders
(orderid varchar(10), customerid varchar(10))
insert into #Orders values
('venkat','a'),
('raj','b'),
('mahes','c')
CREATE TABLE #Customers
(customerid varchar(10), [location] varchar(10))
insert into #Customers values
('a','and'),
('b','bar'),
('c','board'),
('D','board1')
SELECT cu.customerid from #Customers CU WHERE NOT EXISTS
(
SELECT 1 FROM #orders b WHERE Cu.customerid = b.customerid
)
output
customerid
D

SQL based Northwind, hard time on filtering

So in a practice site there is a question:
Which Product is the most popular? (number of items)
This means that There are Customers, and they want to know the most popular Ordered Product by the Customers(Overall Orders of TOP 1 ordered Product).
I Sincerely do not know How to solve this one.
Any help?
What I've tried so far is:
SELECT TOP(1) ProductID, ProductName
FROM Products
GROUP BY ProductID, ProductName
ORDER BY COUNT(*) DESC
But that's far from what they have asked.
In this one, I just get the top 1 Product with the lowest count, but that doesn't mean anything about the customers who ordered this product.
That only means that this specific Item could have been at low quantity and still is lower then the others, while the others were very high quantity and now just low (but still not low enough)
I hope I was clear enough.
If the data exists in that table, you might just need to order by something more sophisticated than count, like summing the quantity (if that column exists). Also, if ProductID and ProductName are already unique identifiers, note that you don't need the group by and sum at all.
SELECT TOP(1) ProductID, ProductName
FROM Products
GROUP BY ProductID, ProductName
ORDER BY SUM(Quantity) DESC
I don't know what your keys are, but it sounds like you actually want to be counting how many times it was ordered by customers, so you may need to join on the Customers table. I am assuming here that you have a table Orders, that has one line per order and shares the ProductID key. I also assume that ProductID is unique in Products (which may not be true based on your first query).
SELECT TOP(1) Products.ProductID, Products.ProductName
FROM Products
LEFT JOIN Orders
ON Orders.ProductID = Products.ProductID
GROUP BY Products.ProductID, Products.ProductName
ORDER BY COUNT(Orders.OrderID) DESC
This really depends on what tables and keys you have available to you.
Select top 1 P.ProductID,P.ProductName,Sum(OD.Quantity)AS Quantity
From [Order Details] OD
inner join Products P ON P.ProductID = OD.ProductID
Group By P.ProductID,P.ProductName
Order by Quantity Desc
You can workout something like this, (Table name/schema may differ)
with cte_product
as
(
select ProductID,Rank() over (order by Count(1) desc) as Rank from
Orders O
inner join Product P
on P.ProductID = O.ProductID
group by ProductID
)
select P.productID, P.ProductName from
cte_product ct
inner join product p
on ct.productId = p.ProductID
where ct.Rank = 1
Crux is usage of RANK() to get most popular product. Rest you may fetch columns as per need using relevant Joins.

Efficiency of joining subqueries in SQL Server

I have a customers and orders table in SQL Server 2008 R2. Both have indexes on the customer id (called id). I need to return details about all customers in the customers table and information from the orders table, such as details of the first order.
I currently left join my customers table on a subquery of the orders table, with the subquery returning the information I need about the orders. For example:
SELECT c.id
,c.country
,First_orders.product
,First_orders.order_id
FROM customers c
LEFT JOIN SELECT( id,
product
FROM (SELECT id
,product
,order_id
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Order_Date asc) as order_No
FROM orders) orders
WHERE Order_no = 1) First_Orders
ON c.id = First_orders.id
I'm quite new to SQL and want to understand if I'm doing this efficiently. I end up left joining quite a few subqueries like this onto the customers table in one select query and it can take tens of minutes to run.
So am I doing this efficiently or can it be improved? For example, I'm not sure if my index on id in the orders table is of any use and maybe I could speed up the query by creating a temporary table of what is in the subquery first and creating a unique index on id in the temporary table so SQL Server knows id is now a unique column and then joining my customers table to this temporary table? I typically have one or two million rows in the customers and orders tables.
Many thanks in advance!
You can remove one of your subqueries to make it a little more efficient:
SELECT c.id
,c.country
,First_orders.product
,First_orders.order_id
FROM customers c
LEFT JOIN (SELECT id
,product
,order_id
,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Order_Date asc) as order_No
FROM orders) First_Orders
ON c.id = First_orders.id AND First_Orders.order_No = 1
In your above query, you need to be careful where you place your parentheses as I don't think it will work. Also, you're returning product in your results, but not including in your nested subquery.
For someone who is just learning SQL, your query looks pretty good.
The index on customers may or may not be used for the query -- you would need to look at the execution plan. An index on orders(id, order_date) could be used quite effectively for the row_number function.
One comment is on the naming of fields. The field orders.id should not be the customer id. That should be something like 'orders.Customer_Id`. Keeping the naming system consistent across tables will help you in the future.
Try this...its easy to understand
;WITH cte
AS (
SELECT id
,product
,order_id
,ROW_NUMBER() OVER (
PARTITION BY id ORDER BY Order_Date ASC
) AS order_No
FROM orders
)
SELECT c.id
,c.country
,c1.Product
,c1.order_id
FROM customers c
INNER JOIN cte c1 ON c.id = c1.id
WHERE c1.order_No = 1