SQL query with subquery and without subquery comparison - sql

I'd like someone who can explain me the logic difference between these two queries. Maybe you can explain performance difference also. (DB is Microsoft Northwind).
-- Join
select distinct c.CustomerID, c.CompanyName, c.ContactName from orders as o inner join customers as c
on o.CustomerID = c.CustomerID
-- SubQuery
select customerid, companyname, contactname, country from customers
where customerid in (select distinct customerid from orders)
Thanks in advance.

The first generates an intermediate result set with all orders for all customers. It then reduces them using select distinct.
The second just selects the customers without having to reduce them later. It should be much more efficient. However, the select distinct is not needed in the subquery (it is done automatically with in).
I would write the logic as:
select c.customerid, c.companyname, c.contactname, c.country
from customers c
where exists (select 1
from orders o
where o.customerid = c.customerid
);
This can readily make use of an index on orders(customerid).

Related

show each customers' names, how much each person has spent in total, and how many orders they have made

enter image description here
I am trying to extract infomation from atable to solve the aforementioned question:
show each customers' names, how much each person has spent in total, and how many orders they have made
Here is my attempt
select firstName, lastName, ISNULL(totalOrders, 0), ISNULL(totalSpent) from Customers as C
join (
SELECT customerID, count(orderNumber) as totalOrders from CustomerOrders
ORDER BY COUNT(orderNumber) ASC
) AS CO ON CO.customerID = C.customerID
JOIN (
select sum(orderNumber) as totalSpent from ItemsInOrder
order by C.lastName
) as IIO ON IIO.totalSpent = CO.totalOrders
Unfortunately, this did not run. Also i'm trying to get my result to be ordered by order count in ascending and order by the customer's last name but I'm having a hard time, as I don't know where to place it.
I feel like this is an easy question but I kept overthinking it and ending up being confused
As #HoneyBadger has pointed out in the comments, you are not using any group by clause, so that is going to error out. But, you have to look at how you join your tables as well. It does not make sense to equate the sum of item costs to the total number of orders. You should be joining that on order number.
Here is a quick and dirty answer using outer apply. Because count ignores nulls in the count on a single column, we don't need an isnull or a coalesce there, but we would on the sum column.
select
c.firstname,
c.lastname,
sum(coalesce(c.totalspend, 0)) as TotalSpend,
count(a.orderNumber) as Numorders
from customers c
outer apply
(
select co.customerid, orderNumber, sum(totalItemCost) as TotalSpend
from customerorders co
left join itemsinorders ii on ii.ordernumber = co.ordernumber
group by co.customerid, ordernumber
) a on a.customerid = c.customerid
Group by c.firstname, c.lastname
If outer apply doesn't work, you should be able to do it all with joins with a count(distinct). So,
select
c.firstname,
c.lastname,
count(distinct o.ordernumber) as NumOrders,
sum(coalesce(i.totalspend, 0) as totalSpend
from customers c
left join customerorders o on o.customerid = c.customerid
left join itemsinorders i on i.orderid = o.orderid
group by c.firstname, c.lastname
Without being able to check on the null values of distinct, this should work and is easier to read.

Find customer who bought least on W3schools SQL

I'm new to SQL Server and I'm trying to do some exercises. I want to find customers who bought least on W3schools database. My solution for this case is:
Join Customers with OrderDetails via CustomerID
Select CustomerNames that have least OrderID appeared after using JOIN.
Here is my query:
SELECT COUNT(OrderID), CustomerID
FROM Orders
GROUP BY CustomerID
ORDER BY COUNT(CustomerID) ASC
HAVING COUNT(OrderID) = '1'
When I ran this query, message says "Syntax error near "Having". What happened with my query?
Please help me to figure out.
My solution for this case is:
Join Customers with OrderDetails via CustomerID
Select CustomerNames that have least OrderID appeared after using JOIN.
As #thorsten-kettner lamented:
You say in your explanation that you join and then show the customer
name. Your query does neither of the two things...
Furthermore, your question has severe grammatical errors making it hard to decipher.
I want to find customers who bought least on W3schools database.
Nonetheless,
The Try-SQL Editor at w3schools.com
To get the list of customers who have at least 1 order:
SELECT C.CustomerName FROM [Customers] AS C
JOIN [Orders] AS O
ON C.CustomerID = O.CustomerID
GROUP BY C.CustomerID
ORDER BY C.CustomerName
To get the list of customers who have exactly 1 order:
SELECT C.CustomerName FROM [Customers] AS C
JOIN [Orders] AS O
ON C.CustomerID = O.CustomerID
GROUP BY C.CustomerID
HAVING COUNT(O.OrderID) = 1
ORDER BY C.CustomerName
To get the customer who made the least number of orders:
Including the ones who made no order. Use JOIN instead of LEFT JOIN if you only want to consider the ones who made at least one order.
You can remove LIMIT 1 to get the whole list sorted by the number of orders placed.
SELECT C.CustomerName, COUNT(O.OrderID) FROM [Customers] AS C
LEFT JOIN [Orders] AS O
ON C.CustomerID = O.CustomerID
GROUP BY C.CustomerID
ORDER BY COUNT(O.OrderID), C.CustomerName
LIMIT 1;
Addendum
As commented by #sticky-bit ,
The ORDER BY clause has to come after the HAVING clause.
You want a TOP 1 WITH TIES query, something like this:
SELECT TOP 1 WITH TIES CustomerID
FROM Orders
GROUP BY CustomerID
ORDER BY COUNT(OrderID);
In case you are using MySQL, try the following version:
SELECT CustomerID
FROM Orders
GROUP BY CustomerID
HAVING COUNT(OrderID) = (
SELECT COUNT(OrderID)
FROM ORDERS
GROUP BY CustomerID
ORDER BY COUNT(OrderID)
LIMIT 1
);

Not in aggregate function or group by clause: org.hsqldb.Expression#59bcb2b6 in statement

I'm trying to group SUM(OrderDetails.Quantity) but keep getting the error Not in aggregate function or group by clause: org.hsqldb.Expression#59bcb2b6 in statement but since I already have an GROUP BY part I don't know what I'm missing
SQL Statement:
SELECT OrderDetails.CustomerID, Customers.CompanyName, Customers.ContactName, SUM(OrderDetails.Quantity)
FROM OrderDetails INNER JOIN Customers ON OrderDetails.CustomerID = Customers.CustomerID
WHERE OrderDetails.CustomerID = Customers.CustomerID
GROUP BY OrderDetails.CustomerID
ORDER BY OrderDetails.CustomerID ASC
I'm trying to create a table that shows customers and the amount of products they ordered, while also showing their CompanyName and ContactName.
Write this:
GROUP BY OrderDetails.CustomerID, Customers.CompanyName, Customers.ContactName
Unlike in MySQL, PostgreSQL, and standard SQL, in most other SQL dialects, it is not sufficient to group only by the primary key if you also want to project functionally dependent columns in the SELECT clause, or elsewhere. You have to explicitly GROUP BY all of the columns that you want to project.
Don't take the customer id from the orders table. Take it from the customers table. If you do so, this might work in your database:
SELECT c.CustomerID, c.CompanyName, c.ContactName, SUM(od.Quantity)
FROM OrderDetails od INNER JOIN
Customers c
ON od.CustomerID = c.CustomerID
GROUP BY c.CustomerID
ORDER BY c.CustomerID ASC;
Note that the WHERE clause does not need to repeat the conditions in the ON clause.
Your version won't work in standard SQL because od.CustomerId is not unique in OrderDetails. Many databases don't support this, so in these you need the additional columns:
SELECT c.CustomerID, c.CompanyName, c.ContactName, SUM(od.Quantity)
FROM OrderDetails od INNER JOIN
Customers c
ON od.CustomerID = c.CustomerID
GROUP BY c.CustomerID, c.CompanyName, c.ContactName
ORDER BY c.CustomerID ASC;
Even so, it is much, much better to take all columns from the same table. That would allow the SQL optimizer to use indexes on Customers.

Sql query to display records that appear more than once in a table

I have two tables, Customer with columns CustomerID, FirstName, Address and Purchases with columns PurchaseID, Qty, CustomersID.
I want to create a query that will display FirstName(s) that have bought more than two products, product quantity is represented by Qty.
I can't seem to figure this out - I've just started with T-SQL
You could sum the purchases and use a having clause to filter those you're interested in. You can then use the in operator to query only the customer names that fit these IDs:
SELECT FirstName
FROM Customer
WHERE CustomerID IN (SELECT CustomerID
FROM Purchases
GROUP BY CustomerID
HAVING SUM(Qty) > 2)
Please try this, it should work for you, according to your question.
Select MIN(C.FirstName) FirstName from Customer C INNER JOIN Purchases P ON C.CustomerID=P.CustomersID Group by P.CustomersID Having SUM(P.Qty) >2
Please try this:
select c.FirstName,p.Qty
from Customer as c
join Purchase as p
on c.CustomerID = p.CustomerID
where CustomerID in (select CustomerID from Purchases group by CustomerID having count(CustomerID)>2);
SELECT
c.FirstName
FROM
Customer c
INNER JOIN Purchases p
ON c.CustomerId = p.CustomerId
GROUP BY
c.FirstName
HAVING
SUM(p.Qty) > 2
While the IN suggestions would work they are kind of overkill and more than likely less performant than a straight up join with aggregation. The trick is the HAVING Clause by using it you can limit your result to the names you want. Here is a link to learn more about IN vs. Exists vs JOIN (NOT IN vs NOT EXISTS)
There are dozens of ways of doing this and to introduce you to Window Functions and common table expressions which are way over kill for this simplified example but are invaluable in your toolset as your queries continue to get more complex:
;WITH cte AS (
SELECT DISTINCT
c.FirstName
,SUM(p.Qty) OVER (PARTITION BY c.CustomerId) as SumOfQty
FROM
Customer c
INNER JOIN Purchases p
ON c.CustomerId = p.CustomerId
)
SELECT *
FROM
cte
WHERE
SumOfQty > 2

Need hints on seemingly simple SQL query

I'm trying to do something like:
SELECT c.id, c.name, COUNT(orders.id)
FROM customers c
JOIN orders o ON o.customerId = c.id
However, SQL will not allow the COUNT function. The error given at execution is that c.Id is not valid in the select list because it isn't in the group by clause or isn't aggregated.
I think I know the problem, COUNT just counts all the rows in the orders table. How can I make a count for each customer?
EDIT
Full query, but it's in dutch... This is what I tried:
select k.ID,
Naam,
Voornaam,
Adres,
Postcode,
Gemeente,
Land,
Emailadres,
Telefoonnummer,
count(*) over (partition by k.id) as 'Aantal bestellingen',
Kredietbedrag,
Gebruikersnaam,
k.LeverAdres,
k.LeverPostnummer,
k.LeverGemeente,
k.LeverLand
from klanten k
join bestellingen on bestellingen.klantId = k.id
No errors but no results either..
When using an aggregate function like that, you need to group by any columns that aren't aggregates:
SELECT c.id, c.name, COUNT(orders.id)
FROM customers c
JOIN orders o ON o.customerId = c.id
GROUP BY c.id, c.name
If you really want to be able to select all of the columns in Customers without specifying the names (please read this blog post in full for reasons to avoid this, and easy workarounds), then you can do this lazy shorthand instead:
;WITH o AS
(
SELECT CustomerID, CustomerCount = COUNT(*)
FROM dbo.Orders GROUP BY CustomerID
)
SELECT c.*, o.OrderCount
FROM dbo.Customers AS c
INNER JOIN dbo.Orders AS o
ON c.id = o.CustomerID;
EDIT for your real query
SELECT
k.ID,
k.Naam,
k.Voornaam,
k.Adres,
k.Postcode,
k.Gemeente,
k.Land,
k.Emailadres,
k.Telefoonnummer,
[Aantal bestellingen] = o.klantCount,
k.Kredietbedrag,
k.Gebruikersnaam,
k.LeverAdres,
k.LeverPostnummer,
k.LeverGemeente,
k.LeverLand
FROM klanten AS k
INNER JOIN
(
SELECT klantId, klanCount = COUNT(*)
FROM dbo.bestellingen
GROUP BY klantId
) AS o
ON k.id = o.klantId;
I think this solution is much cleaner than grouping by all of the columns. Grouping on the orders table first and then joining once to each customer row is likely to be much more efficient than joining first and then grouping.
The following will count the orders per customer without the need to group the overall query by customer.id. But this also means that for customers with more than one order, that count will repeated for each order.
SELECT c.id, c.name, COUNT(orders.id) over (partition by c.id)
FROM customers c
JOIN orders ON o.customerId = c.id