Order of Execution of Subqueries in SQL - sql

SELECT customerid,
(SELECT COUNT(*)
FROM orders
WHERE customers.customerid = orders.customerid) as total_orders
FROM customers
Can anyone explain the working of this SQL code? The subquery should always return the same number of rows in this case according to me, because the total no. of rows where
customers.customerid = orders.customerid is same. But its displaying each customer and the total_orders made by him/her. What is the order of execution that results in this?
Please find the database here:
https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_distinct

Your query is:
SELECT c.customerid,
(SELECT COUNT(*)
FROM orders o
WHERE c.customerid = o.customerid
) as total_orders
FROM customers c;
(Note that I added table aliases and qualified all column names.)
This is a scalar, correlated subquery. It is a scalar subquery because it returns a single value (rather than a table).
It is correlated because the subquery is linked to the outer query. This is the part that confuses you.
Basically, the outer query says that the result set will have one row for each customer.
The subquery than says that for each customer, the result set will count the number of matching rows for the customer in any given row.
Although writing the query with a subquery is totally fine, this would often be written as:
SELECT c.customerid, COUNT(o.customerid) as total_orders
FROM customers c LEFT JOIN
orders o
ON c.customerid = o.customerid
GROUP BY c.customerId

You are basically using the Correlated subquery which means your inner query is executed for each of the row of the outer query.
In your case, the inner query gets executed for all the customers because of the where clause customers.customerid = orders.customerid. So, the aggregate function COUNT(*) returns the total number of orders for every customer. Since your outer query selects customerId and total_orders that is why you get 2 columns.

Related

SQL dividing a count from one table by a number from a different table

I am struggling with taking a Count() from one table and dividing it by a correlating number from a different table in Microsoft SQL Server.
Here is a fictional example of what I'm trying to do
Lets say I have a table of orders. One column in there is states.
I have a second table that has a column for states, and second column for each states population.
I'd like to find the order per population for each sate, but I have struggled to get my query right.
Here is what I have so far:
SELECT Orders.State, Count(*)/
(SELECT StatePopulations.Population FROM Orders INNER JOIN StatePopulations
on Orders.State = StatePopulations.State
WHERE Orders.state = StatePopulations.State )
FROM Orders INNER JOIN StatePopulations
ON Orders.state = StatePopulations.State
GROUP BY Orders.state
So far I'm contending with an error that says my sub query is returning multiple results for each state, but I'm newer to SQL and don't know how to overcome it.
If you really want a correlated sub-query, then this should do it...
(You don't need to join both table in either the inner or outer query, the correlation in the inner query's where clause does the 'join'.)
SELECT
Orders.state,
COUNT(*) / (SELECT population FROM StatePopulation WHERE state = Orders.state)
FROM
Orders
GROUP BY
Orders.state
Personally, I'd just join them and use MAX()...
SELECT
Orders.state,
COUNT(*) / MAX(StatePopulation.population)
FROM
Orders
INNER JOIN
StatePopulation
StatePopulation.state = Orders.state
GROUP BY
Orders.state
Or aggregate your orders before you join...
SELECT
Orders.state,
Orders.order_count / StatePopulation.population
FROM
(
SELECT
Orders.state,
COUNT(*) AS order_count
FROM
Orders
GROUP BY
Orders.state
)
Orders
INNER JOIN
StatePopulation
StatePopulation.state = Orders.state
(Please forgive typos and smelling pistakes, I'm doing this on a phone.)

Count(*) with inner join and group by

I have 2 tables.
I need to get the number of counts with id from table 2 and get the name for the id from table 1.
I tried the following code. Didn't work!
select orders.CustomerID, customers.ContactName , count(*)
from Orders
left join customers on Customers.CustomerID= Orders.customerid
group by Orders.customerid;
Pls explain my shortcomings if possible.
When grouping the group by section of the query needs to mention all columns tat appear outside of an aggregate like COUNT. You're missing the ContactName here.
A fixed version:
select orders.CustomerID, customers.ContactName , count(*)
from Orders
left join customers on Customers.CustomerID= Orders.customerid
group by Orders.customerid, customers.ContactName;
Alternatively you can group by ID alone and then make the join like this:
With OrderCounts AS
(
select orders.CustomerID , count(*) AS OrderCount
from Orders
group by Orders.customerid
)
SELECT OrderCounts.CustomerID
, customers.ContactName
, OrderCounts.OrderCount
FROM OrderCounts
left join customers on Customers.CustomerID= OrderCounts.CustomerID
The first version is shorter and easier to type. In some scenarios the second version will run faster as the group by occurs on a single table & column.
For the second to give the same results CustomerID must be unique in the customers table otherwise it will produce duplicates (but if that's the case the first example would double count orders).

Join with a groupby operation in SQL

I am playing with W3Schools SQL environment. A pre-defined database is setup here.
Tables to be used: Customer and Orders.
To get all the info from Customer we can do:
SELECT * FROM [Customers]
To get Customers who have only less than 3 orders we do:
SELECT CustomerID, count(*) as num_orders FROM [Orders] group by customerID having num_orders<3
To get the Customers we have in London, we do:
SELECT * FROM [Customers] where city="London"
Question: How can I get, for every customer in London (with less than 3 orders), how many orders they have?
I know it has to be a Left join, as I want to keep all customers even if they have N/A orders (so, no records in "Orders"), but I am having a hard time to make it work.
I tried:
SELECT * FROM [Customers] where city="London"
left join (SELECT CustomerID, count(*) as num_orders FROM [Orders] group by customerID having num_orders<3) as data
on customers.CustomerID= data.CustomerID
But the environment gives no meaninful info about the error.
The proper syntax is:
SELECT c.*, o.num_orders
FROM [Customers] c LEFT JOIN
(SELECT o.CustomerID, COUNT(*) as num_orders
FROM [Orders] o
GROUP BY o.customerID
) o
ON c.CustomerID = o.CustomerID
WHERE c.city = 'London';
Notes:
The most important difference is the order of the clauses. WHERE comes after the FROM clause.
The HAVING clause is removed, because the question is for all customers in London.
Single quotes are used to quote London. Single quotes are the standard string delimiter.
The query uses table aliases, and these are specifically chosen to be very short and related to the table names.
All columns are qualified.

SQL Counting the amount of times a value from one table shows up in another

I am trying to work out how to go about this one SQL query.
I have two tables Orders and Customers.
Orders has two columns CustomerNumber and Fruit
Customers has two columns as well CustomerNumber and Address
Not all customers have placed an order but I need a query that runs through the list of Customers.CustomerNumber and lists how many times that Customers.CustomerNumber times shows up in the table Orders.
It's a countif query but im not sure how to set it up.
select customer.id, count(order.*)
from Customer inner join Order on Customer.id=Order.ID
group by customer.id
select c.CustomerNumber, count(1)
from Customer as c
left join Order as o on c.CustomerNumber = o.CustomerNumber
group by c.CustomerNumber
This will return a zero for customer's without any orders.

Give all rows appear in another table at least specific number of times

I am using the sample database and I want to write a query on the tables Customers and Orders that gives all the customers which have made more than 2 Orders. Although I achive that with the query:
Select Customers.*
From Customers
Where Customers.CustomerID IN(
Select Orders.CustomerID
From Orders
Group by Orders.CustomerID
Having count(*)>2
);
I cannot understand why the query:
SELECT Customers.*
FROM Orders
INNER JOIN Customers
ON Orders.CustomerID=Customers.CustomerID
GROUP BY Customers.CustomerID
HAVING COUNT(*)>2;
cannot give the same results. The message from the database is:
"Cannot group on fields selected with '*' (Customers)."
I had though the impression that it should work, since Customers.CustomerID is included on the demanded columns in Select statement. What is the problem and how could I modify the second query in order to work, even though it excecutes probably superfluous statements?
From SQL GROUP BY Statement
The GROUP BY statement is used in conjunction with the aggregate
functions to group the result-set by one or more columns.
SQL GROUP BY Syntax
SELECT column_name, aggregate_function(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name;
So for using aggregating, you need to specify which columns you are aggregating by, and for that you cannot use *
You would have to specifically specify the columns in both the SELECT and GROUP BY clauses.
Specify the columns you need in SELECT statement:
SELECT Customers.CustomerID, Customers.CustomerName
FROM Orders
INNER JOIN Customers
ON Orders.CustomerID=Customers.CustomerID
GROUP BY Customers.CustomerID, Customers.CustomerName
HAVING COUNT(*)>2;
Your first solution is formed of two queries actually
The second part is used for determining returning customers by listing CustomerID
Select o.CustomerID
From Sales.SalesOrderHeader o
Group by o.CustomerID
Having count(*)>2
And the first part displays Customer details by using returning customer list gained by the second query
Select c.*
From Sales.Customer c
Where
c.CustomerID IN(
...
);
It is not possible to return all customer data while trying to fetch dublicate customer ID's using Group By syntax on Order table
But instead of Group By, SQL aggregate functions (Count function below) with PARTITION BY clause can be used here
Please check the tutorial http://www.kodyaz.com/t-sql/sql-count-function-with-partition-by-clause.aspx and have a look at the following query
select * from (
SELECT distinct
c.*,
COUNT(o.SalesOrderID) over (partition by c.CustomerId) cnt
FROM Sales.SalesOrderHeader o
INNER JOIN Sales.Customer c ON o.CustomerID = c.CustomerID
) t where cnt > 1