SQL count - first time

SQL count - first time - sql

I am learning SQL (bit by bit!) trying to perform a query on our database and adding in a count function to show the total orders that appear against a customers id by counting in a inner join query.
Somehow it is pooling all the data together onto one customer with the count function though.
Can someone please suggest where I am going wrong?
SELECT tbl_customers.*, tbl_stateprov.stprv_Name, tbl_custstate.CustSt_Destination, COUNT(order_id) as total
FROM tbl_stateprov
INNER JOIN (tbl_customers
INNER JOIN (tbl_custstate
INNER JOIN tbl_orders ON tbl_orders.order_CustomerID = tbl_custstate.CustSt_Cust_ID)
ON tbl_customers.cst_ID = tbl_custstate.CustSt_Cust_ID)
ON tbl_stateprov.stprv_ID = tbl_custstate.CustSt_StPrv_ID
WHERE tbl_custstate.CustSt_Destination='BillTo'
AND cst_LastName LIKE '#URL.Alpha#%'

You need a GROUP BY clause in this statement in order to get what you want. You need to figure out what level you want to group it by in order to select which fields to add to the group by clause. If you just wanted to see it on a per customer basis, and the customers table had an id field, it would look like this (at the very end of your sql):
GROUP BY tbl_customers.id
Now you can certainly group by more fields, it just depends how you want to slice the results.

In your select statement you are using format like tableName.ColumnName but not for COUNT(order_id)
It should be COUNT(tableOrAlias.order_id)
Hope that helps.

As you are new to SQL it might also be worth considering the readability of your joins - the nested / bracketed joins you mentioned above are quite hard to read, and I would also personally alias your tables to make the query that bit more accessible:
SELECT
tbl_customers.customer_id
,tbl_stateprov.stprv_Name
,tbl_custstate.CustSt_Destination
,COUNT(order_id) as total
FROM tbl_stateprov statep
INNER JOIN tbl_custstate state ON statep.stprv_ID = state.CustSt_StPrv_ID
INNER JOIN tbl_customers customer ON customer.cst_ID = state.CustSt_Cust_ID
INNER JOIN tbl_orders orders ON orders.order_CustomerID = state.CustSt_Cust_ID
WHERE tbl_custstate.CustSt_Destination='BillTo'
AND cst_LastName LIKE '#URL.Alpha#%'
GROUP BY
tbl_customers.customer_id
,tbl_stateprov.stprv_Name
,tbl_custstate.CustSt_Destination
--And any other columns you want to include the count for

Related

Using COUNT (DISTINCT..) when also using INNER JOIN to join 3 tables but Postgres keeps erroring

I need to use INNER JOINs to get a series of information and then I need to COUNT this info. I need to be able to "View all courses and the instructor taking them, the capacity of the course, and the number of members currently booked on the course."
To get all the info I have done the following query:
SELECT
C.coursename, Instructors.fname, Instructors.lname,C.maxNo, membercourse.memno
FROM Courses AS C
INNER JOIN Instructors ON C.instructorNo = Instructors.instructorNo
INNER JOIN Membercourse ON C.courseID = Membercourse.courseID;
but no matter where I put the COUNT it always tells me that whatever is outside the COUNT should be in the GROUP BY
I have worked out how to COUNT/GROUP BY the necessary info e.g.:
SELECT courseID, COUNT (DISTINCT MC.memno)
FROM Membercourse AS MC
GROUP BY MC.courseID;
but I don't know how to combine the two!

I think what you're looking for is a subquery. I'm a SQL-Server guy (not postgresql) but the concept looks to be almost identical after some crash-course postgresql googling.
Anyway, basically, when you write a SELECT statement, you can use a subquery instead of an actual table. So your SQL would look something like:
select count(*)
from
(
select stuff from table
inner join someOtherTable
)
... hopefully that makes sense. Instead of trying to write one big query where you're doing both the inner join and count, you're writing two: an inner one that gets your inner-join'ed data, and then an outer one to actually count the rows.
EDIT: To help explain a bit more on the thought process behind subqueries.
Subqueries are a way of logically breaking down the steps/processes on the data. Instead of trying to do everything in one big step, you do it in steps.
In this case, what's step one? It's to get a combined data source for your combined, inner-join'ed data.
Step 1: Write the Inner Join query
SELECT
C.coursename, Instructors.fname, Instructors.lname,C.maxNo,
membercourse.memno
FROM Courses AS C
INNER JOIN Instructors ON C.instructorNo = Instructors.instructorNo
INNER JOIN Membercourse ON C.courseID = Membercourse.courseID;
Okay, now, what next?
Well, let's say we want to get a count of how many entries there are for each 'memno' in that result above.
Instead of trying to figure out how to modify that query above, we instead use it as a data source, like it was a table itself.
Step 2 - Make it A Subquery
select * from
(
SELECT
C.coursename, Instructors.fname, Instructors.lname,C.maxNo,
membercourse.memno
FROM Courses AS C
INNER JOIN Instructors ON C.instructorNo = Instructors.instructorNo
INNER JOIN Membercourse ON C.courseID = Membercourse.courseID
) mySubQuery
Step 3 - Modify your outer query to get the data you want.
Well, we wanted to group by 'memno', and get the count, right? So...
select memno, count(*)
from
(
-- all that same subquery stuff
) mySubQuery
group by memno
... make sense? Once you've got your subquery written out, you don't need to worry about it any more - you just treat it like a table you're working with.
This is actually incredibly important, and makes it much easier to read more intricate queries - especially since you can name your subqueries in a way that explains what the subquery represents data-wise.

There are many ways to solve this, such using Window Functions and so on. But you can also achieve it using a simple subquery:
SELECT
C.coursename,
Instructors.fname,
Instructors.lname,
C.maxNo,
(SELECT
COUNT(*)
FROM
membercourse
WHERE
C.courseID = Membercourse.courseID) AS members
FROM
Courses AS C
INNER JOIN Instructors ON C.instructorNo = Instructors.instructorNo;

Specifying SELECT, then joining with another table

I just hit a wall with my SQL query fetching data from my MS SQL Server.
To simplify, say i have one table for sales, and one table for customers. They each have a corresponding userId which i can use to join the tables.
I wish to first SELECT from the sales table where say price is equal to 10, and then join it on the userId, in order to get access to the name and address etc. from the customer table.
In which order should i structure the query? Do i need some sort of subquery or what do i do?
I have tried something like this
SELECT *
FROM Sales
WHERE price = 10
INNER JOIN Customers
ON Sales.userId = Customers.userId;
Needless to say this is very simplified and not my database schema, yet it explains my problem simply.
Any suggestions ? I am at a loss here.

A SELECT has a certain order of its components
In the simple form this is:
What do I select: column list
From where: table name and joined tables
Are there filters: WHERE
How to sort: ORDER BY
So: most likely it was enough to change your statement to
SELECT *
FROM Sales
INNER JOIN Customers ON Sales.userId = Customers.userId
WHERE price = 10;

The WHERE clause must follow the joins:
SELECT * FROM Sales
INNER JOIN Customers
ON Sales.userId = Customers.userId
WHERE price = 10
This is simply the way SQL syntax works. You seem to be trying to put the clauses in the order that you think they should be applied, but SQL is a declarative languages, not a procedural one - you are defining what you want to occur, not how it will be done.
You could also write the same thing like this:
SELECT * FROM (
SELECT * FROM Sales WHERE price = 10
) AS filteredSales
INNER JOIN Customers
ON filteredSales.userId = Customers.userId
This may seem like it indicates a different order for the operations to occur, but it is logically identical to the first query, and in either case, the database engine may determine to do the join and filtering operations in either order, as long as the result is identical.

Sounds fine to me, did you run the query and check?
SELECT s.*, c.*
FROM Sales s
INNER JOIN Customers c
ON s.userId = c.userId;
WHERE s.price = 10

SQL Lookup in MS Access Query

I have an Access Database with a table [Inventory] with following fields:
[Inventory].[Warehouse]
[Inventory].[PartNumber]
I also have a query [TransactionsQry] with following fielfd:
[TransactionsQry].[PartNumber]
[TransactionsQry].[SumofTransactions]
Now I would like to make a new query which shows all part numbers per Warehouse (from table [Inventory]) and look up related (number of) transactions in query [TransactionsQry]. Not every part number in [Inventory] has transactions yet and if so I would like to display "0".
At first I successfully tried this with a DLookup, but the result is a very slow query for very little data.
That is why I tried the following (but unsuccessfully displaying only matched part numbers and an additional error message):
SELECT
Inventory.Warehouse AS Warehouse,
TransactionsQry.PartNumber AS PartNumber,
TransactionsQry.SumofTransactions AS SumofTransactions
FROM Inventory
INNER JOIN TransactionsQry ON Inventory.PartNumber = TransactionsQry.PartNumber;
Any help with solving this issue in SQL is highly appreciated. Thanks.

You would be needing a LEFT JOIN based on what you need. Along with a Nz to treat Nulls as 0. Here is the corrected CODE
SELECT
Inventory.Warehouse AS Warehouse,
TransactionsQry.PartNumber AS PartNumber,
Nz(TransactionsQry.SumofTransactions, 0) AS SumofTransactions
FROM
Inventory LEFT JOIN TransactionsQry
ON
Inventory.PartNumber = TransactionsQry.PartNumber;

You want to use left join rather than inner join for this. Also, table aliases make queries easier to read and write:
SELECT i.Warehouse AS Warehouse,
tq.PartNumber AS PartNumber,
nz(tq.SumofTransactions, 0) AS SumofTransactions
FROM Inventory as i LEFT JOIN
TransactionsQry as tq
ON i.PartNumber = tq.PartNumber;
However, I'm guessing that you really want a group by:
SELECT i.Warehouse AS Warehouse,
tq.PartNumber AS PartNumber,
nz(sum(tq.SumofTransactions), 0) AS SumofTransactions
FROM Inventory as i LEFT JOIN
TransactionsQry as tq
ON i.PartNumber = tq.PartNumber
GROUP BY i.Warehouse, tq.PartNumber;

SQL - Conditional Left Join Count Performance is Slow

I am using SQL Server 2012.
I have three tables. Builders, Addresses and BuilderAddresses.
I have the following query which is used to give me my total count of records during paging:
SELECT COUNT(*) FROM Builders
LEFT JOIN Addresses ON Addresses.AddressId IN
(SELECT AddressId FROM BuilderAddresses WHERE BuilderId = Builders.BuilderId AND IsPrimary = 1)
WHERE Builders.[Email] LIKE '%TEST'%
ORDER BY Builders.[Name]
This query is particularly slow when records in the table approach 100k+. Does any one have any suggestions on how to get this query to execute faster??
On a table with 120K records, it take 452ms to get the count. When it comes to returning the records used in the paging, say 100 rows, it takes 11ms. I would really like to improve this if I can.
If I need to add greater detail, please let me know and I will edit the question.

The ORDER BY is not necessary for the COUNT, and you can remove that IN validation by joining with BuilderAdresses directly.
Try something like this:
SELECT COUNT(*)
FROM Builders b
LEFT JOIN BuilderAddresses ba ON ba.BuilderId = b.BuilderId AND isPrimary = 1
LEFT JOIN Addresses a ON a.AddressId = ba.AddressId
WHERE Builders.[Email] LIKE '%TEST' %

The problem is most likely to be with using IN as part of the join predicate. What it looks like you need to do is first join the junction table BuilderAddresses and then join Addresses, so something like this
SELECT COUNT(*) FROM Builders
JOIN BuilderAddresses ON BuilderAddresses.BuilderId = Builders.BuilderId AND isPrimary = 1
JOIN Addresses ON Addresses.AddressId = BuilderAddresses.AddressId
WHERE Builders.[Email] LIKE '%TEST%'
ORDER BY Builders.[Name]

How to reduce scope of subquery?

I've got SQL running on MS SQL Server similar to the following:
SELECT
CustNum,
Name,
FROM
Cust
LEFT JOIN (
SELECT
CustNum, MAX(OrderDate) as LastOrderDate
FROM
Orders
GROUP BY
CustNum) as Orders
ON Orders.CustNum = Cust.CustNum
WHERE
Region = 1
It contains a subquery to find the MAX record from a child table. The concern is that these tables have a very large number of rows. It seems like the subquery would operate on all the rows of the child table, even though only a very few of them are actually needed because of the WHERE clause on the outer query
Is there a way to reduce the scope of the inner query? Something like adding a WHERE clause to only include the records that are included in the outer query? Something like
WHERE CustomerOrders.CustomerNumber = Customers.CustomerNumber -- Customers from the outer query.
I suspect that this is not necessary, but I am getting some push back from another developer and I wanted to be sure (my SQL is a little rusty).

You are correct about the subquery. It will have to summarize all the data. You could re-write the query like this:
SELECT CustNum, Name, max(OrderDate) as LastOrderDate
FROM Cust LEFT JOIN
Orders
ON Orders.CustNum = Cust.CustNum
WHERE Region = 1
group by CustNum, Name
This would let the SQL optimizer choose the optimal path.
If you know that there are very, very few customers matching Region = 1 and you have an index on CustNum, OrderDate in Orders, you could write the query like this:
select CustNum, Name,
(select top 1 OrderDate
from Orders o
where Cust.CustNum = o.CustNum
order by OrderDate desc
) as LastOrderDate
from Cust
Where Region = 1
I think you would get a very similar effect by using cross apply.
By the way, I'm not a fan of re-writing queries for such purposes. But, I haven't found a SQL optimizer that would do anything other than summarize all the orders rows in this case.

No it's generally not necessary if your statistics etc are up to date. That's the job of the optimiser. You can try the CROSS APPLY operator if you think you're missing out on some shortcuts but generally if you have all constraints and stats it will be fine.
Your proposed additional WHERE might make sense to you, but as it doesn't correlate to anything in the actual query you posted it will change the results (if it works at all). If you want comments on that you need to post tables & relations etc.
Best way is to check the execution plan and see if it's doing anything dumb.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas