Specifying SELECT, then joining with another table - sql

I just hit a wall with my SQL query fetching data from my MS SQL Server.
To simplify, say i have one table for sales, and one table for customers. They each have a corresponding userId which i can use to join the tables.
I wish to first SELECT from the sales table where say price is equal to 10, and then join it on the userId, in order to get access to the name and address etc. from the customer table.
In which order should i structure the query? Do i need some sort of subquery or what do i do?
I have tried something like this
SELECT *
FROM Sales
WHERE price = 10
INNER JOIN Customers
ON Sales.userId = Customers.userId;
Needless to say this is very simplified and not my database schema, yet it explains my problem simply.
Any suggestions ? I am at a loss here.

A SELECT has a certain order of its components
In the simple form this is:
What do I select: column list
From where: table name and joined tables
Are there filters: WHERE
How to sort: ORDER BY
So: most likely it was enough to change your statement to
SELECT *
FROM Sales
INNER JOIN Customers ON Sales.userId = Customers.userId
WHERE price = 10;

The WHERE clause must follow the joins:
SELECT * FROM Sales
INNER JOIN Customers
ON Sales.userId = Customers.userId
WHERE price = 10
This is simply the way SQL syntax works. You seem to be trying to put the clauses in the order that you think they should be applied, but SQL is a declarative languages, not a procedural one - you are defining what you want to occur, not how it will be done.
You could also write the same thing like this:
SELECT * FROM (
SELECT * FROM Sales WHERE price = 10
) AS filteredSales
INNER JOIN Customers
ON filteredSales.userId = Customers.userId
This may seem like it indicates a different order for the operations to occur, but it is logically identical to the first query, and in either case, the database engine may determine to do the join and filtering operations in either order, as long as the result is identical.

Sounds fine to me, did you run the query and check?
SELECT s.*, c.*
FROM Sales s
INNER JOIN Customers c
ON s.userId = c.userId;
WHERE s.price = 10

Related

Issue with joins in a SQL query

SELECT
c.ConfigurationID AS RealflowID, c.companyname,
c.companyphone, c.ContactEmail, COUNT(k.caseid)
FROM
dbo.Configuration c
INNER JOIN
dbo.cases k ON k.SiteID = c.ConfigurationId
WHERE
EXISTS (SELECT * FROM dbo.RepairEstimates
WHERE caseid = k.caseid)
AND c.AccountStatus = 'Active'
AND c.domainid = 46
GROUP BY
c.configurationid,c.companyname, c.companyphone, c.ContactEmail
I have this query - I am using the configuration table to get the siteid of the cases in the cases table. And if the case exists in the repair estimates table pull the company details listed and get a count of how many cases are in the repair estimator table for that siteid.
I hope that is clear enough of a description.
But the issue here is the count is not correct with the data that is being pulled. Is there something I could do differently? Different join? Remove the exists add another join? I am not sure I have tried many different things.
Realized I was using the wrong table. The query was correct.

SQL - Conditional Left Join Count Performance is Slow

I am using SQL Server 2012.
I have three tables. Builders, Addresses and BuilderAddresses.
I have the following query which is used to give me my total count of records during paging:
SELECT COUNT(*) FROM Builders
LEFT JOIN Addresses ON Addresses.AddressId IN
(SELECT AddressId FROM BuilderAddresses WHERE BuilderId = Builders.BuilderId AND IsPrimary = 1)
WHERE Builders.[Email] LIKE '%TEST'%
ORDER BY Builders.[Name]
This query is particularly slow when records in the table approach 100k+. Does any one have any suggestions on how to get this query to execute faster??
On a table with 120K records, it take 452ms to get the count. When it comes to returning the records used in the paging, say 100 rows, it takes 11ms. I would really like to improve this if I can.
If I need to add greater detail, please let me know and I will edit the question.
The ORDER BY is not necessary for the COUNT, and you can remove that IN validation by joining with BuilderAdresses directly.
Try something like this:
SELECT COUNT(*)
FROM Builders b
LEFT JOIN BuilderAddresses ba ON ba.BuilderId = b.BuilderId AND isPrimary = 1
LEFT JOIN Addresses a ON a.AddressId = ba.AddressId
WHERE Builders.[Email] LIKE '%TEST' %
The problem is most likely to be with using IN as part of the join predicate. What it looks like you need to do is first join the junction table BuilderAddresses and then join Addresses, so something like this
SELECT COUNT(*) FROM Builders
JOIN BuilderAddresses ON BuilderAddresses.BuilderId = Builders.BuilderId AND isPrimary = 1
JOIN Addresses ON Addresses.AddressId = BuilderAddresses.AddressId
WHERE Builders.[Email] LIKE '%TEST%'
ORDER BY Builders.[Name]

How to reduce scope of subquery?

I've got SQL running on MS SQL Server similar to the following:
SELECT
CustNum,
Name,
FROM
Cust
LEFT JOIN (
SELECT
CustNum, MAX(OrderDate) as LastOrderDate
FROM
Orders
GROUP BY
CustNum) as Orders
ON Orders.CustNum = Cust.CustNum
WHERE
Region = 1
It contains a subquery to find the MAX record from a child table. The concern is that these tables have a very large number of rows. It seems like the subquery would operate on all the rows of the child table, even though only a very few of them are actually needed because of the WHERE clause on the outer query
Is there a way to reduce the scope of the inner query? Something like adding a WHERE clause to only include the records that are included in the outer query? Something like
WHERE CustomerOrders.CustomerNumber = Customers.CustomerNumber -- Customers from the outer query.
I suspect that this is not necessary, but I am getting some push back from another developer and I wanted to be sure (my SQL is a little rusty).
You are correct about the subquery. It will have to summarize all the data. You could re-write the query like this:
SELECT CustNum, Name, max(OrderDate) as LastOrderDate
FROM Cust LEFT JOIN
Orders
ON Orders.CustNum = Cust.CustNum
WHERE Region = 1
group by CustNum, Name
This would let the SQL optimizer choose the optimal path.
If you know that there are very, very few customers matching Region = 1 and you have an index on CustNum, OrderDate in Orders, you could write the query like this:
select CustNum, Name,
(select top 1 OrderDate
from Orders o
where Cust.CustNum = o.CustNum
order by OrderDate desc
) as LastOrderDate
from Cust
Where Region = 1
I think you would get a very similar effect by using cross apply.
By the way, I'm not a fan of re-writing queries for such purposes. But, I haven't found a SQL optimizer that would do anything other than summarize all the orders rows in this case.
No it's generally not necessary if your statistics etc are up to date. That's the job of the optimiser. You can try the CROSS APPLY operator if you think you're missing out on some shortcuts but generally if you have all constraints and stats it will be fine.
Your proposed additional WHERE might make sense to you, but as it doesn't correlate to anything in the actual query you posted it will change the results (if it works at all). If you want comments on that you need to post tables & relations etc.
Best way is to check the execution plan and see if it's doing anything dumb.

SQL count - first time

I am learning SQL (bit by bit!) trying to perform a query on our database and adding in a count function to show the total orders that appear against a customers id by counting in a inner join query.
Somehow it is pooling all the data together onto one customer with the count function though.
Can someone please suggest where I am going wrong?
SELECT tbl_customers.*, tbl_stateprov.stprv_Name, tbl_custstate.CustSt_Destination, COUNT(order_id) as total
FROM tbl_stateprov
INNER JOIN (tbl_customers
INNER JOIN (tbl_custstate
INNER JOIN tbl_orders ON tbl_orders.order_CustomerID = tbl_custstate.CustSt_Cust_ID)
ON tbl_customers.cst_ID = tbl_custstate.CustSt_Cust_ID)
ON tbl_stateprov.stprv_ID = tbl_custstate.CustSt_StPrv_ID
WHERE tbl_custstate.CustSt_Destination='BillTo'
AND cst_LastName LIKE '#URL.Alpha#%'
You need a GROUP BY clause in this statement in order to get what you want. You need to figure out what level you want to group it by in order to select which fields to add to the group by clause. If you just wanted to see it on a per customer basis, and the customers table had an id field, it would look like this (at the very end of your sql):
GROUP BY tbl_customers.id
Now you can certainly group by more fields, it just depends how you want to slice the results.
In your select statement you are using format like tableName.ColumnName but not for COUNT(order_id)
It should be COUNT(tableOrAlias.order_id)
Hope that helps.
As you are new to SQL it might also be worth considering the readability of your joins - the nested / bracketed joins you mentioned above are quite hard to read, and I would also personally alias your tables to make the query that bit more accessible:
SELECT
tbl_customers.customer_id
,tbl_stateprov.stprv_Name
,tbl_custstate.CustSt_Destination
,COUNT(order_id) as total
FROM tbl_stateprov statep
INNER JOIN tbl_custstate state ON statep.stprv_ID = state.CustSt_StPrv_ID
INNER JOIN tbl_customers customer ON customer.cst_ID = state.CustSt_Cust_ID
INNER JOIN tbl_orders orders ON orders.order_CustomerID = state.CustSt_Cust_ID
WHERE tbl_custstate.CustSt_Destination='BillTo'
AND cst_LastName LIKE '#URL.Alpha#%'
GROUP BY
tbl_customers.customer_id
,tbl_stateprov.stprv_Name
,tbl_custstate.CustSt_Destination
--And any other columns you want to include the count for

How to select all attributes in sql Join query

The following sql query below produces the specified result.
select product.product_no,product_type,salesteam.rep_name,salesteam.SUPERVISOR_NAME
from product
inner join salesteam
on product.product_rep=salesteam.rep_id
ORDER BY product.Product_No;
However my intensions are to further produce a more detailed result which will include all the attributes in the PRODUCT table. my approach is to list all the attributes in the first line of the query.
select product.product_no,product.product_date,product.product_colour,product.product_style,
product.product_age product_type,salesteam.rep_name,salesteam.SUPERVISOR_NAME
from product
inner join salesteam
on product.product_rep=salesteam.rep_id
ORDER BY product.Product_No;
Is there another way it can be done instead of listing all the attributes of PRoduct table one by one?
You can use * to select all columns from all tables, or you can use [table/alias].* to select all columns from the specified table. In your case, you can use product.*:
select product.*,salesteam.rep_name,salesteam.SUPERVISOR_NAME
from product
inner join salesteam
on product.product_rep=salesteam.rep_id
ORDER BY product.Product_No;
It is important to note that you should only do this if you are 100% sure you need every single column, and always will. There are performance implications associated with this; if you're selecting 100 columns from a table when you really only need 4 or 5 of them, you're adding a lot of overhead to the query. The DBMS has to work harder, and you're also sending more data across the wire (if your database is not on the same machine as your executing code).
If any columns are later added to the product table, those columns will also be returned by this query in the future.
select
product.*,
salesteam.rep_name,
salesteam.SUPERVISOR_NAME
from product inner join salesteam on
product.product_rep=salesteam.rep_id
ORDER BY
product.Product_No;
This should do.
You can write like this
select P.* --- all Product columns
,S.* --- all salesteam columns
from product P
inner join salesteam S
on P.product_rep=S.rep_id
ORDER BY P.Product_No;