Joining column preference in SQL Server - sql

select *
from tableA ta
inner join tableB tb on tb.columnbid = ta.columnaid
When joining 2 tables, is the joining column preference important?
Should I use
on ta.columnaid = tb.columnbid
instead?

No, the order of the equation does not play any part in execution or optimisation.

The equation tb.columnbid=ta.columnaid simply means that show only those rows where both of them are equal. So the order doesn't matter. You can write ta.columnaid=tb.columnbid and will find the same result.

Related

SQL subselect statement very slow on certain machines

I've got an sql statement where I get a list of all Ids from a table (Machines).
Then need the latest instance of another row in (Events) where the the id's match so have been doing a subselect.
I need to latest instance of quite a few fields that match the id so have these subselects after one another within this single statement so end up with results similar to this...
This works and the results are spot on, it's just becoming very slow as the Events Table has millions of records. The Machine table would have on average 100 records.
Is there a better solution that subselects? Maybe doing inner joins or a stored procedure?
Help appreciated :)
You can use apply. You don't specify how "latest instance" is defined. Let me assume it is based on the time column:
Select a.id, b.*
from TableA a outer apply
(select top(1) b.Name, b.time, b.weight
from b
where b.id = a.id
order by b.time desc
) b;
Both APPLY and the correlated subquery need an ORDER BY to do what you intend.
APPLY is a lot like a correlated query in the FROM clause -- with two convenient enhances. A lateral join -- technically what APPLY does -- can return multiple rows and multiple columns.

Getting way more results than expected in SQL left join query

My code is such:
SELECT COUNT(*)
FROM earned_dollars a
LEFT JOIN product_reference b ON a.product_code = b.product_code
WHERE a.activity_year = '2015'
I'm trying to match two tables based on their product codes. I would expect the same number of results back from this as total records in table a (with a year of 2015). But for some reason I'm getting close to 3 million.
Table a has about 40,000,000 records and table b has 2000. When I run this statement without the join I get 2,500,000 results, so I would expect this even with the left join, but somehow I'm getting 300,000,000. Any ideas? I even refered to the diagram in this post.
it means either your left join is using only part of foreign key, which causes row multiplication, or there are simply duplicate rows in the joined table.
use COUNT(DISTINCT a.product_code)
What is the question are are trying to answer with the tsql?
instead of select count(*) try select a.product_code, b.product_code. That will show you which records match and which don't.
Should also add a where b.product_code is not null. That should exclude the records that don't match.
b is the parent table and a is the child table? try a right join instead.
Or use the table's unique identifier, i.e.
SELECT COUNT(a.earned_dollars_id)
Not sure what your datamodel looks like and how it is structured, but i'm guessing you only care about earned_dollars?
SELECT COUNT(*)
FROM earned_dollars a
WHERE a.activity_year = '2015'
and exists (select 1 from product_reference b ON a.product_code = b.product_code)

Does order matter when using "ON" in joins sqlserver 2008

when writing a query using any joins does it matter which side the On is based
Example
select * from customer C
join Address ON A.CustomerId=C.CustomerID -- would it make a difference if I did ON C.CustomerId=A.CustomerID
where c.CustomerId=1
What about left or right joins?
thanks a lot
No. You are still joining the two tables. I would consider it best practice to put the joining table on the right handside.
Does it matter? No.
LEFT vs RIGHT - depends on the results you want.
SELECT * FROM A LEFT JOIN B ON A.aid=b.aid
will give you all results from A whether or not they have a matching record in B - so the fields returned from B will be NULL for records where B doesn't match.
SELECT * FROM A RIGHT JOIN B ON ...
will give you all records from B and the possibility of null values in the fields provided by A...
Equality is a commutative operation. A=B is the same as B=A, so the specific ordering in the JOIN clause is irrelevant.
If you were doing inequality tests, < and >, then the ordering would matter. B>A is not the same A>B (and ditto for A<B and B<A)

Order by in Inner Join

I am putting inner join in my query.I have got the result but didn't know that how the data is coming in output.Can anyone tell me that how the Inner join matching the data.Below I am showing a image.There are two table(One or Two Table).
According to me that first row it should be Mohit but output is different. Please tell me.
In SQL, the order of the output is not defined unless you specify it in the ORDER BY clause.
Try this:
SELECT *
FROM one
JOIN two
ON one.one_name = two.one_name
ORDER BY
one.id
You have to sort it if you want the data to come back a certain way. When you say you are expecting "Mohit" to be the first row, I am assuming you say that because "Mohit" is the first row in the [One] table. However, when SQL Server joins tables, it doesn't necessarily join in the order you think.
If you want the first row from [One] to be returned, then try sorting by [One].[ID]. Alternatively, you can order by any other column.
Avoid SELECT * in your main query.
Avoid duplicate columns: the JOIN condition ensures One.One_Name and two.One_Name will be equal therefore you don't need to return both in the SELECT clause.
Avoid duplicate column names: rename One.ID and Two.ID using 'aliases'.
Add an ORDER BY clause using the column names ('alises' where applicable) from the SELECT clause.
Suggested re-write:
SELECT T1.ID AS One_ID, T1.One_Name,
T2.ID AS Two_ID, T2.Two_name
FROM One AS T1
INNER JOIN two AS T2
ON T1.One_Name = T2.One_Name
ORDER
BY One_ID;
Add an ORDER BY ONE.ID ASC at the end of your first query.
By default there is no ordering.
SQL doesn't return any ordering by default because it's faster this way. It doesn't have to go through your data first and then decide what to do.
You need to add an order by clause, and probably order by which ever ID you expect. (There's a duplicate of names, thus I'd assume you want One.ID)
select * From one
inner join two
ON one.one_name = two.one_name
ORDER BY one.ID
I found this to be an issue when joining but you might find this blog useful in understanding how Joins work in the back.
How Joins Work..
[Edited]
#Shree Thank you for pointing that out.
On the paragraph of Merge Join. It mentions on how joins work...
Like
hash join, merge join consists of two steps. First, both tables of the
join are sorted on the join attribute. This can be done with just two
passes through each table via an external merge sort. Finally, the
result tuples are generated as the next ordered element is pulled from
each table and the join attributes are compared
.

problem with IN clause in SQL

We have observed that there seems to be a maximum number of ids/variables which one can pass in the IN clause of SQL as comma separated values. To avoid this we are storing all the ids in a table and doing a SELECT within the IN clause. This however means extra database operations to store and retrieve ids. Is there any other way to use IN without SELECT?
Regards,
Sameer
In SQL Server 2008 you can declare a table variable and pass it to your query from the client or between the procedures.
For a modest number of values I would not have thought an IN (SELECT ..) would be that expensive on any rdbms.
You could INNER JOIN you IDs table or just break down the INs:
WHERE X IN (123,456 ...)
OR X IN (789,987 ...)
...
Agree with Alex
Instead of
Select * From TableA Where ID IN (Select ID from IDTable)
Use
Select * From TableA
INNER JOIN IDTable ON TableA.ID = IDTable.ID
The join will automatically filter the IDs for you.
If you put it in atable why are you still using the in clause, why not just join to the table?