What is difference in performance of following 2 queries - sql

What is the difference in performance of following 2 queries in SQL Server 2008
Query 1:
SELECT A.Id,A.Name,B.Class,B.Std,C.Result,D.Grade
FROM Student A
INNER JOIN Classes B ON B.ID = A.ID
INNER JOIN Results C ON C.ID = A.ID
INNER JOIN Grades D ON D.Name = A.Name
WHERE A.Name='Test' AND A.ID=3
Query 2:
SELECT A.Id,A.Name,B.Class,B.Std,C.Result,D.Grade
FROM Student A
INNER JOIN Classes B ON B.ID = A.ID AND A.Name='Test' AND A.ID=3
INNER JOIN Results C ON C.ID = A.ID
INNER JOIN Grades D ON D.Name = A.Name
Is there any best way to achieve the best perofermance in the above 2 queries

You can have a 100% guarantee that they execute the same, with the same plan.
The only time it does matter when splitting AND/WHERE clauses in an INNER JOIN is when options like FORCE ORDER is used.
For great performance at the expense of writes, create these indexes:
A (id, name)
B (id) includes (class, std)
C (id) includes (result)
D (name) includes (grade)
However, it still depends on your distribution of data and selectivity of indexes as to whether they will actually be used. e.g. if your Grade table contains only 5 entries A,B,C,D,E, then no index will be used, it will simply scan and buffer the table in memory.

If you look at the execution plans of each, you'll find they're in all likelihood going to have identical execution plans.
The way to best improve performance for these kinds of queries is to make sure that you've got the proper indexes set up. All of your ID columns should have indexes on them, and if Grades is a largeish table, consider putting an index on Name for both it and Student

Related

Self join to limit the resultset before the rest of the joins are executed

I have a Table A with millions of records. This table must be joined to N more tables; I will only have one row returnded, as I will query by A key.
So in a traditional way I do something like this:
Select A.field1, ... , N.fieldN
From A
Inner Join B on A.fieldFkB = B.fieldKeyB
...
Inner Join N on A.fieldFkN = N.fieldKeyN
Where A.fieldKeyA = 'keyToFind'
But it takes a long time (as I have millions of records on Table A)
A partner told me to rewrite the SQL by removing the where clause and do a self join on table A like this:
Select MAIN.field1, ... , N.fieldN
From A as MAIN
Inner Join A as AUX on MAIN.fieldKeyA = 'keyToFind' AND MAIN.fieldKeyA = AUX.fieldKeyA
Inner Join B on MAIN.fieldFkB = B.fieldKeyB
...
Inner Join N on MAIN.fieldFkN = N.fieldKeyN
And it works (it is faster), but I wonder...
There are any caveats that I am not aware of?
Is it a good practise to do this in this concrete use case?
Are there other alternatives?

What is the best approach for performance when querying across multiple DB's

We have a setup where our customers each have their own databases. We also have some shared databases that are used to hold things like module access, reports, customer server locations, etc.
We have a few queries that look like this
USING CustomerDB
SELECT
fields
FROM
CustomerTable C
INNER JOIN SharedDb.dbo.SharedtableA A ON A.Id = C.SharedAId
INNER JOIN SharedDb.dbo.SharedtableB B ON B.Id = A.SharedBId
Does it make a difference to query plans etc if we were to change the query so that it executes in separate spaces?
E.g
USE CustomerDb
DECLARE #SharedTemp TABLE (
Id int NOT NULL
)
INSERT INTO #SharedTemp
SELECT
Id
FROM
SharedDb.dbo.SharedtableA A
INNER JOIN SharedDb.dbo.SharedtableB B ON B.Id = A.SharedBId
SELECT
fields
FROM
CustomerTable C
INNER JOIN #SharedTemp A ON A.Id = C.SharedAId
Thank you in advance for your insights

Simple SQL query on small tables takes too long to execute

I have a query that takes much too long time to execute. It is simple and tables are small. The simplified query (but still slow) is:
SELECT D.ID, C.Name, T.Name AS TownName
FROM Documents D, Companies C, Towns T
WHERE C.ID = D.Company AND T.ID = C.Town
ORDER BY C.Name
Primary keys and foreign keys between tables are properly set. Also, column Companies.Name is indexed.
I tried using JOINs, restarting SQL Server, rebuilding indices etc. but it still needs about 40 seconds to execute on my computer with SSD. Number of records in tables Documents and Companies is only 18K (currently, they are 1:1) and only about 20 records in table Towns.
On the other side, the following query returns completely the same records, but it takes practically no time to execute:
SELECT D.ID, C.Name, (SELECT Name FROM Towns WHERE ID = C.Town) AS TownName
FROM Documents D, Companies C
WHERE C.ID = D.Company
ORDER BY C.Name
In my opinion, the first query should be even faster, but I am obviously wrong. Does anybody have a clue what's happening here? It seems that indices are ignored when sorting by column in a table which is a master of one and detail of another one.
I can't explain why your subquery query is running faster but I would try something else to see if I could eliminate the subquery.
I usually go from least to greatest when i'm not using where conditions.. So my query would look like
Select t.Name TownName,
c.Name,
d.Id
From Towns t
Join Companies c ON t.Id = c.Town
Join Documents d ON c.Id = d.Company
Order By c.Name
Then I'd make sure that Companies has an Index on Town, and that Documents has and index on Company.. 18k records might take a little while to display in the output window but the query should be pretty quick
what happens when you use join statements?
SELECT D.ID, C.Name, T.Name AS TownName
FROM Documents D
inner join Companies C on C.
inner join Towns T on T.ID = C.Town
ORDER BY C.Name
also, try with and without the order

SQL Server query perfomance tuning with group by and join clause

We have been experiencing performance concerns over job and I could fortunately find the query causing the slowness..
select name from Student a, Student_Temp b
where a.id = b.id and
a.name in (select name from Student
group by name having count(*) = #sno)
group by a.name having count(*) = #sno
OPTION (MERGE JOIN, LOOP JOIN)
This particular query is iteratively called many times slowing down the performance..
Student table has 8 Million records and Student_temp receives 5-20 records in the iteration process each time.
Student table has composite primary key on ( id and name)
and sno = No of records in Student_Temp.
My questions are below,
1) why does this query show performance issues.
2) could you guys give a more efficient way of writing this piece ?
Thanks in Advance !
It's repeating the same logic unnecessarily. You really just want:
Of the Student(s) who also exist in Student_temp
what names exist #sno times?
Try this:
SELECT
name
FROM
Student a JOIN
Student_Temp b ON a.id = b.id
GROUP BY
name
HAVING
count(*) = #sno
Your query returns the following result: Give me all names that are #sno times in the table Student and exactly once in Student_temp.
You can rewrite the query like this:
SELECT a.name
FROM Student a
INNER JOIN Student_temp b
ON a.id = b.id
GROUP BY a.name
HAVING COUNT(*) = #sno
You should omit the query hint unless you are absolutely sure that the query optimizer screws up.
EDIT: There is of course a difference between the queries: if for instance #sno=2 then a name that shows up once in Student but twice in Student_temp would be included in my query but not in the original. I depends on what you really want to achieve whether that needs to be adressed or not.
Here you go
select name
from Student a
inner join Student_Temp b
on a.id = b.id
group by a.name
HAVING COUNT(*) = #sno

How to optimize the query? t-sql

This query works about 3 minutes and returns 7279 rows:
SELECT identity(int,1,1) as id, c.client_code, a.account_num,
c.client_short_name, u.uso, us.fio, null as new, null as txt
INTO #ttable
FROM accounts a INNER JOIN Clients c ON
c.id = a.client_id INNER JOIN Uso u ON c.uso_id = u.uso_id INNER JOIN
Magazin m ON a.account_id = m.account_id LEFT JOIN Users us ON
m.user_id = us.user_id
WHERE m.status_id IN ('1','5','9') AND m.account_new_num is null
AND u.branch_id = #branch_id
ORDER BY c.client_code;
The type of 'client_code' field is VARCHAR(6).
Is it possible to somehow optimize this query?
Insert the records in the Temporary table without using Order by Clause and then Sort them using the c.client_code. Hope it should help you.
Create table #temp
(
your columns...
)
and Insert the records in this table Without Using the Order by Clause. Now run the select with Order by Clause
Do you have indexes set up for your tables? An index on foreign key columns as well as Magazin.status might help.
Make sure there is an index on every field used in the JOINs and in the WHERE clause
If one or the tables you select from are actually views, the problem may be in the performance of these views.
Always try to list tables earlier if they are referenced in the where clause - it cuts off row combinations as early as possible. In this case, the Magazin table has some predicates in the where clause, but is listed way down in the tables list. This means that all the other joins have to be made before the Magazin rows can be filtered - possibly millions of extra rows.
Try this (and let us know how it went):
SELECT ...
INTO #ttable
FROM accounts a
INNER JOIN Magazin m ON a.account_id = m.account_id
INNER JOIN Clients c ON c.id = a.client_id
INNER JOIN Uso u ON c.uso_id = u.uso_id
LEFT JOIN Users us ON m.user_id = us.user_id
WHERE m.status_id IN ('1','5','9')
AND m.account_new_num is null
AND u.branch_id = #branch_id
ORDER BY c.client_code;
This kind of optimization can greatly improve query performance.