SQL Server which query runs faster - sql

UPDATE User
SET Name = (SELECT NameSpace.NameId
FROM NameSpace
WHERE NameSpace.Name = 'BlaBlaBla')
WHERE UserId = 1453
This is faster or
int Value = Select NameSpace.NameId from NameSpace
where NameSpace.Name = 'BlaBlaBla';
UPDATE User
SET Name = "+Value +"
WHERE UserId = 1453
and
Select
UserName,
UserAge,
(Select * from AdressesTable where Adresses.AdresID=User.AdresID)
from
UserTable
where
UserId='123'
OR
Select *
from AdressesTable, UserTable
where Adresses.AdresID = User.AdresID AND UserID = '123'

There are a variety of assumptions to be made in determining which is faster.
First, if you are concerned about speed, then you want indexes on users(userid) and namespace(name).
Second, the assignment query should look like this in SQL Server:
declare #Value int;
select #Value = NameSpace.NameId
from NameSpace
where NameSpace.Name = 'BlaBlaBla';
Your variable declarations and subqueries are not correct for SQL Server.
Finally, even with everything set up correctly, it is not possible to say which is faster. If I assume that there is only one matching record for UserId, then the single update is probably faster -- although perhaps by so little that it is not noticeable. It may not be faster. The update may cause some sort of lock to be taken on NameSpace that would not otherwise be taken. I would actually expect the two to be quite comparable in speed.
However, if many users have the same userid (which is unlikely given the name of the column), then you are doing updates on multiple rows. Storing the calculated result once and using that is probably better than running the subquery multiple times. Even so, with the right indexes, I would expect the difference in performance to be negligible.

Related

Stored procedure using too many selects?

I recently started doing some performance tuning on a client's stored procedures and i bumped into this chunk of code and could'nt find a way to make it work more efficiently.
declare #StationListCount int;
select #StationListCount = count(*) from #StationList;
declare #FleetsCnt int;
select #FleetsCnt=COUNT(*) from #FleetIds;
declare #StationCnt int;
select #StationCnt=COUNT(*) from #StationIds;
declare #VehiclesCnt int;
select #VehiclesCnt=COUNT(*) from #VehicleIds;
declare #TrIds table(VehicleId bigint,TrId bigint,InRange bit);
insert into #TrIds(VehicleId,TrId,InRange)
select t.VehicleID,t.FuelTransactionId,1
from dbo.FuelTransaction t
join dbo.Fleet f on f.FleetID = t.FleetID and f.CompanyID=#ActorCompanyID
where t.TransactionTime>=#From and (#To is null or t.TransactionTime<#To)
and (#StationListCount=0 or exists (select id fRom #StationList where t.FuelStationID = ID))
and (#FleetsCnt=0 or exists (select ID from #FleetIds where ID = t.FleetID))
and (#StationCnt=0 or exists (select ID from #StationIds where ID = t.FuelStationID))
and (#VehiclesCnt=0 or exists (select ID from #VehicleIds where ID = t.VehicleID))
and t.VehicleID is not null
the insert command slows the whole procedure and takes 99% of the resources.
I am not sure but i think these nested loops are referring to the queries inside the where clause
I would very much appreciate the help i can get on this.
Thank you!
There are couple of things that you actually should go over and see the performance differences. First of all, as the previous answer suggest you should omit the count(*)-like aggragates as much as possible. If the table is so big, the cost of these functions exponentially increase. You can even think of storing those counts in a seperate table with proper index constraints.
I also suggest you to split the select statement into multiple statements because when you use so many NULL checks, or, and conditions in combinations; your indexes may be bypassed so that your query cost increases a lot. Sometimes, using UNIONs may provide far better performance than using such conditions.
Actually, you should try all these and see what fits your needs
hope it helps.
Insert is using only 1 table for vehicle Id so joining other tables doesn't requires.
I don't see the declaration of the #table variables, but (assuming the IDs in them are unique) consider communicating this information to the optimizer, IOW add primary key constraints to them.
Also, add the option(recompile) to the end of the query.

Poor performance of SQL query with Table Variable or User Defined Type

I have a SELECT query on a view, that contains 500.000+ rows. Let's keep it simple:
SELECT * FROM dbo.Document WHERE MemberID = 578310
The query runs fast, ~0s
Let's rewrite it to work with the set of values, which reflects my needs more:
SELECT * FROM dbo.Document WHERE MemberID IN (578310)
This is same fast, ~0s
But now, the set is of IDs needs to be variable; let's define it as:
DECLARE #AuthorizedMembers TABLE
(
MemberID BIGINT NOT NULL PRIMARY KEY, --primary key
UNIQUE NONCLUSTERED (MemberID) -- and index, as if it could help...
);
INSERT INTO #AuthorizedMembers SELECT 578310
The set contains the same, one value but is a table variable now. The performance of such query drops to 2s, and in more complicated ones go as high as 25s and more, while with a fixed id it stays around ~0s.
SELECT *
FROM dbo.Document
WHERE MemberID IN (SELECT MemberID FROM #AuthorizedMembers)
is the same bad as:
SELECT *
FROM dbo.Document
WHERE EXISTS (SELECT MemberID
FROM #AuthorizedMembers
WHERE [#AuthorizedMembers].MemberID = Document.MemberID)
or as bad as this:
SELECT *
FROM dbo.Document
INNER JOIN #AuthorizedMembers AS AM ON AM.MemberID = Document.MemberID
The performance is same for all the above and always much worse than the one with a fixed value.
The dynamic SQL comes with help easily, so creating an nvarchar like (id1,id2,id3) and building a fixed query with it keeps my query times ~0s. But I would like to avoid using Dynamic SQL as much as possible and if I do, I would like to keep it always the same string, regardless the values (using parameters - which above method does not allow).
Any ideas how to get the performance of the table variable similar to a fixed array of values or avoid building a different dynamic SQL code for each run?
P.S. I have tried the above with a user defined type with same results
Edit:
The results with a temporary table, defined as:
CREATE TABLE #AuthorizedMembers
(
MemberID BIGINT NOT NULL PRIMARY KEY
);
INSERT INTO #AuthorizedMembers SELECT 578310
have improved the execution time up to 3 times. (13s -> 4s). Which is still significantly higher than dynamic SQL <1s.
Your options:
Use a temporary table instead of a TABLE variable
If you insist on using a TABLE variable, add OPTION(RECOMPILE) at the end of your query
Explanation:
When the compiler compiles your statement, the TABLE variable has no rows in it and therefore doesn't have the proper cardinalities. This results in an inefficient execution plan. OPTION(RECOMPILE) forces the statement to be recompiled when it is run. At that point the TABLE variable has rows in it and the compiler has better cardinalities to produce an execution plan.
The general rule of thumb is to use temporary tables when operating on large datasets and table variables for small datasets with frequent updates. Personally I only very rarely use TABLE variables because they generally perform poorly.
I can recommend this answer on the question "What's the difference between temporary tables and table variables in SQL Server?" if you want an in-depth analysis on the differences.

Oracle WHERE clause ( AND , OR operators)

Friends
while executing where clause in Oracle SQL suppose I have
UPDATE schema1.TBL_SCHEMA1_PROCESS_FEED F
SET F.TBL_SCHEMA1_PROCESS_LINE_ID = V_LINE_ID,
F.TBL_SCHEMA1_PROCESS_LINE_TYPE_ID = V_LINE_TYPE_ID,
F.TBL_SCHEMA1_PROCESS_LINE_SUB_TYPE_ID = V_SUB_TYPE_ID,
WHERE F.CURR_DATE = V_CURR_DATE
AND F.NEXT_DATE = V_NEXT_BUSINESS_DATE OR F.NEXT_DATE IS NULL;
How this code can be optimized for the condition
F.NEXT_DATE = V_NEXT_BUSINESS_DATE OR F.NEXT_DATE IS NULL
Is that your actual where clause? Do you mean it to be:
WHERE F.CURR_DATE = V_CURR_DATE
AND ( F.NEXT_DATE = V_NEXT_BUSINESS_DATE
OR F.NEXT_DATE IS NULL )
If so then you need an index, unique if possible, on curr_date.
If you're not satisfied that this provides a large enough improvement in the execution time then think about extending it to curr_date, next_date. Don't create a larger index if you don't need to.
You might also consider chaning your conditions slightly, though I doubt it would make much, if any, difference.
WHERE F.CURR_DATE = V_CURR_DATE
AND NVL(F.NEXT_DATE, V_NEXT_BUSINESS_DATE) = V_NEXT_BUSINESS_DATE
The best possible option is, to update using the rowid. Without a lot more information it's impossible to know if you're in a situation where this might be possible but as the rowid is a unique address in the table it always is quicker than indexes, when updating a single row. If you're collecting data from this table then populating your variables before writing back to the table then this would be possible.
Are those your actual schema and table names... if they are then why not think about chosing something more descriptive?

T-SQL query massive performance difference between using variables & constants

All,
I am seeing some really weird behavior when I run a query in terms of performance between using a variable that's value is set at the beginning to actually using the value as a constant in the query.
What I am seeing is that
DECLARE #ID BIGINT
SET #ID = 5
SELECT * FROM tblEmployee WHERE ID = #ID
runs much faster than when I run
SELECT * FROM tblEmployee WHERE ID = 5
This is obviously a simpler version of the actual query but does anyone know of known issues in SQL Server 2005 the way it parses queries that would explain this behavior. My original query goes from 13 seconds to 8 minutes between the two approaches.
Thanks,
Ashish
Are you sure it's that way around?
Normally the parameterised query will be slower because SQL Server doesnp't know in advance what the parameter will be. A constant can be optimised right away.
One thing to note here about datatypes though.. what does this do:
SELECT * FROM tblEmployee WHERE ID = CAST(5 as bigint)
Also, reverse the execution order. We saw something odd the other day and the plans changed when we changed order.
Another way, mask ID to remove "parameter sniffing" affects on the first query. And difference?
DECLARE #ID BIGINT
SET #ID = 5
DECLARE #MaskedID BIGINT
SET #MaskedID = #ID
SELECT * FROM tblEmployee WHERE ID = #MaskedID
Finally, add OPTION (RECOMPILE) to each query. It means the plan is discarded and not re-used so it compiles differently.
Have you checked the query plans for each? That's always the first thing I do when I'm trying to analyze a performance issue.
If values get cached, you could be drawing an unwarranted conclusion that one approach is faster than another. Is there always this difference?
From what I understand it's to do with cached query plans.
When you run Select * from A Where B = #C it's one query plan regardless of value of #C. so if you run 10x with different values for #C, it's a single query plan.
When you run:
Select * from A Where B = 1 it creates a query plan
Select * from A Where B = 2 creates another
Select * from A Where B = 3 creates another
etc.
All this does is eat up memory.
Google query plan caching and literals and I'm sure you turn up detail explanations

T-SQL query performance puzzle: Why does using a variable make a difference?

I'm trying to optimize a complex SQL query and getting wildly different results when I make seemingly inconsequential changes.
For example, this takes 336 ms to run:
Declare #InstanceID int set #InstanceID=1;
With myResults as (
Select
Row = Row_Number() Over (Order by sv.LastFirst),
ContactID
From DirectoryContactsByContact(1) sv
Join ContainsTable(_s_Contacts, SearchText, 'john') fulltext on (fulltext.[Key]=ContactID)
Where IsNull(sv.InstanceID,1) = #InstanceID
and len(sv.LastFirst)>1
) Select * From myResults Where Row between 1 and 20;
If I replace the #InstanceID with a hard-coded number, it takes over 13 seconds (13890 ms) to run:
Declare #InstanceID int set #InstanceID=1;
With myResults as (
Select
Row = Row_Number() Over (Order by sv.LastFirst),
ContactID
From DirectoryContactsByContact(1) sv
Join ContainsTable(_s_Contacts, SearchText, 'john') fulltext on (fulltext.[Key]=ContactID)
Where IsNull(sv.InstanceID,1) = 1
and len(sv.LastFirst)>1
) Select * From myResults Where Row between 1 and 20;
In other cases I get the exact opposite effect: For example, using a variable #s instead of the literal 'john' makes the query run more slowly by an order of magnitude.
Can someone help me tie this together? When does a variable make things faster, and when does it make things slower?
The cause might be that IsNull(sv.InstanceID,1) = #InstanceID is very selective for some values of #InstanceID, but not very selective for others. For example, there could be millions of rows with InstanceID = null, so for #InstanceID = 1 a scan might be quicker.
But if you explicitly provide the value of #InstanceID, SQL Server knows based on the table statistics whether it's selective or not.
First, make sure your statistics are up to date:
UPDATE STATISTICS table_or_indexed_view_name
Then, if the problem still occurs, compare the query execution plan for both methods. You can then enforce the fastest method using query hints.
With hardcoded values the optimizer knows what to base on when building execution plan.
When you use variables it tries to "guess" the value and in many cases it gets not the best one.
You can help it to pick a value for optimization in 2 ways:
"I know better", this will force it to use the value you provide.
OPTION (OPTIMIZE FOR(#InstanceID=1))
"See what I do", this will instruct it to sniff the values you pass and use average (or most popular for some data types) value of those supplied over time.
OPTION (OPTIMIZE FOR UNKNOWN)