I am working on tuning a stored procedure. It is a huge stored proc and joins tables that has about 6-7 million records.
My question is how do I determine the time spent in the components of the proc. The proc has 1 big select with many temp tables created on the fly (read sub-queries).
I tried using SET STATISTICS TIME ON, SET SHOWPLAN_ALL ON.
I am looking to isolate a chunk of code that takes the most time and not sure of how to do it.
Please help.
PS: I did try to google it, searched on Stackoverflow..........No luck. Here is one question that I looked at
How to improve SQL Server query containing nested sub query
Any help is really appreciated. Thanks in advance.
I would try out SQL Sentry's SQL Plan Explorer. It gives you visual help in finding the problem. It is also a free tool. It highlights the bits that cost a lot of I/O or CPU, versus a generic percent.
Here's where you can check it out:
http://www.sqlsentry.net/plan-explorer/sql-server-query-view.asp
Eric
I realize your asking for "time" (the how long), but maybe you should focus on the "what". What I mean is, tuning to the results of Execution Plan. Ideally using the "Show Execution Plan" is going to give you the biggest bang. And it will tell you, via percentages where it is cost the most resources.
If you are in SSMS 2008 you can right click in your query window and click "Include Execution Plan".
In your scenario, the best way to do this is to just run the components individually. Bear in mind the below is relevant for tuning for execution time primarily (in a low-contingency/concurrency environment). You may have other priorities under a heavy concurrent load.
I have to do a very similar break down on a regular basis for different procedures I have to tune. As a rule the general methodology I follow is:
1 - Do a baseline run
2 - Add PRINT or RAISERROR commands between portions that return the current time to aid in identifying which steps take the longest.
3 - Break down the queries individually. I normally run portions on their own (omit JOIN conditions) to see what the variance is. If it is a very long-running query you can add a TOP clause to any SELECTs to limit the returns. As long as you are consistent this will still give you a good idea.
4 - Tweak the components from step 3 that take the most time. If you have complicated subqueries, maybe make them indexed #temp tables to see if that helps. CTEs as a rule never help performance, so you may need to materialize those as well.
Related
Was wondering if there is a way to estimate the total run time for a query without actually processing the query?
I have found when running particular queries it might take hours and I guess it would come in handy to know the approximate completion time as there have been times where I was stuck at work due to waiting for a query to finish.
Sorry not sure if this is a silly question I'm a bit new with SAS.
Thanks guys.
This comes down to how well you know the data you're working with. There is no simple method of estimating this that is guaranteed to work in all situations, as there are so many factors that contribute to query performance. That said, there are a few heuristics you can use:
If you're reading every row from a large table, try reading a small proportion of it first before scaling up to get an idea of how much that read will contribute to the total query execution time.
Try running your query with proc sql inobs = 100 _method; to find out what sorts of joins the query planner is selecting. If there are any cartesian joins (sqxjsl in the log output), your query is going to take at least O(m*n) to run, where m and n are the numbers of rows in the tables being joined.
Check whether there are any indexes on the tables that could potentially speed up your query.
i have a table valued function with quite some code inside, doing multiple join selects and calling sub-functions and returns a result set. during the development of this function, at some point, i faced a performance degradation when executing the function. normally it shouldn't take more than 1 sec but it started taking about 10 sec. I played a bit with joins and also indexes but nothing changed dramatically.
after some time of changes and research, I wanted to see the results with another way. I created the same exact code with same exact parameters as a stored procedure. then i executed the sp. boom! it takes less then 1 sec. the same exact code takes about 10 sec with a function.
i really cannot figure out what this all about and i have no time to do more research. I need it as a function for some reasons but i don't know what to do at this point. I thought i could create it as a proc then call it within the function but then i realized it's not possible to do it for functions.
i wanted to hear some good views and advice here from experts.
thanks in advance
ps:i did not add any code here as the code is not in a good format and quite dirty. i would share it if anybody is interested. server is sql 2014 enterprise 64 bit
edit: i saw the possible duplicate question before but it did not satisfy me as my question is specifically about performance hit. the other question has many answers about general differences between procedures and functions. i want to make it more clear about possible performance related differences.
These are the differences from my experience:
When you first started writing the function, you are likely to run it with the same parameters again & again until it works correctly. This enables page caching in which SQL Server keeps the relevant data in memory.
Functions do not cache their execution plans. As you add more data, it takes longer to come up with a plan. SET STATISTICS TIME ON to see query compilation time vs. execution time.
Functions can only use table variables and there's no stats on those. That can make for some horrendous JOIN decisions later.
Some people prefer table-valued functions because they are easier to query:
SELECT * FROM fcn_myfunc(...) WHERE <some_conditions>
Instead of creating a temp table, exec the stored procedure then filter off that temp table. If your code is performance critical, turn it into a stored procedure.
I have encountered a strange situation in which an SQL query takes several seconds to complete when run from Toad and a Jasper Report containing the same query takes over half an hour to produce results (with the same parameters). Here are some details:
I checked, and Oracle (version 11g) uses different execution plans in these two cases.
I considered using stored outlines, but the report slightly modifies the query (bind variables are renamed; in the case of multiple values, i.e. $P!{...}, the report simply inserts values into the query, and there are too many combinations of values to bypass this), so outlines won't work.
I ran the report in iReport 5.1 and via OpenReports and it takes about 35 minutes for both.
The original query is tuned with some hints, without them it takes comparably long to complete as the report.
I would appreciate any advice on how to deal with this.
First of all, don't use TOAD for query tuning. It is in TOAD's interest to present you the first few rows of the result set as fast as possible, to make the application as responsive as possible. To do so, TOAD injects a FIRST_ROWS hint to your query. A nice feature, but not for tuning queries.
Now, to address your query that's taking too long, I suggest you first investigate where time is being spent. You can do so by tracing a query execution, as explained here. Once you have done that, and you know where time is being spent, but you still don't know how to solve it, then please post the execution plan and statistics.
There are probably differences in the optimizer environment. You can check this using
select * from
V$SES_OPTIMIZER_ENV
where sid = sys_context(’userenv’,’sid’)
Run this in your toad session and in a Jasper report and compare the results.
I have a lot of records in table. When I execute the following query it takes a lot of time. How can I improve the performance?
SET ROWCOUNT 10
SELECT StxnID
,Sprovider.description as SProvider
,txnID
,Request
,Raw
,Status
,txnBal
,Stxn.CreatedBy
,Stxn.CreatedOn
,Stxn.ModifiedBy
,Stxn.ModifiedOn
,Stxn.isDeleted
FROM Stxn,Sprovider
WHERE Stxn.SproviderID = SProvider.Sproviderid
AND Stxn.SProviderid = ISNULL(#pSProviderID,Stxn.SProviderid)
AND Stxn.status = ISNULL(#pStatus,Stxn.status)
AND Stxn.CreatedOn BETWEEN ISNULL(#pStartDate,getdate()-1) and ISNULL(#pEndDate,getdate())
AND Stxn.CreatedBy = ISNULL(#pSellerId,Stxn.CreatedBy)
ORDER BY StxnID DESC
The stxn table has more than 100,000 records.
The query is run from a report viewer in asp.net c#.
This is my go-to article when I'm trying to do a search query that has several search conditions which might be optional.
http://www.sommarskog.se/dyn-search-2008.html
The biggest problem with your query is the column=ISNULL(#column, column) syntax. MSSQL won't use an index for that. Consider changing it to (column = #column AND #column IS NOT NULL)
You should consider using the execution plan and look for missing indexes. Also, how long it takes to execute? What is slow for you?
Maybe you could also not return so many rows, but that is just a guess. Actually we need to see your table and indexes plus the execution plan.
Check sql-tuning-tutorial
For one, use SELECT TOP () instead of SET ROWCOUNT - the optimizer will have a much better chance that way. Another suggestion is to use a proper inner join instead of potentially ending up with a cartesian product using the old style table,table join syntax (this is not the case here but it can happen much easier with the old syntax). Should be:
...
FROM Stxn INNER JOIN Sprovider
ON Stxn.SproviderID = SProvider.Sproviderid
...
And if you think 100K rows is a lot, or that this volume is a reason for slowness, you're sorely mistaken. Most likely you have really poor indexing strategies in place, possibly some parameter sniffing, possibly some implicit conversions... hard to tell without understanding the data types, indexes and seeing the plan.
There are a lot of things that could impact the performance of query. Although 100k records really isn't all that many.
Items to consider (in no particular order)
Hardware:
Is SQL Server memory constrained? In other words, does it have enough RAM to do its job? If it is swapping memory to disk, then this is a sure sign that you need an upgrade.
Is the machine disk constrained. In other words, are the drives fast enough to keep up with the queries you need to run? If it's memory constrained, then disk speed becomes a larger factor.
Is the machine processor constrained? For example, when you execute the query does the processor spike for long periods of time? Or, are there already lots of other queries running that are taking resources away from yours...
Database Structure:
Do you have indexes on the columns used in your where clause? If the tables do not have indexes then it will have to do a full scan of both tables to determine which records match.
Eliminate the ISNULL function calls. If this is a direct query, have the calling code validate the parameters and set default values before executing. If it is in a stored procedure, do the checks at the top of the s'proc. Unless you are executing this with RECOMPILE that does parameter sniffing, those functions will have to be evaluated for each row..
Network:
Is the network slow between you and the server? Depending on the amount of data pulled you could be pulling GB's of data across the wire. I'm not sure what is stored in the "raw" column. The first question you need to ask here is "how much data is going back to the client?" For example, if each record is 1MB+ in size, then you'll probably have disk and network constraints at play.
General:
I'm not sure what "slow" means in your question. Does it mean that the query is taking around 1 second to process or does it mean it's taking 5 minutes? Everything is relative here.
Basically, it is going to be impossible to give a hard answer without a lot of questions asked by you. All of these will bear out if you profile the queries, understand what and how much is going back to the client and watch the interactions amongst the various parts.
Finally depending on the amount of data going back to the client there might not be a way to improve performance short of hardware changes.
Make sure Stxn.SproviderID, Stxn.status, Stxn.CreatedOn, Stxn.CreatedBy, Stxn.StxnID and SProvider.Sproviderid all have indexes defined.
(NB -- you might not need all, but it can't hurt.)
I don't see much that can be done on the query itself, but I can see things being done on the schema :
Create an index / PK on Stxn.SproviderID
Create an index / PK on SProvider.Sproviderid
Create indexes on status, CreatedOn, CreatedBy, StxnID
Something to consider: When ROWCOUNT or TOP are used with an ORDER BY clause, the entire result set is created and sorted first and then the top 10 results are returned.
How does this run without the Order By clause?
I know my questions will sound silly and probably nobody will have perfect answer but since I am in a complete dead-end with the situation it will make me feel better to post it here.
So...
I have a SQL Server Express database that's 500 Mb. It contains 5 tables and maybe 30 stored procedure. This database is use to store articles and is use for the Developer It web site. Normally the web pages load quickly, let's say 2 ou 3 sec. BUT, sqlserver process uses 100% of the processor for those 2 or 3 sec.
I try to find which stored procedure was the problem and I could not find one. It seems like every read into the table dans contains the articles (there are about 155,000 of them and 20 or so gets added every 15 minutes).
I added few indexes but without luck...
It is because the table is full text indexed ?
Should I have order with the primary key instead of date ? I never had any problems with ordering by dates....
Should I use dynamic SQL ?
Should I add the primary key into the URL of the articles ?
Should I use multiple indexes for separate columns or one big index ?
I you want more details or code bits, just ask for it.
Basically, every little hint is much appreciated.
Thanks.
If your index is not being used, then it usually indicates one of two problems:
Non-sargable predicate conditions, such as WHERE DATEPART(YY, Column) = <something>. Wrapping columns in a function will impair or eliminate the optimizer's ability to effectively use an index.
Non-covered columns in the output list, which is very likely if you're in the habit of writing SELECT * instead of SELECT specific_columns. If the index doesn't cover your query, then SQL Server needs to perform a RID/key lookup for every row, one by one, which can slow down the query so much that the optimizer just decides to do a table scan instead.
See if one of these might apply to your situation; if you're still confused, I'd recommend updating the question with more information about your schema, the data, and the queries that are slow. 500 MB is very small for a SQL database, so this shouldn't be slow. Also post what's in the execution plan.
Use SQL Profiler to capture a lot of typical queries used in your app. Then run the profiler results through index tuning wizard. That will tell you what indexes can be added to optimize.
Then look at the worst performing queries and analyze their execution plans manually.