stored proc time outs - sql

We are currently having difficulties with a sql server procedure timing out on queries. 9 times out of 10 the query will run within 5 second max, however, on occasions, the proc can continue to run in excess of 2 mins and causing time outs on the front end (.net MVC application)..
They have been investigating this for over a week now, checking jobs, server performance and all seems to be ok..
The DBA's have narrowed it down to a particular table which is being bombarded from different application with inserts / updates. This in combination with the complex select query that is causing the time out that joins on that table (im being told) is causing the time outs..
Are there any suggestions at all to how to get around these time outs?
ie.
replicate the table and query the new table?
Any additional debugging that can prove that this is actually the issue?
Perhaps cache the data on the front end, if a time out, call data from cache?

A table being bombarded with updates is a table being bombarded with locks. And yes, this can affect performance.
First, copy the table and run the query multiple times. There are other possibilities for the performance issue.
One cause of unstable stored procedure performance in SQL Server is compilation. The code in the stored procedure is compiled the first time it is executed -- the resulting execution plan might work for some inputs and not others. This is readily fixed by using the option to recompile the queries each time (although this adds overhead).
Then, think about the query. Does it need the most up-to-date data? If not, perhaps you can just copy the table once per hour or once per day.
If the most recent data is needed, you might need to re-think the architecture. A table that does insert-only using a clustered identity column always inserts at the end of the table. This is less likely to interfere with queries on the table.
Replication may or may not help the problem. After all, full replication will be doing the updates on the replicated copy. You don't solve the "bombardment" problem by bombarding two tables.
If your queries involve a lot of historical data, then partitioning might help. Only the most recent partition would be "bombarded", leaving the others more responsive to queries.

The DBA's have narrowed it down to a particular table which is being bombarded from different application with inserts / updates. This in combination with the complex select query that is causing the time out that joins on that table (im being told) is causing the time outs
We used to face many time outs and used to get a lot of escalations..This is the approach we followed for reducing time outs..
Some may be applicable in your case,some may not...but following will not cause any harm
Change below sql server settings:
1.Remote login timeout :60
2.Remote query timeout:0
Also if your windows server is set to use Dynamic ram,try changing it to static ram..
You may also have to tune,some of windows server settings
TCP Offloading/Chimney & RSS…What is it and should I disable it?
Following above steps,reduced our time outs by 99%..
For the rest 1%,we have dealt each case seperately
1.Update statistics for those tables involved in the query
2.Try fine tuning the query further
This helped us reduce time outs totally..

Related

Is there a possibility to make my query in SQL run faster?

I am trying to run a query that would produce only 2 million lines and 12 columns. However my query has been running for 6 hours... I would like to ask if there is anything I can do to speed it up and if there are general tips.
I am still a beginner in SQL and your help is highly appreciated
INSERT INTO #ORSOID values (321) --UK
INSERT INTO #ORSOID values (368) --DE
SET #startorderdate = '4/1/2019' --'1/1/2017' --EDIT THESE
SET #endorderdate = '6/30/2019' --EDIT THESE
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---step 1 for the list of opids and check the table to see if more columns that are needed are present to include them
--Create a list of relevant OpIDs for the selected time period
select
op1.oporid,
op1.opcurrentponum,
o.orcompletedate,
o.orsoid,
op1.opid,
op1.opreplacesopid,
op1.opreplacedbyopid,
op1.OpSplitFromOpID,
op1.opsuid,
op1.opprsku,
--op1.orosid,
op1.opdatenew,
OPCOMPONENTMANUFACTURERPARTID
into csn_junk.dbo.SCOpid
from csn_order..vworder o with (nolock)
inner join csn_order..vworderproduct op1 with (nolock) on oporid = orid
LEFT JOIN CSN_ORDER..TBLORDERPRODUCT v WITH (NOLOCK) on op1.opid = v.OpID
where op1.OpPrGiftCertificate = 0
and orcompletedate between #startorderdate and #endorderdate
and orsoid in (select soid from #orsoid)
Select * From csn_junk.dbo.SCOpid
First, there is no way to know why a query is running on for many hours on a server we don't have access to or without any metrics (i.e an. execution plan or CPU/Memory/IO metrics.) Also, without any DDL there it's impossible to understand what's going on with your query.
General Guidelines for troubleshooting slow data modification:
Getting the right metrics
The first thing I'd do is run task manager on that server and see if you have a server issue or a query issue. Is the CPU pegged to 100%? If so, is sqlservr.exe the cause? How often do you run this query? How fast is it normally?
There are a number of native and 3rd party tools for collecting good metrics. Execution plans, DMFs and DMVs, Extended Events, SQL Traces, Query Store. You also have great third party tools like Brent Ozar's suite of tools, Adam Machanic's sp_whoisactive.
There's a saying in the BI World: If you can't measure it, you can't manage it. If you can't measure what's causing your queries to be slow, you won't know where to start.
Big updates like this can cause locking, blocking, lock-escalation and even deadlocks.
Understand execution plans, specifically actual execution plans.
I write my code in SSMS with "Show execution plan" turned on. I always want to know what my query is doing. You can also view the execution plans after the fact by capturing them using SQL Traces (via the SQL Profiler) or Extended Events.
This is a huge topic so I'll just mention some things off the top of my head that I look for in my plans when troubleshooting slow queries: Sorts, Key Lookups, RID lookups, Scans against large tables (e.g. you scan an entire 10,000,000 row table to retrieve 12,000 rows - for this you want a seek.) Sometimes there will be warnings in the execution plan such as a "tempdb spill" - these are bad. Sometimes the plan will call out "missing indexes" - a topic unto itself. Which brings me to...
INDEXES
This is where execution plans, DMV and other SQL monitoring tools really come in handy. The rule of thumb is, when you are doing SELECT queries it's nice to have plenty of good indexes available for the optimizer to chose; in a normalized data mart for example, more are better. For INSERT/UPDATE/DELETE operations you want as few indexes possible because each one associated with the query/data in the query is modified. For a big insert like the one you are doing, fewer indexes would be better on csn_junk.dbo.SCOpid and, as mentioned in the comments below your post, you want the correct indexes to support the tables used for the update.
CONSTRAINTS
Constraints slow data modification. The present referential integrity constraints (Primary/Foreign keys) and UNIQUE constraints will impact performance. CHECK constraints can as well; CHECK constraints that use a T-SQL scalar function will absolutely destroy data modification performance more than almost anything else I can think of except for scalar UDFs as CHECK constraints that also access other tables this can slow an insert that should a minute to several hours.
MDF & LDF file growth
A 2,000,000 row+/12 column insert is going to cause the associated MDF and LDF files to grow substantially. If your data files (.MDF or .NDF) or Log File (.LDF) fill up they will auto-grow to create space. This slows queries that run in seconds to minutes, especially when your auto-growth settings are bad. See: SQL Server Database Growth and Autogrowth Settings
Whenever I have a query that always runs for 10 seconds and now, out of nowhere, it's running for minutes. Assuming it's not a deadlock or server issue I will check for MDF or LDF autogrowth as this is often the culprit. Often you have a log file that needs to be shrunk (via log backup or manually depending on the recovery model.) This brings me to batching:
Batching
Huge inserts chew up log space and take forever to roll back if the query fails. Making things worse - cancelling a huge insert (or trying to Kill the Spid) will sometimes cause more problems. Doing data modifications in batches can circumvent this problem. See this article for more details.
Hopefully this helps get you started. I've given you plenty to google. Please forgive any typos - I spun this up fast. Feel free to ask followup questions.

MS SQL Server Query caching

One of my projects has a very large database on which I can't edit indexes etc., have to work as it is.
What I saw when testing some queries that I will be running on their database via a service that I am writing in .net. Is that they are quite slow when ran the first time?
What they used to do before is - they have 2 main (large) tables that are used mostly. They showed me that they open SQL Server Management Studio and run a
SELECT *
FROM table1
JOIN table2
a query that takes around 5 minutes to run the first time, but then takes about 30 seconds if you run it again without closing SQL Server Management Studio. What they do is they keep open SQL Server Management Studio 24/7 so that when one of their programs executes queries that are related to these 2 tables (which seems to be almost all queries ran by their program) in order to have the 30 seconds run time instead of the 5 minutes.
This happens because I assume the 2 tables get cached and then there are no (or close to none) disk reads.
Is this a good idea to have a service which then runs a query to cache these 2 tables every now and then? Or is there a better solution to this, given the fact that I can't edit indexes or split the tables, etc.?
Edit:
Sorry just I was possibly unclear, the DB hopefully has indexes already, just I am not allowed to edit them or anything.
Edit 2:
Query plan
This could be a candidate for an indexed view (if you can persuade your DBA to create it!), something like:
CREATE VIEW transhead_transdata
WITH SCHEMABINDING
AS
SELECT
<columns of interest>
FROM
transhead th
JOIN transdata td
ON th.GID = td.HeadGID;
GO
CREATE UNIQUE CLUSTERED INDEX transjoined_uci ON transhead_transdata (<something unique>);
This will "precompute" the JOIN (and keep it in sync as transhead and transdata change).
You can't create indexes? This is your biggest problem regarding performance. A better solution would be to create the proper indexes and address any performance by checking wait stats, resource contention, etc... I'd start with Brent Ozar's blog and open source tools, and move forward from there.
Keeping SSMS open doesn't prevent the plan cache from being cleared. I would start with a few links.
Understanding the query plan cache
Check your current plan cache
Understanding why the cache would clear (memory constraint, too many plans (can't hold them all), Index Rebuild operation, etc. Brent talks about this in this answer
How to clear it manually
Aside from that... that query is suspect. I wouldn't expect your application to use those results. That is, I wouldn't expect you to load every row and column from two tables into your application every time it was called. Understand that a different query on those same tables, like selecting less columns, adding a predicate, etc could and likely would cause SQL Server to generate a new query plan that was more optimized. The current query, without predicates and selecting every column... and no indexes as you stated, would simply do two table scans. Any increase in performance going forward wouldn't be because the plan was cached, but because the data was stored in memory and subsequent reads wouldn't experience physical reads. i.e. it is reading from memory versus disk.
There's a lot more that could be said, but I'll stop here.
You might also consider putting this query into a stored procedure which can then be scheduled to run at a regular interval through SQL Agent that will keep the required pages cached.
Thanks to both #scsimon #Branko Dimitrijevic for their answers I think they were really useful and the one that guided me in the right direction.
In the end it turns out that the 2 biggest issues were hardware resources (RAM, no SSD), and Auto Close feature that was set to True.
Other fixes that I have made (writing it here for anyone else that tries to improve):
A helper service tool will rearrange(defragment) indexes once every
week and will rebuild them once a month.
Create a view which has all the columns from the 2 tables in question - to eliminate JOIN cost.
Advised that a DBA can probably help with better tables/indexes
Advised to improve server hardware...
Will accept #Branko Dimitrijevic 's answer as I can't accept both

Running an SQL query in background

I'm trying to update a modest dataset of 60k records with a value which takes a little time to compute. From a small trial run of 6k records in the production environment, it took 4 minutes to complete, so the full execution should take around 40 minutes.
However this trial run showed that there were SQL timeouts occurring on user requests when accessing data in related tables (but not necessarily on the actual rows which were being updated).
My question is, is there a way of running non-urgent queries as a background operation in the SQL server without causing timeouts or table locking for extensive periods of time? The data within the column which is being updated during this period is not essential to have the new value returned; aka if a request happened to come in for this row, returning the old value would be perfectly acceptable rather than locking the set until the update is complete (I'm not sure the ins and outs of how this works, obviously I do want to prevent data corruption; could be a way of queuing any additional changes in the background)
This is possibly a situation where the NOLOCK hint is appropriate. You can read about SQL Server isolation levels in the documentation. And Googling "SQL Server NOLOCK" will give you plenty of material on why you should not over-use the construct.
I might also investigate whether you need a SQL query to compute values. A single query that takes 4 minutes on 6k records . . . well, that is a long time. You might want to consider reading the data into an application (say, using Python, R, or whatever) and doing the data manipulation there. It may also be possible to speed up the query processing itself.

Is it possible to Cache the result set of a select query in the database?

I am trying to optimize the search query which is the most used in our system. So far I have added some missing indexes and that has helped slightly. But I want to further reduce the load on the db server. One option that I will use is caching the result set as a LIST in the asp.net Cache so that I don't have to hit the db often.
However, I was wondering if there is a way to Cache some portions of the select query at the db as well. e.g. for the search results we consider only users who have been active in the last 180 days and who have share-info set as true. So this is like a super set which the db processes everytime and then applies other conditions such as category specified, city etc. which are passed. Is it possible to somehow Cache the Super Set so that I can run queries against the super set rather than run the query against the whole table? Will creating a View help in this? I am a bit hesitant to create a view as I read managing views can be an overhead and takes away some flexibility to modfy the tables.
I am using Sql-Server 2005 so cannot create a filtered index on the table, which I think would have been helpful.
I agree with #Neville K. SQL Server is pretty smart at caching data in memory. You might see limited / no performance gains for your effort.
You could consider indexed views (Enterprise Edition only) http://technet.microsoft.com/en-us/library/cc917715.aspx for your sub-query.
It is, of course, possible to do this - but I'm not sure if it will help.
You can create a scheduled job - once a night, perhaps - which populates a table called "active_users_with_share_info" by truncating it, and then repopulating it based on a select query filtering out users active in the last 180 days with "share_info = true".
Then you can join your search query to this table.
However, I doubt this would do much good - SQL Server is pretty smart at caching. Unless you're dealing with huge volumes of data (100 of millions of records), or very limited hardware, I doubt you'd get any measurable performance improvements - but by all means try it!
Of course, the price for this would be more moving parts in your application, more interesting failure modes (what happens if the overnight batch fails silently?), and more training for any new developers you bring into the team.

Adding a clustered index to a SQL table: what dangers exist for a live production system?

I've been put in charge of a 10-year old transactional system where the majority of the business logic is implemented at the database level (triggers, stored procedures, etc). Win2000 server, MSSQL 2000 Enterprise.
No immediate plans for replacing or updating the system are being considered.
The core process is a program that executes transactions - specifically, it executes a stored procedure with various parameters; let's call it sp_ProcessTrans. The program executes the stored procedure at asynchronous intervals.
By itself, things work fine, but there are 30 instances of this program on remotely located workstations, all of them asynchronously executing sp_ProcessTrans and then retrieving data from the SQL server. Execution is pretty regular - ranging 0 to 60 times a minute, depending on what items the program instance is responsible for.
Performance of the system has dropped considerably with 10 years of data growth: the reason is the deadlocks, specifically deadlock wait times, on the Employee table.
I have discovered:
In sp_ProcessTrans's execution, it selects from an Employee table 7 times
The select is done on a field that is NOT the primary key
No index exists on this field. Thus a table scan is performed 7 times per transaction
So the reason for deadlocks is clear. I created a non-unique ordered clustered index on the field (almost unique, NUM(7), very rarely changes). There was immediate improvement in the test environment.
The problem is that I cannot simulate the deadlocks in a test environment. I'd need 30 workstations, and I'd need to simulate 'realistic' activity on those stations, so visualization is out.
I need to know if I must schedule downtime.
Creating an index shouldn't be a risky operation for MSSQL, but is there any danger (data corruption, extra wait time, etc.) in creating this field index on the production database while the transactions are still taking place? I can select a time when transactions are fairly quiet through the 30 stations.
Are there any hidden dangers I'm not seeing? (I'm not looking forward to restoring the DB if something goes wrong. It would take a lot of time with 10 years of data.)
Data corruption shouldn't be an issue, but if you try adding an index to a live production table you are likely to experience problems as the table will not be responsive to queries during the index creation. Creating an index will apply an exclusive table lock until it is complete, and the time this takes will depend on numerous factors (especially the number of rows).
scheduled downtime is strongly recommended and also a good habit to get into. And obviously backup taken, and a plan in case you have to undo what you're intending.