I am working on a SQL Job which involves processing around 75000 records.
Now, the job works fine for 10000/20000 records with speed of around 500/min. After around 20000 records, execution just dies. It loads around 3000 records every 30 mins and stays at similar speed.
I asked a similar question yesterday and got few good suggestions on procedure changes.
Here's the link:
SQL SERVER Procedure Inconsistent Performance
I am still not sure how to find the real issue here. Here are few of the questions I have:
If problem is tempdb, how can I monitor activities on tempdb?
How can I check if it's the network being the bottleneck?
Are there any other ways to find what is different between when job is running fast and when it slows down?
I have been the administrator for a couple large data warehouse implementations where this type of issue was common. Although, I can't be sure of it, what it sounds like is that the performance of your server is being degraded by either growing log files or by memory usage. A great tool for reviewing these types of issues is Perfmon.
A great article on using this tool can be found here
Unless your server is really chimped down, 75000 records should not be a problem for tempdb, so I really doubt that is your problem.
Your prior question indicated SQL Server, so I'd suggest running a trace while the proc is running. You can get statement timings etc from the trace and use that to determine where or what is slowing things down.
You should be running each customer's processing in separate transactions, or small groups of customers. Otherwise, the working set of items that the the ultimate transaction has to write keeps getting bigger and each addition causes a rewrite. You can end up forcing your current data to be paged out and that really slows things down.
Check the memory allocated to SQL Server. If it's too small, you end up paging SQL Server's processes. If it's too large, you can leave the OS without enough memory.
Related
I've seen this question asked in many ways all over the Internet but despite implementing the abundance of advice (and some voodoo), I'm still struggling. I have a 100GB+ database that is constantly inserting and updating records in very large transactions (200+ statements per trans). After a system restart, the performance is amazing (data is written to a large SATA III SSD connected via USB 3.0). The SQL Server instance is running on a VM running under VMWare Workstation. The host is set to hold the entire VM in memory. The VM itself has a paging cache of 5000 MB. The SQL Server user is set to 'hold pages in memory'. I have 5 GBs of RAM allocated to the VM, and the max memory of the SQL Server instance is set to half a Gig.
I have played with every single one of these parameters to attempt to maintain consistent performance, but sure and steady, the performance eventually degrades to the point where it begins to time out. Here's the kicker though, if I stop the application that's loading the database, and then execute the stored proc in the Management Studio, it runs like lightning, clearly indicating it's not an issue with the query, and probably nothing to do with memory management or paging. If I then restart the loader app, it still crawls. If I reboot the VM however, the app once again runs like lightning...for a while...
Does anybody have any other suggestions based upon the symptoms presented?
Depending on how large your hot set is, 5GB memory may just tax it for a 100+gb database.
Check indices and query plans. We can not help you without them. And I bet you miss some indices - which is the standard performance issue people have.
Otherwise, once you made your homework - head over to dba.stackexchange.com and ask there.
Generally - consider that 200 statements per transaction may simply indicate a seriously sub-optimal programming. For example you could bulk-load the data into a temp table then merge into the final one.
Actually, I may have a working theory. What I did was add some logic to the app that when it times out, sit for two minutes, and then try again, and voila! Back to full speed. I rubber-ducky'd my co-worker and came up with the concept that my perceived SSD write speeds were actually the write speed to the VMWare host's virtual USB 3 buffer, and that the actual SSD write speeds were slower. I'm probably hitting against the host's buffer size and by forcing the app to wait 2 minutes, the host has a chance to dump its back-buffered data to the SSD. Elementary, Watson :)
If this approach also fails to be sustainable, I'll report in.
Try executing this to determine your problem queries:
SELECT TOP 20
qs.sql_handle,
qs.execution_count,
qs.total_worker_time AS Total_CPU,
total_CPU_inSeconds = --Converted from microseconds
qs.total_worker_time/1000000,
average_CPU_inSeconds = --Converted from microseconds
(qs.total_worker_time/1000000) / qs.execution_count,
qs.total_elapsed_time,
total_elapsed_time_inSeconds = --Converted from microseconds
qs.total_elapsed_time/1000000,
st.text,
qp.query_plan
FROM
sys.dm_exec_query_stats as qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) as st
cross apply sys.dm_exec_query_plan (qs.plan_handle) as qp
ORDER BY qs.total_worker_time desc
Then check your estimated and actual execution plans on the queries this command helps you pinpoint.
Source How do I find out what is hammering my SQL Server? and at the bottom of the page of http://technet.microsoft.com/en-us/magazine/2007.11.sqlquery.aspx
Beyond the excellent indexing suggestions already given,
be sure to read up on parameter sniffing. That could be the cause of the problem.
SQL Server - parameter sniffing
http://www.sommarskog.se/query-plan-mysteries.html#compileps
As a result you could have a bad query plan being re-used, or SQL's buffer could be getting full and writing pages out of memory to disk (maybe that's other allocated memory in your case).
You could run DBCC FreeProcCache and DBCC FreeSystemCache to empty it and see if you get a performance boost.
You should give SQL more memory too - as much as you can while leaving room for other critical programs and the OS. You might have 5gb of Ram on the VM, but SQL is only getting to play with a 1/2 gb, which seems REALLY small for what you're describing.
If those things don't move you in the right direction, install the SQL Management Data Warehouse so you can see exactly what is happening when your slow down begins. Running it takes up additional memory, but you will give the DBA's more to go on.
In the end, what I did was a combination of two things, putting in logic to recover when timeouts occurred, and setting the host core count to only reflect physical cores, not logical cores, so for example, the host has 2 cores that are hyper-threaded. When I set my VM to use 4 cores, it occasionally gets hung in some infinite loop, but when I set it to 2 cores, it runs without fail. Still, aberrant behavior like this is difficult to mitigate reliably.
This isn't a terribly technical question as I am just looking for theories onto why something like this would happen.
In our application, we have a few different stored procedures that read mostly the same tables. We had been monitoring SQL Server and trying to knock off the most expensive queries on our list (highest I/O, CPU time etc). We have seen quite a lot of gains by altering the SQL and/or altering the application.
Anyway, we altered stored procedure #1 and released it. As expected it performed much better. However stored procedure #2 (which reads similar data) all of sudden saw it's performance metrics decrease (it is consuming much more I/O).
We are still in a better place after the release but I am trying to figure out why this is happening. Thus far, I have been unable to replicate the issue (it is still performing fine for me no matter how I use the stored proc).
Also, the stored proc does not perform poorly every time in production. The majority of the times it is run, it performs just fine.
Any ideas?
We are using SQL Server 2008. We did not alter any indexes.
This may not seem at all related, but humor me and check your max degree of parallelism on the database instance:
EXEC sp_configure 'max degree of parallelism'
My suspicion is the sporadic poor execution of your SP was always happening, but was being masked by the poor performance of the db in general. Chances are this config value is set to 0, which is a bad thing by default. You will want to change this to a value somewhere between 1 and 8 based on a variety of factors you should reference this KB for: http://support.microsoft.com/kb/2806535
I have a feeling your getting a runaway parallel query from time to time which would explain the behavior you're seeing, and changing this value will help curb that.
P.S. My apologies for not appending a comment to the question, but my reputation isn't high enough yet...
I am currently addressing a situation where our web application receives at least a Million requests per 30 seconds. So these requests will lead to generating 3-5 Million row inserts between 5 tables. This is pretty heavy load to handle. Currently we are using multi threading to handle this situation (which is a bit faster but unable to get a better CPU throughput). However the load will definitely increase in future and we will have to account for that too. After 6 months from now we are looking at double the load size we are currently receiving and I am currently looking at a possible new solution that is scalable and should be easy enough to accommodate any further increase to this load.
Currently with multi threading we are making the whole debugging scenario quite complicated and sometimes we are having problem with tracing issues.
FYI we are already utilizing the SQL Builk Insert/Copy that is mentioned in this previous post
Sql server 2008 - performance tuning features for insert large amount of data
However I am looking for a more capable solution (which I think there should be one) that will address this situation.
Note: I am not looking for any code snippets or code examples. I am just looking for a big picture of a concept that I could possibly use and I am sure that I can take that further to an elegant solution :)
Also the solution should have a better utilization of the threads and processes. And I do not want my threads/processes to even wait to execute something because of some other resource.
Any suggestions will be deeply appreciated.
Update: Not every request will lead to an insert...however most of them will lead to some sql operation. The appliciation performs different types of transactions and these will lead to a lot of bulk sql operations. I am more concerned towards inserts and updates.
and these operations need not be real time there can be a bit lag...however processing them real time will be much helpful.
I think your problem looks more towards getting a better CPU throughput which will lead to a better performance. So I would probably look at something like an Asynchronous Processing where in a thread will never sit idle and you will probably have to maintain a queue in the form of a linked list or any other data structure that will suit your programming model.
The way this would work is your threads will try to perform a given job immediately and if there is anything that would stop them from doing it then they will push that job into the queue and these pushed items will be processed based on how it stores the items in the container/queue.
In your case since you are already using bulk sql operations you should be good to go with this strategy.
lemme know if this helps you.
Can you partition the database so that the inserts are spread around? How is this data used after insert? Is there a natural partion to the data by client or geography or some other factor?
Since you are using SQL server, I would suggest you get several of the books on high availability and high performance for SQL Server. The internals book muight help as well. Amazon has a bunch of these. This is a complex subject and requires too much depth for a simple answer on a bulletin board. But basically there are several keys to high performance design including hardware choices, partitioning, correct indexing, correct queries, etc. To do this effectively, you have to understand in depth what SQL Server does under the hood and how changes can make a big difference in performance.
Since you do not need to have your inserts/updates real time you might consider having two databases; one for reads and one for writes. Similar to having a OLTP db and an OLAP db:
Read Database:
Indexed as much as needed to maximize read performance.
Possibly denormalized if performance requires it.
Not always up to date.
Insert/Update database:
No indexes at all. This will help maximize insert/update performance
Try to normalize as much as possible.
Always up to date.
You would basically direct all insert/update actions to the Insert/Update db. You would then create a publication process that would move data over to the read database at certain time intervals. When I have seen this in the past the data is usually moved over on a nightly bases when few people will be using the site. There are a number of options for moving the data over, but I would start by looking at SSIS.
This will depend on your ability to do a few things:
have read data be up to one day out of date
complete your nightly Read db update process in a reasonable amount of time.
In the final stage of development I started looking at code to try to find some bad practices. I discovered that while accessing a page, I'm querying the DB n times (n is the number of rows of an HTML table) just to get the translation (different languages) for a given record... I immediately thought that was bad and I tried a small optimization.
Running the SQL profiler shows that those query took 0ms.
Since these tables I'm querying are small (20-100 records) I thought I could fetch all data and cache them into the web server RAM, retrieving later using LINQ to Objects. Execution time this way is also 0ms.
The environment where I'm running these test is a DB and Web server with 0% load on the same machine. It's only me using the application.
The question starts here. Since I have no performance difference at all should I avoid that optimization? Should I keep it in the way to balance the usage of both DB and web server (the servers will be on 2 different machines in the production environment)?
In my opinion this optimization can't damage the performances, it could only make some better in case of heavy loaded DB. I have something in my brain that say it's wrong to optimize if there is no need...
Thanks for your opinion.
I don't think you've actually shown that there's no performance difference at all.
Try running the query a million times each way, and time how long it takes in total... I think you'll see a big difference.
The SQL profiler only shows you (as far as I'm aware) the time taken on the database to perform the query. It doesn't take into account:
The time taken to set up the connection or prepare the query
The time taken in network latency issuing the query
The time taken in network latency returning the results
The time taken converting the results into useful objects.
Of course, premature optimisation is a bad thing in general, but this does sound like a reasonable change to make - if you know that the contents of the tables won't change.
SQL Server is a bit peculiar in that way, all query execution times between 0 and 15 ms are rounded down to 0 ms. So you don't actually know from looking at the number if your query is taking 0 ms or 15 ms. It's a big difference between doing 1000 * 1 ms queries and doing 1000 * 15 ms.
Regarding translations, I've found that the best way is using Resources and tying them to a SQL database and the cache the translations in the web application for a reasonable amount of time. That is pretty efficient.
Other than that, what Jon Skeet says... :)
Both answer are true.
I actually measured 1 million times as Jon suggested and in fact.....there is a huge difference! Thanks Jon
And even what Jonas says is true. The query actually took something around 15ms (measured by the program) even if sql profiler says 0.
Thanks guys!
I have a problem with a large database I am working with which resides on a single drive - this Database contains around a dozen tables with the two main ones are around 1GB each which cannot be made smaller. My problem is the disk queue for the database drive is around 96% to 100% even when the website that uses the DB is idle. What optimisation could be done or what is the source of the problem the DB on Disk is 16GB in total and almost all the data is required - transactions data, customer information and stock details.
What are the reasons why the disk queue is always high no matter the website traffic?
What can be done to help improve performance on a database this size?
Any suggestions would be appreciated!
The database is an MS SQL 2000 Database running on Windows Server 2003 and as stated 16GB in size (Data File on Disk size).
Thanks
Well, how much memory do you have on the machine? If you can't store the pages in memory, SQL Server is going to have to go to the disk to get it's information. If your memory is low, you might want to consider upgrading it.
Since the database is so big, you might want to consider adding two separate physical drives and then putting the transaction log on one drive and partitioning some of the other tables onto the other drive (you have to do some analysis to see what the best split between tables is).
In doing this, you are allowing IO accesses to occur in parallel, instead of in serial, which should give you some more performance from your DB.
Before buying more disks and shifting things around, you might also update statistics and check your queries - if you are doing lots of table scans and so forth you will be creating unnecessary work for the hardware.
Your database isn't that big after all - I'd first look at tuning your queries. Have you profiled what sort of queries are hitting the database?
If you disk activity is that high while your site is idle, I would look for other processes that might be running that could be affecting it. For example, are you sure there aren't any scheduled backups running? Especially with a large db, these could be running for a long time.
As Mike W pointed out, there is usually a lot you can do with query optimization with existing hardware. Isolate your slow-running queries and find ways to optimize them first. In one of our applications, we spent literally 2 months doing this and managed to improve the performance of the application, and the hardware utilization, dramatically.