Poor SQL performance after server transfer - sql

We had a SQL 2005 server running for XML EXPLICIT queries quite happily with no performance issues. The machine (a Windows 2003 server) has unfortunately died so I've had to do an emergency provision of a Windows 2012 box. The databases files have been reattached to a 2008r2 and "work". However the queries are horrendously slow. 5 seconds per query when previously they were in the .x times. This makes the websites that they power unusable.
I've rebuilt all the indexes and I've run DBCC FREEPROCCACHE on all machines but this has had no noticable effect. What else can I look at ? I can't run them on the 2016 SQL instance on the box because some of the queries use non-ANSI *= joins (I said it was old!).

If your query was running fine before, consider what else have changed - the query planner and actual execution plan might help to pinpoint this.
When you say you are joining, have you considered how much you join? If the new machine have more data in the database, then a join might quickly become prohibitively expensive. This can be done by reducing the data you need, as less datahandling means less workload.
Is there something you can pre-calculate before you run your query, or otherwise change to make it run faster?
I assume you do a SELECT, but if you UPDATE or DELETE data, the indexes also need to be recalculated, which takes a long time (in this case, disable the index, insert all the needed data and then recalculate the index)
You don't mention any XML handling, but have marked the for-xml tag. If your join is performed on XML data, using Xquery to get the data might also give a boost to performance.

Related

How I can find sql query for execution plan?

Some programm generate and send queries to sql server(on high load production). I want take plan of concrete query of concrete table. I start profiler with "Showplan XML" and set filter on TextData(like %MyTable%) and DatabaseName. It show rows with xml in TextData that describe execution plans(for all queries of my table). But I know that exist 5 different sql queries for this table.
How I can match some concrete query with correspond plan without use statistic?
Is there a reason this has to be done on the production environment? Most really bad execution plans (missing indexes causing table scans etc.) will be obvious enough on a dev environment where you can use all the diagnostics you want.
Otherwise running the SQL on the query cache (as in the linked question someone else mentioned) will probably have the lowest impact as it just queries a system table rather than adding diagnostics to every query.

sql temp table join between servers

So I have a summary i need to return to the end user application.
It should accept 3 parameters DateType, StartDate, EndDate.
Date Type will determine the date field I use to filter the data.
The way i accomplished this was putting all the IDs of the records for a datetype into a TEMP table and then joining my summary to the list of IDs.
This worked fine when running on the query on the SQL server that houses the data.
However, that is a replicated server, so when I compiled to a stored proc that would be on the server with the rest of the application data, it slowed the query down. IE 2 seconds vs 50 seconds.
I think the cross join from the temp table that is created on the SQL server then joining to the tables on the replciation server, is causing the slow down.
Are there any methods or techniques that I can use to get around this and build this all in one stored procedure?
If I create 3 stored procedures with their own date range, then they are fast again. However, this means maintaining multiple stored procs for the same thing.
First off, if you are running a version of SQL Server older than 2012 SP1, one problem is that users who aren't allowed to run DBCC SHOW_STATISTICS (which is most users who aren't sysadmins, see the "Permissions" section in the documentation) don't get access to statistics on remote tables. This can severely cripple the optimizer's ability to generate a good execution plan. Upgrading SQL Server or granting more permissions can help there.
If your query involves filtering or joining on a character column, make sure the remote server is flagged in the linked server options as "collation compatible". If this option is off, SQL Server can't assume strings can be compared across the servers and it will start pumping entire tables up and down just to make sure the data ends up where the comparison has to be made.
If the execution plan is as good as it gets and it's still not good enough, one general (lame) technique is to transfer all data locally first (SELECT * INTO #localtable FROM remote.db.schema.table), then run the query as a non-distributed query. Obviously, in order for this to work, the remote table cannot be "too big" and in some cases this actually has worse performance, depending on how many rows are involved. But it's always worth considering, because the optimizer does a better job with local tables.
Another approach that avoids pulling tables together across servers is packing up data in parameters to remote stored procedure calls. Entire tables can be passed as XML through an NVARCHAR(MAX), since neither XML columns nor table-valued parameters are supported in distributed queries. The basic idea is the same: avoid the need for the the optimizer to figure out an efficient distributed query. The best approach greatly depends on your data and your query, obviously.

How to optimize the downloading of data to the server in SSIS

Good day.
Need to get records from an Oracle database to a database in SQL Server. The data source type (ODBC) the performed using a SQL command, where I am taking all possible indices according to my requirement. The process runs fine, the problem is that it takes a long time and I need to be something quick. The process can not be performed with lookup, requires merge or merge join, simply load a table from Oracle to SQL under certain conditions.
Thank you for your help
Check what is your limiting factor. Generally there are 3 points to check:
Remote server is slow.
Source DB can run low on memory, read speed or free CPU. Substitute you query with a straight SELECT statement with no WHERE clause or JOINs and see if your SSIS package runs faster.
Target DB.
You may have indexes enabled, high write latency on HDD or not enough CPU.
Run an INSERT for your target table and see how longer it takes.
Problem may be in the middle: transfer between 2 servers. Network usually is main bottleneck. Is SSIS hosted on the same server as SQL server? then you have 2 network connections + possible hardware bottleneck on dedicated SSIS machine.
Depending on the bottleneck there are different solutions.
If you have network capacity and bottleneck is 1 CPU per query on Oracle, then you can partition your data horisontally (IDs 1 to 100, 101 to 200 etc); establish multiple connections to Oracle and load data in several streams. Number of streams is 1 less then number of CPUs on Oracle, SSIS or SQL Server (which ever is smaller).

Fastest way to insert in parallel to a single table

My company is cursed by a symbiotic partnership turned parasitic. To get our data from the parasite, we have to use a painfully slow odbc connection. I did notice recently though that I can get more throughput by running queries in parallel (even on the same table).
There is a particularly large table that I want to extract data from and move it into our local table. Running queries in parallel I can get data faster, but I also imagine that this could cause issues with trying to write data from multiple queries into the same table at once.
What advice can you give me on how to best handle this situation so that I can take advantage of the increased speed of using queries in parallel?
EDIT: I've gotten some great feedback here, but I think I wasn't completely clear on the fact that I'm pulling data via a linked server (which uses the odbc drivers). In other words that means I can run normal INSERT statements and I believe that would provide better performance than either SqlBulkCopy or BULK INSERT (actually, I don't believe BULK INSERT would even be an option).
Have you read Load 1TB in less than 1 hour?
Run as many load processes as you have available CPUs. If you have
32 CPUs, run 32 parallel loads. If you have 8 CPUs, run 8 parallel
loads.
If you have control over the creation of your input files, make them
of a size that is evenly divisible by the number of load threads you
want to run in parallel. Also make sure all records belong to one
partition if you want to use the switch partition strategy.
Use BULK insert instead of BCP if you are running the process on the
SQL Server machine.
Use table partitioning to gain another 8-10%, but only if your input
files are GUARANTEED to match your partitioning function, meaning
that all records in one file must be in the same partition.
Use TABLOCK to avoid row at a time locking.
Use ROWS PER BATCH = 2500, or something near this if you are
importing multiple streams into one table.
For SQL Server 2008, there are certain circumstances where you can utilize minimal logging for a standard INSERT SELECT:
SQL Server 2008 enhances the methods that it can handle with minimal
logging. It supports minimally logged regular INSERT SELECT
statements. In addition, turning on trace flag 610 lets SQL Server
2008 support minimal logging against a nonempty B-tree for new key
ranges that cause allocations of new pages.
If your looking to do this in code ie c# there is the option to use SqlBulkCopy (in the System.Data.SqlClient namespace) and as this article suggests its possible to do this in parallel.
http://www.adathedev.co.uk/2011/01/sqlbulkcopy-to-sql-server-in-parallel.html
If by any chance you've upgraded to SQL 2014, you can insert in parallel (compatibility level must be 110). See this:
http://msdn.microsoft.com/en-us/library/bb510411%28v=sql.120%29.aspx

Mysql to SQL Server: is there something similar to query cache?

I'm trying to optimize a SQL Server. I have some experience with Mysql and one of the things that usually help is to enable query cache, that will basically cache query results as long as you are running the same query.
Is there something similar to this on SQL Server? Could you please point what is the name of this feature?
Thanks!
SQL Server doesn't cache result sets per se, but it does cache data pages which have been read, in addition to caching query execution plans. This means that if it has to read the same data pages again to answer a query, it will be faster since there are fewer physical reads (from disk) but you should still see the same amount of logical reads.
Here is an article with more details.
SQL Server will cache query results, but it's a little more complicated than in MySQL's case (since SQL Server provides ACID guarantees that MySQL does not - at least, not with MyISAM). But you'll definitely find that the second time you execute a query on SQL Server, it'll be faster than the first time (as long as no other changes have happened).
There's no specific name for this feature, that I'm aware of. It's more a combination of caches...