Index creation time is high on Azure Managed Instance - sql

I am working with Azure Managed Instances for hosting a data warehouse. For the large table loads the indexes are removed and rebuilt instead of inserting with the indexes in place. The indexes are re-created using a stored procedure that builds them from a list kept in an admin table. When moving from our on-prem solution to the managed instance, we have seen considerable decrease in performance when building the indexes. The process takes roughly twice as long when running in Azure vs when running on-prem.
The specs for the server are higher in the Azure Managed Instance, more cores and more memory. We have looked at IO time and tried increasing file size to increase IO but it has had a minimal impact.
Why would it take longer to build indexes on the same data using the same code in an Azure Managed Instance than it does on an on-pre SQL Server?
Is there a setting or configuration in Azure that could be changed to improve performance?

Could you please check the transaction log file for the database. Monitor log space use by using sys.dm_db_log_space_usage. This DMV returns information about the amount of log space currently used and indicates when the transaction log needs truncation. Please see the referral link here sys.dm_db_log_space_usage (Transact-SQL) - SQL Server | Microsoft Docs
As creating the index will easily reach throughput limit either for data or log files, you might need to increase individual file sizes. Resource limits - Azure SQL Managed Instance | Microsoft Docs
You also can use this script managed-instance/MI-GP-storage-perf.sql at master · dimitri-furman/managed-instance · GitHub to determine if the IOPS/throughput seen against each database file.

Related

How to increase performance of Azure Data Factory Pipeline with Integration Runtime

I would like to increated the performance of our pipelines.
The pipelines currently run from an integration runtime.
I am running a single copy activity on tables held on our Source which is a SQL Database. Tables contain just under a million rows, with about 15 columns.
Currently the time it takes to copy a table from Source to Sink(ADLS) is approximately 20mins.
Is there a way to increase the DIU to increase performance?
My current copy settings are as follows:
I'm thinking that if I made some changes to Settings, see below, I would improve performance, but I have never played around to settings before, any suggestions most welcomed.
The activity details for a pipeline run is as follows:
My link service is an Azure Synapse Link service, see below:
From the output window, we can see that almost all the wait time was "Time to first byte", which means your SQL server is slow to reply. It takes ~22 minutes for less than 90K rows. So changes on the ADF side will not help.
If your query is a simple "select * from table", then maybe your SQL server is low on resources. You can check that in your database portal in Azure. Try to add more resources and see if copy times improve.
If this is a query from a view or other complicated query, maybe it needs some improvement (indexes, improve code). You can test that by writing the query result to a table in your SQL database, use that table as the data factory source, and see if this improves copy time.
Quick check , is the Azure SQL and storage account in the same region ? Also I see that your copy activity is set as parraleism as 1 , you can play with number and see if that helps .
How to setyp parallelism please read here : https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-performance-features#parallel-copy
Please see the snaphot below

How to troubleshoot suspended queries in Azure Synapse?

Currently, I encounter an issue of suspended queries in Azure Synapse when executing from ADF (Store procedures call).
Also, I followed the suggestion in the link below for troubleshooting the issue:
Delete due to sensitive informations
The troubleshoot queries returned as below:
I checked if the transaction lock is the issue as I killed a few suspending or running queries which they ran for more than 15 hours. I also checked for the rest of the queries running but there is nothing would cause the transaction lock. I tried to run the store procedure manually from Azure Data Studio which is blocked as mentioned above and It took 40 seconds to complete.
While the suspending query from ADF, it took nearly an hour to finish.
Any suggestion to troubleshoot this issue is much appreciated.
Thanks
There a number of factors you must always consider when tuning queries in Azure Synapse Analytics, dedicated SQL pools:
DWU - what DWU is your pool at? Lower DWUs mean lower concurrent users and lower performance and should not be used for any kind of performance tuning. Crank it up temporarily to rule this out as a problem, bearing in mind changing this disconnects any active queries. Also bear in mind, not all queries respond to higher DWU.
Resource class - what resource class is associated with the user executing these queries? Remember the default is smallrc, and the admin user always has smallrc. Understand static and dynamic resource classes. DMV sys.dm_pdw_exec_requests will give you useful information on this. Trial with your workload to find the sweetspot between performance and concurrency v resource class. Encourage your dev team to use labels in their queries: OPTION ( LABEL = 'some informative label' )
Table geometry - this is the distribution (ROUND_ROBIN|HASH|REPLICATE) of your table and the indexing choice (CLUSTERED COLUMNSTORE|CLUSTERED INDEX|HEAP). Clustered columnstore and round robin are the defaults but they are not always appropriate. Consider what is appropriate for your tables.
If you work through those and still have an issue you can start to look at statistics and workload classification for starters, but gather information on the points above should give you a good idea.
If you are just doing single value INSERTs, then don't. Dedicated SQL pools are terrible with these. Convert these to load from a file in a single INSERT / COPY INTO.

How to analyze poor performance from Azure PostGreSQL-PaaS

I'm experiencing poor performance from Azure PostGreSQL-PaaS and need help with how to proceed.
I'm trying out Azure PostGreSQL-PaaS in a project. I'm experiencing an intolerable performance from the database (or at least it seems like the database is the problem).
Our application is running in an Azure-VM and both the VM and the database is located in western Europe.
The network between the VM and the database seems to perform ok. (Using psping (from Sysinternals) on the database port 5432 I get latency between 2 ms and 4 ms)
PostGreSQL incorporates a benchmark tool called pgbench. This tool runs a sequence of simple sql statements on a test dataset and provides timing.
I ran pgbench on the VM against the database. Pgbench reports latency between 800 ms and 1600 ms.
If I do the same test with pgbench in-house on our local network against an in-house database I typically get latency below 10 ms.
I tried to contact Microsoft support regarding this, but I've basically been told that since the network seems to perform ok this must be a PostGreSQL-software-problem and not related to Microsoft.
Since the database is PostGreSQL-Paas I've only got limited access to logs and metrics.
Can anyone please help or advice me with how to proceed with this?
Performance of Azure PostgreSQL PaaS offering depends on different server and client configuration, including the SKU provisioned along with storage IOPS. Microsoft engineering has published series of performance blog which helps customer gain measurable and empirical gains by following these steps based on their workload. Please review these blog post:
Performance best practices for Azure PostgreSQL
Performance tuning of Azure PostgreSQL
Performance quick tips for Azure PostgreSQL
Is your in-house Postgres set up similar to the set up in Azure ?
I had the same issue. We moved from a dedicated VM (Ubuntu, Size Standard B2s 2 vcpus, 4 GiB memory, ~35€ p.m. ) running PostgreSQL to the Azure managed PostgreSQL instance (General Purpose, single server, 2vcpus, 10GB Memory, ~130€ p.m. ).
I first noticed the bad performance when the main API request of our webapplication suddenly took 3s instead of 1.7s / 2s.
I ran some very simple timing tests on my old setup with dedicated VM:
select count(*) from mytable;
count
-------
4686
Time: 0.940 ms
And those are the timings of the new setup with Azure managed PostgreSQL:
select count(*) from mytable;
count
-------
4686
Time: 21,353 ms
I think I do not have to explain these numbers :)
I have created a support ticket, and got some insights:
"In Azure PostgreSQL single server, we have a gateway to manage and route connections and there are always 3 copies of the data to ensure your data is not lost, and all of this will create latency."
I also asked what the benefits are of the managed database:
A: Being a instance running on azure, you’ve benefit of:
-Automatic patching, your instance is automatically upgraded.
-Crash recovery, in case our system detects the instance is not running, it tries to perform a restart/swithover to a new host. If all this fails, an oncall engineer is activated to manually restore the instance.
-Automatic backups and one click point in time restore.
-Redundancy of data."
They suggested that I switch from Single Server to a Flexible server, where the gateway is ditched and the performance apparently should be better, but not as good as on a managed instance:
"In several tests we’ve made, the performance comparing to single server is much better. But to setup the right expectactions, you will not get 1 to 1 performance as having PostgreSQL running in a dedicated virtual machine."
I asked for the results of those tests, I will post them here as soon as I get them.
I think you have to decide if the benefits mentioned above are so high that you are willing to pay at least 4 times more compared to a dedicated VM and if you can live with the worse performance. We will now switch back to a master / slave configuration with 2 dedicated VMs.

Related to speed of execution of Job in Amazon Elastic Mapreduce

My Task is
1) Initially I want to import the data from MS SQL Server into HDFS using SQOOP.
2) Through Hive I am processing the data and generating the result in one table
3) That result containing table from Hive is again exported to MS SQL SERVER back.
I want to perform all this using Amazon Elastic Map Reduce.
The data which I am importing from MS SQL Server is very large (near about 5,00,000 entries in one table. Like wise I have 30 tables). For this I have written a task in Hive which contains only queries (And each query has used a lot of joins in it). So due to this the performance is very poor on my single local machine ( It takes near about 3 hrs to execute completely).
I want to reduce that time as much less as possible. For that we have decided to use Amazon Elastic Mapreduce. Currently I am using 3 m1.large instance and still I have same performance as on my local machine.
In order to improve the performance what number of instances should I need to use?
As number of instances we use are they configured automatically or do I need to specify while submitting JAR to it for execution? Because as I use two machine time is same.
And also Is there any other way to improve the performance or just to increase the number of instance. Or am I doing something wrong while executing JAR?
Please guide me through this as I don't much about the Amazon Servers.
Thanks.
You could try Ganglia, which can be installed on your EMR cluster using a bootstrap action. This will give you some metrics on the performance of each node in the cluster and may help you optimise to get the right sized cluster:
http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_Ganglia.html
If you use the EMR Ruby client on your local machine, you can set up an SSH tunnel to allow you to view the ganglia web interface in Firefox (you'll also need to setup FoxyProxy as per the following http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/emr-connect-master-node-foxy-proxy.html)

Is having multiple data/log files a good thing even on the same LUN?

I have read that it is a good idea to have one file per CPU/CPU Core so that SQL can more efficiently stream data to and from the disks. Ok, I can see the benefit if they are on different spindles, but what if I only have one spindle (4 drives in Raid 10) for my data files (.mdf and .ndf), will I still benefit from splitting the data files (from just the .mdf file to a .mdf and several .ndf files)? Same goes for the log file, although I see no benefit to it as the data has to be written serially and you're limited by the spindle's sequential write speed...
FYI, this is in regards to SQL Server 2005/2008...
Thanks.
The recommendation for multiple tempdb data files is definitely not about IOPS. It is about contention on the allocation pages (GAM, SGAM, PFS) in tempdb. SQL 2005+ doesn't require as big of a load on these pages, but contention still occurs. Not all system require a 1 file to 1 core mapping. Most sytems will perform well with 1 file to 2 or 4 cores. Having too many files adds overhead for managing the files. A good recommendation is to start with 1:4 or 1:2 and increasing if contention continues. Don't go above 1:1.
For other databases, this is not recommended.
And yes, only 1 log file ... always.
8 Steps to better Transaction Log throughput:
Create only ONE transaction log file.
Even though you can create multiple
transaction log files, you only need
one... SQL Server DOES not "stripe"
across multiple transaction log files.
Instead, SQL Server uses the
transaction log files sequentially.
Misconceptions around TF 1118:
Why is the trace flag not required so
much in 2005 and 2008? In SQL Server
2005, my team changed the allocation
system for tempdb to reduce the
possibility of contention. There is
now a cache of temp tables. When a new
temp table is created on a cold system
(just after startup) it uses the same
mechanism as for SQL 2000. When it is
dropped though, instead of all the
pages being deallocated completely,
one IAM page and one data page are
left allocated, and the temp table is
put into a special cache. Subsequent
temp table creations will look in the
cache to see if they can just grab a
pre-created temp table 'off the
shelf'. If so, this avoids accessing
the allocation bitmaps completely. The
temp table cache isn't huge (I think
it's 32 tables), but this can still
lead to a big drop in latch
contention in tempdb.
So the answer is NO to both questions. Log striping was never an issue, and one-NDF-per-CPU is largely a myth, one that will take a very long time to die out. Multiple files IMHO make sense only if you can stripe IO (separate LUNs). Multiple filegroups though make sense, but not for IO reasons, for administrative purposes: piecemeal restores and archive read-only filegroups.
Still good. This is not about IOPS - it is about SQL Server BLOCKING a file for certain operations. mostly when file extends are allocated to a table / index. If you do a lot of inserts / updates, multiple files basically mean another thread will block another file, not wait on the first one.
So, this is not really about IOPS loads, it is about a blocking behavior.