Why is running a query on SQL Azure so much slower?

Why is running a query on SQL Azure so much slower? - sql

I created a trial account on Azure, and I deployed my database from SmarterAsp.
When I run a pivot query on SmarterAsp\MyDatabase, the results appeared in 2 seconds.
However, running the same query on Azure\MyDatabase took 94 seconds.
I use the SQL Server 2014 Management Studio (trial) to connect to the servers and run query.
Is this difference of speed because my account is a trial account?
Some related info to my question
the query is:
ALTER procedure [dbo].[Pivot_Per_Day]
#iyear int,
#imonth int,
#iddepartment int
as
declare #columnName Nvarchar(max) = ''
declare #sql Nvarchar(max) =''
select #columnName += quotename(iDay) + ','
from (
Select day(idate) as iDay
from kpivalues where year(idate)=#iyear and month(idate)=#imonth
group by idate
)x
set #columnName=left(#columnName,len(#columnName)-1)
set #sql ='
Select * from (
select kpiname, target, ivalues, convert(decimal(18,2),day(idate)) as iDay
from kpi
inner join kpivalues on kpivalues.idkpi=kpi.idkpi
inner join kpitarget on kpitarget.idkpi=kpi.idkpi
inner join departmentbscs on departmentbscs.idkpi=kpi.idkpi
where iddepartment='+convert(nvarchar(max),#iddepartment)+'
group by kpiname,target, ivalues,idate)x
pivot
(
avg(ivalues)
for iDay in (' + #columnName + ')
) p'
execute sp_executesql #sql
Running this query on 3 different servers gave me different results in terms of Elapsed time till my pivot table appear on the screen:
Azure - Elapsed time = 100.165 sec
Smarterasp.net - Elapsed time = 2.449 sec
LocalServer - Elapsed time = 1.716 sec
Regarding my trial account on Azure, I made it with the main goal to check if I will have a better speed than Smarter when running stored procedure like the above one.
I choose for my database Service Tier - Basic, Performance level -Basic(5DTUs) and Max. Size 2GB.
My database has 16 tables, 1 table has 145284 rows, and the database size is 11mb. Its a test database for my app.
My questions are:
What can I do, to optimize this query (sp)?
Is Azure recommended for small databases (100mb-1Gb)? I mean performance vs. cost!
Conclusions based on your inputs:
I made suggested changes to the query and the performance was improved with more than 50% - Thank you Remus
I tested my query on Azure S2 and the Elapsed time for updated query was 11 seconds.
I tested again my query on P1 and the Elapsed time was 0.5 seconds :)
the same updated query on SmarterASP had Elapsed time 0.8 seconds.
Now its clear for me what are the tiers in Azure and how important is to have a very good query (I even understood what is an Index and his advantage/disadvantage)
Thank you all,
Lucian

This is first and foremost a question of performance. You are dealing with a poorly performing code on your part and you must identify the bottleneck and address it. I'm talking about the bad 2 seconds performance now. Follow the guidelines at How to analyse SQL Server performance. Once you get this query to execute locally acceptable for a web app (less than 5 ms) then you can ask the question of porting it to Azure SQL DB. Right now your trial account is only highlighting the existing inefficiencies.
After update
...
#iddepartment int
...
iddepartment='+convert(nvarchar(max),#iddepartment)+'
...
so what is it? is the iddepartment column an int or an nvarchar? And why use (max)?
Here is what you should do:
parameterize #iddepartment in the inner dynamic SQL
stop doing nvarchar(max) conversion. Make the iddepartment and #iddertment types match
ensure indexes on iddepartment and all idkpis
Here is how to parameterize the inner SQL:
set #sql =N'
Select * from (
select kpiname, target, ivalues, convert(decimal(18,2),day(idate)) as iDay
from kpi
inner join kpivalues on kpivalues.idkpi=kpi.idkpi
inner join kpitarget on kpitarget.idkpi=kpi.idkpi
inner join departmentbscs on departmentbscs.idkpi=kpi.idkpi
where iddepartment=#iddepartment
group by kpiname,target, ivalues,idate)x
pivot
(
avg(ivalues)
for iDay in (' +#columnName + N')
) p'
execute sp_executesql #sql, N'#iddepartment INT', #iddepartment;
The covering indexes is, by far, the most important fix. That obviously requires more info than is here present. Read Designing Indexes including all sub-chapters.
As a more general comment: this sort of queries befit columnstores more than rowstore, although I reckon the data size is, basically, tiny. Azure SQL DB supports updateable clustered columnstore indexes, you can experiment with it in anticipation of serious data size. They do require Enterprise/Development on the local box, true.

(Update: the original question has been changed to also ask how to optimise the query - which is a good question as well. The original question was why the difference which is what this answer is about).
The performance of individual queries is heavily affected by the performance tiers. I know the documentation implies the tiers are about load, that is not strictly true.
I would re-run your test with an S2 database as a starting point and go from there.
Being on a trial subscription does not in itself affect performance, but with the free account you are probably using a B level which isn't really useable by anything real - certainly not for a query that takes 2 seconds to run locally.
Even moving between, say, S1 and S2 will show a noticeable difference in performance of an individual query.
If you want to experiment, do remember you are charged a day for "any part of a day", which is probably okay for S level but be careful when testing P level.
For background; when Azure introduced the new tiers last year, they changed the hosting model for SQL. It used to be that many databases would run on a shared sqlserver.exe. In the new model, each database effectively gets its own sqlserver.exe that runs in a resource constrained sandbox. That is how they control the "DTU usage" but also affects general performance.

It's nothing to do with the fact that your account is trial, it's due to the lower performance level you have selected.
In other service (SmarterAsp) and running local instance you probably do not have performance restrictions rather size restrictions.
At this point it's impossible to put together what actually DTU means / what sort of DTU number is associated with a Sql server installed in your local machine or in any other hosting provider.
However, there are some good analysis (https://cbailiss.wordpress.com/2014/09/16/performance-in-new-azure-sql-database-performance-tiers/) done regarding this but nothing official.

Related

How to fix MS SQL linked server slowness issue for MariaDB

I'm using SQL Link server for fetching data from MariaDB.
But I fetching issue with slowness when i used MariaDB from link server.
I used below scenarios to fetch result (also describe time taken by query)
Please suggest if you have any solutions.
Total number of row in patient table : 62520
SELECT count(1) FROM [MariaDB]...[webimslt.Patient] -- 2.6 second
SELECT * FROM OPENQUERY([MariaDB], 'select count(1) from webimslt.patient') -- 47ms
SELECT * FROM OPENQUERY([MariaDB], 'select * from webimslt.patient') -- 20 second

This isn’t really a fair comparison...
SELECT COUNT(1) is only returning a single number and will probably be using an index to count rows.
SELECT * is returning ALL data from the table.
Returning data is an expensive (slow) process, so it will obviously take time to return your data. Then there is the question of data transfer, are the servers connected using a high speed connection? That is also a factor in this. It will never be as fast to query over a linked server as it is to query your database directly.
How can you improve the speed? I would start by only returning the data you need by specifying the columns and adding a where clause. After that, you can probably use indexes in Maria to try to speed things up.

Remove query from Query Store report

I've just started using Query data store in SQL server 2016, and its very useful indeed.
I have a problem in that on the server there are a number of services that are monitoring service broker queues, and as a result their WAITFOR statements always appear as the longest running queries in the reports.
This in itself is not an issue, but they run for so long that they skew the duration axis on the report so that all the other queries are hardly visible.
Is there any way to get Query Store to ignore a query so it doesn't show up on the report?

Try using sp_query_store_remove_query. It removes the query, as well as all associated plans and runtime stats from the query store.
DECLARE #QueryStoreRemoveCommand VARCHAR(MAX)
SELECT #QueryStoreRemoveCommand = COALESCE(#QueryStoreRemoveCommand +
'; EXEC sp_query_store_remove_query ',
'EXEC sp_query_store_remove_query ')
+ CONVERT(NVARCHAR,QueryData.query_id)
FROM
(SELECT Qry.query_id
FROM sys.query_store_plan AS Pl
JOIN sys.query_store_query AS Qry
ON Pl.query_id = Qry.query_id
JOIN sys.query_store_query_text AS Txt
ON Qry.query_text_id = Txt.query_text_id
WHERE UPPER(Txt.query_sql_text) LIKE '%WAITFOR DELAY%') QueryData
PRINT #QueryStoreRemoveCommand
EXECUTE (#QueryStoreRemoveCommand)

Good question!
I didn't find how to remove one query (which would be usefull), but I found how to clear the cache so that you can start over. That way if you have an old query that changed, you can reset the cache and get fresh data.
ALTER DATABASE WideWorldImporters SET QUERY_STORE CLEAR;

Transferring tables with an "insert from location" statement in Sybase IQ is very slow

I am trying to transfer several tables from a Sybase IQ database on one machine, to the same database on another machine (exact same schema and table layout etc).
To do this I'm using an insert from location statement:
insert into <local table> location <other machine> select * from mytablex
This works fine, but the problem is that it is desperately slow. I have a 1 gigabit connection between both machines, but the transfer rate is nowhere near that.
With a 1 gigabyte test file, it takes only 1 or 2 minutes to transfer it via ftp (just as a file, nothing to do with IQ).
But I am only managing 100 gigabytes over 24 hours in IQ. That means that the transfer rate is more like 14 or 15 minutes for 1 gigabyte for the data going through Sybase IQ.
Is there any way I can speed this up?
I saw there is an option to change the packet size, but would that make a difference? Surely if the transfer is 7 times faster for a file the packet size can't be that much of a factor?
Thanks! :)

It appears from the documentation here and here that using insert into is a row by row operation, and not a bulk operation. This could explain the performance issues that you are seeing.
You may want to look at the bulk loading LOAD TABLE operation instead.

If I recall correctly, IQ 15.x has known bugs where packetsize is effectively ignored for insert...location...select and the default 512 is always used.
The insert...location...select is a bulk tds operation typically, however we have found it to have limited value when working with gigabytes of data, and built a process to handle extract/Load Table that is significantly faster.
I know it's not the answer you want, but performance appears to degrade as the data size grows. Some tables will actually never finish, if they are large enough.
Just a thought, you might want to specify the exact columns and wrap in an exec with dynamic sql. Dynamic SQL is a no-no, but if you need the proc executable in dev/qa + prod environments, there really isn't another option. I'm assuming this will be called in a controlled environment anyways, but here's what I mean:
declare #cmd varchar(2500), #location varchar(255)
set #location = 'SOMEDEVHOST.database_name'
set #cmd = 'insert localtablename (col1, col2, coln...) ' +
''''+ trim(#location)+ '''' +
' { select col1, col2, coln... from remotetablename}'
select #cmd
execute(#cmd)
go

Is there a "Code Coverage" equivalent for SQL databases?

I have a database with many tables that get used, and many tables that are no longer used. While I could sort through each table manually to see if they are still in use, that would be a cumbersome task. Is there any software/hidden feature that can be used on a SQL Server/Oracle database that would return information like "Tables x,y,z have not been used in the past month" "Tables a,b,c have been used 17 times today"? Or possibly a way to sort tables by "Date Last Modified/Selected From"?
Or is there a better way to go about doing this? Thanks
edit: I found a "modify_date" column when executing "SELECT * FROM sys.tables ORDER BY modify_date desc", but this seems to only keep track of modifications to the table's structure, not its contents.

replace spt_values with the tablename you are interested in, the query will give the the last time it was used and what it was used by
From here: Finding Out How Many Times A Table Is Being Used In Ad Hoc Or Procedure Calls In SQL Server 2005 And 2008
SELECT * FROM(SELECT COALESCE(OBJECT_NAME(s2.objectid),'Ad-Hoc') AS ProcName,execution_count,
(SELECT TOP 1 SUBSTRING(s2.TEXT,statement_start_offset / 2+1 ,
( (CASE WHEN statement_end_offset = -1
THEN (LEN(CONVERT(NVARCHAR(MAX),s2.TEXT)) * 2)
ELSE statement_end_offset END) - statement_start_offset) / 2+1)) AS sql_statement,
last_execution_time
FROM sys.dm_exec_query_stats AS s1
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS s2 ) x
WHERE sql_statement like '%spt_values%' -- replace here
AND sql_statement NOT like 'SELECT * FROM(SELECT coalesce(object_name(s2.objectid)%'
ORDER BY execution_count DESC
Keep in mind that if you restart the box, this will be cleared out

In Oracle you can use the ASH (Active Session History) to find info about SQL that was used. You can also perform code coverage tests with the Hierarchical profiler, where you can find which parts of the stored procedures is used or not used.
If you wonder about the updates on table data, you can also use DBA_TAB_MODIFICATIONS. This shows how many inserts, updates, deletes are done on a table or table partition. As soon as new object statistics are generated, the row for the specified table is removed from DBA_TAB_MODIFICATIONS. You still have help here, since you could also have a peek in the table statistics history. This does not show anything about tables that are queried only. If you really need to know about this, you are to use the ASH.
Note, for both ASH and statistics history access, you do need the diagnostics or tuning pack license. (normally you would want this anyway).

If you use trigger you can detect update insert or delete on table.
Access is problably more difficult.

I use a combination of static analysis in the metadata to determine tables/columns which have no dependencies and runtime traces in SQL Server to see what activity is happening.

Some more queries that might be useful for you.
select * from sys.dm_db_index_usage_stats
select * from sys.dm_db_index_operational_stats(db_id(),NULL,NULL,NULL)
select * from sys.sql_expression_dependencies /*SQL Server 2008 only*/
The difference betweeen what the first 2 DMVs report is explained well in this blog post.

Ed Elliott's open source tool, SQL Cover, is a good bet and has built-in support for the popular unit testing tool, tSQLt.

Report on SQL/SSRS 2k5 takes > 10 minutes, query < 3 mins

We have SQL and SSRS 2k5 on a Win 2k3 virtual server with 4Gb on the virt server. (The server running the virt server has > 32Gb)
When we run our comparison report, it calls a stored proc on database A. The proc pulls data from several tables, and from a view on database B.
If I run Profiler and monitor the calls, I see activity
SQL:BatchStarting SELECT
DATABASEPROPERTYEX(DB_NAME(),
'Collation'),
COLLATIONPROPERTY(CONVERT(char,
DATABASEPROPERTYEX(DB_NAME(),
'collation')), 'LCID')
then wait several minutes till the actual call of the proc shows up.
RPC:Completed exec sp_executesql
N'exec
[procGetLicenseSales_ALS_Voucher]
#CurrentLicenseYear,
#CurrentStartDate, #CurrentEndDate,
''Fishing License'',
#PreviousLicenseYear,
#OpenLicenseAccounts',N'#CurrentStartDate
datetime,#CurrentEndDate
datetime,#CurrentLicenseYear
int,#PreviousLicenseYear
int,#OpenLicenseAccounts
nvarchar(4000)',#CurrentStartDate='2010-11-01
00:00:00:000',#CurrentEndDate='2010-11-30
00:00:00:000',#CurrentLicenseYear=2010,#PreviousLicenseYear=2009,#OpenLicenseAccounts=NULL
then more time, and usually the report times out. It takes about 20 minutes if I let it run in Designer
This Report was working, albeit slowly but still less than 10 minutes, for months.
If I drop the query (captured from profiler) into SQL Server Management Studio, it takes 2 minutes, 8 seconds to run.
Database B just had some changes and data replicated to it (we only read from the data, all new data comes from nightly replication).
Something has obviously changed, but what change broke the report? How can I test to find out why the SSRS part is taking forever and timing out, but the query runs in about 2 minutes?
Added: Please note, the stored proc returns 18 rows... any time. (We only have 18 products to track.)
The report takes those 18 rows, and groups them and does some sums. No matrix, only one page, very simple.

M Kenyon II
Database B just had some changes and data replicated to it (we only read from the data, all new data comes from nightly replication).
Ensure that all indexes survived the changes to Database B. If they still exist, check how fragmented they are and reorganize or rebuild as necessary.
Indexes can have a huge impact on performance.
As far as the report taking far longer to run than your query, there can be many reasons for this. Some tricks for getting SSRS to run faster can be found here:
http://www.sqlservercentral.com/Forums/Topic859015-150-1.aspx
Edit:
Here's the relevant information from the link above.
AshMc
I recall some time ago we had the same issue where we were passing in the parameters within SSRS to a SQL Dataset and it would slow it all down compared to doing it in SSMS (minutes compared to seconds like your issue). It appeared that when SSRS was passing in the parameter it was possibly recalculating the value and not storing it once and that was it.
What I did was declare a new TSQL parameter first within the dataset and set it to equal the SSRS parameter and then use the new parameter like I would in SSMS.
eg:
DECLARE #X as int
SET #X = #SSRSParameter
janavarr
Thanks AshMc, this one worked for me. However my issue now is that it will only work with a single parameter and the query won’t run if I want to pass multiple parameter values.
...
AshMc
I was able to find how I did this previously. I created a Temp table placed the values that we wanted to filter on in it then did an inner join on the main query to it. We only use the SSRS Parameters as a filter on what to put in the temp table.
This saved a lot of report run time doing it this way
DECLARE #ParameterList TABLE (ValueA Varchar(20))
INSERT INTO #ParameterList
select ValueA
from TableA
where ValueA = #ValueB
INNER JOIN #ParameterList
ON ValueC = ValueA
Hope this helps,
--Dubs

Could be parameter sniffing. If you've changed some data or some of the tables then the cached plan that will have satisfied the sp for the old data model may not be valid any more.
Answered a very similar thing here:
stored procedure performance issue
Quote:
f you are sure that the sql is exactly the same and that the params are the same then you could be experiencing a parameter sniffing problem .
It's a pretty uncommon problem. I've only had it happen to me once and since then I've always coded away the problem.
Start here for a quick overview of the problem:
http://blogs.msdn.com/b/queryoptteam/archive/2006/03/31/565991.aspx
http://elegantcode.com/2008/05/17/sql-parameter-sniffing-and-what-to-do-about-it/
try declaring some local variables inside the sp and allocate the vales of the parameters to them. The use the local variables in place of the params.
It's a feature not a bug but it makes you go #"$#

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas