I've just started using Query data store in SQL server 2016, and its very useful indeed.
I have a problem in that on the server there are a number of services that are monitoring service broker queues, and as a result their WAITFOR statements always appear as the longest running queries in the reports.
This in itself is not an issue, but they run for so long that they skew the duration axis on the report so that all the other queries are hardly visible.
Is there any way to get Query Store to ignore a query so it doesn't show up on the report?
Try using sp_query_store_remove_query. It removes the query, as well as all associated plans and runtime stats from the query store.
DECLARE #QueryStoreRemoveCommand VARCHAR(MAX)
SELECT #QueryStoreRemoveCommand = COALESCE(#QueryStoreRemoveCommand +
'; EXEC sp_query_store_remove_query ',
'EXEC sp_query_store_remove_query ')
+ CONVERT(NVARCHAR,QueryData.query_id)
FROM
(SELECT Qry.query_id
FROM sys.query_store_plan AS Pl
JOIN sys.query_store_query AS Qry
ON Pl.query_id = Qry.query_id
JOIN sys.query_store_query_text AS Txt
ON Qry.query_text_id = Txt.query_text_id
WHERE UPPER(Txt.query_sql_text) LIKE '%WAITFOR DELAY%') QueryData
PRINT #QueryStoreRemoveCommand
EXECUTE (#QueryStoreRemoveCommand)
Good question!
I didn't find how to remove one query (which would be usefull), but I found how to clear the cache so that you can start over. That way if you have an old query that changed, you can reset the cache and get fresh data.
ALTER DATABASE WideWorldImporters SET QUERY_STORE CLEAR;
Related
I am working on building an application to assist with taking data from different sources dynamically Files and Emails (usually CSV and Excel), APIs, and other SQL Databases and processing them and moving them to a central SQL server. All the tables are being uploaded to the main SQL server and processed to insert new rows into the destination table and update rows with changed data if some is available. The main SQL server is a Microsoft SQL server.
When the data is uploaded to the main server for comparison it is being stored in a temporary table which is being dropped after the comparison is done. The statement that I am using created dynamically by the program in order to allow it to be dynamic to different datasets. What I have been using is a NOT EXISTS which when I run it on a table that is 380k+ rows of data it has been taking 2+ hours to process that data. I have also tried EXCEPT, however I am unable to use that as some of the tables contain text fields which can't be used for the EXCEPT statement. The datasets that are being uploaded to the server are written to and read from at different intervals based on the schedules built into the program.
I was looking to find a more efficient way or improvements that I might be able to make use of in order to bring down the run times for this table. The program that manages the server is running on a remote server than the SQL instance which runs on part of the organizations SQL farm. I am not very experienced with SQL so I appreciate all the help I might be able to get. Below I added links to the code and an example statement produced by the system when going to run the comparison.
C# Code: https://pastebin.com/8PeUvekG
SQL Statement: https://pastebin.com/zc9kshJw
INSERT INTO vewCovid19_SP
(Street_Number,Street_Dir,Street_Name,Street_Type,Apt,Municipality,County,x_cord,y_cord,address,Event_Number,latitude,longitude,Test_Type,Created_On_Date,msg)
SELECT A.Street_Number,A.Street_Dir,A.Street_Name,A.Street_Type,A.Apt,A.Municipality,A.County,A.x_cord,A.y_cord,A.address,A.Event_Number,A.latitude,A.longitude,A.Test_Type,A.Created_On_Date,A.msg
FROM #TEMP_UPLOAD A
WHERE NOT EXISTS
SELECT * FROM vewCovid19_SP B
WHERE ISNULL(CONVERT(VARCHAR,A.Street_Number), 'NULL') = ISNULL(CONVERT(VARCHAR,B.Street_Number), 'NULL')
AND ISNULL(CONVERT(VARCHAR,A.Street_Dir), 'NULL') = ISNULL(CONVERT(VARCHAR,B.Street_Dir), 'NULL')
AND ISNULL(CONVERT(VARCHAR,A.Apt), 'NULL') = ISNULL(CONVERT(VARCHAR,B.Apt), 'NULL')
AND ISNULL(CONVERT(VARCHAR,A.Street_Name), 'NULL') = ISNULL(CONVERT(VARCHAR,B.Street_Name), 'NULL')
AND ISNULL(CONVERT(VARCHAR,A.Street_Type), 'NULL') = ISNULL(CONVERT(VARCHAR,B.Street_Type), 'NULL'));
DROP TABLE #TEMP_UPLOAD"
One simple query form would be to load a new table from a UNION (which includes de-duplication).
eg
insert into viewCovid19_SP_new
select *
from vewCovid19_SP
union
select *
from #temp_upload
then swap the tables with ALTER TABLE ... SWITCH or drop and sp_rename.
I created a trial account on Azure, and I deployed my database from SmarterAsp.
When I run a pivot query on SmarterAsp\MyDatabase, the results appeared in 2 seconds.
However, running the same query on Azure\MyDatabase took 94 seconds.
I use the SQL Server 2014 Management Studio (trial) to connect to the servers and run query.
Is this difference of speed because my account is a trial account?
Some related info to my question
the query is:
ALTER procedure [dbo].[Pivot_Per_Day]
#iyear int,
#imonth int,
#iddepartment int
as
declare #columnName Nvarchar(max) = ''
declare #sql Nvarchar(max) =''
select #columnName += quotename(iDay) + ','
from (
Select day(idate) as iDay
from kpivalues where year(idate)=#iyear and month(idate)=#imonth
group by idate
)x
set #columnName=left(#columnName,len(#columnName)-1)
set #sql ='
Select * from (
select kpiname, target, ivalues, convert(decimal(18,2),day(idate)) as iDay
from kpi
inner join kpivalues on kpivalues.idkpi=kpi.idkpi
inner join kpitarget on kpitarget.idkpi=kpi.idkpi
inner join departmentbscs on departmentbscs.idkpi=kpi.idkpi
where iddepartment='+convert(nvarchar(max),#iddepartment)+'
group by kpiname,target, ivalues,idate)x
pivot
(
avg(ivalues)
for iDay in (' + #columnName + ')
) p'
execute sp_executesql #sql
Running this query on 3 different servers gave me different results in terms of Elapsed time till my pivot table appear on the screen:
Azure - Elapsed time = 100.165 sec
Smarterasp.net - Elapsed time = 2.449 sec
LocalServer - Elapsed time = 1.716 sec
Regarding my trial account on Azure, I made it with the main goal to check if I will have a better speed than Smarter when running stored procedure like the above one.
I choose for my database Service Tier - Basic, Performance level -Basic(5DTUs) and Max. Size 2GB.
My database has 16 tables, 1 table has 145284 rows, and the database size is 11mb. Its a test database for my app.
My questions are:
What can I do, to optimize this query (sp)?
Is Azure recommended for small databases (100mb-1Gb)? I mean performance vs. cost!
Conclusions based on your inputs:
I made suggested changes to the query and the performance was improved with more than 50% - Thank you Remus
I tested my query on Azure S2 and the Elapsed time for updated query was 11 seconds.
I tested again my query on P1 and the Elapsed time was 0.5 seconds :)
the same updated query on SmarterASP had Elapsed time 0.8 seconds.
Now its clear for me what are the tiers in Azure and how important is to have a very good query (I even understood what is an Index and his advantage/disadvantage)
Thank you all,
Lucian
This is first and foremost a question of performance. You are dealing with a poorly performing code on your part and you must identify the bottleneck and address it. I'm talking about the bad 2 seconds performance now. Follow the guidelines at How to analyse SQL Server performance. Once you get this query to execute locally acceptable for a web app (less than 5 ms) then you can ask the question of porting it to Azure SQL DB. Right now your trial account is only highlighting the existing inefficiencies.
After update
...
#iddepartment int
...
iddepartment='+convert(nvarchar(max),#iddepartment)+'
...
so what is it? is the iddepartment column an int or an nvarchar? And why use (max)?
Here is what you should do:
parameterize #iddepartment in the inner dynamic SQL
stop doing nvarchar(max) conversion. Make the iddepartment and #iddertment types match
ensure indexes on iddepartment and all idkpis
Here is how to parameterize the inner SQL:
set #sql =N'
Select * from (
select kpiname, target, ivalues, convert(decimal(18,2),day(idate)) as iDay
from kpi
inner join kpivalues on kpivalues.idkpi=kpi.idkpi
inner join kpitarget on kpitarget.idkpi=kpi.idkpi
inner join departmentbscs on departmentbscs.idkpi=kpi.idkpi
where iddepartment=#iddepartment
group by kpiname,target, ivalues,idate)x
pivot
(
avg(ivalues)
for iDay in (' +#columnName + N')
) p'
execute sp_executesql #sql, N'#iddepartment INT', #iddepartment;
The covering indexes is, by far, the most important fix. That obviously requires more info than is here present. Read Designing Indexes including all sub-chapters.
As a more general comment: this sort of queries befit columnstores more than rowstore, although I reckon the data size is, basically, tiny. Azure SQL DB supports updateable clustered columnstore indexes, you can experiment with it in anticipation of serious data size. They do require Enterprise/Development on the local box, true.
(Update: the original question has been changed to also ask how to optimise the query - which is a good question as well. The original question was why the difference which is what this answer is about).
The performance of individual queries is heavily affected by the performance tiers. I know the documentation implies the tiers are about load, that is not strictly true.
I would re-run your test with an S2 database as a starting point and go from there.
Being on a trial subscription does not in itself affect performance, but with the free account you are probably using a B level which isn't really useable by anything real - certainly not for a query that takes 2 seconds to run locally.
Even moving between, say, S1 and S2 will show a noticeable difference in performance of an individual query.
If you want to experiment, do remember you are charged a day for "any part of a day", which is probably okay for S level but be careful when testing P level.
For background; when Azure introduced the new tiers last year, they changed the hosting model for SQL. It used to be that many databases would run on a shared sqlserver.exe. In the new model, each database effectively gets its own sqlserver.exe that runs in a resource constrained sandbox. That is how they control the "DTU usage" but also affects general performance.
It's nothing to do with the fact that your account is trial, it's due to the lower performance level you have selected.
In other service (SmarterAsp) and running local instance you probably do not have performance restrictions rather size restrictions.
At this point it's impossible to put together what actually DTU means / what sort of DTU number is associated with a Sql server installed in your local machine or in any other hosting provider.
However, there are some good analysis (https://cbailiss.wordpress.com/2014/09/16/performance-in-new-azure-sql-database-performance-tiers/) done regarding this but nothing official.
I'm kicking tires on BI tools, including, of course, Tableau. Part of my evaluation includes correlating the SQL generated by the BI tool with my actions in the tool.
Tableau has me mystified. My database has 2 billion things; however, no matter what I do in Tableau, the query Redshift reports as having been run is "Fetch 10000 in SQL_CURxyz", i.e. a cursor operation. In the screenshot below, you can see the cursor ids change, indicating new queries are being run -- but you don't see the original queries.
Is this a Redshift or Tableau quirk? Any idea how to see what's actually running under the hood? And why is Tableau always operating on 10000 records at a time?
I just ran into the same problem and wrote this simple query to get all queries for currently active cursors:
SELECT
usr.usename AS username
, min(cur.starttime) AS start_time
, DATEDIFF(second, min(cur.starttime), getdate()) AS run_time
, min(cur.row_count) AS row_count
, min(cur.fetched_rows) AS fetched_rows
, listagg(util_text.text)
WITHIN GROUP (ORDER BY sequence) AS query
FROM STV_ACTIVE_CURSORS cur
JOIN stl_utilitytext util_text
ON cur.pid = util_text.pid AND cur.xid = util_text.xid
JOIN pg_user usr
ON usr.usesysid = cur.userid
GROUP BY usr.usename, util_text.xid;
Ah, this has already been asked on the AWS forums.
https://forums.aws.amazon.com/thread.jspa?threadID=152473
Redshift's console apparently doesn't display the query behind cursors. To get that, you can query STV_ACTIVE_CURSORS: http://docs.aws.amazon.com/redshift/latest/dg/r_STV_ACTIVE_CURSORS.html
Also, you can alter your .TWB file (which is really just an xml file) and add the following parameters to the odbc-connect-string-extras property.
UseDeclareFetch=0;
FETCH=0;
You would end up with something like:
<connection class='redshift' dbname='yourdb' odbc-connect-string-extras='UseDeclareFetch=0;FETCH=0' port='0000' schema='schm' server='any.redshift.amazonaws.com' [...] >
Unfortunately there's no way of changing this behavior trough the application, you must edit the file directly.
You should be aware of the performance implications of doing so. While this greatly enhances debugging there must be a reason why Tableau chose not to allow modification of these parameters trough the application.
My project is in Visual Foxpro and I use MS SQL server 2008. When I fire sql queries in batch, some of the queries don't execute. However, no error is thrown. I haven't used BEGIN TRAN and ROLLBACK yet. What should be done ??
that all depends... You don't have any sample of your queries posted to give us an indication of possible failure. However, one thing I've had good response with from VFP to SQL is to build into a string (I prefer using TEXT/ENDTEXT for readabilty), then send that entire value to SQL. If there are any "parameter" based values that are from VFP locally, you can use "?" to indicate it will come from a variable to SQL. Then you can batch all in a single vs multiple individual queries...
vfpField = 28
vfpString = 'Smith'
text to lcSqlCmd noshow
select
YT.blah,
YT.blah2
into
#tempSqlResult
from
yourTable YT
where
YT.SomeKey = ?vfpField
select
ost.Xblah,
t.blah,
t.blah2
from
OtherSQLTable ost
join #tempSqlResult t
on ost.Xblah = t.blahKey;
drop table #tempSqlResult;
endtext
nHandle = sqlconnect( "your connection string" )
nAns = sqlexec( nHandle, lcSqlCmd, "LocalVFPCursorName" )
No I don't have error trapping in here, just to show principle and readability. I know the sample query could have easily been done via a join, but if you are working with some pre-aggregations and want to put them into temp work areas like Localized VFP cursors from a query to be used as your next step, this would work via #tempSqlResult as "#" indicates temporary table on SQL for whatever the current connection handle is.
If you want to return MULTIPLE RESULT SETs from a single SQL call, you can do that too, just add another query that doesn't have an "into #tmpSQLblah" context. Then, all instances of those result cursors will be brought back down to VFP based on the "LocalVFPCursorName" prefix. If you are returning 3 result sets, then VFP will have 3 cursors open called
LocalVFPCursorName
LocalVFPCursorName1
LocalVFPCursorName2
and will be based on the sequence of the queries in the SqlExec() call. But if you can provide more on what you ARE trying to do and their samples, we can offer more specific help too.
We have SQL and SSRS 2k5 on a Win 2k3 virtual server with 4Gb on the virt server. (The server running the virt server has > 32Gb)
When we run our comparison report, it calls a stored proc on database A. The proc pulls data from several tables, and from a view on database B.
If I run Profiler and monitor the calls, I see activity
SQL:BatchStarting SELECT
DATABASEPROPERTYEX(DB_NAME(),
'Collation'),
COLLATIONPROPERTY(CONVERT(char,
DATABASEPROPERTYEX(DB_NAME(),
'collation')), 'LCID')
then wait several minutes till the actual call of the proc shows up.
RPC:Completed exec sp_executesql
N'exec
[procGetLicenseSales_ALS_Voucher]
#CurrentLicenseYear,
#CurrentStartDate, #CurrentEndDate,
''Fishing License'',
#PreviousLicenseYear,
#OpenLicenseAccounts',N'#CurrentStartDate
datetime,#CurrentEndDate
datetime,#CurrentLicenseYear
int,#PreviousLicenseYear
int,#OpenLicenseAccounts
nvarchar(4000)',#CurrentStartDate='2010-11-01
00:00:00:000',#CurrentEndDate='2010-11-30
00:00:00:000',#CurrentLicenseYear=2010,#PreviousLicenseYear=2009,#OpenLicenseAccounts=NULL
then more time, and usually the report times out. It takes about 20 minutes if I let it run in Designer
This Report was working, albeit slowly but still less than 10 minutes, for months.
If I drop the query (captured from profiler) into SQL Server Management Studio, it takes 2 minutes, 8 seconds to run.
Database B just had some changes and data replicated to it (we only read from the data, all new data comes from nightly replication).
Something has obviously changed, but what change broke the report? How can I test to find out why the SSRS part is taking forever and timing out, but the query runs in about 2 minutes?
Added: Please note, the stored proc returns 18 rows... any time. (We only have 18 products to track.)
The report takes those 18 rows, and groups them and does some sums. No matrix, only one page, very simple.
M Kenyon II
Database B just had some changes and data replicated to it (we only read from the data, all new data comes from nightly replication).
Ensure that all indexes survived the changes to Database B. If they still exist, check how fragmented they are and reorganize or rebuild as necessary.
Indexes can have a huge impact on performance.
As far as the report taking far longer to run than your query, there can be many reasons for this. Some tricks for getting SSRS to run faster can be found here:
http://www.sqlservercentral.com/Forums/Topic859015-150-1.aspx
Edit:
Here's the relevant information from the link above.
AshMc
I recall some time ago we had the same issue where we were passing in the parameters within SSRS to a SQL Dataset and it would slow it all down compared to doing it in SSMS (minutes compared to seconds like your issue). It appeared that when SSRS was passing in the parameter it was possibly recalculating the value and not storing it once and that was it.
What I did was declare a new TSQL parameter first within the dataset and set it to equal the SSRS parameter and then use the new parameter like I would in SSMS.
eg:
DECLARE #X as int
SET #X = #SSRSParameter
janavarr
Thanks AshMc, this one worked for me. However my issue now is that it will only work with a single parameter and the query won’t run if I want to pass multiple parameter values.
...
AshMc
I was able to find how I did this previously. I created a Temp table placed the values that we wanted to filter on in it then did an inner join on the main query to it. We only use the SSRS Parameters as a filter on what to put in the temp table.
This saved a lot of report run time doing it this way
DECLARE #ParameterList TABLE (ValueA Varchar(20))
INSERT INTO #ParameterList
select ValueA
from TableA
where ValueA = #ValueB
INNER JOIN #ParameterList
ON ValueC = ValueA
Hope this helps,
--Dubs
Could be parameter sniffing. If you've changed some data or some of the tables then the cached plan that will have satisfied the sp for the old data model may not be valid any more.
Answered a very similar thing here:
stored procedure performance issue
Quote:
f you are sure that the sql is exactly the same and that the params are the same then you could be experiencing a parameter sniffing problem .
It's a pretty uncommon problem. I've only had it happen to me once and since then I've always coded away the problem.
Start here for a quick overview of the problem:
http://blogs.msdn.com/b/queryoptteam/archive/2006/03/31/565991.aspx
http://elegantcode.com/2008/05/17/sql-parameter-sniffing-and-what-to-do-about-it/
try declaring some local variables inside the sp and allocate the vales of the parameters to them. The use the local variables in place of the params.
It's a feature not a bug but it makes you go #"$#