SQL Server table-valued function executed code - sql

Based on Row level security I have created a table-valued function:
CREATE FUNCTION Security.userAccessPredicate(#ValueId int)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT 1 AS accessResult
WHERE #ValueId =
(
SELECT Value
FROM dbo.Values
WHERE UserId = CAST(SESSION_CONTEXT(N'UserId') AS NVARCHAR(50))
) OR NULLIF(CAST(SESSION_CONTEXT(N'UserId') AS nvarchar(50)),'') IS NULL
);
CREATE SECURITY POLICY Security.userSecurityPolicy
ADD FILTER PREDICATE Security.userAccessPredicate(ValueUd) ON dbo.MainTable
Let's say MainTable contains milions of rows. Is userAccessPredicate calculating SELECT Value FROM dbo.Values for every row independently? If so it is ineffective I guess. How to check what exact code is generating when executing table-valued function? SQL Server Profiler isn't way because I am using Azure DB.
I am using SQL Server 2016 Management Studio.

Best way is to look at an execution plan with the policy turned off then turned on. You'll see the extra work its doing as a consequence. You're adding another table to query so its similar to doing a join but probably more efficient.
To answer your question, if you see the addition of a nested loop in the plan when the policy is on, then yes its going row-by-row Nested Loops
Also do the same with DBCC SHOW_STATISTICS to get a look at the resource hits too. With smaller tables i never saw any noticeable performance hits, < 100,000 rows in a similar implementation.
I found this link useful when getting into this before.
https://www.mssqltips.com/sqlservertip/4005/sql-server-2016-row-level-security-limitations-performance-and-troubleshooting/

Related

Parameter Sniffing causing slowdown for text-base query, how to remove execution plan?

I have a sql query, the exact code of which is generated in C#, and passed through ADO.Net as a text-based SqlCommand.
The query looks something like this:
SELECT TOP (#n)
a.ID,
a.Event_Type_ID as EventType,
a.Date_Created,
a.Meta_Data
FROM net.Activity a
LEFT JOIN net.vu_Network_Activity na WITH (NOEXPAND)
ON na.Member_ID = #memberId AND na.Activity_ID = a.ID
LEFT JOIN net.Member_Activity_Xref ma
ON ma.Member_ID = #memberId AND ma.Activity_ID = a.ID
WHERE
a.ID < #LatestId
AND (
Event_Type_ID IN(1,2,3))
OR
(
(na.Activity_ID IS NOT NULL OR ma.Activity_ID IS NOT NULL)
AND
Event_Type_ID IN(4,5,6)
)
)
ORDER BY a.ID DESC
This query has been working well for quite some time. It takes advantage of some indexes we have on these tables.
In any event, all of a sudden this query started running really slow, but ran almost instantaneously in SSMS.
Eventually, after reading several resources, I was able to verify that the slowdown we were getting was from poor parameter sniffing.
By copying all of the parameters to local variables, I was able to successfully reduce the problem. The thing is, this just feels like all kind of wrong to me.
I'm assuming that what happened was the statistics of one of these tables was updated, and then by some crappy luck, the very first time this query was recompiled, it was called with parameter values that cause the execution plan to differ?
I was able to track down the query in the Activity Monitor, and the execution plan resulting in the query to run in ~13 seconds was:
Running in SSMS results in the following execution plan (and only takes ~100ms):
So what is the question?
I guess my question is this: How can I fix this problem, without copying the parameters to local variables, which could lead to a large number of cached execution plans?
Quote from the linked comment / Jes Borland:
You can use local variables in stored procedures to “avoid” parameter sniffing. Understand, though, that this can lead to many plans stored in the cache. That can have its own performance implications. There isn’t a one-size-fits-all solution to the problem!
My thinking is that if there is some way for me to manually remove the current execution plan from the temp db, that might just be good enough... but everything I have found online only shows me how to do this for an actual named stored procedure.
This is a text-based SqlCommand coming from C#, so I do not know how to find the cached execution plan, with the sniffed parameter values, and remove it?
Note: the somewhat obvious solution of "just create a proper stored procedure" is difficult to do because this query can get generated in a number of different ways... and would require a somewhat unpleasant refactor.
If you want to remove a specific plan from the cache then it is really a two step process: first obtain the plan handle for that specific plan; and then use DBCC FREEPROCCACHE to remove that plan from the cache.
To get the plan handle, you need to look in the execution plan cache. The T-SQL below is an example of how you could search for the plan and get the handle (you may need to play with the filter clause a bit to hone in on your particular plan):
SELECT top (10)
qs.last_execution_time,
qs.creation_time,
cp.objtype,
SUBSTRING(qt.[text], qs.statement_start_offset/2, (
CASE
WHEN qs.statement_end_offset = -1
THEN LEN(CONVERT(NVARCHAR(MAX), qt.[text])) * 2
ELSE qs.statement_end_offset
END - qs.statement_start_offset)/2 + 1
) AS query_text,
qt.text as full_query_text,
tp.query_plan,
qs.sql_handle,
qs.plan_handle
FROM
sys.dm_exec_query_stats qs
LEFT JOIN sys.dm_exec_cached_plans cp ON cp.plan_handle=qs.plan_handle
CROSS APPLY sys.dm_exec_sql_text (qs.[sql_handle]) AS qt
OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) tp
WHERE qt.text like '%vu_Network_Activity%'
Once you have the plan handle, call DBCC FREEPROCCACHE as below:
DBCC FREEPROCCACHE(<plan_handle>)
There are many ways to delete/invalidate a query plan:
DBCC FREEPROCCACHE(plan_handle)
or
EXEC sp_recompile 'net.Activity'
or
adding OPTION (RECOMPILE) query hint at the end of your query
or
using optimize for ad hoc workloads server settings
or
updating statistics
If you have a crappy product from a crappy vendor, the best way to handle parameter sniffing is to create you own plan using EXEC sp_create_plan_guide/

Build temporary table with dynamic sql in SQL Server 2008

To make a long story short...
I'm building a web app in which the user can select any combination of about 40 parameters. However, for one of the results they want(investment experience), I have to extract information from a different table and compare the values in six different columns(stock exp, mutual funds exp, etc) and return only the highest value of the six for that specific record.
This is not the issue. The issue is that at runtime, my query to find the investment exp doesn't necessarily know the account id. Considering a table scan would bring well over half a million clients, this is not an option. So what I'm trying to do is edit a copy of my main dynamically built query, but instead of returning 30+ columns, it'll just return 2, the accountid and experienceid (which is the PK for the experience table) so I can do the filtering deal.
Some of you may define dynamic SQL a little different than myself. My query is a string that depending on the arguments sent to my procedure, portions of the where clause will be turned on or off by switches. In the end I execute, it's all done on the server side, all the web app does is send an array of arguments to my proc.
My over simplified code looks essentially like this:
declare #sql varchar(8000)
set #sql =
'select [columns]
into #tempTable
from [table]
[table joins]' + #dynamicallyBuiltWhereClause
exec(#sql)
after this part I try to use #tempTable for the investment experience filtering process, but i get an error telling me #tempTable doesn't exist.
Any and all help would be greatly appreciated.
The problem is the scope of your temp table only exists within the exec() statement. You can transform your temp table into a "global" temp table by using 2 hash signs -> ##tempTable. However, I wonder why you are using a variable #dynamicallyBuiltWhereClause to generate your SQL statement.
I have done what you are doing in the past, but have had better success generating SQL from the application (using C# to generate my SQL).
Also, you may want to look into Table Variables. I have seen some strange instances using temp tables where an application re-uses a connection and the temp table from the last query is still there.

Need help with SQL query on SQL Server 2005

We're seeing strange behavior when running two versions of a query on SQL Server 2005:
version A:
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = 1234
ORDER BY name ASC
version B:
DECLARE #Id AS INT;
SET #Id = 1234;
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId
WHERE listcontacts.listid = #Id
ORDER BY name ASC
Both queries return 1000 rows; version A takes on average 15s; version B on average takes 4s.
Could anyone help us understand the difference in execution times of these two versions of SQL?
If we invoke this query via named parameters using NHibernate, we see the following query via SQL Server profiler:
EXEC sp_executesql N'SELECT otherattributes.* FROM listcontacts JOIN otherattributes ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = #id ORDER BY name ASC',
N'#id INT',
#id=1234;
...and this tends to perform as badly as version A.
Try take a look at the execution plan for your query. This should give you some more explanation on how your query is executed.
I've not seen the execution plans, but I strongly suspect that they are different in these two cases. The issue that you are having is that in case A (the faster query) the optimiser knows the value that you are using for the list id (1234) and using a combination of the distribution statistics and the indexes chooses an optimal plan.
In the second case, the optimiser is not able to sniff the value of the ID and so produces a plan that would be acceptable for any passed in list id. And where I say acceptable I do not mean optimal.
So what can you do to improve the scenario? There are a couple of alternatives here:
1) Create a stored procedure to perform the query as below:
CREATE PROCEDURE Foo
#Id INT
AS
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = #Id
ORDER BY name ASC
GO
This will allow the optimiser to sniff the value of the input parameter when passed in and produce an appropriate execution plan for the first execution. Unfortunately it will cache that plan for reuse later so unless the you generally call the sproc with similarly selective values this may not help you too much
2) Create a stored procedure as above, but specify it to be WITH RECOMPILE. This will ensure that the stored procedure is recompiled each time it is executed and hence produce a new plan optimised for this input value
3) Add OPTION (RECOMPILE) to the end of the SQL Statement. Forces recompilation of this statement, and is able to optimise for the input value
4) Add OPTION (OPTIMIZE FOR (#Id = 1234)) to the end of the SQL statement. This will cause the plan that gets cached to be optimised for this specific input value. Great if this is a highly common value, or most common values are similarly selective, but not so great if the distribution of selectivity is more widely spread.
It's possible that instead of casting 1234 to be the same type as listcontacts.listid and then doing the comparison with each row, it might be casting the value in each row to be the same as 1234. The first requires just one cast, the second needs a cast per row (and that's probably on far more than 1000 rows, it may be for every row in the table). I'm not sure what type that constant will be interpreted as but it may be 'numeric' rather than 'int'.
If this is the cause, the second version is faster because it's forcing 1234 to be interpreted as an int and thus removing the need to cast the value in every row.
However, as the previous poster suggests, the query plan shown in SQL Server Management Studio may indicate an alternative explanation.
The best way to see what is happening is to compare the execution plans, everything else is speculation based on the limited details presented in the question.
To see the execution plan, go into SQL Server Management Studio and run SET SHOWPLAN_XML ON then run query version A, the query will not run but the execution plan will be displayed in XML. Then run query version B and see its execution plan. If you still can't tell the difference or solve the problem, post both execution plans and someone here will explain it.

Functions in SQL Server 2008

Does sql server cache the execution plan of functions?
Yes, see rexem's Tibor link and Andrew's answer.
However... a simple table value function is unnested/expanded into the outer query anyway. Like a view. And my answer (with links) here
That is, this type:
CREATE FUNC dbo.Foo ()
RETURNS TABLE
AS
RETURN (SELECT ...)
GO
According to the dmv yes, http://msdn.microsoft.com/en-us/library/ms189747.aspx but I'd have to run a test to confirm.
Object ID in the output is "ID of the object (for example, stored procedure or user-defined function) for this query plan".
Tested it and yes it does look like they are getting a separate plan cache entry.
Test Script:
create function foo (#a int)
returns int
as
begin
return #a
end
The most basic of functions created.
-- clear out the plan cache
dbcc freeproccache
dbcc dropcleanbuffers
go
-- use the function
select dbo.foo(5)
go
-- inspect the plan cache
select * from sys.dm_exec_cached_plans
go
The plan cache then has 4 entries, the one listed as objtype = Proc is the function plan cache, grab the handle and crack it open.
select * from sys.dm_exec_query_plan(<insertplanhandlehere>)
The first adhoc on my test was the actual query, the 2nd ad-hoc was the query asking for the plan cache. So it definitely received a separate entry under a different proc type to the adhoc query being issued. The plan handle was also different, and when extracted using the plan handle it provides an object id back to the original function, whilst an adhoc query provides no object ID.

Select Fails With Nonexisitent Columns

Executing the following statement with SQL Server 2005 (My tests are through SSMS) results in success upon first execution and failure upon subsequent executions.
IF OBJECT_ID('tempdb..#test') IS NULL
CREATE TABLE #test ( GoodColumn INT )
IF 1 = 0
SELECT BadColumn
FROM #test
What this means is that something is comparing the columns I am accessing in my select statement against the columns that exist on a table when the script is "compiled". For my purposes this is undesirable functionality. My question is if there is anything that can be done so that this code would execute successfully on every run, or if that is not possible perhaps someone could explain why the demonstrated functionality is desirable. The only solutions I have currently is to wrap the select with EXEC or select *, but I don't like either of those solution.
Thanks
If you put:
IF OBJECT_ID('tempdb..#test') IS NOT NULL
DROP TABLE #test
GO
At the start, then the problem will go away, as the batch will get parsed before the #test table exists.
What you're asking is for the system to recognise that "1=0" will always evaluate to false. If it were ever true (which could potentially be the case for most real-life conditions), then you'd probably want to know that you were about to run something that would cause failure.
If you drop the temporary table and then create a stored procedure that does the same:
CREATE PROC dbo.test
AS
BEGIN
IF OBJECT_ID('tempdb..#test') IS NULL
CREATE TABLE #test ( GoodColumn INT )
IF 1 = 0
SELECT BadColumn
FROM #test
END
Then this will happily be created, and you can run it as many times as you like.
Rob
Whether or not this behaviour is "desirable" from a programmer's point of view is debatable of course -- it basically comes down to the difference between statically typed and dynamically typed languages. From a performance point of view, it's desirable because SQL Server needs complete information in order to compile and optimize the execution plan (and also cache execution plans).
In a word, T-SQL is not an interpretted or dynamically typed language, and so you cannot write code like this. Your options are either to use EXEC, or to use another language and embed the SQL queries within it.
This problem is also visible in these situations:
IF 1 = 1
select dummy = GETDATE() into #tmp
ELSE
select dummy = GETDATE() into #tmp
Although the second statement is never executed the same error occurs.
It seems the query engine first level validation ignores all conditional statements.
You say you have problems with subsequent request and that is because the object already exits. It it recommended that you drop your temporary tables as soon as possible when you are done with it.
Read more about temporary table performance at:
SQL Server performance.com