Summary
Is there an efficient way to run large numbers of Dynamic SQL (on SQL Server 2005)?
Details
Our system allows users to create "email alert" subscriptions - where new matches on the system are emailed to them on a daily basis.
The subscription allows for multiple options, including the use of search keywords. A parser written by myself outputs the appropriate SQL code, taking into account and, or and brackets (). The parser will not allow anything through that could be used for SQL Injection.
For example, the keywords might be entered by the user as this (that or other) and the resultant query would end up roughly as...
SELECT *
FROM [VW_EMAIL_ALERT]
WHERE ([SEARCH] LIKE '%this%' AND ([SEARCH] LIKE '%that%' OR [SEARCH] LIKE '%other%'))
Each night, all those subscriptions are processed individually, because each one is potentially unique. The result is that the batch processing has to run a cursor over every subscription and run the SQL through sp_executesql.
Obviously this is highly inefficient, and can cause serious overloading - leading in some cases to timeouts. The stored-procedure that runs this processing is coded to split the subscriptions into blocks, so they're not all being called at once.
Is there a better/more efficient way to do this?
Note: Unfortunately we are currently stuck supporting a minimum of SQL Server 2005, as some of our clients still use that technology
If you are looking for keywords that is the least efficient way you could do it
A like '%anything does not use an index
Use a FullText search to index the words
Or write you own parser to index the unique words
You would build up a keywords table
And index the keyword
This is a very efficient query
select id
from keywords
where keyword = 'this'
intersect
select id
from keywords
where keyword in ( 'that','other')
Even with wildcards in the keywords it is still much more efficient than searching the entire text
I hope this will help. At my work, we replaced cursor with this kind of implementation.
DECLARE
#strSQL NVARCHAR(MAX) = ''
CREATE TABLE #tmp
(
Result_Query VARCHAR(MAX)
)
INSERT INTO
#tmp
SELECT
'SELECT * FROM [VW_EMAIL_ALERT] WHERE ([SEARCH] = ''%this%'' AND ([SEARCH] = ''%that%'' OR [SEARCH] = ''%other%''))'
UNION
SELECT
'SELECT * FROM [VW_EMAIL_ALERT] WHERE ([SEARCH] = ''%this1%'' AND ([SEARCH] = ''%that1%'' OR [SEARCH] = ''%other1%''))'
SELECT
#strSQL = #strSQL + Result_Query + ';'
FROM
#tmp
SET
#strSQL = LEFT(#strSQL, LEN(#strSQL) - 1)
PRINT #strSQL
EXEC(#strSQL)
Related
One of the product teams I manage has an abundance of queries that use dynamic sql queries injecting the table in using concatenation. While it has sanitisation, I am trying to completely remove dynamic sql.
Is there a way to parameterise the table name?
I am trying to think of how I can query a table something like:
SELECT * FROM (SELECT DISTINCT Table_Name FROM INFORMATION_SCHEMA.Tables WHERE Table_Name = :queryParam)
Is this possible?
There is no way to "properly" prevent SQL injection completely from within SQL, the calling application layer should do this prior to executing any SQL statement.
Solve the problem by using an ORM or building the code to protect yourself from SQL injection when you generate the SQL in the application code.
This feels like a classic XY problem, try to take a step back and consider that you need to protect the access to the SQL server itself rather than sanitise everything from within "after" your SQL server has been accessed.
I am trying to completely remove dynamic sql.
You can't do it without Dynamic SQL.
Is this possible?
No, it's not possible, you cannot parameterize identifiers in SQL queries.
Why?
From the Books online page for Variables, Variables can be used only in expressions, not in place of object names or keywords. To construct dynamic SQL statements, use EXECUTE.
This is the only way to do it:
DECLARE #Column SysName = N'Table_Name',
#Param NVARCHAR(128) = N'ParamValue';
DECLARE #SQL NVARCHAR(MAX)=N'SELECT *
FROM(
SELECT '+ QUOTENAME(#Column) +
'FROM INFORMATION_SCHEMA.Tables
WHERE ' + QUOTENAME(#Column) + ' = #Param
) T';
EXECUTE sp_executesql #SQl,
N'#Param NVARCHAR(128)',
#Param;
I have a single table that contains questions with corresponding references to another table and field that contain the answers. Something like:
I would like to query the questions table and return QID, QuestionText and the value contained in the [ResponseTable].[ResponseField] for each QID. The design seamed flexible at the time. However the app developer is expecting a stored procedure and the SQL developer was counting on an in app solution for this issue.
I am at the end of my rope trying to build this query. How would you suggest accomplishing this task?
I don't think you'll like hearing this answer because it will likely mean some major rework, but I think it's the right answer. Get rid of the questions table and put the questions into new Question fields in the Client1, Client9, and Jobs tables; one for each response.
For example the Client1 table will have these fields:
ColorPref
ColorPrefQuestion
Rating
RatingQuestion
...and so on
Working around that design will be manageable where working around the design you have now will be a headache.
It sounds like a redesign should be considered (storing all responses in one table, for example), but if that's not a possibility then dynamic SQL (using sp_executesql) can be used. However, it can be dangerous to use as it is vulnerable to SQL injection. There are some precautions that can be taken, such as using QUOTENAME on table and column names. This is also a good read before using dynamic SQL: The Curse and Blessings of Dynamic SQL.
DECLARE #tableName NVARCHAR(50)
DECLARE #columnName NVARCHAR(50)
DECLARE #query NVARCHAR(MAX)
SET #tableName = 'Client1'
SET #columnName = 'ColorPref'
SET #query = 'SELECT ' + QUOTENAME(#columnName) + ' FROM ' + QUOTENAME(#tableName)
EXEC sp_executesql #query
Until you get to the rewrite you mentioned, consider the idea of using a view to bring these response tables together.
CREATE VIEW ClientResponses AS
SELECT QID, ResponseField FROM [Client1]
UNION
SELECT QID, ResponseField FROM [Jobs]
UNION
SELECT QID, ResponseField FROM [Client9]
-- ..... add the new tables as they are created
This will
Avoid dynamic SQL
Give you a single place to maintain querying
Provide a pretty simple, readable way to hobble this together
(1) Is there a good/reliable way to query the system catalogue in order
to find all stored procedures which create some temporary tables in their
source code bodies but which don't drop them at the end of their bodies?
(2) In general, can creating temp tables in a SP and not dropping
them in the same SP cause some problems and if so, what problems?
I am asking this question in the contexts of
SQL Server 2008 R2 and SQL Server 2012 mostly.
Many thanks in advance.
Not 100% sure if this is accurate as I don't have a good set of test data to work with. First you need a function to count occurrences of a string (shamelessly stolen from here):
CREATE FUNCTION dbo.CountOccurancesOfString
(
#searchString nvarchar(max),
#searchTerm nvarchar(max)
)
RETURNS INT
AS
BEGIN
return (LEN(#searchString)-LEN(REPLACE(#searchString,#searchTerm,'')))/LEN(#searchTerm)
END
Next make use of the function like this. It searches the procedure text for the strings and reports when the number of creates doesn't match the number of drops:
WITH CreatesAndDrops AS (
SELECT procedures.name,
dbo.CountOccurancesOfString(UPPER(syscomments.text), 'CREATE TABLE #') AS Creates,
dbo.CountOccurancesOfString(UPPER(syscomments.text), 'DROP TABLE #') AS Drops
FROM sys.procedures
JOIN sys.syscomments
ON procedures.object_id = syscomments.id
)
SELECT * FROM CreatesAndDrops
WHERE Creates <> Drops
1) probably no good / reliable way -- though you can extract the text of sp's using some arcane ways that you can find in other places.
2) In general - no this causes no problems -- temp tables (#tables) are scope limited and will be flagged for removal when their scope disappears.
and table variables likewise
an exception is for global temp tables (##tables) which are cleaned up when no scope holds a reference to them. Avoid those guys -- there are usually (read almost always) better ways to do something than with a global temp table.
Sigh -- if you want to go down the (1) path then be aware that there are lots of pitfalls in looking at code inside sql server -- many of the helper functions and information tables will truncate the actual code down to a NVARCHAR(4000)
If you look at the code of sp_helptext you'll see a really horrible cursor that pulls the actual text..
I wrote this a long time ago to look for strings in code - you could run it on your database -- look for 'CREATE TABLE #' and 'DROP TABLE #' and compare the outputs....
DECLARE #SearchString VARCHAR(255) = 'DELETE FROM'
SELECT
[ObjectName]
, [ObjectText]
FROM
(
SELECT
so.[name] AS [ObjectName]
, REPLACE(comments.[c], '#x0D;', '') AS [ObjectText]
FROM
sys.objects AS so
CROSS APPLY (
SELECT CAST([text] AS NVARCHAR(MAX))
FROM syscomments AS sc
WHERE sc.[id] = so.[object_id]
FOR XML PATH('')
)
AS comments ([c])
WHERE
so.[is_ms_shipped] = 0
AND so.[type] = 'P'
)
AS spText
WHERE
spText.[ObjectText] LIKE '%' + #SearchString + '%'
Or much better - use whatever tool of choice you like on your codebase - you've got all your sp's etc scripted out into source control somewhere, right.....?
I think SQL Search tool from red-gate would come handy in this case. You can download from here. This tool will find the sql text within stored procedures, functions, views etc...
Just install this plugin and you can find sql text easily from SSMS.
I believe what I am attempting to achieve may only be done through the use of Dynamic SQL. However, I have tried a couple of things without success.
I have a table in database DB1 (lets say DB1.dbo.table1, in a MS SQL server) that contains the names of other databases in the server (DB2,DB3, etc). Now, all the dbs listed in that table contain a particular table (lets call it desiredTable) which I want to query. So what I'm looking for is a way of creating a stored procedure/script/whatever that queries DB1.dbotable1 for the other DBs and then run a statement on each of the dbs retrieved, something like:
#DBNAME = select dbName from DB1.dbo.table1
select value1 from #DBNAME.dbo.desiredTable
Is that possible? I'm planning on running the sp/script in various systems DB1.dbo.table1 being a constant.
You need to build a query dinamically and then execute it. Something like this:
DECLARE #MyDynamicQuery VARCHAR(MAX)
DECLARE #MyDynamicDBName VARCHAR(20)
SELECT #MyDynamicDBName = dbName
FROM DB1.dbo.table1
SET #MyDynamicQuery = 'SELECT value1 FROM ' + #MyDynamicDBName + '.dbo.desiredTable'
EXEC(#MyDynamicQuery)
You can use the undocumented stored procedure, sp_MSForEachDB. The usual warnings about using an undocumented stored procedure apply though. Here's an example of how you might use it in your case:
EXEC sp_MSForEachDB 'SELECT value1 FROM ?.dbo.desiredTable'
Notice the use of ? in place of the DB name.
I'm not sure how you would limit it to only DBs in your own table. If I come up with something, then I'll post it here.
I've used dynamic SQL for many tasks and continuously run into the same problem: Printing values of variables used inside the Dynamic T-SQL statement.
EG:
Declare #SQL nvarchar(max), #Params nvarchar(max), #DebugMode bit, #Foobar int
select #DebugMode=1,#Foobar=364556423
set #SQL='Select #Foobar'
set #Params=N'#Foobar int'
if #DebugMode=1 print #SQL
exec sp_executeSQL #SQL,#Params
,#Foobar=#Foobar
The print results of the above code are simply "Select #Foobar". Is there any way to dynamically print the values & variable names of the sql being executed? Or when doing the print, replace parameters with their actual values so the SQL is re-runnable?
I have played with creating a function or two to accomplish something similar, but with data type conversions, pattern matching truncation issues, and non-dynamic solutions. I'm curious how other developers solve this issue without manually printing each and every variable manually.
I dont believe the evaluated statement is available, meaning your example query 'Select #FooBar' is never persisted anywhere as 'Select 364556243'
Even in a profiler trace you would see the statement hit the cache as '(#Foobar int)select #foobar'
This makes sense, since a big benefit of using sp_executesql is that it is able to cache the statement in a reliable form without variables evaluated, otherwise if it replaced the variables and executed that statement we would just see the execution plan bloat.
updated: Here's a step in right direction:
All of this could be cleaned up and wrapped in a nice function, with inputs (#Statement, #ParamDef, #ParamVal) and would return the "prepared" statement. I'll leave some of that as an exercise for you, but please post back when you improve it!
Uses split function from here link
set nocount on;
declare #Statement varchar(100), -- the raw sql statement
#ParamDef varchar(100), -- the raw param definition
#ParamVal xml -- the ParamName -to- ParamValue mapping as xml
-- the internal params:
declare #YakId int,
#Date datetime
select #YakId = 99,
#Date = getdate();
select #Statement = 'Select * from dbo.Yak where YakId = #YakId and CreatedOn > #Date;',
#ParamDef = '#YakId int, #Date datetime';
-- you need to construct this xml manually... maybe use a table var to clean this up
set #ParamVal = ( select *
from ( select '#YakId', cast(#YakId as varchar(max)) union all
select '#Date', cast(#Date as varchar(max))
) d (Name, Val)
for xml path('Parameter'), root('root')
)
-- do the work
declare #pStage table (pName varchar(100), pType varchar(25), pVal varchar(100));
;with
c_p (p)
as ( select replace(ltrim(rtrim(s)), ' ', '.')
from dbo.Split(',', #ParamDef)d
),
c_s (pName, pType)
as ( select parsename(p, 2), parsename(p, 1)
from c_p
),
c_v (pName, pVal)
as ( select p.n.value('Name[1]', 'varchar(100)'),
p.n.value('Val[1]', 'varchar(100)')
from #ParamVal.nodes('root/Parameter')p(n)
)
insert into #pStage
select s.pName, s.pType, case when s.pType = 'datetime' then quotename(v.pVal, '''') else v.pVal end -- expand this case to deal with other types
from c_s s
join c_v v on
s.pName = v.pName
-- replace pName with pValue in statement
select #Statement = replace(#Statement, pName, isnull(pVal, 'null'))
from #pStage
where charindex(pName, #Statement) > 0;
print #Statement;
On the topic of how most people do it, I will only speak to what I do:
Create a test script that will run the procedure using a wide range of valid and invalid input. If the parameter is an integer, I will send it '4' (instead of 4), but I'll only try 1 oddball string value like 'agd'.
Run the values against a data set of representative size and data value distribution for what I'm doing. Use your favorite data generation tool (there are several good ones on the market) to speed this up.
I'm generally debugging like this on a more ad hoc basis, so collecting the results from the SSMS results window is as far as I need to take it.
The best way I can think of is to capture the query as it comes across the wire using a SQL Trace. If you place something unique in your query string (as a comment), it is very easy to apply a filter for it in the trace so that you don't capture more than you need.
However, it isn't all peaches & cream.
This is only suitable for a Dev environment, maybe QA, depending on how rigid your shop is.
If the query takes a long time to run, you can mitigate that by adding "TOP 1", "WHERE 1=2", or a similar limiting clause to the query string if #DebugMode = 1. Otherwise, you could end up waiting a while for it to finish each time.
For long queries where you can't add something the query string only for debug mode, you could capture the command text in a StmtStarted event, then cancel the query as soon as you have the command.
If the query is an INSERT/UPDATE/DELETE, you will need to force a rollback if #DebugMode = 1 and you don't want the change to occur. In the event you're not currently using an explicit transaction, doing that would be extra overhead.
Should you go this route, there is some automation you can achieve to make life easier. You can create a template for the trace creation and start/stop actions. You can log the results to a file or table and process the command text from there programatically.