I know this question has been asked and debated before. The answers I have read are mostly opinion, so I am looking for something more technical. More "under the hood".
I'm using SQL Server 2008. I am about to write a query that will need to accept at least 11 inputs. The user could specify 1 to 11(+) of these. They are of various types (IDs, text matching, geography, full text matches, record sets, etc.). It's as dynamic as you can get with SQL.
I know how to do this without using dynamic SQL:
/*for simple match*/ WHERE #Input IS NOT NULL OR t.col = #Input
This has drawbacks because of joins etc. I don't ever use dynamic SQL. But from what I've read, it seems this may be a case for dynamic SQL.
Can anyone give some technical pointers please?
http://www.sommarskog.se/index.html
Look at the curse and blessing of Dynamic SQl and the Dynamic Search Conditions links
You need to remember that if you don't want to use dynamic SQL (and I wouldn't either) then the DRY principle is sorta out-the-window.
Make effective use of TVP's and accept the fact that you will have somewhat high cyclomatic complexity (for a SQL statement)
If you want everything cacheable and sargable, you should consider how to design it in multiple procedures and then break out your conditions as needed.
Another very common approach is to include conditions such as:
WHERE (#pParam IS NULL OR ~condition~)
If your parameter is NULL then SQL will not bother resolving the other half of the condition.
EDIT: This last part only applies in SQL2008R2 and only if using the OPTION(RECOMPILE) command
Related
A specification essentially is a text string representing a "where" clause created by an end user.
I have stored procedures that copy a set of related tables and records to other places. The operation is always the same, but dependent on some crazy user requirements like "products that are frozen and blue and on sale on Tuesday".
What if we fed the user specification (or string parameter) to a scalar function that returned true/false which executed the specification as dynamic SQL or just exec (#variable).
It could tell us whether those records exist. We could add the result of the function to our copy products where clause.
It would keep us from recompiling the copy script each time our where clauses changed. Plus it would isolate the product selection in to a single function.
Anyone ever do anything like this or have examples? What bad things could come of it?
EDIT:
This is the specification I simply added to the end of each insert/select statement:
and exists (
select null as nothing
from SameTableAsOutsideTable inside
where inside.ID = outside.id and -- Join operations to outside table
inside.page in (6, 7) and -- Criteria 1
inside.dept in (7, 6, 2, 4) -- Criteria 2
)
It would be great to feed a parameter into a function that produces records based on the user criteria, so all that above could be something like:
and dbo.UserCriteria( #page="6,7", #dept="7,6,2,4")
Dynamic Search Conditions in T-SQL
When optimizing SQL the important thing is optimizing the access path to data (ie. index usage). This trumps code reuse, maintainability, nice formatting and just about every other development perk you can think of. This is because a bad access path will cause the query to perform hundreds of times slower than it should. The article linked sums up very well all the options you have, and your envisioned function is nowhere on the radar. Your options will gravitate around dynamic SQL or very complicated static queries. I'm afraid there is no free lunch on this topic.
It doesn't sound like a very good idea to me. Even supposing that you had proper defensive coding to avoid SQL injection attacks it's not going to really buy you anything. The code still needs to be "compiled" each time.
Also, it's pretty much always a bad idea to let users create free-form WHERE clauses. Users are pretty good at finding new and innovative ways to bring a server to a grinding halt.
If you or your users or someone else in the business can't come up with some concrete search requirements then it's likely that someone isn't thinking about it hard enough and doesn't really know what they want. You can have pretty versatile search capabilities without letting the users completely loose on the system. Alternatively, look at some of the BI tools out there and consider creating a data mart where they can do these kinds of ad hoc searches.
How about this:
You create another store procedure (instead of function) and pass the right condition to it.
Based on that condition it dumps the record ids to a temp table.
Next you move procedure will read ids from that table and do the needful things?
Or you could create a user function that returns a table which is nothing but the ids of the records that matches your criteria (dynamic)
If I am totally off, then please clarify me.
Hope this helps.
If you are forced to use dynamic queries and you don't have any solid and predefined search requirements, it is strongly recommended to use sp_executesql instead of EXEC . It provides parametrized queries to prevent SQL Injection attacks (to some extent) and It makes use of execution plans to speed up performance. (More info)
Is there any hints in Oracle that works the same way as these SQL Server hints?
Recompile: That a query is recompiled every time it's run (if execution plans should vary greatly depending on parameters). Would this be best compared to cursor_sharing in Oracle?
Optimize for: When you want the plan to get optimized for a certain parameter even if a different one is used the first time the SQL is run? I guess maybe could be helped with cursor_sharing as well?
Since you're using 11g, Oracle should use adaptive cursor sharing by default. If you have a query that uses bind variables and the histogram on the column with skewed data indicates that different bind variable values should use different query plans, Oracle will maintain multiple query plans for the same SQL statement automatically. There would be no need to specifically hint the queries to get this behavior, it's already baked in to the optimizer.
I didn't know, but found a discussion with some solutions here on forums.oracle.com
I am working on a problem that I'm certain someone has seen before, but all I found across the net was how not to do it.
Fake table example and dynamic searching.
(Due to my low rating I cannot post images. I know I should be ashamed!!)
Clicking the add button automatically creates another row for adding more criteria choices.
(Note: My table is most definitely more complex)
Now to my issue, I thought I knew how to handle the SQL for this task, but I really don't. The only examples of what I should do are not meant for this sort of dynamic table querying. The examples didn't have the ability to create as as many search filters as a user pleases (or perhaps my understanding was lacking).
Please let me know if my uploaded image is not of good enough quality or if I have not given enough information.
I'm really curious about the best practice for this situation. Thank you in advance.
I had a similar question. You can use dynamic sql with the sp_executesql stored proc where you actually build your select statement as a string and pass it in.
Or you might be able to write a stored proc kinda like the one I created where you have all of the conditions in the where clause but the NULL values are ignored.
Here's the stored proc I came up with for my scenario:
How do I avoid dynamic SQL when using an undetermined number of parameters?
The advantage with the parameterized stored proc I wrote is that I'm able to avoid the SQL injection risks associated with dynamic SQL.
Two main choices:
Linq to Sql allows you to compose a query, add to it, add to it again, and it won't actually compile and execute a SQL statement until you iterate the results.
Or you can use dynamic SQL. The trick to making this easy is the "WHERE (1=1)" technique, but you do have to be careful to use parameters (to avoid SQL injection attacks) and build your sql statements carefully.
The original post:
Write a sql for searching with multiple conditions
select * from thetable
where (#name='' or [name]=#name) and (#age=0 or age=#age)
However, the above query forces table scan. For better performance and more complex scenario (I guess you simplified the question in you original post), consider use dynamic sql. By the way, Linq to SQL can help you build dynamic SQL very easily, like the following:
IQueryable<Person> persons = db.Persons;
if (!string.IsNullOrEmpty(name)) persons = persons.Where(p=>p.Name==name);
if (age != 0) persons = persons.Where(p=>p.Age=age);
Check out SqlBuilder, a utility for Dynamic SQL.
a) What do we consider as a dynamic sql statement?
Any sql statement that dynamically adds a clause(s) or even just a part of a clause to a SQL string?
b)Aren’t then parameterized strings that use placeholders for dynamically supplied values also considered dynamic sql statements?
thanx
A dynamic SQL statement is a statement that is built at execution time. The emphasis lies on statement. So, it isn't dynamic SQL if you just supply a value at execution time.
Dynamic SQL statements generally refers those that are constructed using string concatenation.
"SELECT name FROM names WHERE id=" + this.id;
"SELECT name FROM names WHERE id=" + this.id + " AND age=" this.age;
Parameterized queries are also dynamic but not in terms of construct. You can only change parameters but you can't change the structure of the statement i.e add WHERE clauses.
Parameterized queries are often at the database level so the database can cache the execution plan of the query and use it over and over. Not quite possible in the first case since a simple change in the text or the order of the where clauses can cause the database to not recognize the previously cached execution plan and start over.
The first construct is also vulnerable to SQL injection since it is hard to validate input values for attempts to inject rogue SQL.
Certainly anything involving EXEC (#sql) or EXEC sp_ExecuteSQL #sql, ... (i.e. dynamic at the database itself) would qualify, but I guess you could argue that any SQL generated at runtime (rather than fixed at build / install) would qualify.
So yes, you could argue that a runtime-generated, yet correctly parameterized query is "dynamic" (for example, LINQ-to-SQL generated queries), but to be honest as long as it doesn't expose you to injection attacks I don't care about the name ;-p
Dynamic sql is basically just any sql that is not fully constructed until runtime. Its generated on-the-fly by concatenating runtime values into a statement. Could be any part of an sql statement
A. Anything that will cause the DB server to evaluate strings as SQL.
B. No, as they still go through the DB driver/provider and get cleaned up.
For point b) You already know the statement and you pass in known parameters (hopefully type safe and not string literals).
I consider a dynamic SQL statement to be one that accepts new values at runtime in order to return a different result. "New values", by my reckoning, can be a different ORDER BY, new WHERE criteria, different field selections, etc.
a) What do we consider as a dynamic sql statement?
Any sql statement that dynamically adds a clause(s) or even just a part of a clause to a SQL string?
Both - any query altered/tailored prior to execution.
b)Aren’t then parameterized strings that use placeholders for dynamically supplied values also considered dynamic sql statements?
Parameterized queries, AKA using bind variables, are supplying different filter criteria to the query. You can use (my_variable IS NULL OR ...), but OR performance is generally terrible on any db & the approach destroys sargability.
Dynamic SQL generally deals with tailoring the query to include other logic, like JOINs that only need to be included if a specific parameter is set. However, there are limitations like the IN clause not supporting converting a comma delimited string as it's list of options - for this you would have to use dynamic SQL, or handle the comma delimited list in another fashion (CLR, pipelined into a temp table, etc).
I see where you're going with this, but one somewhat objective criteria which defines a particular situation as Dynamic SQL vs. say a prepared statement is...
...the fact that dynamic statements cause the SQL server to fully evaluate the query, to define a query plan etc.
With prepared statements, SQL can (and does unless explicitly asked) cache the query plan (and in some cases, even gather statistics about the returns etc.).
[Edit: effectively, even dynamic SQL statements are cached, but such cached plans have a much smaller chance of being reused because the exact same query would need to be received anew for this to happen (unlike with parametrized queries where the plan is reused even with distinct parameter values and of course, unless "WITH RECOMPILE")]
So in the cases from the question, both a) and b) would be considered dynamic SQL, because, in the case of b), the substitution takes place outside of SQL. This is not a prepared statement. SQL will see it as a totally novel statement, as it doesn't know that you are merely changing the search values.
Now... beyond the SQL-centric definition of dynamic SQL, it may be useful to distinguish between various forms of dynamic SQL, such as say your a) and b) cases. Dynamic SQL has had a bad rep for some time, some of it related to SQL injection awareness. I've seen applications where dynamic SQL is produced, and executed, within a Stored Procedure, and because it is technically "SQL-side" some people tended to accept it (even though SQL-injecting in particular may not have been addressed)...
The point I'm trying to make here is that building a query, dynamically, from various contextual elements is often needed, and one may be better advised to implement this in "application" layers (well, call it application-side layers, for indeed, it can and should be separate from application per-se), where the programming language and associated data structures are typically easier and more expressive than say T-SQL and such. So jamming this into SQL for the sake of calling it "Data-side" isn't a good thing, in my opinion.
A "static" query is one that remains the same at all times. For example, the "Tags" button on Stackoverflow, or the "7 days" button on Digg. In short, they always map to a specific database query, so you can create them at design time.
But I am trying to figure out how to do "dynamic" queries where the user basically dictates how the database query will be created at runtime. For example, on Stackoverflow, you can combine tags and filter the posts in ways you choose. That's a dynamic query albeit a very simple one since what you can combine is within the world of tags. A more complicated example is if you could combine tags and users.
First of all, when you have a dynamic query, it sounds like you can no longer use the substitution api to avoid sql injection since the query elements will depend on what the user decided to include in the query. I can't see how else to build this query other than using string append.
Secondly, the query could potentially span multiple tables. For example, if SO allows users to filter based on Users and Tags, and these probably live in two different tables, building the query gets a bit more complicated than just appending columns and WHERE clauses.
How do I go about implementing something like this?
The first rule is that users are allowed to specify values in SQL expressions, but not SQL syntax. All query syntax should be literally specified by your code, not user input. The values that the user specifies can be provided to the SQL as query parameters. This is the most effective way to limit the risk of SQL injection.
Many applications need to "build" SQL queries through code, because as you point out, some expressions, table joins, order by criteria, and so on depend on the user's choices. When you build a SQL query piece by piece, it's sometimes difficult to ensure that the result is valid SQL syntax.
I worked on a PHP class called Zend_Db_Select that provides an API to help with this. If you like PHP, you could look at that code for ideas. It doesn't handle any query imaginable, but it does a lot.
Some other PHP database frameworks have similar solutions.
Though not a general solution, here are some steps that you can take to mitigate the dynamic yet safe query issue.
Criteria in which a column value belongs in a set of values whose cardinality is arbitrary does not need to be dynamic. Consider using either the instr function or the use of a special filtering table in which you join against. This approach can be easily extended to multiple columns as long as the number of columns is known. Filtering on users and tags could easily be handled with this approach.
When the number of columns in the filtering criteria is arbitrary yet small, consider using different static queries for each possibility.
Only when the number of columns in the filtering criteria is arbitrary and potentially large should you consider using dynamic queries. In which case...
To be safe from SQL injection, either build or obtain a library that defends against that attack. Though more difficult, this is not an impossible task. This is mostly about escaping SQL string delimiters in the values to filter for.
To be safe from expensive queries, consider using views that are specially crafted for this purpose and some up front logic to limit how those views will get invoked. This is the most challenging in terms of developer time and effort.
If you were using python to access your database, I would suggest you use the Django model system. There are many similar apis both for python and for other languages (notably in ruby on rails). I am saving so much time by avoiding the need to talk directly to the database with SQL.
From the example link:
#Model definition
class Blog(models.Model):
name = models.CharField(max_length=100)
tagline = models.TextField()
def __unicode__(self):
return self.name
Model usage (this is effectively an insert statement)
from mysite.blog.models import Blog
b = Blog(name='Beatles Blog', tagline='All the latest Beatles news.')
b.save()
The queries get much more complex - you pass around a query object and you can add filters / sort elements to it. When you finally are ready to use the query, Django creates an SQL statment that reflects all the ways you adjusted the query object. I think that it is very cute.
Other advantages of this abstraction
Your models can be created as database tables with foreign keys and constraints by Django
Many databases are supported (Postgresql, Mysql, sql lite, etc)
DJango analyses your templates and creates an automatic admin site out of them.
Well the options have to map to something.
A SQL query string CONCAT isn't a problem if you still use parameters for the options.