Dynamic sql - Is this a good approach? - sql

I have a table that has a column named category. The column category has values like "6,7,9,22,44". I would like to know if this approach is an efficient approach for SQL Server 2005 or 2008. I basically pass in the "6,7,9,22,44" value to the stored procedure. I am open links/suggestions for a better approach.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Note: I should clarify that I am maintaining an app written by someone else and the category column already has comma separated values. I would have placed the data in a separate table but I won't be able to do that in the short term. With regards to SQL Injection - The column is populated by values checked from a checkbox list. Users cannot type in or enter any data that would affect that specific stored proc. Is SQL injection still possible with that scenario? Please explain how. Thanks.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CREATE PROCEDURE [dbo].[GetCategories]
-- Add the parameter that contain the string that is passed in
#catstring varchar(300)
AS
DECLARE #SQLQuery nvarchar(500);
BEGIN
/* Build Transact-SQL String with parameter value */
SET #SQLQuery = 'select id,category from categoryTable where id in ('+ #catstring +')'
/* Execute Transact-SQL String */
EXECUTE(#SQLQuery)
END
GO

This is a very bad idea because it leaves your system open to SQL injection. A malicious user might be able to pass in a string like "') DROP TABLE categoryTable --" (notice the single quote at the beginning of the string), or something worse.
There are plenty of ways to protect yourself against SQL injection, and there are plenty of resources out there. I would suggest you research it on your own since it's too large an area to cover in a single question.
But for your particular problem, I would suggest this website to learn how to create a Table Valued Parameter to pass in a list of integer IDs: http://www.sommarskog.se/arrays-in-sql-2008.html
It's only valid for SQL Server 2008, you should follow the links in that article for solutions for SQL Server 2005 if you need to.

As Jack said, this creates a possibility of SQL injection. Sommarskog has an excellent look at different ways of handling this at: The Curse and Blessing of Dynamic SQL.

Related

Create a generic procedure, which inserts data into any table

I'm currently working on a .NET application and want to make it as modular as possible. I've already created a basic SELECT procedure, which returns data by checking inputted parameters on SQL Server side.
I want to create a procedure that parses structured data as string and inserts its' contents to corresponding table in database.
For example, I have a table as
CREATE TABLE ExampleTable (
id_exampleTable int IDENTITY (1, 1) NOT NULL,
exampleColumn1 nvarchar(200) NOT NULL,
exampleColumn2 int NULL,
exampleColumn3 int NOT NULL,
CONSTRAINT pk_exampleTable PRIMARY KEY ( id_exampleTable )
)
And my procedure starts as
CREATE PROCEDURE InsertDataIntoCorrespondingTable
#dataTable nvarchar(max), --name of Table in my DB
#data nvarchar(max) --normalized string parameter as 'column1, column2, column3, etc.'
AS
BEGIN
IF #dataTable = 'table'
BEGIN
/**Parse this string and execute insert command**/
END
ELSE IF /**Other statements**/
END
TL;DR
So basically, I'm looking for a solution that can help me achieve something like this
EXEC InsertDataIntoCorrespondingTableByID(
#dataTable = 'ExampleTable',
#data = '''exampleColumn1'', 2, 3'
)
Which should be equal to just
INSERT INTO ExampleTable SELECT 'exampleColumn1', 2, 3
Sure, I can push data as INSERT statements (for each and every 14 tables inside DB...), generated inside an app, but I want to conquer T-SQL :)
This might be reasonable (to some degree) on an RDBMS that supports structured data like JSON or XML natively, but doing this the way you are planning is going to cause some real pain-in-the-rear support and, more importantly, a sql injection attack vector. I would leave this to the realm of the web backend server where it belongs.
You are likely going to invent your own structured data markup language and parser to solve this as sql server. That's a wheel that doesn't need to be reinvented. If you do end up building this, highly consider going with JSON to avoid all the issues that structured data inherently bring with it, assuming your version of sql server supports json parsing/packaging.
Your front end that packages your data into your SDML is going to have to assume column ordinals, but column ordinal is not something that one should rely on in a database. SQL Amateurs often do, I know from years in the industry and dealing with end users that are upset when a new column is introduced in a position they don't want it. Adding a column to a table shouldn't break an application. If it does, that application has bad code.
Regarding the sql injection attack vector, your SP code is going to get ugly. You'll need to parse out each item in #data into a variable of its own in order to properly parameterize your dynamic sql that is being built. See here under the "working with parameters" section for what that will look like. Failure to add this to your SP code means that values passed in that #data SDML could become executable SQL instead of literals and that would be very bad. This is not easy to solve in SP language. Where it IS easy to solve though is in the backend server code. Every database library on the planet supports parameterized query building/execution natively.
Once you have this built you will be dynamically generating an INSERT statement and dynamically generating variables or an array or some data structure to pass in parameters to the INSERT statement to avoid sql injection attacks. It's going to be dynamic, on top of dynamic, on top of dynamic which leads to:
From a support context, imagine that your application just totally throws up one day. You have to dive into investigate. You track the SDML that your front end created that caused the failure, and you open up your SP code to troubleshoot. Imagine what this code ends up looking like
It has to determine if the table exists
It has to parse the SDML to get each literal
It has to read DB metadata to get the column list
It has to dynamically write the insert statement, listing the columns from metadata and dynamically creating sql parameters for the VALUES() list.
It has to execute sending a dynamic number of variables into the dynamically generated sql.
My support staff would hang me out to dry if they had to deal with that, and I'm the one paying them.
All of this is solved by using a proper backend to handle communication, deeper validation, sql parameter binding, error catching and handling, and all the other things that backend servers are meant to do.
I believe that your back end web server should be VERY aware of the underlying data model. It should be the connection between your view, your data, and your model. Leave the database to the things it's good at (reading and writing data). Leave your front end to the things that it's good at (presenting a UI for the end user).
I suppose you could do something like this (may need a little extra work)
declare #columns varchar(max);
select #columns = string_agg(name, ', ') WITHIN GROUP ( ORDER BY column_id )
from sys.all_columns
where object_id = object_id(#dataTable);
declare #sql varchar(max) = select concat('INSERT INTO ',#dataTable,' (',#columns,') VALUES (', #data, ')')
exec sp_executesql #sql
But please don't. If this were a good idea, there would be tons of examples of how to do it. There aren't so it's probably not a good idea.
There are however tons of examples of using ORMs or auto-generated code in stead - because that way your code is maintainable, debugable and performant.

Can parameterized statement stop all SQL injection?

If yes, why are there still so many successful SQL injections? Just because some developers are too dumb to use parameterized statements?
When articles talk about parameterized queries stopping SQL attacks they don't really explain why, it's often a case of "It does, so don't ask why" -- possibly because they don't know themselves. A sure sign of a bad educator is one that can't admit they don't know something. But I digress.
When I say I found it totally understandable to be confused is simple. Imagine a dynamic SQL query
sqlQuery='SELECT * FROM custTable WHERE User=' + Username + ' AND Pass=' + password
so a simple sql injection would be just to put the Username in as ' OR 1=1--
This would effectively make the sql query:
sqlQuery='SELECT * FROM custTable WHERE User='' OR 1=1-- ' AND PASS=' + password
This says select all customers where they're username is blank ('') or 1=1, which is a boolean, equating to true. Then it uses -- to comment out the rest of the query. So this will just print out all the customer table, or do whatever you want with it, if logging in, it will log in with the first user's privileges, which can often be the administrator.
Now parameterized queries do it differently, with code like:
sqlQuery='SELECT * FROM custTable WHERE User=? AND Pass=?'
parameters.add("User", username)
parameters.add("Pass", password)
where username and password are variables pointing to the associated inputted username and password
Now at this point, you may be thinking, this doesn't change anything at all. Surely you could still just put into the username field something like Nobody OR 1=1'--, effectively making the query:
sqlQuery='SELECT * FROM custTable WHERE User=Nobody OR 1=1'-- AND Pass=?'
And this would seem like a valid argument. But, you would be wrong.
The way parameterized queries work, is that the sqlQuery is sent as a query, and the database knows exactly what this query will do, and only then will it insert the username and passwords merely as values. This means they cannot effect the query, because the database already knows what the query will do. So in this case it would look for a username of "Nobody OR 1=1'--" and a blank password, which should come up false.
This isn't a complete solution though, and input validation will still need to be done, since this won't effect other problems, such as XSS attacks, as you could still put javascript into the database. Then if this is read out onto a page, it would display it as normal javascript, depending on any output validation. So really the best thing to do is still use input validation, but using parameterized queries or stored procedures to stop any SQL attacks.
The links that I have posted in my comments to the question explain the problem very well. I've summarised my feelings on why the problem persists, below:
Those just starting out may have no awareness of SQL injection.
Some are aware of SQL injection, but think that escaping is the (only?) solution. If you do a quick Google search for php mysql query, the first page that appears is the mysql_query page, on which there is an example that shows interpolating escaped user input into a query. There's no mention (at least not that I can see) of using prepared statements instead. As others have said, there are so many tutorials out there that use parameter interpolation, that it's not really surprising how often it is still used.
A lack of understanding of how parameterized statements work. Some think that it is just a fancy means of escaping values.
Others are aware of parameterized statements, but don't use them because they have heard that they are too slow. I suspect that many people have heard how incredibly slow paramterized statements are, but have not actually done any testing of their own. As Bill Karwin pointed out in his talk, the difference in performance should rarely be used as a factor when considering the use of prepared statements. The benefits of prepare once, execute many, often appear to be forgotten, as do the improvements in security and code maintainability.
Some use parameterized statements everywhere, but with interpolation of unchecked values such as table and columns names, keywords and conditional operators. Dynamic searches, such as those that allow users to specify a number of different search fields, comparison conditions and sort order, are prime examples of this.
False sense of security when using an ORM. ORMs still allow interpolation of SQL statement parts - see 5.
Programming is a big and complex subject, database management is a big and complex subject, security is a big and complex subject. Developing a secure database application is not easy - even experienced developers can get caught out.
Many of the answers on stackoverflow don't help. When people write questions that use dynamic SQL and parameter interpolation, there is often a lack of responses that suggest using parameterized statements instead. On a few occasions, I've had people rebut my suggestion to use prepared statements - usually because of the perceived unacceptable performance overhead. I seriously doubt that those asking most of these questions are in a position where the extra few milliseconds taken to prepare a parameterized statement will have a catastrophic effect on their application.
Well good question.
The answer is more stochastic than deterministic and I will try to explain my view, using a small example.
There many references on the net that suggest us to use parameters in our queries or to use stored procedure with parameters in order to avoid SQL Injection (SQLi). I will show you that stored procedures (for instance) is not the magic stick against SQLi. The responsibility still remains on the programmer.
Consider the following SQL Server Stored Procedure that will get the user row from a table 'Users':
create procedure getUser
#name varchar(20)
,#pass varchar(20)
as
declare #sql as nvarchar(512)
set #sql = 'select usrID, usrUName, usrFullName, usrRoleID '+
'from Users '+
'where usrUName = '''+#name+''' and usrPass = '''+#pass+''''
execute(#sql)
You can get the results by passing as parameters the username and the password. Supposing the password is in free text (just for simplicity of this example) a normal call would be:
DECLARE #RC int
DECLARE #name varchar(20)
DECLARE #pass varchar(20)
EXECUTE #RC = [dbo].[getUser]
#name = 'admin'
,#pass = '!#Th1siSTheP#ssw0rd!!'
GO
But here we have a bad programming technique used by the programmer inside the stored procedure, so an attacker can execute the following:
DECLARE #RC int
DECLARE #name varchar(20)
DECLARE #pass varchar(20)
EXECUTE #RC = [TestDB].[dbo].[getUser]
#name = 'admin'
,#pass = 'any'' OR 1=1 --'
GO
The above parameters will be passed as arguments to the stored procedure and the SQL command that finally will be executed is:
select usrID, usrUName, usrFullName, usrRoleID
from Users
where usrUName = 'admin' and usrPass = 'any' OR 1=1 --'
..which will get all rows back from users
The problem here is that even we follow the principle "Create a stored procedure and pass the fields to search as parameters" the SQLi is still performed. This is because we just copy our bad programming practice inside the stored procedure. The solution to the problem is to rewrite our Stored Procedure as follows:
alter procedure getUser
#name varchar(20)
,#pass varchar(20)
as
select usrID, usrUName, usrFullName, usrRoleID
from Users
where usrUName = #name and usrPass = #pass
What I am trying to say is that the developers must learn first what an SQLi attack is and how can be performed and then to safeguard their code accordingly. Blindly following 'best practices' is not always the safer way... and maybe this is why we have so many 'best practices'- failures!
Yes, the use of prepared statements stops all SQL injections, at least in theory. In practice, parameterized statements may not be real prepared statements, e.g. PDO in PHP emulates them by default so it's open to an edge case attack.
If you're using real prepared statements, everything is safe. Well, at least as long as you don't concatenate unsafe SQL into your query as reaction to not being able to prepare table names for example.
If yes, why are there still so many successful SQL injections? Just because some developers are too dumb to use parameterized statements?
Yes, education is the main point here, and legacy code bases. Many tutorials use escaping and those can't be easily removed from the web, unfortunately.
I avoid absolutes in programming; there is always an exception. I highly recommend stored procedures and command objects. A majority of my back ground is with SQL Server, but I do play with MySql from time to time. There are many advantages to stored procedures including cached query plans; yes, this can be accomplished with parameters and inline SQL, but that opens up more possibilities for injection attacks and doesn't help with separation of concerns. For me it's also much easier to secure a database as my applications generally only have execute permission for said stored procedures. Without direct table/view access it's much more difficult to inject anything. If the applications user is compromised one only has permission to execute exactly what was pre-defined.
My two cents.
I wouldn't say "dumb".
I think the tutorials are the problem. Most SQL tutorials, books, whatever explain SQL with inlined values, not mentioning bind parameters at all. People learning from these tutorials don't have a chance to learn it right.
Because most code isn't written with security in mind, and management, given a choice between adding features (especially something visible that can be sold) and security/stability/reliability (which is a much harder sell) they will almost invariably choose the former. Security is only a concern when it becomes a problem.
Can parameterized statement stop all SQL injection?
Yes, as long as your database driver offers a placeholder for the every possible SQL literal. Most prepared statement drivers don't. Say, you'd never find a placeholder for a field name or for an array of values. Which will make a developer to fall back into tailoring a query by hand, using concatenation and manual formatting. With predicted outcome.
That's why I made my Mysql wrapper for PHP that supports most of literals that can be added to the query dynamically, including arrays and identifiers.
If yes, why are there still so many successful SQL injections? Just because some developers are too dumb to use parameterized statements?
As you can see, in reality it's just impossible to have all your queries parameterized, even if you're not dumb.
First my answer to your first question: Yes, as far as I know, by using parameterized queries, SQL injections will not be possible anymore. As to your following questions, I am not sure and can only give you my opinion on the reasons:
I think it's easier to "just" write the SQL query string by concatenate some different parts (maybe even dependent on some logical checks) together with the values to be inserted.
It's just creating the query and executing it.
Another advantage is that you can print (echo, output or whatever) the sql query string and then use this string for a manual query to the database engine.
When working with prepared statements, you always have at least one step more:
You have to build your query (including the parameters, of course)
You have to prepare the query on the server
You have to bind the parameters to the actual values you want to use for your query
You have to execute the query.
That's somewhat more work (and not so straightforward to program) especially for some "quick and dirty" jobs which often prove to be very long-lived...
Best regards,
Box
SQL injection is a subset of the larger problem of code injection, where data and code are provided over the same channel and data is mistaken for code. Parameterized queries prevent this from occurring by forming the query using context about what is data and what is code.
In some specific cases, this is not sufficient. In many DBMSes, it's possible to dynamically execute SQL with stored procedures, introducing a SQL injection flaw at the DBMS level. Calling such a stored procedure using parameterized queries will not prevent the SQL injection in the procedure from being exploited. Another example can be seen in this blog post.
More commonly, developers use the functionality incorrectly. Commonly the code looks something like this when done correctly:
db.parameterize_query("select foo from bar where baz = '?'", user_input)
Some developers will concatenate strings together and then use a parameterized query, which doesn't actually make the aforementioned data/code distinction that provides the security guarantees we're looking for:
db.parameterize_query("select foo from bar where baz = '" + user_input + "'")
Correct usage of parameterized queries provides very strong, but not impenetrable, protection against SQL injection attacks.
To protect your application from SQL injection, perform the following steps:
Step 1. Constrain input.
Step 2. Use parameters with stored procedures.
Step 3. Use parameters with dynamic SQL.
Refer to http://msdn.microsoft.com/en-us/library/ff648339.aspx
even if
prepared statements are properly used throughout the web application’s own
code, SQL injection flaws may still exist if database code components construct
queries from user input in an unsafe manner.
The following is an example of a stored procedure that is vulnerable to SQL
injection in the #name parameter:
CREATE PROCEDURE show_current_orders
(#name varchar(400) = NULL)
AS
DECLARE #sql nvarchar(4000)
SELECT #sql = ‘SELECT id_num, searchstring FROM searchorders WHERE ‘ +
‘searchstring = ‘’’ + #name + ‘’’’;
EXEC (#sql)
GO
Even if the application passes the user-supplied name value to the stored
procedure in a safe manner, the procedure itself concatenates this directly into
a dynamic query and therefore is vulnerable.

SQL Table "Pointer"?

Using SQl Server 2000 I have a stored procedure that joins 2 tables and then returns the data. I want this sp to be able to do this for whatever table name I pass into it, otherwise I'll have the exact same code with the exception of the table name 20 or so times in a giant if statement. Basically, how do I use a variable to point to a table, or is that allowed? Thanks.
You need dynamic SQL, start here The Curse and Blessings of Dynamic SQL to learn how to do it correctly so that nobody drops your tables or does anything else possible with SQL Injection
Try building the SELECT as a string and then calling EXEC and passing the string.
e.g.
declare #sql varchar(500)
set #sql = 'select whatever from ' + #tableName
exec #sql
One proc to do anything is usually a bad bad idea. 20 possible tables I might need to go to depending on the circumstances almost always indicates a database design that is bad. Read the article Denis posted on the curses and blessings of dynamic SQL.

Stored procedure, pass table name as a parameter

I have about half a dozen generic, but fairly complex stored procedures and functions that I would like to use in a more generic fashion.
Ideally I'd like to be able to pass the table name as a parameter to the procedure, as currently it is hard coded.
The research I have done suggests I need to convert all existing SQL within my procedures to use dynamic SQL in order to splice in the dynamic table name from the parameter, however I was wondering if there is a easier way by referencing the table in another way?
For example:
SELECT * FROM #MyTable WHERE...
If so, how do I set the #MyTable variable from the table name?
I am using SQL Server 2005.
Dynamic SQL is the only way to do this, but I'd reconsider the architecture of your application if it requires this. SQL isn't very good at "generalized" code. It works best when it's designed and coded to do individual tasks.
Selecting from TableA is not the same as selecting from TableB, even if the select statements look the same. There may be different indexes, different table sizes, data distribution, etc.
You could generate your individual stored procedures, which is a common approach. Have a code generator that creates the various select stored procedures for the tables that you need. Each table would have its own SP(s), which you could then link into your application.
I've written these kinds of generators in T-SQL, but you could easily do it with most programming languages. It's pretty basic stuff.
Just to add one more thing since Scott E brought up ORMs... you should also be able to use these stored procedures with most sophisticated ORMs.
You'd have to use dynamic sql. But don't do that! You're better off using an ORM.
EXEC(N'SELECT * from ' + #MyTable + N' WHERE ... ')
You can use dynamic Sql, but check that the object exists first unless you can 100% trust the source of that parameter. It's likely that there will be a performance hit as SQL server won't be able to re-use the same execution plan for different parameters.
IF OBJECT_ID(#tablename, N'U') IS NOT NULL
BEGIN
--dynamic sql
END
ALTER procedure [dbo].[test](#table_name varchar(max))
AS
BEGIN
declare #tablename varchar(max)=#table_name;
declare #statement varchar(max);
set #statement = 'Select * from ' + #tablename;
execute (#statement);
END

How do I supply the FROM clause of a SELECT statement from a UDF parameter

In the application I'm working on porting to the web, we currently dynamically access different tables at runtime from run to run, based on a "template" string that is specified. I would like to move the burden of doing that back to the database now that we are moving to SQL server, so I don't have to mess with a dynamic GridView. I thought of writing a Table-valued UDF with a parameter for the table name and one for the query WHERE clause.
I entered the following for my UDF but obviously it doesn't work. Is there any way to take a varchar or string of some kind and get a table reference that can work in the FROM clause?
CREATE FUNCTION TemplateSelector
(
#template varchar(40),
#code varchar(80)
)
RETURNS TABLE
AS
RETURN
(
SELECT * FROM #template WHERE ProductionCode = #code
)
Or some other way of getting a result set similar in concept to this. Basically all records in the table indicated by the varchar #template with the matching ProductionCode of the #code.
I get the error "Must declare the table variable "#template"", so SQL server probably things I'm trying to select from a table variable.
On Edit: Yeah I don't need to do it in a function, I can run Stored Procs, I've just not written any of them before.
CREATE PROCEDURE TemplateSelector
(
#template varchar(40),
#code varchar(80)
)
AS
EXEC('SELECT * FROM ' + #template + ' WHERE ProductionCode = ' + #code)
This works, though it's not a UDF.
The only way to do this is with the exec command.
Also, you have to move it out to a stored proc instead of a function. Apparently functions can't execute dynamic sql.
The only way that this would be possible is with dynamic SQL, however, dynamic SQL is not supported by SqlServer within a function.
I'm sorry to say that I'm quite sure that it is NOT possible to do this within a function.
If you were working with stored procedures it would be possible.
Also, it should be noted that, be replacing the table name in the query, you've destroyed SQL Server's ability to cache the execution plan for the query. This pretty much reduces the advantage of using a UDF or SP to nil. You might as well just call the SQL query directly.
I have a finite number of tables that I want to be able to address, so I could writing something using IF, that tests #template for matches with a number of values and for each match runs
SELECT * FROM TEMPLATENAME WHERE ProductionCode = #code
It sounds like that is a better option
If you have numerous tables with identical structure, it usually means you haven't designed your database in a normal form. You should unify these into one table. You may need to give this table one more attribute column to distinguish the data sets.