Well, It seems like such a simple solution to the many problems that can arise from insecure services and applications. But I'm not sure if it's possible, or maybe nobody's thought of this idea yet...
Instead of leaving it up to programmers/developers to ensure that their applications use stored procedures/parameterised queries/escape strings etc to help prevent sql injection/other attacks - why don't the people who make the databases just build these security features into the databases so that when an update or insert query is performed on the database, the database secures/sanitizes the string before it is inserted into the database?
The database would not necessarily know the context of what is going on. What is malicious for one application is not malicious for another. Sometimes the intent IS to
drop table users--
It is much better to let the database do what it does best, arranging data. And let the developers worry about the security implementations.
The problem is that the database cannot readily tell whether the command it is requested to execute is legitimate or not - it is syntactically valid and there could be a valid reason for the user to request that it be executed.
There are heuristics that the DBMS could apply. For example, if a single request combined both a SELECT operation and a DELETE operation, it might be possible to infer that this is more likely to be illegitimate than legitimate - and the DBMS could reject that combined operation. But it is hard to deal with a query where the WHERE condition has been weakened to the point that it shows more data than it was supposed to. A UNION query can deliberately select from multiple tables. It is not sufficient to show that there is a weak condition and a strong condition OR'd together - that could be legitimate.
Overall, then, the problem is that the DBMS is supposed to be able to execute a vast range of queries - so it is essentially impossible to be sure that any query it is given to execute is, or is not, legitimate.
The proper way to access the database is with stored procedures. If you were using SQL Server and C#/VB.NET you could use LINQ to SQL, which allows you to build the query in the language witch then gets turned into a parameterized SP. Good stuff.
Related
Here in the company I work, we have a support tool that, among other things, provides a page that allows the user to run SELECT queries. It should prevent the user from running UPDATE, INSERT, DELETE, DROP, etc. Besides that, every select statement is accepted.
The way it works is by executing
SELECT * FROM (<query>)
so any statement besides a SELECT should fail due to a syntax error.
In my opinion, this approach is not enough to prevent an attack since anything could change the out-query and break the security. I affirmed that along with that solution it should also check the syntax of the inside query. My colleagues asked me to prove that the current solution is unsafe.
To test it, I tried to write something like
SELECT * from dual); DROP table users --
But it failed because of the ; character that is not accepted by the SQL connector.
So, is there any way to append a modification statement in a query like that?
By the way, it is Oracle SQL.
EDIT:
Just to put it more clear: I know this is not a good approach. But I must prove it to my colleagues to justify a code modification. Theoretical answers are good, but I think a real injection would be more efficient.
The protection is based on the idea/assumption that "update queries" are never going to produce a result table (which is what it would take to make it a valid sub-expression to your SELECT FROM (...) ).
Proprietary engines with proprietary language extensions might undermine that assumption. And although admittedly it still seems unlikely, in the world of proprietary extensions there really is some crazy stuff flying around so don't assume too lightly.
Maybe also beware of expression compilers that coerce "does not return a table" into "an empty table of some kind". You know. Because any system must do anything it can to make the user action succeed instead of fail/crash/...
And maybe also consider that if "query whatever you like" is really the function that is needed, then your DBMS most likely already has some tool or component that actually allows that ... (and is even designed specifically for the purpose).
I'm going to assume that it's deemed acceptable for users to see any data accessible from that account (as that is what this seems designed to do).
It's also fairly trivial to perform a Denial of Service with this, either with an inefficient query, or with select for update, which could be used to lock critical tables.
Oracle is a feature rich DB, and that means there is likely a variety of ways to run DML from within a query. You would need to find an inline PL/SQL function that allow you to perform DML or have other side effects. It will depend on the specific schema as to what packages are available - the XML DB packages have some mechanisms to run arbitrary SQL, the UTL_HTTP packages can often be used to launch network attacks, and the java functionality is quite powerful.
The correct way to protect against this is to use the DB security mechanisms - run this against a read-only schema (one with query privs only on the tables).
I'm making a simple web interface to allow execution of queries against a database (yeah, I know, I know it's a really bad practice, but it's a private website used only by a few trusted users that currently use directly a DB manager to execute these queries, so the web interface is only to make more automatic the process).
The thing is that, for safety, whenever an UPDATE query is detected I want to first execute a SELECT statement "equivalent" to the update (keeping WHERE clause) to retrieve how many records are going to be affected prior to execute the UPDATE.
The idea is to replace "UPDATE" by "SELECT * FROM" and remove the whole "SET" clause without removing the "WHERE".
I'm trying replacing UPDATE\s*(.*?)\s*SET.*(\s*WHERE\s*.*) by SELECT * FROM \1 \2 and similar but i'm having troubles when there is no "WHERE" clause (uncommon, but possible).
edit: It's pretty hard to explain why I need this to be done like this, but I do, I know about stored procedures, query builders, transactions, etc... but for my case it's not what I need to be able to do.
You should fix your design. There is nothing wrong with users updating data in a database. The question is how they do it. My strong suggestion is to wrap the update statements in stored procedures. Then only allow updates through the stored procedures.
There are several main reasons why I prefer this approach:
I think a well-designed API produces more stable and maintainable code.
The stored procedures control security.
The stored procedures allow better logging of what is happening in the database.
The stored procedures provide control over what users can do.
In your case, though, they offer another advantage. Because all the update code is on the database-side, you know what the update statements look like. So, you can then decide how you want to get the "pre-counts" (which is what I assume you are looking for).
EDIT:
There is also an important flaw in your design (as you describe it). The data might change between the update and the select. If you use stored procedures, there are ways to address this. For instance, you can wrap the operations in a transaction. Or, you use a SELECT to get the rows to be updated, lock those rows (depending on the database), and only do the update on those rows.
I have read many strong views (both for and against) SPs or DS.
I am writing a query engine in C++ (mySQL backend for now, though I may decide to go with a C++ ORM). I cant decide whether to write a SP, or to dynamically creat the SQL and send the query to the db engine.#
Any tips on how to decide?
Here's the simple answer:
If your programmers do both database and coding work, keep the SQL with the app. It's easier to maintain that way. Otherwise, let the DB guys handle it in SPs.
You have more control over the mechanisms outside the database. The biggest win for taking care of this outside the database is simply maintenance (in my mind). It'd be slightly hard to version control the SP vs the code you generate outside the database. One more thing to keep track of.
While we're on the topic, it's similar to handling data/schema migrations. It's annoyingly complex to version/handle schema migrations, if you don't already have a mechanism for this, you will have yet another thing you'll need to manage. It comes down to simply being easier to manage/version these things outside the database.
Consider the scenario where you have a bug in your SP. Now it needs to be changed, but then you hop over to another developers database/sandbox. What version is the sandbox and the SP? Now you have to track multiple versions.
One of the main differentiators is whether you are writing the "one true front end" or whether the database is the central piece of your application.
If you are going to have multiple front ends stored procedures make a lot of sense because you reduce your maintenance overhead. If you are writing only one interface, stored procedures are a pain, because you lose a lot of flexibility in changing your data set as your front end needs change, plus you now have to do code maintenance, version control, etc. in two places. Databases are a real pain to keep in sync with code repositories.
Finally, if you are coding for multiple databases (Oracle and SQL compatible code, for example), I'd avoid stored procedures completely.
You may in certain rare circumstances, after profiling, determine that some limited stored procedures are useful to you. This situation comes up way less than people think it does.
The main scenarios when you MUST have the SP is:
1) When you have very complex set of queries with heavy compile overhead and data drift low enough that recompiling is not needed on a regular basis.
2) When the "Only True" logic for accessing the specific data set is VERY complicated, needs to be accessed from several different codebases on different platforms (so writing multiple APIs in code is much more expensive).
Any other scenario, it's debatable, and can be decided one way or another.
I must also say that the other posters' arguments about versioning are not really such a big deal in my experience - having your SPs in version control is as easy as creating a "sql/db_name" directory structure and having easy basic "database release" script which releases the SP code from the version control location to the database. Every company I worked for had some kind of setup like this, central one run by DBAs or departmental one run by developers.
The one thing you want to avoid is to have your business logic spread across multiple tiers of your application. Database DDL and DML are difficult enough to keep in sync with an application code base as it is.
My recommendation is to create a good relational schema, but all your constraints and triggers so that the data retains integrity even if somebody goes to the database and tries to do something through some command line SQL.
Put all your business logic in an application or service that calls (static/dynamic) SQL then wraps the business functionality you are are trying to expose.
Stored-procedures have two purposes that I can think of.
An aid to simplifying data access.
The Stored Procedure does not have
any business logic in it, it just
knows about the structure of the
data and exposes an interface to
isolate accessing three tables and a
view just to get a single piece of
information.
Mapping the Domain Model to the Data
Model, Stored Procedures can assist
in making the Data Model look like a
given Domain Model.
After the program has been completed and has been profiled there are often performance issues with the pre 1.0 release. Stored procedures do offer batching of SQL without traffic needing to go back and forth between the DBMS and the Application. That being said in rare and extreme cases due to performance a few business rules might need to be migrated to the Stored-Procedure side. Make sure to document any exceptions to the architectural philosophy in multiple prominent places.
Stored Procedures are ideal for:
Creating reusable abstractions over complex queries;
Enforcing specific types of insertions/updates to tables (if you also deny permissions to the table);
Performing privileged operations that the logged-in user wouldn't normally be allowed to do;
Guaranteeing a consistent execution plan;
Extending the capabilities of an ORM (batch updates, hierarchy queries, etc.)
Dynamic SQL is ideal for:
Variable search arguments or output columns:
Optional search conditions
Pivot tables
IN clauses with user-specified values
ORM implementations (most can use SPs, but can't be built entirely on them);
DDL and administrative scripts.
They solve different problems, really. Use whichever one is more appropriate to the task at hand, and don't restrict yourself to just one or the other. After you work on database code for a while you'll start to get a more intuitive feel for these things; you'll find yourself banging together some rat's nest of strings for a query and think, "this should really go in a stored procedure."
Final note: Because this question implies a certain level of inexperience with SQL, I feel obliged to say, don't forget that you still need to parameterize your queries when you write dynamic SQL. Parameters aren't just for stored procedures.
DS is more flexible. SP approach makes your system more manageable.
If I have a valid SQL string; is there anyway I can execute it in my PL/SQL - but guarantee that it is a SELECT statement only...without doing complex parsing to ensure it doesn't have any escape characters/nested commands or any of that jazz?
EDIT:
What I'm really trying to accomplish is a generic, built-in to my application, querying tool. It has a friendly, domain specific GUI and lets a very non-tech user create reasonably complex queries. The tool handles versioning of the searches, adds innerjoins where needed and some other application specific stuff you wouldn't find a typical SQL DEV type tool.
The application successfully creates a SQL Query. The problem is that I also allow users to directly enter their own SQL. I'm worried about potential SQL injection type issues.
I'm not sure if this is the appropriate place; but, in addition to the question - if anyone could recommend a good Oracle book that would get me up to speed on things of this nature - I'd very much appreciate it.
One solution is to GRANT your user only SELECT privilege if that's the only thing the user is authorized to do.
See "Oracle Database Security Guide: Introduction to Privileges"
However, I don't think that your application is necessarily secure just because you restrict the queries to SELECT. There are examples of mischief that can be perpetrated when you allow unsafe use of SELECT queries.
Re your clarified question: I've studied SQL injection and written about it quite a bit. What I can advise as a general rule is: Never execute user input as code. That's how SQL injection occurs.
You can design a domain-specific language and map user input to SQL operations, but make sure there's a layer that translates user choices to the database schema. If you separate user input from your SQL code by introducing a mapping layer, then you should be all right.
See also my answer to "How do I protect this function from sql injection."
Oracle comes with a lot of execute privileges granted to public. As such even a user with no explicit insert/update/delete/execute privileges can do mischief.
Speaking of mischief, even with a SELECT a user could cause trouble. A "SELECT * FROM table FOR UPDATE of column" would lock the entire table. SELECT...FOR UPDATE only requires SELECT privileges.
Dumb queries (eg cartesian joins) could bring a database to its needs (though Resource Manager should be able to block most of them by only allowing queries that would do less than a specified amount of IOs or CPU).
How about giving them a list of approved SQLs to execute and a process for them to nominate SQLs for inclusion ?
If you're giving the user a text area so they can type whatever they want, hey, SQL injection is what you want.
I wouldn't leave the door so open like that, but if I was forced to do it, then I'd run an explain plan on whatever the user wants to do. The optimizer will parse the query and put all the information about the SQL statement in the plan_table table, which you can then query to check if it's really a select operation, which tables/indexes from which schemas are being accessed, if the where clause is something you approve of, if there's any "bad" operations, such as Cartesian joins or full table scans, etc.
Take a look at Oracle's paper on writing injection proof pl/sql. The DBMS_ASSERT built-in package should help you test your SQL for appropriateness.
Even with those tests, I'd be extremely reluctant to give people an open text window for building their queries especially on the public net or in a large organization where you don't know everybody. There are very creative people just looking for opportunities like that.
In oracle, you can just check to see that the first word is "select" or "with". This is due to PL/SQL's Ada heritage, which requires compound statements to be in begin/end blocks, so that the usual SQL injection techniques just cause syntax errors.
Of course, the best answer is to do this by granting permissions and avoiding if possible directly evaluating unknown input. But it is interesting that the begin/end syntax eliminates a lot of SQL injection attack vectors.
Here's an argument for SPs that I haven't heard. Flamers, be gentle with the down tick,
Since there is overhead associated with each trip to the database server, I would suggest that a POSSIBLE reason for placing your SQL in SPs over embedded code is that you are more insulated to change without taking a performance hit.
For example. Let's say you need to perform Query A that returns a scalar integer.
Then, later, the requirements change and you decide that it the results of the scalar is > x that then, and only then, you need to perform another query. If you performed the first query in a SP, you could easily check the result of the first query and conditionally execute the 2nd SQL in the same SP.
How would you do this efficiently in embedded SQL w/o perform a separate query or an unnecessary query?
Here's an example:
--This SP may return 1 or two queries.
SELECT #CustCount = COUNT(*) FROM CUSTOMER
IF #CustCount > 10
SELECT * FROM PRODUCT
Can this/what is the best way to do this in embedded SQL?
A very persuasive article
SQL and stored procedures will be there for the duration of your data.
Client languages come and go, and you'll have to re-implement your embedded SQL every time.
In the example you provide, the time saved is sending a single scalar value and a single follow-up query over the wire. This is insignificant in any reasonable scenario. That's not to say there might not be other valid performance reasons to use SPs; just that this isn't such a reason.
I would generally never put business logic in SP's, I like them to be in my native language of choice outside the database. The only time I agree SPs are better is when there is a lot of data movement that don't need to come out of the db.
So to aswer your question, I'd rather have two queries in my code than embed that in a SP, in my view I am trading a small performance hit for something a lot more clear.
How would you do this efficiently in
embedded SQL w/o perform a separate
query or an unnecessary query?
Depends on the database you are using. In SQL Server, this is a simple CASE statement.
Perhaps include the WHERE clause in that sproc:
WHERE (all your regular conditions)
AND myScalar > myThreshold
Lately I prefer to not use SPs (Except when uber complexity arises where a proc would just be better...or CLR would be better). I have been using the Repository pattern with LINQ to SQL where my query is written in my data layer in a strongly typed LINQ expression. The key here is that the query is strongly typed which means when I refactor I am refactoring properties of a class that is directly generated from the database table (which makes changes from the DB carried all the way forward super easy and accurate). While my SQL is generated for me and sent to the server I still have the option of sticking to DRY principles as the repository pattern allows me to break things down into their smallest component. I do have the issue that I might make a trip to the server and based on the results of query I may find that I need to make another trip to the server. I don't worry about this up front. If I find later that it becomes an issue then I may refactor that code into something more performant. The over all key here is that there is no one magic bullet. I tend to work on greenfield applications which allows this method of development to be most efficient for me.
Benefits of SPs:
Performance (are precompiled)
Easy to change (without compiling the application)
SQL set based features make very easy doing really difficult data tasks
Drawbacks:
Depend heavily on the database engine used
Makes deployment of upgrades a little harder (you have to deploy the App + the scripts)
My 2 cents...
About your example, it can be done like this:
select * from products where (select count(*) from customers>10)