Is this a good approach for avoiding SQL injection? - sql

Here in the company I work, we have a support tool that, among other things, provides a page that allows the user to run SELECT queries. It should prevent the user from running UPDATE, INSERT, DELETE, DROP, etc. Besides that, every select statement is accepted.
The way it works is by executing
SELECT * FROM (<query>)
so any statement besides a SELECT should fail due to a syntax error.
In my opinion, this approach is not enough to prevent an attack since anything could change the out-query and break the security. I affirmed that along with that solution it should also check the syntax of the inside query. My colleagues asked me to prove that the current solution is unsafe.
To test it, I tried to write something like
SELECT * from dual); DROP table users --
But it failed because of the ; character that is not accepted by the SQL connector.
So, is there any way to append a modification statement in a query like that?
By the way, it is Oracle SQL.
EDIT:
Just to put it more clear: I know this is not a good approach. But I must prove it to my colleagues to justify a code modification. Theoretical answers are good, but I think a real injection would be more efficient.

The protection is based on the idea/assumption that "update queries" are never going to produce a result table (which is what it would take to make it a valid sub-expression to your SELECT FROM (...) ).
Proprietary engines with proprietary language extensions might undermine that assumption. And although admittedly it still seems unlikely, in the world of proprietary extensions there really is some crazy stuff flying around so don't assume too lightly.
Maybe also beware of expression compilers that coerce "does not return a table" into "an empty table of some kind". You know. Because any system must do anything it can to make the user action succeed instead of fail/crash/...
And maybe also consider that if "query whatever you like" is really the function that is needed, then your DBMS most likely already has some tool or component that actually allows that ... (and is even designed specifically for the purpose).

I'm going to assume that it's deemed acceptable for users to see any data accessible from that account (as that is what this seems designed to do).
It's also fairly trivial to perform a Denial of Service with this, either with an inefficient query, or with select for update, which could be used to lock critical tables.
Oracle is a feature rich DB, and that means there is likely a variety of ways to run DML from within a query. You would need to find an inline PL/SQL function that allow you to perform DML or have other side effects. It will depend on the specific schema as to what packages are available - the XML DB packages have some mechanisms to run arbitrary SQL, the UTL_HTTP packages can often be used to launch network attacks, and the java functionality is quite powerful.
The correct way to protect against this is to use the DB security mechanisms - run this against a read-only schema (one with query privs only on the tables).

Related

Database Security Question

Well, It seems like such a simple solution to the many problems that can arise from insecure services and applications. But I'm not sure if it's possible, or maybe nobody's thought of this idea yet...
Instead of leaving it up to programmers/developers to ensure that their applications use stored procedures/parameterised queries/escape strings etc to help prevent sql injection/other attacks - why don't the people who make the databases just build these security features into the databases so that when an update or insert query is performed on the database, the database secures/sanitizes the string before it is inserted into the database?
The database would not necessarily know the context of what is going on. What is malicious for one application is not malicious for another. Sometimes the intent IS to
drop table users--
It is much better to let the database do what it does best, arranging data. And let the developers worry about the security implementations.
The problem is that the database cannot readily tell whether the command it is requested to execute is legitimate or not - it is syntactically valid and there could be a valid reason for the user to request that it be executed.
There are heuristics that the DBMS could apply. For example, if a single request combined both a SELECT operation and a DELETE operation, it might be possible to infer that this is more likely to be illegitimate than legitimate - and the DBMS could reject that combined operation. But it is hard to deal with a query where the WHERE condition has been weakened to the point that it shows more data than it was supposed to. A UNION query can deliberately select from multiple tables. It is not sufficient to show that there is a weak condition and a strong condition OR'd together - that could be legitimate.
Overall, then, the problem is that the DBMS is supposed to be able to execute a vast range of queries - so it is essentially impossible to be sure that any query it is given to execute is, or is not, legitimate.
The proper way to access the database is with stored procedures. If you were using SQL Server and C#/VB.NET you could use LINQ to SQL, which allows you to build the query in the language witch then gets turned into a parameterized SP. Good stuff.

Dynamic sql vs stored procedures - pros and cons?

I have read many strong views (both for and against) SPs or DS.
I am writing a query engine in C++ (mySQL backend for now, though I may decide to go with a C++ ORM). I cant decide whether to write a SP, or to dynamically creat the SQL and send the query to the db engine.#
Any tips on how to decide?
Here's the simple answer:
If your programmers do both database and coding work, keep the SQL with the app. It's easier to maintain that way. Otherwise, let the DB guys handle it in SPs.
You have more control over the mechanisms outside the database. The biggest win for taking care of this outside the database is simply maintenance (in my mind). It'd be slightly hard to version control the SP vs the code you generate outside the database. One more thing to keep track of.
While we're on the topic, it's similar to handling data/schema migrations. It's annoyingly complex to version/handle schema migrations, if you don't already have a mechanism for this, you will have yet another thing you'll need to manage. It comes down to simply being easier to manage/version these things outside the database.
Consider the scenario where you have a bug in your SP. Now it needs to be changed, but then you hop over to another developers database/sandbox. What version is the sandbox and the SP? Now you have to track multiple versions.
One of the main differentiators is whether you are writing the "one true front end" or whether the database is the central piece of your application.
If you are going to have multiple front ends stored procedures make a lot of sense because you reduce your maintenance overhead. If you are writing only one interface, stored procedures are a pain, because you lose a lot of flexibility in changing your data set as your front end needs change, plus you now have to do code maintenance, version control, etc. in two places. Databases are a real pain to keep in sync with code repositories.
Finally, if you are coding for multiple databases (Oracle and SQL compatible code, for example), I'd avoid stored procedures completely.
You may in certain rare circumstances, after profiling, determine that some limited stored procedures are useful to you. This situation comes up way less than people think it does.
The main scenarios when you MUST have the SP is:
1) When you have very complex set of queries with heavy compile overhead and data drift low enough that recompiling is not needed on a regular basis.
2) When the "Only True" logic for accessing the specific data set is VERY complicated, needs to be accessed from several different codebases on different platforms (so writing multiple APIs in code is much more expensive).
Any other scenario, it's debatable, and can be decided one way or another.
I must also say that the other posters' arguments about versioning are not really such a big deal in my experience - having your SPs in version control is as easy as creating a "sql/db_name" directory structure and having easy basic "database release" script which releases the SP code from the version control location to the database. Every company I worked for had some kind of setup like this, central one run by DBAs or departmental one run by developers.
The one thing you want to avoid is to have your business logic spread across multiple tiers of your application. Database DDL and DML are difficult enough to keep in sync with an application code base as it is.
My recommendation is to create a good relational schema, but all your constraints and triggers so that the data retains integrity even if somebody goes to the database and tries to do something through some command line SQL.
Put all your business logic in an application or service that calls (static/dynamic) SQL then wraps the business functionality you are are trying to expose.
Stored-procedures have two purposes that I can think of.
An aid to simplifying data access.
The Stored Procedure does not have
any business logic in it, it just
knows about the structure of the
data and exposes an interface to
isolate accessing three tables and a
view just to get a single piece of
information.
Mapping the Domain Model to the Data
Model, Stored Procedures can assist
in making the Data Model look like a
given Domain Model.
After the program has been completed and has been profiled there are often performance issues with the pre 1.0 release. Stored procedures do offer batching of SQL without traffic needing to go back and forth between the DBMS and the Application. That being said in rare and extreme cases due to performance a few business rules might need to be migrated to the Stored-Procedure side. Make sure to document any exceptions to the architectural philosophy in multiple prominent places.
Stored Procedures are ideal for:
Creating reusable abstractions over complex queries;
Enforcing specific types of insertions/updates to tables (if you also deny permissions to the table);
Performing privileged operations that the logged-in user wouldn't normally be allowed to do;
Guaranteeing a consistent execution plan;
Extending the capabilities of an ORM (batch updates, hierarchy queries, etc.)
Dynamic SQL is ideal for:
Variable search arguments or output columns:
Optional search conditions
Pivot tables
IN clauses with user-specified values
ORM implementations (most can use SPs, but can't be built entirely on them);
DDL and administrative scripts.
They solve different problems, really. Use whichever one is more appropriate to the task at hand, and don't restrict yourself to just one or the other. After you work on database code for a while you'll start to get a more intuitive feel for these things; you'll find yourself banging together some rat's nest of strings for a query and think, "this should really go in a stored procedure."
Final note: Because this question implies a certain level of inexperience with SQL, I feel obliged to say, don't forget that you still need to parameterize your queries when you write dynamic SQL. Parameters aren't just for stored procedures.
DS is more flexible. SP approach makes your system more manageable.

Executing Dynamic SQL in Oracle (PL/SQL) and Ensuring Security

If I have a valid SQL string; is there anyway I can execute it in my PL/SQL - but guarantee that it is a SELECT statement only...without doing complex parsing to ensure it doesn't have any escape characters/nested commands or any of that jazz?
EDIT:
What I'm really trying to accomplish is a generic, built-in to my application, querying tool. It has a friendly, domain specific GUI and lets a very non-tech user create reasonably complex queries. The tool handles versioning of the searches, adds innerjoins where needed and some other application specific stuff you wouldn't find a typical SQL DEV type tool.
The application successfully creates a SQL Query. The problem is that I also allow users to directly enter their own SQL. I'm worried about potential SQL injection type issues.
I'm not sure if this is the appropriate place; but, in addition to the question - if anyone could recommend a good Oracle book that would get me up to speed on things of this nature - I'd very much appreciate it.
One solution is to GRANT your user only SELECT privilege if that's the only thing the user is authorized to do.
See "Oracle Database Security Guide: Introduction to Privileges"
However, I don't think that your application is necessarily secure just because you restrict the queries to SELECT. There are examples of mischief that can be perpetrated when you allow unsafe use of SELECT queries.
Re your clarified question: I've studied SQL injection and written about it quite a bit. What I can advise as a general rule is: Never execute user input as code. That's how SQL injection occurs.
You can design a domain-specific language and map user input to SQL operations, but make sure there's a layer that translates user choices to the database schema. If you separate user input from your SQL code by introducing a mapping layer, then you should be all right.
See also my answer to "How do I protect this function from sql injection."
Oracle comes with a lot of execute privileges granted to public. As such even a user with no explicit insert/update/delete/execute privileges can do mischief.
Speaking of mischief, even with a SELECT a user could cause trouble. A "SELECT * FROM table FOR UPDATE of column" would lock the entire table. SELECT...FOR UPDATE only requires SELECT privileges.
Dumb queries (eg cartesian joins) could bring a database to its needs (though Resource Manager should be able to block most of them by only allowing queries that would do less than a specified amount of IOs or CPU).
How about giving them a list of approved SQLs to execute and a process for them to nominate SQLs for inclusion ?
If you're giving the user a text area so they can type whatever they want, hey, SQL injection is what you want.
I wouldn't leave the door so open like that, but if I was forced to do it, then I'd run an explain plan on whatever the user wants to do. The optimizer will parse the query and put all the information about the SQL statement in the plan_table table, which you can then query to check if it's really a select operation, which tables/indexes from which schemas are being accessed, if the where clause is something you approve of, if there's any "bad" operations, such as Cartesian joins or full table scans, etc.
Take a look at Oracle's paper on writing injection proof pl/sql. The DBMS_ASSERT built-in package should help you test your SQL for appropriateness.
Even with those tests, I'd be extremely reluctant to give people an open text window for building their queries especially on the public net or in a large organization where you don't know everybody. There are very creative people just looking for opportunities like that.
In oracle, you can just check to see that the first word is "select" or "with". This is due to PL/SQL's Ada heritage, which requires compound statements to be in begin/end blocks, so that the usual SQL injection techniques just cause syntax errors.
Of course, the best answer is to do this by granting permissions and avoiding if possible directly evaluating unknown input. But it is interesting that the begin/end syntax eliminates a lot of SQL injection attack vectors.

SQL With A Safety Net

My firm have a talented and smart operations staff who are working very hard. I'd like to give them a SQL-execution tool that helps them avoid common, easily-detected SQL mistakes that are easy to make when they are in a hurry. Can anyone suggest such a tool? Details follow.
Part of the operations team remit is writing very complex ad-hoc SQL queries. Not surprisingly, operators sometimes make mistakes in the queries they write because they are so busy.
Luckily, their queries are all SELECTs not data-changing SQL, and they are running on a copy of the database anyway. Still, we'd like to prevent errors in the SQL they run. For instance, sometimes the mistakes lead to long-running queries that slow down the duplicate system they're using and inconvenience others until we find the culprit query and kill it. Worse, occasionally the mistakes lead to apparently-correct answers that we don't catch until much later, with consequent embarrassment.
Our developers also make mistakes in complex code that they write, but they have Eclipse and various plugins (such as FindBugs) that catch errors as they type. I'd like to give operators something similar - ideally it would see
SELECT U.NAME, C.NAME FROM USER U, COMPANY C WHERE U.NAME = 'ibell';
and before you executed, it would say "Hey, did you realise that's a Cartesian product? Are you sure you want to do that?" It doesn't have to be very smart - finding obviously missing join conditions and similar evident errors would be fine.
It looks like TOAD should do this but I can't seem to find anything about such a feature. Are there other tools like TOAD that can provide this kind of semi-intelligent error correction?
Update: I forgot to mention that we're using MySQL.
If your people are using the mysql(1) program to run queries, you can use the safe-updates option (aka i-am-a-dummy) to get you part of what you need. Its name is somewhat misleading; it not only prevents UPDATE and DELETE without a WHERE (which you're not worried about), but also adds an implicit LIMIT 1000 to SELECT statements, and aborts SELECTs that have joins and are estimated to consider over 1,000,000 tuples --- perfect for discouraging Cartesian joins.
..."writing very complex ad-hoc SQL queries.... they are so busy"
Danger Will Robinson!
Automate Automate Automate.
Ideally, the ops team should not be put into a position where they have to write queries on the fly in a high stress situation – it’s a recipe for disaster! Better for them to build up a library of pre-written scripts that have undergone the appropriate testing to make sure it a) does what you want b) provides an audit trail c) has a possible ‘undo’ type function.
Failing that, giving them a user ID that only has SELECT premissions might help :-)
You might find SQL Prompt from redgate useful. I'm not sure what database engine you're using, as it's only for MSSQL Server
I'm not expecting anything like this to exist. The tool would have to first implement everything that the SQL parser in your database implements, and then it would have to do a data model analysis to predict "bad" queries.
Your best bet might be to write a plugin for a text editor that did some basic checking for suspicious patterns and highlighted them differently than the standard .sql mode. But even that would be quite difficult.
I would be happy with a tool that set off alarm bells whenever I typed in an update statement without a where clause. And perhaps administered a mild electric shock, since it's usually about 1 in the morning after a long day when mistakes like that happen.
It would be pretty easy to build this by setting up a sample database with a extremely small amount of dummy data, which would receive the query first. A couple of things will happen:
You might get a SQL syntax error, which would not load the database much since it's a small database.
You might get back a response which could clearly be shown to contain every row in one or more tables, which is probably not what they want.
Things which pass the above conditions are likely to be okay, so you can run them against the copy of the production database.
Assuming your schema doesn't change much and is not particularly weird, writing the above is likely the quickest solution to your problem.
I'd start with some coding standards - for instance never use the type of join in your example - it often results in bad results (especially in SQL Server if you try to do an outer join that way, you will get bad results). require them to do explicit joins.
If you have complex relationships, you might consider putting them in views and then writing the adhoc queries from the views. Then at least they will never make the mistake of getting the joins wrong.
Can't you just limit the amount of time a query can run for? I'm not sure about MySQL, but for SQL Server, even just the default query analyzer can restrict how long queries will run before they time out. Couple that with limited rights so they can only run SELECT queries, and you should be pretty much covered.

Languages other than SQL in postgres

I've been using PostgreSQL a little bit lately, and one of the things that I think is cool is that you can use languages other than SQL for scripting functions and whatnot. But when is this actually useful?
For example, the documentation says that the main use for PL/Perl is that it's pretty good at text manipulation. But isn't that more of something that should be programmed into the application?
Secondly, is there any valid reason to use an untrusted language? It seems like making it so that any user can execute any operation would be a bad idea on a production system.
PS. Bonus points if someone can make PL/LOLCODE seem useful.
#Mike: this kind of thinking makes me nervous. I've heard to many times "this should be infinitely portable", but when the question is asked: do you actually foresee that there will be any porting? the answer is: no.
Sticking to the lowest common denominator can really hurt performance, as can the introduction of abstraction layers (ORM's, PHP PDO, etc). My opinion is:
Evaluate realistically if there is a need to support multiple RDBMS's. For example if you are writing an open source web application, chances are that you need to support MySQL and PostgreSQL at least (if not MSSQL and Oracle)
After the evaluation, make the most of the platform you decided upon
And BTW: you are mixing relational with non-relation databases (CouchDB is not a RDBMS comparable with Oracle for example), further exemplifying the point that the perceived need for portability is many times greatly overestimated.
"isn't that [text manipulation] more of something that should be programmed into the application?"
Usually, yes. The generally accepted "three-tier" application design for databases says that your logic should be in the middle tier, between the client and the database. However, sometimes you need some logic in a trigger or need to index on a function, requiring that some code be placed into the database. In that case all the usual "which language should I use?" questions come up.
If you only need a little logic, the most-portable language should probably be used (pl/pgSQL). If you need to do some serious programming though, you might be better off using a more expressive language (maybe pl/ruby). This will always be a judgment call.
"is there any valid reason to use an untrusted language?"
As above, yes. Again, putting direct file access (for example) into your middle tier is best when possible, but if you need to fire things off based on triggers (that might need access to data not available directly to your middle tier), then you need untrusted languages. It's not ideal, and should generally be avoided. And you definitely need to guard access to it.
These days, any "unique" or "cool" feature in a DBMS makes me incredibly nervous. I break out in a rash and have to stop work until the itching goes away.
I just hate to be locked in to a platform unnecessarily. Suppose you build a big chunk of your system in PL/Perl inside the database. Or in C# within SQL Server, or PL/SQL within Oracle, there are plenty of examples*.
Now you suddenly discover that your chosen platform doesn't scale. Or isn't fast enough. Or something. Worse, there's a new kid on the database block (something like MonetDB, CouchDB, Cache, say but much cooler) that would solve all your problems (even if your only problem, like mine, is having an uncool databse platform). And you can't switch to it without recoding half your application.
(*Admittedly, the paid-for products are to some extent seeking to lock you in by persuading you to use their unique features, which is not an accusation that can directly be levelled at the free providers, but the effect is the same).
So that's a rant on the first part of the question. Heart-felt, though.
is there any valid reason to use an
untrusted language? It seems like
making it so that any user can execute
any operation would be a bad idea
My goodness, yes it does! A sort of "Perl injection attack"? Almost worth doing it just to see what happens, I'd have thought.
For philosophical reasons outlined above I think I'll pass on the PL/LOLCODE challenge. Although I was somewhat amazed to discover it was a link to something extant.
From my perspective, I guess the answer is 'it depends'.
There is an argument that manipulation of the data belongs in the database layer, so that the business logic does not need to be overly concerned about how the manipulation happens, it just knows that it has.
Another very good reason to process data on the db layer is if the volume of data being crunched means that network bandwidth will become an issue. I once had to categorise very large amounts of data. Processing this in the application layer was severly restricted by the time required to transfer all the data across the network for processing.
I then wrote a binning algorithm in PL/pgSQL and it worked much faster.
Regarding untrusted languages, I heard a podcast from Josh Berkus (a postgres advocate) who discussed an application of postgresql that brought in data from MySQL as part of its processing, so that the communication itself was handled by the postgres server. I don't remember the full details, I think it was on the FLOSS Weekly podcast which is quite an interesting discussion of the history of PostGRESQL and some of the issues it is put to.
The untrusted versions of the procedural languages allow you to access I/O on the system.
This can come in handy if you need a trigger or something send a email or connect to a socket server to send a popup notification. There are tons of uses for this type of thing, and because of postgresql isolation levels you cans safely do things like this.
You can put checkpoints in the function so if the transaction fails the email or whatever won't go out. The nice thing about doing this is it removes the logic from the client and puts it on the server.
I think most additional languages are offered so that if you develop in that language on a regular basis, you can feel comfortable writing db functions, triggers, etc. The usefulness of these features is to provide a control over data as close to the data as possible.
An example of a useful stored procedure I recently wrote in an external language that would not have been possible in pl/sql is a version of 'df' which allowed SQL table generators to pick a tablespace with the most free space available at runtime.
I used plperlu, and it was relatively simple, although I had to be careful with data typing.