I would like to improve my knowledge about the possible SQL injection attacks that exist. I know that parameterization completely avoids SQL injection risk and should therefore be applied everywhere. However, when someone asks me how it can be exploited, I like to have an answer.
I know how a basic SQL injection attack works. For example a website has a page website.com/users/{id} where id is the primary key of the user. If we trust the input completely and simply pass the id parameter to the query being executed, this can have dire consequences. In the case of website.com/users/1 the query becomes SELECT * FROM [User] WHERE [Id] = 1. However, in the case of website.com/users/1;DROP TABLE User the query becomes SELECT * FROM [User] WHERE [Id] = 1;DROP TABLE User, leading to the nasty result.
But, pretty much all SQL injection attacks I read about count on the WHERE clause being present right before the injection. Almost always, the injection works in some form of ;Injected statement--.
My question is, if it is also possible to perform a SQL injection attack given a query like the one below? Or in a broader sense: does the entire statement have to compile for a SQL injection attack to be possible, or will any error in the statement cause the attack to fail? If the answer is different per DBMS, please specify the DBMS.
In the query below, the injection is supposed to happen in the CHARINDEX('input', [Name]) > 0 where input is copied from a user's input.
SELECT
*
FROM (
SELECT TOP 10
*
FROM
[User]
WHERE
CHARINDEX('input', [Name]) > 0
) AS [User]
LEFT JOIN
[Setting] ON [Setting].[UserId] = [User].[Id]
The furthest I got myself was with the query below, but the error it returns, Missing end comment mark '*/', seems to be completely blocking any attack.
SELECT
*
FROM (
SELECT TOP 10
*
FROM
[User]
WHERE
CHARINDEX('input', '') > 0) AS [User];DROP TABLE [NonExistentTable]/*, [Name]) > 0
) AS [User]
LEFT JOIN
[Setting] ON [Setting].[UserId] = [User].[Id]
The resulting SQL has to be accepted by the particular DBMS for injection to occur, which generally means it needs to be valid SQL, but there are usually ways of crafting the input to make it valid regardless of the SQL in question.
If a line comment isn't enough, an extra statement can be added; if multiple statements aren't allowed, a UNION can be used; and so on.
The exact details vary, but with enough knowledge of the query (e.g. through error details leaking to the user) or lucky guesses, something can usually be crafted that is to the attacker's advantage.
In your example, consider this input, which simply repeats parts of the existing query:
nonsense', [Name]) > 0
)
) AS [User];
Drop Table [User];
SELECT
*
FROM (
SELECT TOP 10
*
FROM
[User]
WHERE
CHARINDEX('nonsense
Which results in the following SQL:
SELECT
*
FROM (
SELECT TOP 10
*
FROM
[User]
WHERE
CHARINDEX('nonsense', [Name]
)
) AS [User];
Drop Table [User];
SELECT
*
FROM (
SELECT TOP 10
*
FROM
[User]
WHERE
CHARINDEX('nonsense', [Name]) > 0
) AS [User]
LEFT JOIN
[Setting] ON [Setting].[UserId] = [User].[Id]
SQL injection normally happens where some kind of string concatenation/insertion operation is involved. It does not have to be the WHERE clause. Also, generally speaking, the attacker is not interested in dropping the tables, he wants information. What if input is replaced by this:
', '') > 0 UNION ALL SELECT TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE COLUMN_NAME = 'password' --
Assuming that the result from select are displayed somehow and error messages are also shown, it'll take a few minutes for the attacker to determine the number and position of , NULL he should add before the query actually returns the name of table and column he wants to probe in the next stage.
Related
So I'm creating a query for a report that could have several optional filters. I've only included client and station here to keep it simple. Each of these options could be an include or an exclude and could contain NULL, 1, or multiple values. So I split the varchar into a table before joining it to the query.
This test takes about 15 minutes to execute, which... just won't do :p Is there a better way? We have similar queries written with dynamic sql, and I was trying to avoid that, but maybe there's no way around it for this?
DECLARE
#ClientsInc VARCHAR(10) = 'ABCD, EFGH',
#ClientsExc VARCHAR(10) = NULL,
#StationsInc VARCHAR(10) = NULL,
#StationsExc VARCHAR(10) = 'SomeStation'
SELECT *
INTO #ClientsInc
FROM dbo.StringSplit(#ClientsInc, ',')
SELECT *
INTO #ClientsExc
FROM dbo.StringSplit(#ClientsExc, ',')
SELECT *
INTO #StationsInc
FROM dbo.StringSplit(#StationsInc, ',')
SELECT *
INTO #StationsExc
FROM dbo.StringSplit(#StationsExc, ',')
SELECT [some stuff]
FROM media_order mo
LEFT JOIN #ClientsInc cInc WITH(NOLOCK) ON cInc.Value = mo.client_code
LEFT JOIN #ClientsExc cExc WITH(NOLOCK) ON cExc.Value = mo.client_code
LEFT JOIN #StationsInc sInc WITH(NOLOCK) ON sInc.Value = mo.station_name
LEFT JOIN #StationsExc sExc WITH(NOLOCK) ON sExc.Value = mo.station_name
WHERE ((#ClientsInc IS NOT NULL AND cInc.Value IS NOT NULL)
OR (#ClientsExc IS NOT NULL AND cExc.Value IS NULL)
)
AND ((#StationsInc IS NOT NULL AND sInc.Value IS NOT NULL)
OR (#StationsExc IS NOT NULL AND sExc.Value IS NULL)
)
First of all, I always tend to mention Erland Sommarskog's Dynamic Search Conditions in such cases.
However, you already seem to be aware of the two options: one is dynamic SQL. The other is usually the old trick and (#var is null or #var=respective_column). This trick, however, works only for one value per variable.
Your solution indeed seems to work for multiple values. But in my opinion, you are trying too hard to avoid dynamic sql. Your requirements are complex enough to guarantee it. And remember, usually, dynamic sql is harder for you to code, but easier for the server in complex cases - and this one certainly is. Making a performance guess is always risky, but I would guess an improvement in this case.
I would use exists and not exists:
select ...
from media_order mo
where
(
#ClientsInc is null
or exists (
select 1
from string_split(#ClientsInc, ',')
where value = mo.client_code
)
)
and not exist (
select 1
from string_split(#ClientsExc, ',')
where value = mo.client_code
)
and (
#StationsInc is null
or exists (
select 1
from string_split(#StationsInc, ',')
where value = mo.station_name
)
)
and not exist (
select 1
from string_split(#StationsExc, ',')
where value = mo.station_name
)
Notes:
I used buil-in function string_split() rather than the custom splitter that you seem to be using. It is available in SQL Server 2016 and higher, and returns a single column called value. You can change that back to your customer function if you are running an earlier version
as I understand the logic you want, "include" parameters need to be checked for nullness before using exists, while it is unnecessary for "exclude" variables
Please disregard the obvious problems with the manipulation of data in the where clause. I know! I'm working on it. While working on it, though, I discovered that this query runs:
SELECT *
FROM PatientDistribution
WHERE InvoiceNumber LIKE'PEX%'
AND ISNUMERIC(CheckNumber) = 1
AND CONVERT(BIGINT,CheckNumber) <> TransactionId
And this one does not:
SELECT *
FROM PatientDistribution
WHERE InvoiceNumber LIKE'PEX%'
AND CONVERT(BIGINT,CheckNumber) <> TransactionId
AND ISNUMERIC(CheckNumber) = 1
The only difference between the two queries is the order of items in the WHERE clause. I was under the impression that the SQL Server query optimizer would take the worry out of me having to worry about that.
The error returned is: Error converting data type varchar to bigint.
You are right, the order of the conditions shouldn't matter.
If AND ISNUMERIC(CheckNumber) = 1 is checked first and non-matching rows thus dismissed, then AND CONVERT(BIGINT,CheckNumber) <> TransactionId will work (for exceptions see scsimon's answer).
If AND CONVERT(BIGINT,CheckNumber) <> TransactionId is processed before AND ISNUMERIC(CheckNumber) = 1 then you may get an error.
That your first query worked and the second not was a matter of luck. It could just as well have been vice versa.
You can force one condition to be executed before the other:
SELECT *
FROM
(
SELECT *
FROM PatientDistribution
WHERE InvoiceNumber LIKE 'PEX%'
AND ISNUMERIC(CheckNumber) = 1
) num_only
WHERE CONVERT(BIGINT,CheckNumber) <> TransactionId;
You just got lucky that the first one worked, since you are correct that the order of what you list in the where clause does not matter. SQL is a declarative language meaning that you are telling the engine what should happen, not how. So your queries weren't executed with the same query plan I would suspect. Granted, you can affect what the optimizer does to a certain extent. You'll also notice this type of issue when using a CTE. For example:
declare #table table(columnName varchar(64))
insert into #table
values
('1')
,('1e4')
;with cte as(
select columnName
from #table
where isnumeric(columnName) = 1)
select
cast(columnName as decimal(32,16))
from cte
The above snippet you would assume that the second statement is ran on the results / subset from the CTE statement. However, you can't ensure this will happen and you could still get a type/conversion error on the second statement.
More importantly, you should know that ISNUMERIC() is largely misused. People often think that if it returns 1 then it could be converted to a decimal or int. But this isn't the case. It just checks that it's a valid numeric type. For example:
select
isnumeric('1e4')
,isnumeric('$')
,isnumeric('1,123,456')
As you can see, these evaluate to true, but would fail the conversion you put in your post.
Side note, your indexes are likely the reason why the first actually didn't error our.
There is a dirty data in input.
We are trying to cleanup dataset and then make some calculations on cleared data.
declare #t table (str varchar(10))
insert into #t select '12345' union all select 'ABCDE' union all select '111aa'
;with prep as
(
select *, cast(substring(str, 1, 3) as int) as str_int
from #t
where isnumeric(substring(str, 1, 3)) = 1
)
select *
from prep
where 1=1
and case when str_int > 0 then 'Y' else 'N' end = 'Y'
--and str_int > 0
Last 2 lines are doing the same thing. First one works, but if you uncomment second one it will crash with Conversion failed when converting the varchar value 'ABC' to data type int.
Obviously, SQL Server is rewriting query mixing all the conditions together.
My guess it that it considers 'case' as a havy operation and performs it as a last step. That's why workaround with case works.
Is this behavior documented in any way? or is it a bug?
This is a known issue with SQL Server, and Microsoft does not consider it a bug although users do. The difference between the two queries is the execution path. One is doing the conversion before the filtering, the other after.
SQL Server reserves the right to re-order the processing. The documentation does specify the logical processing of clauses as:
FROM
ON
JOIN
WHERE
GROUP BY
WITH CUBE or WITH ROLLUP
HAVING
SELECT
DISTINCT
ORDER BY
TOP
With (presumably but not explicitly documented here) CTEs being logically processed first. What does logically processed mean? Well, it doesn't mean that run-time errors are caught. It really determines the scope of identifiers during the compile phase.
When SQL Server reads from a data source, it can add new variables in. This is a convenient time to do this, because everything is in memory. However, this might occur before the filtering, which is what is causing the error when it occurs.
The fix to this problem is to use a case statement. So, the following CTE will usually work:
with prep as (
select *, (case when isnumeric(substring(str, 1, 3)) = 1 and str not like '%.%'
then cast(substring(str, 1, 3) as int)
end) as str_int
from #t
where isnumeric(substring(str, 1, 3)) = 1
)
Looks weird. And I think Redmond thinks so too. SQL Server 2012 introduced try_convert() (see here) which returns NULL if the conversion fails.
It would also help if you could instruct SQL Server to materialize CTEs. That would also solve the problem in this case. You can vote on adding such an option to SQL Server here.
Writing a stored procedure that will have multiple input parameters. The parameters may not always have values and could be empty. But since the possibility exists that each parameter may contain values I have to include the criterion that utilizing those parameters in the query.
My query looks something like this:
SELECT DISTINCT COUNT(*) AS SRM
FROM table p
WHERE p.gender IN (SELECT * FROM Fn_SplitParms(#gender)) AND
p.ethnicity IN (SELECT * FROM Fn_SplitParms(#race)) AND
p.marital_status IN (SELECT * FROM Fn_SplitParms(#maritalstatus))
So my problem is if #gender is empty(' ') the query will return data where gender field is empty when I really want to just ignore p.gender all together. I don't want to have to accomplish this task using IF/ELSE conditional statements because they would be too numerous.
Is there any way to use CASE with IN for this scenario? OR
Is there other logic that I'm just not comprehending that will solve this?
Having trouble finding something that works well...
Thanks!
Use or:
SELECT DISTINCT COUNT(*) AS SRM
FROM table p
WHERE
(p.gender IN (SELECT * FROM Fn_SplitParms(#gender)) OR #gender = '')
AND (p.ethnicity IN (SELECT * FROM Fn_SplitParms(#race)) OR #race = '')
AND (p.marital_status IN (SELECT * FROM Fn_SplitParms(#maritalstatus)) OR #maritalstatus = '')
You might also want to consider table-valued parameters (if using SQL Server 2008 and up) - these can sometimes make the code simpler, since they are treated as tables (which in your case, may be empty) and you can join - plus no awkward split function required.
Debugging an app which queries SQL Server 05, can't change the query but need to optimise things.
Running all the selects seperately are quick <1sec, eg: select * from acscard, select id from employee... When joined together it takes 50 seconds.
Is it better to set uninteresting accesscardid fields to null or to '' when using EXISTS?
SELECT * FROM ACSCard
WHERE NOT EXISTS
( SELECT Id FROM Employee
WHERE Employee.AccessCardId = ACSCard.acs_card_number )
AND NOT EXISTS
( SELECT Id FROM Visit
WHERE Visit.AccessCardId = ACSCard.acs_card_number )
ORDER by acs_card_id
Do you have indexes on Employee.AccessCardId, Visit.AccessCardId, and ACSCard.acs_card_number?
The SELECT clause is not evaluated in an EXISTS clause. This:
WHERE EXISTS(SELECT 1/0
FROM EMPLOYEE)
...should raise an error for dividing by zero, but it won't. But you need to put something in the SELECT clause for it to be a valid query - it doesn't matter if it's NULL or a zero length string.
In SQL Server, NOT EXISTS (and NOT IN) are better than the LEFT JOIN/IS NULL approach if the columns being compared are not nullable (the values on either side can not be NULL). The columns compared should be indexed, if they aren't already.