Can 2 character length variables cause SQL injection vulnerability? - sql

I am taking a text input from the user, then converting it into 2 character length strings (2-Grams)
For example
RX480 becomes
"rx","x4","48","80"
Now if I directly query server like below can they somehow make SQL injection?
select *
from myTable
where myVariable in ('rx', 'x4', '48', '80')

SQL injection is not a matter of length of anything.
It happens when someone adds code to your existing query. They do this by sending in the malicious extra code as a form submission (or something). When your SQL code executes, it doesn't realize that there are more than one thing to do. It just executes what it's told.
You could start with a simple query like:
select *
from thisTable
where something=$something
So you could end up with a query that looks like:
select *
from thisTable
where something=; DROP TABLE employees;
This is an odd example. But it does more or less show why it's dangerous. The first query will fail, but who cares? The second one will actually work. And if you have a table named "employees", well, you don't anymore.

Two characters in this case are sufficient to make an error in query and possibly reveal some information about it. For example try to use string ')480 and watch how your application will behave.

Although not much of an answer, this really doesn't fit in a comment.
Your code scans a table checking to see if a column value matches any pair of consecutive characters from a user supplied string. Expressed in another way:
declare #SearchString as VarChar(10) = 'Voot';
select Buffer, case
when DataLength( Buffer ) != 2 then 0 -- NB: Len() right trims.
when PatIndex( '%' + Buffer + '%', #SearchString ) != 0 then 1
else 0 end as Match
from ( values
( 'vo' ), ( 'go' ), ( 'n ' ), ( 'po' ), ( 'et' ), ( 'ry' ),
( 'oo' ) ) as Samples( Buffer );
In this case you could simply pass the value of #SearchString as a parameter and avoid the issue of the IN clause.
Alternatively, the character pairs could be passed as a table parameter and used with IN: where Buffer in ( select CharacterPair from #CharacterPairs ).
As far as SQL injection goes, limiting the text to character pairs does preclude adding complete statements. It does, as others have noted, allow for corrupting the query and causing it to fail. That, in my mind, constitutes a problem.
I'm still trying to imagine a use-case for this rather odd pattern matching. It won't match a column value longer (or shorter) than two characters against a search string.

There definitely should be a canonical answer to all these innumerable "if I have [some special kind of data treatment] will be my query still vulnerable?" questions.
First of all you should ask yourself - why you are looking to buy yourself such an indulgence? What is the reason? Why do you want add an exception to your data processing? Why separate your data into the sheep and the goats, telling yourself "this data is "safe", I won't process it properly and that data is unsafe, I'll have to do something?
The only reason why such a question could even appear is your application architecture. Or, rather, lack of architecture. Because only in spaghetti code, where user input is added directly to the query, such a question can be ever occur. Otherwise, your database layer should be able to process any kind of data, being totally ignorant of its nature, origin or alleged "safety".

Related

Query to ignore rows which have non hex values within field

Initial situation
I have a relatively large table (ca. 0.7 Mio records) where an nvarchar field "MediaID" contains largely media IDs in proper hexadecimal notation (as they should).
Within my "sequential" query (each query depends on the output of the query before, this is all in pure T-SQL) I have to convert these hexadecimal values into decimal bigint values in order to do further calculations and filtering on these calculated values for the subsequent queries.
--> So far, no problem. The "sequential" query works fine.
Problem
Unfortunately, some of these Media IDs do contain non-hex characters - most probably because there was some typing errors by the people which have added them or through import errors from the previous business system.
Because of these non-hex chars, the whole query fails (of course) because the conversion hits an error.
For my current purpose, such rows must be skipped/ignored as they are clearly wrong and cannot be used (there are no medias / data carriers in use with the current business system which can have non-hex character IDs).
Manual editing of the data is not an option as there are too many errors and it is not clear with what the data must be replaced.
Challenge
To create a query which only returns records which have valid hex values within the media ID field.
(Unfortunately, my SQL skills are not enough to create the above query. Your help is highly appreciated.)
The relevant section of the larger query looks like this (xxxx is where your help comes in :-))
select
pureMediaID
, mediaID
, CUSTOMERID
,CONTRACT_CUSTOMERID
from
(
select concat('0x', Replace(Ltrim(Replace(mediaID, '0', ' ')), ' ', '0')) AS pureMediaID
--, CUSTOMERID
, *
from M_T_CONTRACT_CUSTOMERS
where mediaID is not null
and mediaID like '0%'
and xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
) as inner1
EDIT: As per request I have added here some good and some bad data:
Good:
4335463357
4335459809
1426427996
4335463509
4335515039
4335465134
4427370396
4335415661
4427369036
4335419089
004BB03433
004e7cf9c6
00BD23133
00EE13D8C1
00CCB5522C
00C46522C
00dbbe3433
Bad:
4564589+
AB6B8BFC.8
7B498DFCnm
DB218DFChb
d<tgfh8CFC
CB9E8AFCzj
B458DFCjhl
rytzju8DFC
BFCtdsjshj
DB9888FCgf
9BC08CFCyx
EB198DFCzj
4B628CFChj
7B2B8DFCgg
After I did upgrade the compatibility level of the SQL instance to SQL2016 (it was below 2012 before) I could use try_convert with same syntax as the original convert function as donPablo has pointed out. With that the query could run fully through and every MediaID which is not a correct hex value gets nicely converted into a null value - really, really nice.
Exactly what I needed.
Unfortunately, the solution of ALICE... didn't work out for me as this was also (strangely) returning records which had the "+" character within them.
Edit: The added comment of Alice... where you create a calculated field like this:
CASE WHEN "KEY" LIKE '%[^0-9A-F]%' THEN 0 ELSE 1 end as xyz
and then filter in the next query like this:
where xyz = 1
works also with SQL Instances with compatibility level < SQL 2012.
Great addition for people which still have to work with older SQL instances.
An option (although not ideal in terms of performance) is to check the characters in the MediaID through a case statement and regular expression
Hexadecimals cannot contain characters other than A-F and numbers between 0 and 9
CASE WHEN MediaID LIKE '%[0-9A-F]%' THEN 1 ELSE 0 END
I would recommend writing a function that can be used to evaluate MediaID first and checks if it is hexadecimal and then running the query for conversion

Why do I get different results depending on the function I use? (SQL Server)

I've been tasked with creating a report for my company. The report is generated from the results returned by the Stored Procedure spGenerateReport, which has multiple filters.
Inside the SP, this is how the filter is expected to work:
SELECT * FROM MyTable WHERE column1 IN (
'filters', 'for', 'this', 'report'
)
Entering the code above yields ~30000 rows in 9s. However, I want to be able to change my SP's filter by passing it a single argument (since I may use 1 or 2 or n filters), like so:
spGenerateReport 'Filters,for,this,report'
For this I have the User-Created Function fnSplitString (yes, I do know that there is a STRING_SPLIT function but I can't use it due to a lower compatibility level of my database) which splits a single string into a table, like so:
SELECT splitData FROM fnSplitString('Filters,for,this,report')
Returns:
splitData
------
Filters
for
this
report
Thus the final code in my SP is:
SELECT * FROM MyTable WHERE column1 IN (
SELECT * FROM fnSplitString('Filters,for,this,report')
)
However, this instead yields ~10000 rows in 60s. The time taken to complete this SP is weird but isn't too much of a problem, however nearly a quarter of my rows disappearing into the void certainly is. The results only have rows from the first couple filters (for example, 'Filters' and 'for'; if I change the order of the arguments (e.g.: fnSplitString('report,for,Filters,this')), I get a different number of rows, and only from filters 'report', 'for', 'Filters'! I don't understand why using the function returns different results than those obtained when using the literal strings. Is there some inside gimmick that I'm not aware of?
PS - I'm sorry in advance for being bad at explaining myself, and for any grammar mistakes
You should definitely be getting the same results with both techniques. Something is wrong.
You havent posted the fnSplitString code but I suspect fnSplitString is not outputting the last string in the list, or maybe the last string in the list is being truncated before it reaches fnSplitString so that no matches are found.
e.g. if the parameter going into your spGenerateReport stored procedure is varchar(20) then what will reach the function is 'Filters,for,this,rep' with the last bit truncated.
SSRS, for example, will truncate strings that are being passed into an SP instead of warning you with an error message

SQL not finding results

This query currently is returning no results, and it should. Can you see anything wrong with this query
field title are NEED_2_TARGET, ID, and CARD
NEED_2_TARGET = integer
CARD = string
ID = integer
value of name is 'Ash Imp'
{this will check if a second target is needed}
//**************************************************************************
function TFGame.checkIf2ndTargetIsNeeded(name: string):integer;
//**************************************************************************
var
targetType : integer; //1 is TCard , 2 is TMana , 0 is no second target needed.
begin
TargetType := 0;
Result := targetType;
with adoquery2 do
begin
close;
sql.Clear;
sql.Add('SELECT * FROM Spells WHERE CARD = '''+name+''' and NEED_2_TARGET = 1');
open;
end;
if adoquery2.RecordCount < 1 then
Result := 0
else
begin
Adoquery2.First;
TargetType := adoquery2.FieldByName(FIELD_TARGET_TYPE).AsInteger;
result := TargetType;
end;
end;
sql db looks like below
ID CARD TRIGGER_NUMBER CATEGORY_NUMBER QUANTITY TARGET_NUMBER TYPE_NUMBER PLUS_NUMBER PERCENT STAT_TARGET_NUMBER REPLACEMENT_CARD_NUMBER MAX_RANDOM LIFE_TO_ADD REPLACED_DAMAGE NEED_2_TARGET TYPE_OF_TARGET
27 Ash Imp 2 2 15 14 1 1
There are a number of things that could be going wrong.
First and most important in your trouble-shooting is to take your query and run it directly against your database. I.e. first confirm your query is correct by eliminating possibilities of other things going wrong. More things confirmed working, the less "noise" to distract you from solving the problem.
As others having pointed out if you're not clearing your SQL statement, you could be returning zero rows in your first result set.
Yes I know, you've since commented that you are clearing your previous query. The point is: if you're having trouble solving your problem, how can you be sure where the problem lies? So, don't leave out potentially relevant information!
Which bring us neatly to the second possibility. I can't see the rest of your code, so I have to ask: are you refreshing your data after changing your query? If you don't Close and Open your query, you may be looking at a previous execution's result set.
I'm unsure whether you're even allowed to change your query text while the component is Active, or even whether that depends on exactly which data access component you're using. The point is, it's worth checking.
Is your application connecting to the correct database? Since you're using Access, it's very easy to be connected to a different database file without realising it.
You can check this by changing your query to return all rows (i.e. delete the WHERE clause).
You my want to change the quotes used in your SQL query. Instead of: ...CARD = "'+name+'" ORDER... rather use ...CARD = '''+name+''' ORDER...
As far as I'm aware single quotes is the ANSI standard. Even if some databases permit double quotes, using them limits portability, and may produce unexpected results when passed through certain data access drivers.
Check the datatype of your CARD column. If it's a fixed length string, then the data values will be padded. E.g. if CARD is char(10), then you might actually need to look for 'Ash Imp '.
Similarly, the actual value may contain spaces before / after the words. Use select without WHERE and check the actual value of the column. You could also check whether SELECT * FROM Spells WHERE CARD LIKE '%Ash Imp%' works.
Finally, as others have suggested, you're better off using a parameterised query rather dynamically building the query up yourself.
Your code will be more readable and flexible.
You can make your code strongly typed; and so avoid converting things like numbers and dates into strings.
You won't need to worry about the peculiarities of date formatting.
You eliminate some security concerns.
#GordonLinoff all fields in db are all caps
If that is true then that is your problem. SQL usually performs case sensitive comparisons of character/string values unless you tell it not to do so, such as with STRCMP() (MySQL 4+), LOWER() or UPPER() (SQLServer, Firebird), etc. I would also go as far as wrapping the conditions in parenthesis as well:
sql.Text := 'SELECT * FROM Spells WHERE (NEED_2_TARGET = 1) AND (STRCMP(CARD, "'+name+'") = 0) ORDER by ID';
sql.Text := 'SELECT * FROM Spells WHERE (NEED_2_TARGET = 1) AND (LOWER(CARD) = "'+LowerCase(name)+'") ORDER by ID';
sql.Text := 'SELECT * FROM Spells WHERE (NEED_2_TARGET = 1) AND (UPPER(CARD) = "'+UpperCase(name)+'") ORDER by ID';
This is or was an issue with the
With Adoquery2 do
begin
...
end
when using name in the sql, it was really getting adoquery2.name not the var name. I fixed this by changing name to Cname had no more issues after that.

Sorting '£' (pound symbol) in sql

I am trying to sort £ along with other special characters, but its not sorting properly.
I want that string to be sorted along with other strings starting with special characters. For example I have four strings:
&!##
££$$
abcd
&#$%.
Now its sorting in the order: &!##, &#$%, abcd, ££$$.
I want it in the order: &!##, &#$%, ££$$, abcd.
I have used the function order by replace(column,'£','*') so that it sorts along with strings starting with *. Although this seems to work while querying the DB, when used in code and deployed the £ gets replaced by �, i.e. (replace(column,'�','*') in the query, and doesn't sort as expected.
How to resolve this issue? Is there any other solution to sort the pound symbol/£? Any help would be greatly appreciated.
You seem to have two problems; performing the actual sort, and (possibly) how the £ symbol appears in the results in your code. Without knowing anything about your code or client or environment it's rather hard to guess what you might need to change, but I'd start by looking at your NLS_LANG and other NLS settings at the client end. #amccausl's link might be useful, but it depends what you're doing. I suspect you'll find different values in nls_session_parameters when queried from SQL*Plus and from your code, which may give you some pointers.
The sorting itself is slightly clearer now. Have a look at the docs for Linguistic Sorting and String Searching and NLSSORT.
You can do something like this (with a CTE to generate your data):
with tmp_tab as (
select '&!##' as value from dual
union all select '££$$' from dual
union all select 'abcd' from dual
union all select '&#$%' from dual
)
select * from tmp_tab
order by nlssort(value, 'NLS_SORT = WEST_EUROPEAN')
VALUE
------
&!##
&#$%
££$$
abcd
4 rows selected.
You can get sort values supported by your configuration with select value from v$nls_valid_values where parameter = 'SORT', but WESTERN_EUROPEAN seems to do what you want, for this sample data anyway.
You can see the default sorting in your current session with select value from nls_session_parameters where parameter = 'NLS_SORT'. (You can change that with an ALTER SESSION, but it's only letting me do that with some values, so that may not be helpful here).
You need to make sure your application code is all proper UTF-8 (see http://htmlpurifier.org/docs/enduser-utf8.html for more details)
Seems like your issue is with db characterset, or difference in charactersets between the app and db. For Oracle side, you can check by doing:
select value from sys.nls_database_parameters where parameter='NLS_CHARACTERSET';
If this comes up ascii (like US7ASCII), then you may have issues storing the data properly. Even if this is the charset, you should be able to insert and retrieve sorted (binary sort) by using nvarchar2 and unistr (assuming they conform to your NLS_NCHAR_CHARACTERSET, see above query but change parameter), like:
create table test1(val nvarchar2(100));
insert into test1(val) values (unistr('\00a3')); -- pound currency
insert into test1(val) values (unistr('\00a5')); -- yen currency
insert into test1(val) values ('$'); -- dollar currency
commit;
select * from test1
order by val asc;
-- will give symbols in order: dollar('\0024'), pound ('\00a3'), yen ('\00a5')
I will say that I would not resort to using the national characterset, I would probably change the db characterset to fit the needs of my data, as supporting 2 diff character sets isn't ideal, but its available anyway
If you have no issues storing/retrieving on the data side, then your app/client characterset is probably different than your db.
Use nchar(168). It will work.
select nchar(168)

SQL REPLACE function, how to replace single letter

My code is as follows:
REPLACE(REPLACE(cc.contype,'x','y'),'y','z') as ContractType,
This REPLACE's correctly what I would like, but it unfortunatley changes all "z's" to "y's" when I would like
x > y
y > z
Does this make sense? I would not like all of the new Y's to then change again in my second REPLACE function. In Microsoft Access, I would do this with the following
Iif(cc.contype = x, y, iif(cc.contype = y, x))
But I am not sure how to articulate this in SQL, would it be best I do this kind of thing in the client side language?
Many thanks.
EDIT: Have also tried with no luck:
CASE WHEN SUBSTRING(cc.contype, 1, 1) = 'C'
THEN REPLACE(cc.contype, 'C', 'Signed')
CASE WHEN SUBSTRING(cc.contype, 1, 1) = 'E'
THEN REPLACE(cc.contype, 'E', 'Estimate') as ContractType,
Try doing it the other way round if you don't want the new "y"'s to become "z"'s:
REPLACE(REPLACE(cc.contype,'y','z'),'x','y') as ContractType
Not that I'm a big fan of the performance killing process of handling sub-columns, but it appears to me you can do that just by reversing the order:
replace(replace(cc.contype,'y','z'),'x','y') as ContractType,
This will transmute all the y characters to z before transmuting the x characters to y.
If you're after a more general solution, you can do unioned queries like:
select 'Signed: ' || cc.contype as ContractType
wherecc.contype like 'C%' from wherever
union all select 'Estimate: ' || cc.contype as ContractType
where cc.contype like 'E%' from wherever
without having to mess about with substrings at all (at the slight cost of prefixing the string rather than modifying it, and adding any other required conditions as well, of course). This will usually be much more efficient than per-row functions.
Some DBMS' will actually run these sub-queries in parallel for efficiency.
Of course, the ideal solution is to change your schema so that you don't have to handle sub-columns. Separate the contype column into two, storing the first character into contype_first and contype_rest.
Then whenever you want the full contype:
select contype_first || contype_rest ...
For your present query, you could then use a lookup table:
lookup_table:
first char(1) primary key
description varchar(20)
containing:
first description
----- -----------
C Signed:
E Estimate:
and the query:
select lkp.description || cc.contype_rest
from lookup_table lkp, real_table cc
where lkp.first = cc.first ...
Both these queries are likely to be blazingly fast compared to one that does repeated string substitutions on each row.
Even if you can't replace the single column with two independent columns, you can at least create the two new ones and use an insert/update trigger to keep them in sync. This gives you the old way and a new improved way for accessing the contype information.
And while this technically violates 3NF, that's often acceptable for performance reasons, provided you understand and mitigate the risks (with the triggers).
How about
REPLACE(REPLACE(REPLACE(cc.contype,'x','ahhhgh'),'y','z'),'ahhhgh','y') as ContractType,
ahhhgh can be replaced with whatever you like.