Why is '[0] so commonly (and recently) appended to query strings? - websecurity

I use a WAF that monitors suspicious query string and form POST values. Over the last several months I've noticed a dramatic uptick in query strings that have this sequence of 4 characters appended to the usual (i.e. 'normal') values:
'[0]
Why is this so common? I know why stuff like SQL injection attempts and JS snippets appear all the time, but is this string a known attack vector?

Related

No space before ORDER BY - Why does this work?

I came across some SQL in an application which had no space before the "ORDER BY" clause. I was surprised that this even works.
Given a table of numbers, called [counter] where there is simply one column, counter_id that is an incrementing list of integers this SQL works fine in Microsoft SQL Server 2012
select
*
FROM [counter] c
where c.counter_id = 1000ORDER by counter_id
This also works with strings, e.g.:
WHERE some_string = 'test'ORDER BY something
My question is, are there any potential pitfalls or dangers with this query? And conversely, are there any benefits? Other than saving, what, 8 bits of network traffic for that whitespace (whcih may well be a consideration in some applications)
Let me explain the reason why this works with numbers and strings.
The reason is because numbers cannot start identifiers, unless the name is escaped. Basically, the first things that happens to a SQL query is tokenization. That is, the components of the query are broken into identifiers and keywords, which are then analyzed.
In SQL Server, keywords and identifiers and function names (and so on) cannot start with a digit (unless the name is escaped, of course). So, when the tokenizer encounters a digit, it knows that it has a number. The number ends when a non-digit character is encountered. So, a sequence of characters such as 1000ORDER BY is easily turned into three tokens, 1000, ORDER, and BY.
Similarly, the first time that a single quote is encountered, it always represents a string literal. The string literal ends when the final single quote is encountered. The next set of characters represents another token.
Let me add that there is exactly zero reason to ever use these nuances. First, these rules are properties of SQL Server's tokenization and do not necessarily apply to other databases. Second, the purpose of SQL is for humans to be able to express queries. It is way, way more important that we read them.
As jarlh mentioned there might be difference during scanning and parsing the tokens but it creates correctly during execution plan, hence it might not be huge difference in advantages or disadvantages
When parser examines characters ,it checks for keywords,identifiers,string constants and match overall semantic and syntactic structure of the language. Since 'Order by' is a keyword and sql parser knows its possible syntactic location in a query, it will interpret it accordingly without throwing any error. This is the reason why your order by will not throw any error.
Parsing sql query
Parsing SQL

data sanitization/clean-up

Just wondering…
We have a table where the data in certain fields is alphanumeric, comprising a 1-2 digit alpha followed by a 1-2 digit number e.g. x2, x53, yz1, yz95
The number of letters added before the number can be determined by the field so that certain fields will always have the same 1 letter added before the number while others will always have the same 2 letters.
For each field, the actual letters and number of letters added (1 or 2) are always the same, thus, we can always tell which letters appear before the numbers just via the field names.
For the purposes of all downstream data analysis, it is only ever the numeric value from the string which is important.
Sql queries are constructed dynamically behind a user form where the final sql can take many forms depending on which selections and switches the user has chosen. With this, the VBA generating the sql constructs is fairly involved, containing many conditions/variable pathways to the final sql construct.
With this, it would make the VBA and sql much easier to write, read, debug, and perhaps increase the sql execution speed, etc. – if we were only dealing with a numeric datatype e.g. I wouldn’t need to accommodate the many apostrophes within the numerous lines of “strSQL = strSQL & …”
Given that the data itself being analysed is a copy that’s imported via regular .csv extracts from a live source, would it be acceptable to pre sanitize/clean-up these fields around the import stage by converting the data within to numeric values and field datatypes?
- perhaps either by modifying the sql used to generate the extract or by modifying the schema/vba process used to import the extract into the analysis table e.g. using something like a Replace function such as “ = Replace(OriginalField,”yz”,””) “ to strip out the yz characters.
Yes, link the csv "as is", and for each linked table create a straight select query that does the sanitization, like:
Select
Val(Mid([Field1], 2)) As NumField1,
Val(Mid([Field2], 1)) As NumField2,
etc.
Val(Mid([FieldN], 2)) As NumFieldN
From
YourLinkedCsvTable
then use this query throughout your application when you need the data.

Is there a way to make the Sql Server FullText engine to stop recognizing numbers as numbers?

I have an user input processor that transforms this search into "this" AND "search". My processor works great, it breaks the user input into pieces according to what I believe to be word breakers, which include punctuation.
The problem is that, while indexing, Sql Server applies a special approach when it comes to numbers.
if you index the following string:
12,20.1231,213.23,45,345.234.324.556,234.534.345
it will find the following index keys:
12
1231
20
213
23,45 (detected decimal separator)
234.534.345 (detected thousand separator)
345.234.324.556 (detected thousand separator)
Now, if the user searches for 324 he/she won't find anything, because 324 is contained within the last entry, not in the beggining, so it is not going to be found. I'd like it to stop treating numbers as numbers and just index it the way it does with words.
Is there a way to alter this behavior? without implementing too much code?
Not sure if I understand the question correctly, but if you say SQL is finding numbers, where it should find strings then casting them as strings seems to be the solution.
select cast(345234324556 as nvarchar)

does removing all non-numeric characters effectively escape data?

I use this function to strip all non-numeric from a field before writing to a MYSQL dB:
function remove_non_numeric($inputtext) {return preg_replace("/[^0-9]/","",$inputtext);
Does this effectively escape the input data to prevent SQL Injection? I could wrap this function in mysql_real_escape_string, but thought that might be redundant.
Assumption is the mother of all bleep when it comes to sql injection. Wrap it in mysql_real_escape_string anyway.
It does not escape the data, but it is indeed an example of an OWASP recommended approach.
By removing all but numerics from the input you are effectively protecting against SQL-Injection by implementing a White list. There is no amount of paranoia that can make the resulting string (in this specific case) into an effective SQL Injection payload.
However, code ages, and changes and is misunderstood as it's inherited by new developers. So the bottom line, the correct advice, the end all and be all - is to actively prevent against SQL Injection with one or more of the following 3 steps. In this order. Ever. Single. Time.
Use a safe database API. (prepared statements or parametrized queries for example)
Use db specific escaping or escaping routines (mysql_real_escape_string falls into this category).
White list the domain of acceptable input values. (Your proposed numeric solution falls into this category)
mysql_real_escape_string is not the answer for all anti-sql-injection. It's not even the most robust method, but it works. Stripping all but numerics is white listing the safe values and is also a sound idea - however, neither are as good as using a safe API.

How can you query a SQL database for malicious or suspicious data?

Lately I have been doing a security pass on a PHP application and I've already found and fixed one XSS vulnerability (both in validating input and encoding the output).
How can I query the database to make sure there isn't any malicious data still residing in it? The fields in question should be text with allowable symbols (-, #, spaces) but shouldn't have any special html characters (<, ", ', >, etc).
I assume I should use regular expressions in the query; does anyone have prebuilt regexes especially for this purpose?
If you only care about non-alphanumerics and it's SQL Server you can use:
SELECT *
FROM MyTable
WHERE MyField LIKE '%[^a-z0-9]%'
This will show you any row where MyField has anything except a-z and 0-9.
EDIT:
Updated pattern would be: LIKE '%[^a-z0-9!-# ]%' ESCAPE '!'
I had to add the ESCAPE char since you want to allow dashes -.
For the same reason that you shouldn't be validating input against a black-list (i.e. list of illegal characters), I'd try to avoid doing the same in your search. I'm commenting without knowing the intent of the fields holding the data (i.e. name, address, "about me", etc.), but my suggestion would be to construct your query to identify what you do want in your database then identify the exceptions.
Reason being there are just simply so many different character patterns used in XSS. Take a look at the XSS Cheat Sheet and you'll start to get an idea. Particularly when you get into character encoding, just looking for things like angle brackets and quotes is not going to get you too far.