CAT.NET "Sanitize the file path prior to passing it to file system routines" message

CAT.NET "Sanitize the file path prior to passing it to file system routines" message - code-analysis

I'm analyzing my code (C#, desktop application) with CAT.NET Code Analysis and getting "Sanitize the file path prior to passing it to file system routines" message when dealing with file names.
What I don't understand is that to ensure the file name is valid, I use:
void SomeMethod(String filename)
{
filename = System.IO.Path.GetFullPath(filename);
// ... Do stuff
}
Isn't it a "magic solution" to solve problems with invalid file names ? I've read something similar here (first answer), but in my case I'm dealing only with local files, well, something very basic, so...
So why I'm getting this message and how to do to avoid getting it?

I know this is an old question, but I've just come across something that may be helpful specifically related to the CAT.Net error message.
In a blog post about the CAT.Net Data Flow Rules, they have this to say about the FileCanonicalizationRule:
Description
User input used in the file handling routines can potentially lead to
File Canonicalization vulnerability. Code is particularly susceptible
to canonicalization issues if it makes any decisions based on the name
of a resource that is passed to the program as input. Files, paths,
and URLs are resource types that are vulnerable to canonicalization
because in each case there are many different ways to represent the
same name.
Resolution
Sanitize the file path prior to passing it to file handling routines.
Use Path.GetInvalidFileNameChars or Path.GetInvalidPathChars to get
the invalid characters and remove them from the input. More
information can be found at
http://msdn.microsoft.com/en-us/library/system.io.path.getinvalidfilenamechars.aspx.
So, they suggest that you use Path.GetInvalidFileNameChars and Path.GetInvalidPathChars to validate your paths.
Note that their suggestion is to remove the invalid characters. While this will indeed make the path valid, it may cause unexpected behaviour for the user. As the comments on this question/answer suggest it's probably better to quit early and tell the user that their path is invalid, rather than doing something unexpected with their input (like removing bad characters and using the modified version).

If the filename comes from a user, it could be something like "../../../../etc/passwd" - the error message is telling you that you need to sanitize it so that it can't get to directories it's not supposed to.

Related

Response.BinaryWrite(TEXT) is causing Fortify/Checkmarx XSS error

This is a pretty straightforward situation. We have a database table that has a VARBINARY(MAX) field, this field contains a text file. On the .NET side the user can download the text file from the database. It's just plain text and coming from a trusted source. However, fortify/checkmarx complains about Stored XSS. The code is pretty straightforward.
Response.Clear()
Response.ContentType = "text/plain"
Response.AddHeader("content-disposition, $"attachment;filename=FileToDownload.txt")
Response.BinaryWrite(datafromDB)
Response.[End]()
The vulnerability scan points to the Response.BinaryWrite() and complains of Stored XSS, of course this is silly considering it's coming from a trusted source. However, I want to find a way to remediate this Is there a way to filter out the "datafromDB" object or sanitize this before it hits the response.BinaryWrite.

If the response is text, escaping it should not change the content to a form that would be potentially exploitable. You should escape data before it is sent to the client, here is why:
"Trusted data" does not exist. You are not considering:
A disgruntled employee may change the data in the DB and inject exploitable content.
There may be more than one path for data to get into the database. Not all of those paths may perform sanitization on the data before it is written to the DB. It is often common to have applications where data is imported into the back end database via an off-line mechanism.
The vulnerability analysis is not going to consider the content-type since setting the content type is essentially "control flow" analysis. The current vulnerability analysis technique (Data flow analysis) looks at the path of the data from source (DB) to sink (response stream) to recognize the exploit pattern. Setting the content type is not on that data flow path; even if it was, the static string content is not evaluated for how it affects state of a response object because it is a control flow change.
If you mark it as "Not Exploitable" as-is, it is true for the code in the current state that it is not-exploitable. However, if someone comes along later and changes the encoding value without changing the code involved in the data flow, the NE marking is maintained. You will therefore have an undetected "False Negative" because the control flow changed that now made the code exploitable.
Reliance on "compensating controls" such as setting response headers or relying on deployment configuration/environment controls works until it doesn't. Eventually, someone may decide to make a change that removes a compensating control (e.g. changing the response content type in this case) that was used to justify marking something Not Exploitable rather than remediating the issue properly. It then becomes a security hole that may be sitting in the open waiting for someone to find and exploit.
I'm all for relying on compensating controls, but with that comes the need to ensure changes to compensating controls are detected. It is often easier to implement the proper remediation rather than add more automation around detecting changes to compensating controls. The software implementation is the last line of defense when all else fails.

If there's a typo in my program's settings file, what is the proper way to handle it?

I have a program that reads the settings from a settings.ini file. If it comes across a typo or similar issue, what should it do?
Given a settings.ini-file's syntax is like this:
DefaultFolder = C:\Settings
Programversion = 0.52
If the program comes across, say, a spelling error, like this:
DefauFolder = C:\Settings
What should it do?

The answer to this is, inevitably, 'Fail gracefully'. However, in the case of a settings file the 'fail' part is what is in question.
Typically, when a piece of code requires a setting, it will fail if it doesn't find that setting, throwing an appropriate error or exception. In that case, it is the code that needs the setting that should fail gracefully.
Also typical is to see a settings file that allows settings beyond what is strictly required. All additional settings are ignored. Whatever ingests the settings simply skips by them (or puts them in a registry, or what have you). But it still assumes the user knows what they are doing.
Anything more complex requires knowing a lot about user intent.
Given your specific example, this means:
The DefauFolder setting is consumed by the setting ingester and placed in a registry or hashmap, where no one looks at it.
When code needs the DefaultFolder setting, it looks in the settings registry and doesn't find it. It fails with an error message that says something like, "DefaultFolder setting is required in the ini file to complete , but could not be found."

Modsecurity create config file with rules for specific URL

I'm starting to learn about ModSecurity and rule creation so say I know a page in an web app is vulnerable to cross site scripting. For argument's sake lets say page /blah.html is prone to XSS.
Would the rule in question look something like this?:
SecRule ARGS|REQUEST_URI "blah" REQUEST_HEADERS “#rx <script>” id:101,msg: ‘XSS Attack’,severity:ERROR,deny,status:404
Is it possible to create a config file for that particular page (or even wise to do so?) or better said is it possible to create rules for particular URL's?

Not quite right as a few things wrong with this rule as it's written now. But think you get the general concept.
To explain what's wrong with the rule as it currently stands takes a fair bit of explanation:
First up, ModSecurity syntax for defining rules is made up of several parts:
SecRule
field or fields to check
values to check those fields for
actions to take
You have two sets of parts 2 & 3 which is not valid. If you want to check 2 things you need to chain two rules together, which I'll show an example of later.
Next up you are checking the Request Headers for the script tag. This is not where most XSS attacks exist and you should be checking the arguments instead - though see below for further discussion on XSS.
Also checking for <script> is not very thorough. It would easily be defeated by <script type="text/javascript"> for example. Or <SCRIPT> (should add a t:lowercase to ignore case). Or by escaping characters that might be processed by your parts of your application.
Moving on, there is no need to specify the #rx operator as that's implied if no other operator is given. While no harm in being explicitly it's a bit odd you didn't give it for blah but did for the <script> bit.
It's also advisable to specify a phase rather than use default (usually phase 2). In this case you'd want phase 2 which is when all Request Headers and Request Body pieces are available for processing, so default is probably correct but best to be explicit.
And finally 404 is "page not found" response code. A 500 or 503 might be a better response here.
So your rule would be better rewritten as:
SecRule REQUEST_URI "/blah.html" id:101,phase:2,msg: ‘XSS Attack’,severity:ERROR,deny,status:500,chain
SecRule ARGS "<script" "t:lowercase"
Though this still doesn't address all the ways that the basic check you are doing for a script tag can be worked around, as mentioned above. The OWASP Core Rule Set is a free ModSecurity set of rules that has been built up over time and has some XSS rules in it that you can check out. Though be warned some of their regexprs get quite complex to follow!
ModSecurity also works better as a more generic check, so it's unusual to want to protect just one page like this (though not completely unheard of). If you know one page is vulnerable to a particular attack then it's often better to code that page, or how it's processed, to fix the vulnerability, rather than using ModSecurity to handle it. Fix the problem at source rather than patching round it, is always a good mantra to follow where possible. You would do that by sanitising and escaping input HTML code from inputs for example.
Saying that it is often a good short term solution to use a ModSecurity rule to get a quick fix in while you work on the more correct longer term fix - this is called "virtual patching". Though sometimes they have a tendency to become the long term solutions then as no one gets time to fix the core problem.
If however you wanted a more generic check for "script" in any arguments for any page, then that's what ModSecurity is more often used for. This helps add extra protection on what already should be there in a properly coded app, as a back up and extra line of defence. For example in case someone forgets to protect one page out of many by sanitising user inputs.
So it might be best dropping the first part of this rule and just having the following to check all pages:
SecRule ARGS "<script" id:101,phase:2,msg: ‘XSS Attack’,severity:ERROR,deny,status:500,"t:lowercase"
Finally XSS is quite a complex subject. Here I've assumed you want to check parameters sent when requesting a page. So if it uses user input to construct the page and displays the input, then you want to protect that - known as "reflective XSS." There are other XSS vulnerabilities though. For example:
If bad data is stored in a database and used to construct a page. Known as "stored XSS". To address this you might want to check the results returned from the page in the RESPONSE_BODY parameter in phase 4, rather than the inputs sent to the page in the ARGS parameter in phase 2. Though processing response pages is typically slower and more resource intensive compared to requests which are usually much smaller.
You might be able to execute a XSS without going through your server e.g. If loading external content like a third party commenting system. Or page is served over http and manipulated between server and cling. This is known as "DOM-based XSS" and as ModSecurity is on your server then it cannot protect against these types of XSS attacks.
Gone into quite a lot of detail there but hope that helps explain a little more. I found the ModSecurity Handbook the best resource for teaching yourself ModSecurity despite its age.

Stopping invalid file type or file name submissions in coldfusion

So, I'm having this lovely issue where people like to submit invalid file types or funky named files... (like.. hey_i_like_"quotes".docx) Sometimes they will even try to upload a .html link...
How should I check for something like this? It seems to create an error every time someone submits a poorly named item.
Should I create a cfscript that checks it before submission? Or is there an easier way?

If it was before submission it would be javascript not cfscript. Javascript can always be got round, so I'd say you'd be better doing it server-side with ColdFusion. Personally I'd just wrap the whole thing in a try/catch (you should do this anyway as a matter of course with all file upload type things), and throw an error back at them if their filename is no good.

When you say submit are you using cffile to allow your users to upload file.
If so, use the attribute "accept" with a try and catch around. for example....
<cftry>
<cffile action = "upload"
fileField = "FileContents"
destination = "c:\files\upload\"
accept="image/jpg, application/msword"
>
<cfcatch type="Any" >
<p>sorry we could not upload your file!</p>
</cfcatch>
</cftry>
I personally would not use "just" JavaScript as this could be disabled and you are back in the same boat.
Hope this helps.

On the server, as part of validation, use reFindNoCase() along with an appropriate regex to check for a properly formatted file path. You can find lots of example regex expressions for a file path on the internet, such at this one. Hope that helps.

As #Duncan pointed out, a client-side validation would most likely be in JavaScript. Personally, if I had time/resources, I would do this as a convenience for the end user. If they upload an enormous PDF when a DOCX is required by the system, it would be annoying for them not to receive a message until the upload is complete.
As far as filenames go, it seems to me that the simplest solution (and one I've used in the past) is to assume all filenames are bad, and rename them. There are several ways to do this. If you need to preserve the original filename, I would just use urlEncodedFormat() ot clean the filename into something that is web-friendly. If you need to preserve all versions, you can append a date/time stamp, so bob.xocx becomes bob_201104051129.docx or somesuch. If you must keep the original filename without any changes, I would recommend seting up a DB table as a pinter system, keeping the original name, timestamp, and other metadata there and referring to the file by renaming it to the ID.
But urlEncodedFormat() is probably enough for what you've outlined.

For user experience it's best to do it client-side but it's not bad at all to double check server side too.
For the client side part, I recommend using the jQuery validation plugin, easy to use.

Why would Apache be URL decoding my query string?

My Web host has refused to help me with this, so I'm coming to the wise folks here for some help "black-box debugging". Here's an edited version of what I sent to them:
I have two (among other) domains at dreamhost:
1) thefigtrees.net
2) shouldivoteformccain.com
I noticed today that when I host a CGI script on #1, that by the time the
CGI script runs, the HTTP GET query string passed to it as the QUERY_STRING
environment variable has already been URL decoded. This is a problem because
it then means that a standard CGI library (such as perl's CGI.pm) will try to
split on ampersands and then decode the string itself. There are two
potential problems with this:
1) the string is doubly-decoded, so if a value is submitted to the script
such as "%2525", it will end up being treated as just "%" (decoded twice)
rather than "%25" (decoded once)
2) (more common) if there is an ampersand in a value submitted, then it
will get (properly) submitted as %26, but the QUERY_STRING env. variable will
have it already decoded into an "&" and then the CGI library will improperly
split the query string at that ampersand. This is a big problem!
The script at http://thefigtrees.net/test.cgi demonstrates this. It echoes back the
environment variables it is called with. Navigating in a browser to:
http://thefigtrees.net/lee/test.cgi?x=y%26z
You can see that REQUEST_URI properly contains x=y%26z (unencoded) but that
QUERY_STRING already has it decoded to x=y&z.
If I repeat the test at domain #2 (
http://www.shouldivoteformccain.com/test.cgi?x=y%26z ) I see that the
QUERY_STRING remains undecoded, so that CGI.pm then splits and decodes
correctly.
I tried disabling my .htaccess files on both to make sure that was not the
problem, and saw no difference.
Could anyone speculate on potential causes of this, since my Web host seems unwilling to help me?
thanks,
Lee

I have the same behavior in Apache.
I believe mod_rewrite will automatically decode the URL if it is installed, however, I have seen the auto-decode behavior even without it. I haven't tracked down the other culprit.
A common workaround is to double encode the input parameter (taking advantage of URL decoding being safe when called on an unencoded URL).

Curious. Nothing I can see from here would give us a clue why this would happen... I can only confirm that it is an environment bug and suspect maybe configuration differences like maybe rewrite rules.
Per CGI 1.1, this decoding should only happen to SCRIPT-NAME and PATH-INFO, not QUERY-STRING. It's pointless and annoying that it happens at all, but that's the spec. Using REQUEST-URI instead of those variables where available (ie. Apache) is a common workaround for places where you want to put out-of-bounds and Unicode characters in path parts, so it might be reasonable to do the same for query strings until some sort of resolution is available from the host.
VPSs are cheap these days...

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas