This is a pretty straightforward situation. We have a database table that has a VARBINARY(MAX) field, this field contains a text file. On the .NET side the user can download the text file from the database. It's just plain text and coming from a trusted source. However, fortify/checkmarx complains about Stored XSS. The code is pretty straightforward.
Response.Clear()
Response.ContentType = "text/plain"
Response.AddHeader("content-disposition, $"attachment;filename=FileToDownload.txt")
Response.BinaryWrite(datafromDB)
Response.[End]()
The vulnerability scan points to the Response.BinaryWrite() and complains of Stored XSS, of course this is silly considering it's coming from a trusted source. However, I want to find a way to remediate this Is there a way to filter out the "datafromDB" object or sanitize this before it hits the response.BinaryWrite.
If the response is text, escaping it should not change the content to a form that would be potentially exploitable. You should escape data before it is sent to the client, here is why:
"Trusted data" does not exist. You are not considering:
A disgruntled employee may change the data in the DB and inject exploitable content.
There may be more than one path for data to get into the database. Not all of those paths may perform sanitization on the data before it is written to the DB. It is often common to have applications where data is imported into the back end database via an off-line mechanism.
The vulnerability analysis is not going to consider the content-type since setting the content type is essentially "control flow" analysis. The current vulnerability analysis technique (Data flow analysis) looks at the path of the data from source (DB) to sink (response stream) to recognize the exploit pattern. Setting the content type is not on that data flow path; even if it was, the static string content is not evaluated for how it affects state of a response object because it is a control flow change.
If you mark it as "Not Exploitable" as-is, it is true for the code in the current state that it is not-exploitable. However, if someone comes along later and changes the encoding value without changing the code involved in the data flow, the NE marking is maintained. You will therefore have an undetected "False Negative" because the control flow changed that now made the code exploitable.
Reliance on "compensating controls" such as setting response headers or relying on deployment configuration/environment controls works until it doesn't. Eventually, someone may decide to make a change that removes a compensating control (e.g. changing the response content type in this case) that was used to justify marking something Not Exploitable rather than remediating the issue properly. It then becomes a security hole that may be sitting in the open waiting for someone to find and exploit.
I'm all for relying on compensating controls, but with that comes the need to ensure changes to compensating controls are detected. It is often easier to implement the proper remediation rather than add more automation around detecting changes to compensating controls. The software implementation is the last line of defense when all else fails.
Related
I like to use the correct HTTP methods when I'm creating an API. And usually it's very straightforward. POST for creating entities, PUT for updating them, GET for retrieving etc.
But I have a use-case here where I will create an endpoint that updates the status of multiple objects given 1 identifier.
e.g.:
/api/v1/entity/update-status
But note that I mentioned multiple objects. The initial thought of my team would be to use map it as POST, but it won't actually be creating anything, plus if you were to call the same endpoint multiple times with the same identifier, nothing would change after the first time. Making it idempotent.
With this in mind, my idea was to create it as a PUT or even PATCH endpoint.
What do you smart people think?
I imagine PATCH would be the most correct way. Although if you use a PUT it would also not be incorrect.
The difference between the PUT and PATCH requests is reflected in the
way the server processes the enclosed entity to modify the resource
identified by the Request-URI. In a PUT request, the enclosed entity
is considered to be a modified version of the resource stored on the
origin server, and the client is requesting that the stored version be
replaced. With PATCH, however, the enclosed entity contains a set of
instructions describing how a resource currently residing on the
origin server should be modified to produce a new version. The PATCH
method affects the resource identified by the Request-URI, and it also
MAY have side effects on other resources; i.e., new resources may be
created, or existing ones modified, by the application of a PATCH.
Whilst it is a convention in REST APIs that POST is used to create a resource it doesn't necessarily have to be constrained to this purpose.
Referring back to the definition of POST in RFC 7231:
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. For example, POST is used for the following functions (among others):
Providing a block of data, such as the fields entered into an HTMl form, to a data-handling process
Posting a message to a bulletin board, newsgroup, mailing list, blog, or similar group of articles;
*Creating a new resource that has yet to be identified by the origin server; and *
Appending data to a resource's existing representation(s).
Clearly creation is only one of those purposes and updating existing resources is also legitimate.
The PUT operation is not appropriate for your intended operation because again, per RFC, a PUT is supposed to replace the content of the target resource (URL). The same also applies to PATCH but, since it is intended for partial updates of the target resource you can target it to the URL of the collection.
So I think your viable options are:
POST /api/v1/entity/update-status
PATCH /api/v1/entity
Personally, I would choose to go with the PATCH as I find it semantically more pleasing but the POST is not wrong. Using PATCH doesn't gain you anything in terms of communicating an idempotency guarantee to a consumer. Per RFC 5789: "PATCH is neither safe nor idempotent" which is the same as POST.
I am trying to design a REST API that will return detailed information about an identifier that I pass in. For the sake of an example, I am passing in an identifier and returning information about a specific vehicle. The problem that I am facing is that there are many different kinds of vehicles, each with different unique properties. I am wondering if there is a way so that I can only return the relevant details with the REST API.
Currently I plan on having one endpoint /vehicles and passing in the identifier as a parameter.
My current request will consist of something like this GET /vehicles?id=123456
My current response structure will be something like this:
{
"vehicleDetails" : {
"color": "someColor",
"make: "someMake",
"model: "someModel",
"year: "someYear",
"carDetails": {
// some unique car fields
},
"motorcycleDetails" : {
// some unique motorcycle fields
},
"boatDetails" : {
// some unique boat fields
}
}
}
As you can see, there are some fields that are common to all vehicle types, but there are also fields that are unique to a certain type of vehicle, for example boatDetails. As far as I understand, I will have to return the entire resource which will have many empty fields. For example, when I request information about a car, I will still have boat and motorcycle details returned as part of the JSON response, even though they will all be empty. My concern with this is that the response payloads will be rather large when only a small subset of the fields will actually be used by the consumer. Would it make sense to add another parameter to filter the fields that come back? Something like /vehicles?id=123456&type=Car? Then in my code I could manipulate the response structure based on the type parameter? I feel that this violates REST best practices. Any advice into how I could change the design of my API would be appreciated.
Note: I cannot use GraphQL for this and would appreciate input about how I can improve this REST API design
Sure,query parameters (as well as matrix and path parameters) are fine from a REST standpoint. You'll end up with a unique URI that identifies a resource. Responses will be cacheable regardles what type of parameters you use. It is though questionable whether exposing the parameter as query parameter has any advantages over exposing it directly as path paramter, i.e. /vehicles/12345 in that case.
What IMO is more important than the actual form of the URI is the representation format you return. The proposed response format bears the potential of being treated as typed resource instead of utilizing a general, multi-purpose format and propper content-type negotiation. Think of HTML i.e. it defines the syntax and semantics of certain hyper-media controls you can use and any arbitrary Web client will be able to present it and operate on the response format. With custom formats, however, you will miss out on that feature.
If you only have one client, that moreover is under your control, the additional overhead you need to put in is probably not worth aiming for a REST environment in general, as it is easier to change the client once the server stuff changed. Though if you aim for a long-living environment that may be utilized by clients not under your control, this is for sure a thing to consider.
Can I use parameters to change the response structure?
Yes.
Example: https://www.merriam-webster.com/dictionary/JPEG
Notice that -- even though the URI says JPEG, the server is actually returning a text/html document, and the browser knows that from the Content-Type header in the HTTP response.
Identifiers are semantically opaque, in the same way that variable names are. We tend to choose spellings that make things easy for humans, but the machines don't care what spelling conventions we use.
So there's no constraint that says that /vehicles?id=1 and /vehicles?id=2 have to have the same "structure".
So you could have application/prs.rbanders-car, application/prs.rbanders-boat and application/prs.rbanders-rocketship, each with its own specific processing rules.
More likely, though, you'll want to piggyback on some other more common representation; so it's common to see a structured syntax suffix: application/prs.rbanders-car+json`, etc. Effectively, you are promising a json document, but with a more specific schema.
Of course, there's nothing stopping you from creating a application/prs.rbanders-vehicle+json schema, and then describing some fields as optional that will only appear if some condition is met.
Different options, different trade offs.
I feel that this violates REST best practices.
Not really - the important ideas are to handle metadata in a standard way (so that general purpose components can understand what is going on) and to use common formats where it makes sense to do so (to leverage the libraries that are already available).
I am designing my first REST API.
Suppose I have a (SOAP) web service that takes MyData1 and returns MyData2.
It is a pure function with no side effects, for example:
MyData2 myData2 = transform(MyData myData);
transform() does not change the state of the server. My question is, what REST call do I use? MyData can be large, so I will need to put it in the body of the request, so POST seems required. However, POST seems to be used only to change the server state and not return anything, which transform() is not doing. So POST might not be correct? Is there a specific REST technique to use for pure functions that take and return something, or should I just use POST, unload the response body, and not worry about it?
I think POST is the way to go here, because of the sheer fact that you need to pass data in the body. The GET method is used when you need to retrieve information (in the form of an entity), identified by the Request-URI. In short, that means that when processing a GET request, a server is only required to examine the Request-URI and Host header field, and nothing else.
See the pertinent section of the HTTP specification for details.
It is okay to use POST
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”
It's not a great answer, but it's the right answer. The real issue here is that HTTP, which is a protocol for the transfer of documents over a network, isn't a great fit for document transformation.
If you imagine this idea on the web, how would it work? well, you'd click of a bunch of links to get to some web form, and that web form would allow you to specify the source data (including perhaps attaching a file), and then submitting the form would send everything to the server, and you'd get the transformed representation back as the response.
But - because of the payload, you would end up using POST, which means that general purpose components wouldn't have the data available to tell them that the request was safe.
You could look into the WebDav specifications to see if SEARCH or REPORT is a satisfactory fit -- every time I've looked into them for myself I've decided against using them (no, I don't want an HTTP file server).
In the datatable draw parameter documentation said
is strongly recommended for security reasons that you cast this parameter to an integer, rather than simply echoing back to the client what it sent in the draw parameter, in order to prevent Cross Site Scripting (XSS) attacks
How can cast a parameter to int can help to prevent Cross Site Scripting.?
You aren't supposed to do anything with it :-). On the client-side it is dealt with automatically by DataTables. On the server-side all you do is cast as int, and send it back. This example shows basic initialisation of server-side processing:
http://datatables.net/examples/data_sources/server_side.html
and for other attack DataTables indicates two ways to prevent an attack.
Prevention
There are two options to stop this type of attack from being successful in your application:
Disallow any harmful data from being submitted
Encode all untrusted output using a rendering function.
For the first option your server-side script would actively block all data writes (i.e. input) that contain harmful data. You could elect to simply disallow all data that contains any HTML, or use an HTML parser to allow "safe" tags. It is strongly recommended that you use a known and proven security library if you take this approach - do not write your own!
The second option to use a rendering function will protect against attacks when displaying the data (i.e. output). DataTables has two built in rendering functions that can be used to prevent against XSS attacks; $.fn.dataTable.render.text and $.fn.dataTable.render.number.
More Information: https://www.datatables.net/manual/security
I came to think about this question a few days ago when I desinged an HTML form that submits data via php to an SQL database. I solved my problem, but I am asking here a computer-theoretical question, which might help me (or others) in the future.
I want to protect myself from SQL-injection, and I thought that instead of validating the data by the php on the server side, I can have the javascript validate the data on the client side (I am much more fluent in JS than in PHP) and then send it.
However, a sophisticated user might inspect the javascript (or the HTTPrequest) and then alter it to send his own vicious request to the server.
My question:
Is it theoretically possible to do a computation on the clinet side, where the code is visible to him, and have it sent with some way that ensures that the data was sent from my program and not from an altered code?
Can this be done by an RSA-scheme with public and private keys?
I want to protect myself from SQL-injection, and I thought that instead of validating the data
Don't validate data to protect yourself from SQL Injection. Validate data to make sure it is in the format you want.
Escape data to protect yourself from SQL Injection (and do that escaping via prepared statements and parameterized queries).
Is it theoretically possible to do a computation on the clinet side, where the code is visible to him, and have it sent with some way that ensures that the data was sent from my program and not from an altered code?
No. The client side code can be bypassed entirely. In this arena, it is useful only to quickly tell the user that their data would be rejected if it was submitted to the server.
Can this be done by an RSA-scheme with public and private keys?
No. You have to give one of the keys to the client. It can then be extracted and used independently of your code.