SOQL Injection in SFDC - api

What is the best way to avoid SOQL Injection when querying salesforce through the APIs?
The two main APIs I am interested in are the SOAP and REST APIs.
My current methods are to never use any input from the user (which is impractical if they are searching for a Company Name) or encoding certain characters within the string.
However I saw that there was parameterisation within the APEX, so i was wondering if there was a similar way of doing it through the APIs.

I think all you really need to do is to make sure that the input, in this case the company name, is escaped properly. I am not aware of a parameterized way of building a query object for either of the API's.
However, if you needed to you could expose a custom web service method from within Salesforce so that you can pass the value in. Then from within the Salesforce Apex Code language you can parameterize the value using a syntax similar to below:
public Account[] queryCompany(string companyName) {
return [SELECT Id FROM Account WHERE Name = :companyName];
}

Philosophical rant
What are you after really :)
If your application should work same way accessed from different sources (Salesforce UI, PHP connector, some mobile applications) then it probably makes most sense to think about Apex like some stored procedures that will be reused. This means you'd be passing safe parameters to them.
If you plan to hand-craft queries & not rely on Apex too hard - maybe what you need is something like database.com or other cloud-based DB solutions?
Actual answer
I'm not aware of an out of the box way to pass separately the query command and separately the params to it (like bind variables/prepared statements) through APIs. Both REST and SOAP API give you what's essentially Database.query() within Apex. Sure, there are some differences like retrieve() command or queryMore() but that's the baseline.
What you could do is to either expose some commonly used searches with methods similar to what John proposed (bonus points for extra performance - they're precompiled) or build something generic?
List<sObject> runQuery(String query, List<List<String>> params){...}
If the runQuery will contain bind variables like params[0] it should work. Looks crazy and I didn't test it though ;) I'd say that bind variables are the best method. Alternative would be to escape user's input but SQL and XSS injections can become amazingly creative. Check Examples of XSS that I can use to test my page input? for a start (yes, I'm aware you asked about SOQL only).
As for actual SOQL injection: http://wiki.developerforce.com/page/Secure_Coding_SQL_Injection. Since "worst that can happen" is that users will search for more than they were supposed to (no way to convert SELECT into INSERT) escaping should be safe-ish...

Related

Do I need to sanitize express route parameters?

If I have a route in Express with route parameters which are used to query my database, do I need to sanitize this parameter before using it?
What you do and don't need to sanitize is entirely dependent upon what you're doing with it.
The content in a route parameter comes entirely from the user so it can be anything that is allowed in a URL and matches your route parameter. That means there are possibilities that something harmful could be injected within that. But, again whether harm is actually possible or not, depends on the exact code you're using. If you were injecting this user content into a SQL statement, then there are all sorts of bad things it could do. If you were just using it as a programmable query argument in a specific database API, there may be no harm.
So, there is no general purpose answer that applies to all possible uses of the data. It depends on the exact code you're using it in.
If in doubt, sanitize and validate the user input before using it.

Prevent SQL injection when SQL is supplied from the request

The company I am working at uses a REST API for database accesses. So basically, you just provide a SQL statement string and the REST API returns a Datatable. Now I am unsure as to how to prevent an SQL injection as I cannot generate the SQL command using parameters (as I normally would) since I have to provide a SQL statement string for the REST API.
The only way to make this safe is to whitelist a specific list of SQL queries that are pre-vetted. The REST API would compare the input to the whitelist. If the SQL query is one of the known queries in the whitelist, then it can run. Otherwise, the API returns an error status (I'd use 400 BAD REQUEST).
But I suppose the purpose of the API is to run any SQL statement the client inputs, verbatim. This is literally an SQL injection vulnerability by design. There is no way to make that not SQL injection.
Besides that, the API really goes against the conventions of a RESTful web service.
The URI of the request doesn't identify the resource.
I assume every request is a POST with the SQL query as a payload. You probably don't use http methods like PUT, PATCH, or DELETE.
The message in response, being only a datatable, isn't self-descriptive; it doesn't contain metadata the client can use to manipulate the resource.
SQL, being a generative grammar, allows an unlimited variety of queries. This doesn't fit the HATEOAS principle that the REST server should be able to describe the valid actions on request. The client must have implicit knowledge of your database schema.
You don't have a REST API. You have a web service with no specific interface.
The presence of a "query anything" API should be a huge red flag. It's probably a sign that the project isn't specified well in other ways.

Is it ok for a REST api to be exposed via two HTTP methods?

The problem is that we have a complex query string for a search api and want to let the users have convenience of using body instead. So we want to allow both GET and POST(or PUT).
I understand that there will be a debate of search being a read only operation and it should be GET only as per REST standards. Also PUT is not cache friendly as i understand.
But i also know that its ok to deviate at times from the REST standards. But does it make sense to have two methods for client's convenience?
Using POST directly to query data is not a good thing, precisely for the reasons that you mentioned. If your search string is complex, perhaps you could simplify things by splitting the querying process in two steps - one involving a POST, and another one involving straight GETs.
The first step creates a query template using the POST. The query string is sent via message body, and becomes a new resource that users can query through GET. Query string allows for parameters, in a way similar to SQL queries. Taking a wild guess at how your query might look, here is an example:
(userName = $name) || (createdBefore > $asOf && deleted=false)
Your users would POST this in a message body, and get a new resource identifier back. This resource identifies a parameterized "view" into your data. Let's say the resource id for this view is aabb02kjh. Now your users can query it like this:
https://app.yourserver.net/aabb02kjh?name=airboss&asof=20140101
This adds some complexity to your API, but it lets users define and reuse query templates with very simple and standard query strings.
Interesting question. I mean by POST -> PUT,DELETE there are common workarounds for overriding HTTP methods:
sending a _method hidden input field with the post data
sending a _method query param in the URL
sending an X-HTTP-Method-Override header with the post
etc... So if they are valid (I am not sure about that), then you could use the same approach by GET as well.
According to REST constraints: cache and the uniform interface, and the HTTP method definitions, you have to use GET by retrieval requests. There are only a few URL query languages to make URLs readable, for example RQL, but you can always pick your favorite query language and serialize it for URL usage...
Another interesting approach to add link descriptions about the URL. (But that is very new for me either.)

Backend database used in the API

By going through this API documentation page, is it possible to tell which database is being used in the backend?
Zomato API
MySQL would require a php file on the server to handle the requests, make queries, pack data in JSON format then send it back to the device. But in this case parameters are passed to .json files. Please advice
There is no way to "see through" to what the backend service actually used to provide you with the information you may query for. Are you sure you want to continue using this product? The site notes that Zomato will no longer be available to individuals, and that your API key will be disabled if you don't use it monthly.
I haven't read the specs for that particular API. But in general, is it possible to tell what database is being used on the back end by studying an API? No. That's the whole point of an API: It's supposed to shield the API-user from implementation details.
It's probably true that in many cases you could make reasonable guesses about what tools are being used on the back end. Like if you see that the API gives you a syntax for doing comparisons that looks exactly like the proprietary compare function used in Foobar SQL and not found in any other database product, that would be a strong clue. But even something like that wouldn't be proof. Maybe originally they were using Foobar SQL, then they switched to another database, but to maintain compatibility they wrote code to translate the Foobar SQL compare to standard SQL syntax.

RESTful API: Why not use only one base URL for both collections and items?

I was discussing RESTful APIs with a friend, and he asked why it uses two base URLs for collections and items (/dogs and /dogs/1234) instead of a single URL with query parameters like everything else (/dogs and /dogs?id=1234).
After some further discussion, I realized I couldn't come up with an argument that wasn't based around aesthetic reasons (meaning the URL looks better as /dogs/1234 instead of /dogs?id=1234).
You could have one base URL that handles both collections and single items for a resource, and it does seem strange that there is this one special case where you use a non-query parameter (/1234 instead of ?id=1234) to reference a resource.
Which leads me to ask, is there a specific, non-aestetic reason to use two base URLs for a resource instead of one in a RESTful API?
One thing I considered was that nested resources like /dogs/1/fleas/10 seems awkward, but still doable with a single base URL (/dogs?id=1&flea_id=10)
URI design is only one very small part of a REST API, although you would think it is the only thing about being REST-ful given the amount of time spent talking about it. Authentication, content types, response codes, method types (GET, POST, PUT, DELETE, OPTIONS), discover-ability and caching strategies are much more important things to consider.
However, when thinking about whether or not query strings are appropriate, first make a determination of whether or not they can accurately represent a resource's state without changing it. Can the same resource (your dog) be identified at that location using that URI (presumably always)? Will that dog change in some way because you chose a query string with ID instead of representing the ID in the path? No, it won't, which is why a query string in this case is just fine. As a matter of fact either of those will do.