I've recently been tasked with mitigating issues from a ColdFusion 2018 Server vulnerability report. In particular, this one...
Attack Type: SessionStrength
Session tokens that exhibit low entropy ("randomness") are often susceptible to prediction attacks. Insecure tokens can be due to inadequate pseudo-random
number generator, time-based values, static values, or values based on user attributes (username or user ID)...Session tokens should be created with a strong random number generator and gathered from a large pool of numbers. For example, an operating system's rand()
function can usually be sufficient if it can produce 32-bit values that are a statistically uniform distribution.
And the recommendation is:
Make sure that the Token values are at least 32 bits in size, especially for applications with large numbers of concurrent users and high amounts of daily page
requests.....
My question is, how can I increase the randomness? The server uses J2EE session variables. Is there a way, perhaps on the Java side, that I can improve the randomness? Thank you.
Do you have either of these boxes checked in your ColdFusion Administrator?
Without this box checked, ColdFusion will not use a decent value for the CFToken. Instead it will use an incrementing 8-digit value. It's pretty poor default behavior.
Even if you do have it checked, it is possible that the incrementing behavior of the CFID (that is pair with the CFToken to determine a session) will raise a false flag with your vulnerability scanner. This is where the next checkbox can help.
With this box checked, you are telling ColdFusion to use a JEE session variable instead of the CF/CFToken combination. This will see a single session token that uses a good token. That should satisfy your vulnerability scanner.
I don't recall right off, if when the JEE checkmark is checked if it still writes the CFIF/CFToken cookies (despite the fact that it does not use them). If so, then your scanner may still flag it. If that happens, I believe you can use the following code (in your Application.cfc) to tell ColdFusion not to create those cookies.
this.setClientCookies = false;
Related
I am having an application where different users may connect to different databases (those can be either MySQL or Postgres), what might be the best way to cache those connections across different databases? I saw some connection pools but seems like they are more for one db multiple connections than for multiple db multiple connections.
PS:
For adding more context, I am designing a multi tenant architecture where each tenant connects to one or multiple databases, I have an option for using map[string]*sql.DB where the key is the url of the database, but it can be hardly scaled when we have numerous number of databases. Or should we have a sharding layer for each incoming request sharded by connection url, so each machine will contain just the right amount of database connections in the form of map[string]*sql.DB?
An example for the software that I want to build is https://www.sigmacomputing.com/ where the user can connects to multiple databases for working with different tables.
Both MySQL and Postgres do not allow to connection sharing between multiple database users, single database user is specified in connection credentials. If you mean that your different users have their own database credentials, then it is not possible to share connections between them.
If by "different users" you mean your application users and if they share single database user to access DB deeper in the app, then you don't need to do anything particular to "cache" connections. sql.DB keeps and reuses open connections in its pool by default.
Go automatically opens, closes and reuses DB connections with a *database/sql.DB. By default it keeps up to 2 connections open (idle) and opens unlimited number of new connections under concurrency when all opened connections are already busy.
If you need some fine tuning on pool efficiency vs database load, you may want to alter sql.DB config with .Set* methods, for example SetMaxOpenConns.
You seem to have to many unknowns. In cases like this I would apply good, old agile and start with prototype of what you want to achieve with tools that you already know and then benchmark the performance. I think you might be surprised how much go can handle.
Since you understand how to use map[string]*sql.DB for that purpose I would go with that. You reach some limits? Add another machine behind haproxy. Solving scaling problem doesn't necessary mean writing new db pool in go. Obviously if you need this kind of power you can always do it - pgx postgres driver has it's own pool implementation so you can get your inspiration there. However doing this right now seems to be pre-mature optimization - solving problem you don't have yet. Building prototype with map[string]*sql.DB is easy, test it, benchmark it, you will see if you need more.
p.s. BTW you will most likely hit first file descriptor limit before you will be able to exhaust memory.
Assuming you have multiple users with multiple databases with an N to N relation, you could have a map of a database URL to database details (explained below).
The fact that which users have access to which databases should be handled anyway using configmap or a core database; For Database Details, we could have a struct like this:
type DBDetail {
sync.RWMutex
connection *sql.DB
}
The map would be database URL to database's details (dbDetail) and if a user is write it calls this:
dbDetail.Lock()
defer dbDetail.Unock()
and for reads instead of above just use RLock.
As said by vearutop the connections could be a pain but using this you could have a single connection or set the limit with increment and decrement of another variable after Lock.
There isn’t necessarily a correct architectural answer here. It depends on some of the constraints of the system.
I have an option for using map[string]*sql.DB where the key is the url of the database, but it can be hardly scaled when we have numerous number of databases.
Whether this will scale sufficiently depends on the expectation of how numerous the databases will be. If there are expected to be tens or hundreds of concurrent users in the near future, is probably sufficient. Often a good next step after using a map is to transition over to a more full featured cache (for example https://github.com/dgraph-io/ristretto).
A factor in the decision of whether to use a map or cache is how you imagine the lifecycle of a database connection. Once a connection is opened, can that connection remain opened for the remainder of the lifetime of the process or do connections need to be closed after minutes of no use to free up resources.
Should we have a sharding layer for each incoming request sharded by connection url, so each machine will contain just the right amount of database connections in the form of map[string]*sql.DB?
The right answer here depends on how many processing nodes are expected and whether there will be gain additional benefits from routing requests to specific machines. For example, row-level caching and isolating users from each other’s requests is an advantage that would be gained by sharing users across the pool. But a disadvantage is that you might end up with “hot” nodes because a single user might generate a majority of the traffic.
Usually, a good strategy for situations like this is to be really explicit about the constraints of the problem. A rule of thumb was coined by Jeff Dean for situations like this:
Ensure your design works if scale changes by 10X or 20X but the right solution for X [is] often not optimal for 100X
https://static.googleusercontent.com/media/research.google.com/en//people/jeff/stanford-295-talk.pdf
So, if in the near future, the system needs to support tens of concurrent users. The simplest that will support tens to hundreds of concurrent users (probably a map or cache with no user sharding is sufficient). That design will have to change before the system can support thousands of concurrent users. Scaling a system is often a good problem to have because it usually indicates a successful project.
We have an aerospike cluster of 8 nodes. We saw that during peak hours one of the nodes is having a significantly higher load average in comparison to other nodes. Also in the AMC dashboard, we saw that the node is having only 30% read success. After following few similar issues posted in the aerospike community, we thought that the presence of hotkeys might be the possible culprit.
After following (https://discuss.aerospike.com/t/how-to-identify-read-hotkeys/4193), we found out a few hotkey digests with TCPdump in real-time. Among the top 10 digests, the interesting thing is that one key is present in 90% of the time.
We then followed (https://discuss.aerospike.com/t/faq-how-keys-and-digests-are-used-in-aerospike/4663) to find out UserKey/record from those digests. We were able to map user key from all those except for one key which is present in 90% of the time.
Is there any way we can identify that hotkey?
Depending on your version of aerospike, you can also change the logging level for rw-client module which would also print the digest in the logs. That may remove any false positive from the tcpdump method.
Turn detail level logging for rw-client context
asinfo -v "set-log:id=0;rw-client=detail"
Turn back to info
asinfo -v "set-log:id=0;rw-client=info"
Also did you try the UDF from the above article to determine the set and key? (They original key would only be stored if the client has explicitly enable the SendKEY policy). Were there any corresponding record write failures, like record too big? Or possibly trying to read a non-existing record. (read not found) The write failures from a record too big would have the most impact on your network infrastructure. In both of these cases, the digest and record would not make it to storage and digest would not match an existing record.
It is possible that the frequent read request with the rouge digest may be failing with a 'not found' error (and hence only 30% read success). But Aerospike will spend its resources (CPU) to search for this digest in the index tree. If this is true, there will be no record in the database corresponding to the digest that you found via tcpdump. So, you will not get any details about that in the database. How did you identify the keys of other digests ? and what issue are you facing to find the key corresponding to the rouge digest ?.
Another option is to track back to the application. One option is to see in the tcpdump if all the requests for this rouge digest are coming from a single machine. That will narrow down your search greatly. We have seen bots creating such a mess in the past.
The system my company sells is software for a multi-machine solution. In some cases, there is a UI on one of the machines and a backend/API on another. These systems communicate and both use their own clocks for various operations and storage values.
When the UI's system clock gets ahead of the backend by 30 seconds or more, the queries start to misbehave due to the UI's timestamp being sent over as key information to the REST request. There is a "what has been updated by me" query that happens every 30 seconds and the desync will cause the updated data to be missed since they are outside the timing window.
Since I do not have any control over the systems that my software is installed on, I need a solution on my code's side. I can't force customers to keep their clocks in sync.
Possible solutions I have considered:
The UI can query the backend for it's system time and cache that.
The backend/API can reach back further in time when looking for updates. This will give the clocks some room to slip around, but will cause a much heavier query load on systems with large sets of data.
Any ideas?
Your best bet is to restructure your API somewhat.
First, even though NTP is a good idea, you can't actually guarantee it's in use. Additionally, even when it is enabled, OSs (Windows at least) may reject packets that are too far out of sync, to prevent certain attacks (on the order of minutes, though).
When dealing with distributed services like this, the mantra is "do not trust the client". This applies even when you actually control the client, too, and doesn't necessarily mean the client is attempting anything malicious - it just means that the client isn't the authoritative source.
This should include timestamps.
Consider; the timestamps are a problem here because you're trying to use the client's time to query the server - except, we shouldn't trust the client. Instead, what we should do is have the server return a timestamp of when the request was processed, or the update stamp for the latest entry of the database, that can be used in subsequent queries to retrieve new updates (how far back you go on initial query is up to you).
Dealing with concurrent updates safely is a little harder, and depends on what is supposed to happen on collision. There's nothing really different here from most of the questions and answers dealing with database-centric versions of the problem, I'm just mentioning it to note you may need to add extra fields to your API to correctly handle or detect the situation, if you haven't already.
Would it be useful for a hacker in any way to publicly display current server stats, such as average load times and memory usage?
The only issue I can forsee is that someone attempting to DDoS the server would have a visible indication of success, or would be able to examine patterns to choose an optimal time to attack. Is this much of an issue if I'm confident in the host's anti-DDoS setup? Are there any other problems I'm not seeing (I have a bad tendancy to miss wide-open security holes sometimes...)
Also useful for doing a MITM attack at the most busy time.
So the attacker can acquire the most targets before possible detection.
Another thing I can think of is logfile 'obfuscation'. Where requests by an attacker get lost in other logged stuff.
Maybe a long shot, but it can also be used to see where your visitors are coming from (based on the time they access the website), which can be used to target your visitors in other ways.
Also to expand on the possibility of attackers DOSsing the site, they can calculate the average response time at different times of the days (when it doesn't happen automatically). Because they can put load on the server and see when the load gets less.
Yes it's useful.
It will help him to know when he can download a big chunk of data, like a backup, without being detected by traffic statistics ;)
Also he will know when he can attack, do a penetration test, bruteforce or what ever, with better chance of hiding his track in the logs.
Furthermore, if he gain access he will know, when he could collect more credit cards, passwords from users, if he had no lack with the database or it's a Xss attack etc.
Ddos is another point, that you mension it already. Memory and average load will give him the success status of the attack.
I always take precautions regarding SQL INJECTION ATTACKS when data is saved between someone's iPhone and a remote database on the cloud.
But is it also necessary to do the same... when just saving data (using sqlite) from someone's cell phone, to a database that's just on their own phone?
What's the worse they can do? Delete their own data (or tables) on their own phone?
(If they really try hard enough.)
Thanks.
Is it necessary? - Yes, its "necessary", i.e. its probably worth it. Even if you don't care much about security in this context (which may be valid), you should worry about correctness (at the very least, its matter of pride).
What's the worst that could happen?
User #1 Patty O'Brian enters her name into a field that gums up the SQL call and it fails. The program either doesn't handle it well or the user gets an ambiguous error message as to why it failed.
User #2 enters a name that gums up the SQL call and it succeeds! The program is now in an unknown state.
Either way, now the user contacts support and eats up time and energy (user #2 never admitting what they did, making it even more difficult to debug) and/or demands their money back.
Yes, it is necessary, IMHO.
The majority of injection attacks can be prevented by adherence to correctness
SQL placeholders and bound variables, for example, handle both unexpectedly formed input (e.g., the innocent apostrophe in "5 o'clock") and malicious input (e.g., "' OR 1=1 --").
So, be scrupulously correct in your data handling, and don't worry about most injections.
Injections might subvert application logic
SQLite has triggers, I think, but in any case the application might make decisions based on data pulled from the local db, attacking other facets of the environment, etc. If today's application isn't complex enough for this, tomorrow's rev will be.
Someone else might be using (attacking) the phone, not just an authorized user
True, this is generic risk of, say, the desktop authenticated to StackOverflow. However, I find that "smartphone" apps are more at risk of unintended operators: many phones have no passcode, many apps require no frequent re-authentication, and users may freely give their phones to people who just need to make a quick call.
If you are syncing an iPhone database with a remote database do not trust the content. It doesn't take SQL Injection to change the database. A jailbroken iPhone gives the user full access to entire file system which includes the sqlite database file, this can then be modified however the attacker wants. This isn't sql injection, this is a "Client Side Trust" vulnerability.
SQL Injection under sqlite is useful to an attacker. Unlike MySQL, Sqlite allows you to stack queries, so the attacker can always create/drop/insert/update/delete/select/etc no matter what query is affected by sql injection. Under MySQL its common to inject sub-selects or union-selects to obtain specific data, but for instance you cannot turn a select statement into an insert under normal conditions.