I have been trying to solve a big problem for the last 2 weeks with one of our servers (apache 2.2 , windows, php).
The client using our system is a contact center firm.
They have about 120 operators, all connect to our websever with the same IP, their outgoing IP.
We have been suffering DoS attacks from some of these operators.
These are simple, browser attacks , namely 5 or 10 operators will just hold
F5 key and bombard the server with requests when they shouldnt.
There is very little we can do to improve performance of these specific url's the attackers are using. This is a software, not a public portal, so a lot of screens have a good amount of processing and real time querying in them.
We did manage to produce a php protection which will recognize the multiple requests and blacklist the user in php, after the user is logged in, by using a control mechanism with a cookie containing the userID in our software.
This works to some extent, but it’s a little "too late" since the request have already been sent and processed by the webserver.
Even if the response is now minimal and causes no more trouble to the server, ideally we would like something EXACTLY like mod_evasive, but for rejecting single requests instead of blocking the IP.
Exemplifying : if a user calls the same url, 5 times, in a 3 second spawn, we will reject every next request for 30 seconds, but only the requests by that user (identified by some cookie).
Well, this is more of an administrative problem than a technical problem - tell your customers to educate their users.
But if you really want to do that in software, some scheme like the following should work:
Give each of your users a different cookie (you're already doing that?).
Create a database table that has cookies and "request counts".
Whenever your software starts processing a request, if the request count for that session is > 3 (or a number you think is appropriate), abort immediately, set the request count to 30, and give an error message to the user. Especially don't do all that expensive processing stuff.
If the request count is less than 3, increase it by one and process the request normally.
Have some cron job decrease by one all request counts that are >1. Depending on the typical use case of your software, run this once per minute, or once every few seconds in a loop that queries the database, then sleeps for a while.
Tune the parameters to your software, or course.
Related
ASP.Net Core has protection against brute force guessing of passwords, by locking the account after a fixed number of login attempts.
But is there some protection against credential stuffing, where the attacker tries a lot of logins, but always with different usernames? Locking the account would not help, since the account changes on every attempt.
But maybe there is a way to lock an IP against multiple login-attempts or some other good idea to prevent credential stuffing?
I'd recommend using velocity checks with redis, this is basically just throttling certain IPs. Also some fraudsters will also rotate IPs, you could also detect when logins are happening more frequently (say 10x the norm) and start to block all logins for that short window. I wrote a blog post detailing some of the above. The code is all in node, but I did give some high level examples of how we stop fraud at Precognitive (my current gig). I will continue to build upon the code over the next couple of months as I post more in my Account Takeover series.
https://medium.com/precognitive/a-naive-demo-on-how-to-stop-credential-stuffing-attacks-2c8b8111286a
IP throttling is a good first step, but unfortunately it won't block a large number of modern credential stuffing attacks. Tactics for savvy credential stuffers have evolved over the past few years to become more and more complex.
To be really effective at blocking credential stuffing, you need to be looking at more than IP. Attackers will rotate IP and User-Agent, and they'll also spoof the User-Agent value. An effective defense strategy identifies and blocks attacks based on real-time analysis of out-of-norm IP and User-Agent activity, plus additional components to enhance specificity (such as browser-based or mobile-app-based fingerprinting).
I wrote a blog post looking at two credential stuffing attacks in 2020, which I call Attack A (low-complexity) and Attack B (high-complexity).
Attack A, the low-complexity example, had the following characteristics:
~150,000 login attempts
1 distinct User-Agent (a widely used version of Chrome)
~1,500 distinct IP addresses (85% from the USA, where most of the app users reside)
Attack B, the high-complexity example, had the following characteristics:
~60,000 login attempts
~20,000 distinct User-Agents
~5,000 distinct IP addresses(>95% from USA, >99% from USA & Canada)
You can see that with Attack A, there were about 100 login attempts per IP address. IP rate-limiting may be effective here, depending on the limit.
However, with Attack B, there were only 12 login attempts per IP address. It would be really hard to argue that IP rate-limiting would be effective for this scenario.
Here's my post with more in-depth data: https://blog.castle.io/how-effective-is-castle-against-credential-stuffing/
Full disclosure: I work for Castle and we offer an API-based product to guard against credential stuffing.
We have build a free globally distributed mobility analytics REST API. Meaning we have servers all over the world which run different versions (USA, Europe, etc..) of the same application. The services are behind a load balancer so I can't guarantee that the same user always get's the same application/server if he/she does requests today or tomorrow. The API is public but users have to provide an API key in order for us to match them to their paid request quota.
Since we do heavy number crunching with every request, we want to minimize request times as far as possible, inparticular for authentication/authorization and quota monitoring. Since we currently only use one user database (which has to be located in a single data center) there are cases where users in the US make a request to an application/server in the US which authenticates the user in Europe. So we are looking for a solution where the user database interaction:
happens on the same application server
get's synchronized between all application servers
should be easily integrable into java application
should be fast (changes happen in every request)
Things we have done so far:
a single database on each server > not synchronized, nightmare
a single database for all servers > ok, when used with slave as fallback but American users have to authenticate over the Atlantic
started installing bdr but failed on the way (no time, too complex, hard to make transition)
looked at redis.io
Since this is my first globally distributed REST API I wonder how other companies do this. (yelp, Google, etc.)
Any feedback is very kindly appreciated,
Cheers,
Daniel
There is no right answer, there are several ways to perform this. I'll describe the one I'm most familiar with (and which probably is one of the simplest ways to do it, although probably not the most robust one).
1. Separate authentication and authorization
First of all separate authentication and authorization. An user authenticating across the Atlantic is fine, every user request that requires authorization to go across the Atlantic is not. Main user credentials (e.g. a password hash) are a centralized resource, there is no way around that. But you do not need the main credentials every time a request needs to be authorized.
This is how a company I worked at did it (although this was postgres/python/django, nothing to do with java):
Each server has a cache of database queries in memcached but it does not cache user authentication (memcached is very similar to redis, which you mention).
The authentication is performed in the main data centre, always.
A successful authentication produces a user session that expires after a day.
The session is cached.
This may produce a situation in which a user can have an action authorized by more than one session at a given time. The algorithm is that: if at least one user session has not expired, the user is authorized. Since the cache is local, there is no need for a costly operation of fetching data across the Atlantic for every request.
2. Session revival
Sessions expiring after exactly a day may be annoying for the user, since he can never be sure that his session will not expire in the next couple of seconds. Each request the user makes that is authorized by a session shall extend the session lifetime to a full day again.
This is easy to implement: You need to keep in the session the timestamp of the last request made. And count the session lifetime based on that timestamp. If more than a single session may authorize a request the algorithm shall select the younger session (with the longest lifetime remaining) to update.
3. Request logging
This is as far as the live system I worked with went. The remainder of the answer is about how I would extend such a system to make space for logging the requests and verifying request quota.
Assumptions
Lets start by making a couple of design assumptions and argue why they are good assumptions. You're now running a distributed system, since not all data on the system is in a single place. A distributed system shall give priority to response speed and be as horizontal (not hierarchical) as possible, to achieve this consistency is sacrificed.
The session mechanism above already sacrifices some consistency. For example, if a user logs in and talks continuously to server A for a day and half and then a load balancer points the user to server B, the user may be surprised that he needs to login again. That is an unlikely situation: in most cases a load balancer would divide the user's requests equally over both server A and server B over the course of that day and half. And both, server A and server B will have live sessions at that point.
In your question you wonder how google deals with request counting. And I'll argue here that it deals with it in an inconsistent way. By that, I mean the fact that you cannot enforce a strict upper limit of requests if you're sacrificing consistency. Companies like google or yelp simply say:
"You have 12000 requests per month but if you do more than that you will pay $0.05 for every 10 requests above that".
That allows for an easier design: you can count requests at any time after they happened, the counting does not need to happen in real time.
One last assumption: a distributed system will have problems with duplicated internal data. This happens because parts of the system are running in real time and parts are doing batch processing without stopping or timestamping the real time system. That happens because you cannot be 100% sure about the state of the real time system at any given point. It is mandatory that every request coming from a customer have a unique identifier of some sort, it can be as simple as customer number + sequence number, but it needs to exist in every request. You may also add such a unique identifier the moment you receive the request.
Design
Now we extend our user session that is cached on every server (often cached in a different state on different servers that are unaware of each other). For every customer request we store the request's unique identifier as part of the session cache. That's right, the request counter in the central database is not updated in real time.
At a point in time (end of day processing, for example) each server performs a batch processing of request identifiers:
Live sessions are duplicated, and the copies are expired.
All request identifiers in expired sessions are concatenated together, and the central database is written to.
All expired sessions are purged.
This produces a race condition, where a live session may receive requests whilst the server is talking to the central database. For that reason we do not purge request identifiers from live sessions. This, in turn, causes a problem where a request identifier may be logged to the central database twice. And it is for that reason that we need the identifiers to be unique, the central database shall ignore request updates with identifiers already logged.
Advantages
99.9% Uptime, the batch processing does not interrupt the real time system.
Reduced writes to the database, and reduced communication with the database in general.
Easy horizontal growth.
Disadvantages
If a server goes down, recovering the requests performed may be tricky.
There is no way to stop a customer from doing more requests than he is allowed to (that's a peculiarity of distributed systems).
Need to store unique identifiers for all requests, a counter is not enough to measure the number of requests.
Invoicing the user does not change, you just query the central database and see how many requests a customer performed.
I need to do some security validation on my program, and one of the things I need to answer, related to authentication is" Verify that all authentication decisions are logged, including linear back offs and soft-locks."
Does anyone knows what linear back off and soft-locks mean?
Thank you in advance,
Thais.
I am doing my research on OWASP ASVS. Actually Linear Back-of and soft-lock are authentication controls that are used to prevent brute force attacks and also can help against DoS.
Linear Back-of can be implemented through some algorithm by blocking user/IP for a particular time and after every failed login attempt that time is increase exponentially e.g. for first failed login block for 5 minute, for second failed login block for 25 minute for 3rd 125 min and so on.
As per my understanding as I have seen in some articles and implemented in some application like Oracle WebLogic Soft lock is much easier to implement, IP Address (Which is I think is also helpful to protect against DoS and Brute force using automated tools) or user name is logged in database for every failed login attempt and when a certain threshold number of failed login attempts (e.g. 5) block IP address permanently. Once the account has been soft locked in application runtime, it does not try to validate the account credentials against the backend system, thus preventing it from being permanently locked.
ASVS Verification requirement is very much clear on this though.
"Verify that a resource governor is in place to protect against vertical (a single account tested against all possible passwords) and horizontal brute forcing (all accounts tested with the same password e.g. “Password1”). A correct credential entry should incur no delay. For example, if an attacker tries to brute force all accounts with the single password “Password1”, each incorrect attempt incurs a linear back off (say 5, 25, 125, 625 seconds) with a soft lock of say 15 minutes for that IP address before being allowed to proceed. A similar control should also be in place to protect each account, with a linear back off configurable with a soft lock against the user account of say 15 minutes before being allowed to try again, regardless of source IP address. Both these governor mechanisms should be active simultaneously to protect against diagonal and distributed attacks."
I'm about to implement an RESTful API to our website (based on WCF data services, but that probably does not matter).
All data offered via this API belongs to certain users of my server, so I need to make sure only those users have access to my resources. For this reason, all requests have to be performed with a login/password combination as part of the request.
What's the recommended approach for preventing brute force attacks in this scenario?
I was thinking of logging failed requests denied due to wrong credentials and ignoring requests originating from the same IP after a certain threshold of failed requests has been exceeded. Is this the standard approach, or am I a missing something important?
IP-based blocking on its own is risky due to the number of NAT gateways out there.
You might slow down (tar pit) a client if it makes too many requests quickly; that is, deliberately insert a delay of a couple of seconds before responding. Humans are unlikely to complain, but you've slowed down the bots.
I would use the same approach as I would with a web site. Keep track of the number of failed login attempts within a certain window -- say allow 3 (or 5 or 15) within some reasonable span, say 15 minutes. If the threshold is exceeded lock the account out and mark the time that the lock out occurred. You might log this event as well. After another suitable period has passed, say an hour, unlock the account (on the next login attempt). Successful logins reset the counters and last lockout time. Note that you never actually attempt a login on a locked out account, you simply return login failed.
This will effectively rate-limit any brute force attack, rendering an attack against a reasonable password very unlikely to succeed. An attacker, using my numbers above would only be able to try 3 (or 5 or 15) times per 1.25hrs. Using your logs you could detect when such an attack were possibly occurring simply by looking for multiple lockouts from the same account on the same day. Since your service is intended to be used by programs, once the program accessing the service has its credentials set properly, it will never experience a login failure unless there is an attack in progress. This would be another indication that an attack might be occurring. Once you know an attack is in process, then you can take further measures to limit access to the offending IPs or involve authorities, if appropriate, and get the attack stopped.
I'm wondering when login timeouts are being used, specifically when using same session (same browser session). On a number of sites I have completed recently I have added 60 minute timeouts and they seem to be causing problems, such as users are not able to fill out larger forms (like a resume submission--people don't think of copying their resume from another program or saving part way through). On one site, I have implemented a div/popup forcing the user to enter their password to continue in the current session, without having to login again.
But on other sites, such as Facebook, it seems you are never logged out as long as you are using the same browser window, even without "remembering" your password.
The main reason I usually use timeouts is to ensure the data is secure, such that another party can't sit down at the computer a few hours later and use the system as the original user.
I'm wondering how you decide when a site should time out users because of inactivity?
I'm thinking the answer would be language agnostic.
IMO, they're valid when:
security is critical (ie. banking)
the likelihood of seat-swapping is
high (ie. public terminals)
Regardless, there may be instances like your resume system, where you want people on public terminals to be able to carry out an act that may leave them inactive for longer than your desired or necessary timeout.
I suppose you just have to handle that in a smart fashion - either figure out a way they can get the data in quicker (which would be ace, spending an hour filling out a form is not fun - can they just upload a file?), or ensuring they can continue without any data loss after being prompted to log in again.
Even though 60 minutes seems like a long time to fill out a single form (perhaps the forms should be divided into multiple pages?), you can probably use SlidingExpiration to solve the problem where your users get logged out even though the browser session is alive.
I think the timeout for an auth cookie is a Security level decision. If your site is SSL secured, you would probably have minimal timeout values (user session would expire within a matter of minutes). On the other hand, for sites with non-critical security, you could set a medium timeout value.
When I sign on to online banking, for example, it asks me whether or not I am using a "public terminal": and if I say yes then it enforces stricter security, or if no then laxer.