Does load distribution with consistent hashing provide an advantage over standard hashing for stateless servers while scaling a system? - load-balancing

I understand that using consistent hashing for load distribution in case of cache servers or (sharded) database servers offer a significant advantage over usual key-based hashing, as while adding/removing server the the data movement required between the servers due to rehashing is minimized.
However, if we consider application servers or web servers, which are often designed to be stateless and hence not storing any user/session-related data, does consistent hashing offer any advantage here? If yes, what is the data being considered here or am I missing something?

If the server is truly stateless, then yes it doesn't matter. Then you optimize other parameters, like the distance to the client.
But for a server that process some business logic, there is an implicit state in its cache. The server has to have some persistent storage (let's call it a database), local or remote, otherwise the client wouldn't need to make the request if it already had all the information.
The database's or the appserver's cache would be already warmed up, and would have to be re-initialized each time the system scales up or down.
Even if the database is distributed too, the appserver's connection to a specific shard of the database could (or could not) also be a state.

Related

How to cache connections to different Postgres/MySQL databases in Golang?

I am having an application where different users may connect to different databases (those can be either MySQL or Postgres), what might be the best way to cache those connections across different databases? I saw some connection pools but seems like they are more for one db multiple connections than for multiple db multiple connections.
PS:
For adding more context, I am designing a multi tenant architecture where each tenant connects to one or multiple databases, I have an option for using map[string]*sql.DB where the key is the url of the database, but it can be hardly scaled when we have numerous number of databases. Or should we have a sharding layer for each incoming request sharded by connection url, so each machine will contain just the right amount of database connections in the form of map[string]*sql.DB?
An example for the software that I want to build is https://www.sigmacomputing.com/ where the user can connects to multiple databases for working with different tables.
Both MySQL and Postgres do not allow to connection sharing between multiple database users, single database user is specified in connection credentials. If you mean that your different users have their own database credentials, then it is not possible to share connections between them.
If by "different users" you mean your application users and if they share single database user to access DB deeper in the app, then you don't need to do anything particular to "cache" connections. sql.DB keeps and reuses open connections in its pool by default.
Go automatically opens, closes and reuses DB connections with a *database/sql.DB. By default it keeps up to 2 connections open (idle) and opens unlimited number of new connections under concurrency when all opened connections are already busy.
If you need some fine tuning on pool efficiency vs database load, you may want to alter sql.DB config with .Set* methods, for example SetMaxOpenConns.
You seem to have to many unknowns. In cases like this I would apply good, old agile and start with prototype of what you want to achieve with tools that you already know and then benchmark the performance. I think you might be surprised how much go can handle.
Since you understand how to use map[string]*sql.DB for that purpose I would go with that. You reach some limits? Add another machine behind haproxy. Solving scaling problem doesn't necessary mean writing new db pool in go. Obviously if you need this kind of power you can always do it - pgx postgres driver has it's own pool implementation so you can get your inspiration there. However doing this right now seems to be pre-mature optimization - solving problem you don't have yet. Building prototype with map[string]*sql.DB is easy, test it, benchmark it, you will see if you need more.
p.s. BTW you will most likely hit first file descriptor limit before you will be able to exhaust memory.
Assuming you have multiple users with multiple databases with an N to N relation, you could have a map of a database URL to database details (explained below).
The fact that which users have access to which databases should be handled anyway using configmap or a core database; For Database Details, we could have a struct like this:
type DBDetail {
sync.RWMutex
connection *sql.DB
}
The map would be database URL to database's details (dbDetail) and if a user is write it calls this:
dbDetail.Lock()
defer dbDetail.Unock()
and for reads instead of above just use RLock.
As said by vearutop the connections could be a pain but using this you could have a single connection or set the limit with increment and decrement of another variable after Lock.
There isn’t necessarily a correct architectural answer here. It depends on some of the constraints of the system.
I have an option for using map[string]*sql.DB where the key is the url of the database, but it can be hardly scaled when we have numerous number of databases.
Whether this will scale sufficiently depends on the expectation of how numerous the databases will be. If there are expected to be tens or hundreds of concurrent users in the near future, is probably sufficient. Often a good next step after using a map is to transition over to a more full featured cache (for example https://github.com/dgraph-io/ristretto).
A factor in the decision of whether to use a map or cache is how you imagine the lifecycle of a database connection. Once a connection is opened, can that connection remain opened for the remainder of the lifetime of the process or do connections need to be closed after minutes of no use to free up resources.
Should we have a sharding layer for each incoming request sharded by connection url, so each machine will contain just the right amount of database connections in the form of map[string]*sql.DB?
The right answer here depends on how many processing nodes are expected and whether there will be gain additional benefits from routing requests to specific machines. For example, row-level caching and isolating users from each other’s requests is an advantage that would be gained by sharing users across the pool. But a disadvantage is that you might end up with “hot” nodes because a single user might generate a majority of the traffic.
Usually, a good strategy for situations like this is to be really explicit about the constraints of the problem. A rule of thumb was coined by Jeff Dean for situations like this:
Ensure your design works if scale changes by 10X or 20X but the right solution for X [is] often not optimal for 100X
https://static.googleusercontent.com/media/research.google.com/en//people/jeff/stanford-295-talk.pdf
So, if in the near future, the system needs to support tens of concurrent users. The simplest that will support tens to hundreds of concurrent users (probably a map or cache with no user sharding is sufficient). That design will have to change before the system can support thousands of concurrent users. Scaling a system is often a good problem to have because it usually indicates a successful project.

Scalability issues with server based authentication

I was reading up on problems with server based authentication. I need help with elaboration on the following point.
Scalability: Since sessions are stored in memory, this provides problems with scalability. As our cloud providers start replicating servers to handle application load, having vital information in session memory will limit our ability to scale.
I don't seem to understand why "... having vital information in session memory will limit our ability to scale", will limit the ability to scale. Is it just because the information is being replicated.. so it's to do with redundancy? I don't think so. Anyway, would anyone be kind enough to explain this further? Much appreciated.
What's being referred to is the difference between stateless and stateful server-side ops. Stateful servers keep part of their resources (main memory, mostly) occupied for retaining state pertaining to some client, even when the server is actually doing nothing at all for the client and just waiting for the client to come back. Such systems' performance profile is "linear" only up to the point where all available memory has been filled with state, and beyond that point the server seems to essentially stall. Stateless servers only keep resources occupied when they're actually doing something, and once finished doing stuff, those resources are immediately freed and available for other clients. Such servers are essentially not capped by memory limits and therefore "scale easier".
Also, the explanation given seems to refer to scenario's where a set of distinct machines present themselves to the outside world as being one, when actually they are not (this is often called a "cluster" of machines/servers). In such scenario's, if a client has connected to the "big single virtual machine", then actually he is connected to just one of the "actual machines" in the cluster. If state is kept there, subsequent visits by that same client must then be routed to the same physical machine, or that piece of state must be trafficked around to whatever machine the next visit happens to be to. The former implies the implementation of management functions that take their own set of resources, plus limitations on the freedom the cluster has to distribute the load (the opposite of why you want to do clustering), the latter implies additional network traffic that will cap scalability in essentially the same way as available memory does.
Server-based authentication makes use of sessions, which in turn make use of a local session id. In the cloud, when the servers are replicated to handle application load, it becomes difficult for one server to know which sessions are active on other servers. Now to overcome this problem, extra steps must be performed... for instance to persist the session id on to the database. However, as the servers are increasingly replicated, it becomes more and more difficult to handle all this. Therefore, server-based or session-based authentication can be problematic for scalability.

Scaling laravel app horizontally?

As my laravel app will be deployed to heroku, I am wondering how to avoid session affinity so that any node can handle the user request.
As I know, the server who served the authentication for the first time, will store auth-token in a session to identify the user later, but, what about adding new nodes to scale the app, would the user has to be served using the same server that has the auth-token stored? how to avoid such scenario in laravel?
If you want to scale horizontally, you first need to make your web app stateless, which means that you need to store user session & auth info centrally somewhere else instead of storing locally on each server. Redis servers would be the best choice as mentioned by #Amir Bar, since it's a data-structure server (which was used commonly for caching), all data stored on Redis is stored in common data structures (list, hashtable...) on RAM, thus its latency would be exceptionally low.
Once your web app is stateless, just use a load balancer to distribute the load, and then add as much web server nodes as needed behind the load balancer. That would be enough.
Your next challenge after scaling web server would be the database server scalability. You can add as much web server nodes as you want behind the load balancer. But scaling database is another beast. If you're using NoSQL, then congrats! Since NoSQL database is very easy to scale, the horizontally scaling feature is built-in in almost every NoSQL database.
Scaling relational database would be harder than scaling NoSQL database. If you're scaling for high-read system, Master-slave replication model would be appropriate and easy. But if you're scaling for a both high-read and high-write system. Hope you will have fun time with your solution research. The solution would be based on your current design.
Anyway, when you reach the database read/write bottlenecks, try to optimize your queries and database access first, N+1 problem is a very common issue that will greatly slow down your database access.

redis as app's system configuration storage

In my node-based app I plan to use redis for a number of purposes, basically interprocess pub/sub communication and a on-line users cache. This application is clear.
I am thinking about where to store the basic app's system configuration. Critical elements like main TCP port, default message channel, database name, admin-user password, etc. would go here.
The traditional choice for the implementation of this kind of thing would be a conf-file, maybe a JSON-structure in a plain text file.
I am wondering however if it would make sense to use redis here. The major issue is data reliability and the risk of loss of data.
What are the pros/cons?
The claims against Redis reliability are because of the async nature of its data persistence and replication mechanisms. Being async, these can't guarantee that all updates will endure failures - depending on how you configure Redis, there's a (small) chance that the most recent updates will be lost if there's a failure.
That said, in the context of configuration settings storage, this isn't really an issue. Configuration data is immutable most of the time and if you, by some bizarre coincidence, lose you most recent updates to it, recovering the changes manually (I.e. reconfiguring) is usually trivial.

LDAP - write concern / guaranteed write to replicas prior to return

Is OpenLDAP (or are any of LDAP's flavors) capable of providing write concern? I know it's an eventually consistent model, but there's more then a few DB's that have eventual consistency + write concern.
After doing some research, I'm still not able to figure out whether or not it's a thing.
The UnboundID Directory Server provides support for an assured replication mode in which you can request that the server delay the response to an operation until it has been replicated in a manner that satisfies your desired constraints. This can be controlled on a per-operation basis by including a special control in the add/delete/modify/modify DN request, or by configuring the server with criteria that can be used to identify which operations should use this assured replication mode (e.g., you can configure the server so that operations targeting a particular set of attributes are subjected to a greater level of assurance than others).
Our assured replication implementation allows you to define separate requirements for local servers (servers in the same data center as the one that received the request from the client) and nonlocal servers (servers in other data centers). This allows you tune the server to achieve a balance between performance and behavior.
For local servers, the possible assurance levels are:
Do not perform any special assurance processing. The server will send the response to the client as soon as it's processed locally, and the change will be replicated to other servers as soon as possible. It is possible (although highly unlikely) that a permanent failure that occurs immediately after the server sends the response to the client but before it gets replicated could cause the change to be lost.
Delay the response to the client until the change has been replicated to at least one other server in the local data center. This ensures that the change will not be lost even in the event of the loss of the instance that the client was communicating with, but the change may not yet be visible on all instances in the local data center by the time the client receives the response.
Delay the response to the client until the result of the change is visible in all servers in the local data center. This ensures that no client accessing local servers will see out-of-date information.
The assurance options available for nonlocal servers are:
Do not perform any special assurance processing. The server will not delay the response to the client based on any communication with nonlocal servers, but a change could be lost or delayed if an entire data center is lost (e.g., by a massive natural disaster) or becomes unavailable (e.g., because it loses network connectivity).
Delay the response to the client until the change has been replicated to at least one other server in at least one other data center. This ensures that the change will not be lost even if a full data center is lost, but does not guarantee that the updated information will be visible everywhere by the time the client receives the response.
Delay the response to the client until the change has been replicated to at least one server in every other data center. This ensures that the change will be processed in every data center even if a network partition makes a data center unavailable for a period of time immediately after the change is processed. But again this does not guarantee that the updated information will be visible everywhere by the time the client receives the response.
Delay the response to the client until the change is visible in all available servers in all other data centers. This ensures that no client will see out-of-date information regardless of the location of the server they are using.
The UnboundID Directory Server also provides features to help ensure that clients are not exposed to out-of-date information under normal circumstances. Our replication mechanism is very fast so that changes generally appear everywhere in a matter of milliseconds. Each server is constantly monitoring its own replication backlog and can take action if the backlog becomes too great (e.g., mild action like alerting administrators or more drastic measures like rejecting client requests until replication has caught up). And because most replication backlogs are encountered when the server is taken offline for some reason, the server also has the ability to delay accepting connections from clients at startup until it has caught up with all changes processed in the environment while it was offline. And if you further combine this with the advanced load-balancing and health checking capabilities of the UnboundID Directory Proxy Server, you can ensure that client requests are only forwarded to servers that don't have a replication backlog or any other undesirable condition that may cause the operation to fail, take an unusually long time to complete, or encounter out-of-date information.
From reviewing RFC3384 discussion of replication requirements with respect to LDAP, it looks as though LDAP only requires eventual consistency and does not require transactional consistency. Therefore any products which support this feature are likely to do this with vendor specific implementations.
CA Directory does support a proprietary replication model called MULTI-WRITE which guarantees that the client obtains write confirmation only after all replicated instances have been updated. In addition it supports the standard X.525 Shadowing Protocol which provides lesser consistency guarantees and better performance.
With typical LDAP implementations, an update request will normally return immediately when the DSA handling this request has been updated, and not when the replica instances have been updated. This is the case with OpenLDAP I believe. The benefits are speed, the downsides are lack of guarantee that an updated has been applied to all replicas.
CA's directory product uses a Memory Mapped system and writes are so fast this is not a concern.