How to solve hot partition problem in consistent hashing Load Balancing - load-balancing

I am trying to understand how different Load Balancing strategy works.
One way is to use consistent hashing algorithm where we divide the entire space into multiple virtual nodes and each physical node takes a set of vnodes.
I am not able to understand how would hot partition problem will be solved? It could happen that one particular node is busier than the rest?
Can someone add their experience handling similar use cases? Any pointer to the right document/literature would be helpful.

Related

Redis > Isolate keys with large values?

It's my understanding that best practice for redis involves many keys with small values.
However, we have dozens of keys that we'd like to have store a few MB each. When traffic is low, this works out most of the time, but in high traffic situations, we find that we have timeout errors start to stack up. This causes issues for all of our tiny requests to redis, which were previously reliable.
The large values are for optimizing a key part of our site's functionality, and a real performance boost when it's going well.
Is there a good way to isolate these large values so that they don't interfere with the network I/O of our best practice-sized values?
Note, we don't need to dynamically discover if a value is >100KB or in the MBs. We have a specific method that we could have use a separate redis server/instance/database/node/shard/partition (I'm not a hardware guy).
Just install/configure as many instances as needed (2 in the case), each managing independently on a logical subset if keys (e.g. big and small), with routing done by the application. Simple and effective - divide and converter conquer
The correct solution would would be to have 2 separate redis clusters, one for big sized keys, and another one for small sized keys. These 2 clusters could run on the same set of physical or virtual machines, aka multitenancy (You would want to do that to fully utilize the underlying cores on your machine, as redis server is single threaded). This way you would be able to scale both the clusters separately, and your problem of small requests timing out because of queueing behind the bigger ones will be alleviated.

Why are hotspots on my redis cluster bad?

I have a redis cluster and I am planning to add keys which I know will have a much heavier read/update frequency than other keys. I assume this might cause hotspots on my cluster. Why is this bad and how can I avoid it ?
Hotspot on keys is ok, if these keys can sharding to different redis nodes. But if there is hotspot on some redis nodes/machines, that will be bad, as the memory/cpu load of these machines will be quite heavy, while other nodes are not efficiently used.
If you know exactly what these keys are, you can calculate slots of them by yourself at first, with CRC16 of the key modulo 16384.
And then you can distribute these slots to different redis nodes.
Whether or not items will cause a hot spot on a particular node or nodes depends on a bunch of factors. As already mentioned, hotspotting on a single key isn't necessarily a problem if the overall cluster traffic is still relatively even and the node that key is on isn't taxed. If each of your cluster nodes are handling 1000 commands/sec and on one of those nodes all of the commands are one related to one key, it's not really going to matter since all of the commands are processed serially on a single thread, it's all the same.
However, if you have 5 nodes, all of which are handling 1000 commands/sec, but you add a new key to one node which makes that single node incur another 3000 commands/sec, one of your 5 nodes is now handling 50% of the processing. This means that it's going to take longer for that node handle all of its normal 1000 commands/sec, and 1/5 of your traffic is now arbitrarily much slower.
Part of the general idea of distributing/sharding a database is not only to increase storage capacity but to balance work as well. Unbalancing that work will end up unbalancing or screwing up your application performance. It'll either cause 1/Nth of your request load to be arbitrarily slower due to accessing your bogged down node, or it'll increase processing time across the board if your requests potentially access multiple nodes. Evenly spreading load gives an application better capacity to handle higher load without adversely effecting performance.
But there's also the practical consideration of whether the access to your new keys are proportionally large to your ongoing traffic. If your cluster is handling 1000+ commands/sec/node and a single key will add 10 req/sec to a single particular node, you'll probably be fine just fine either way.

Neo4J PathFinder Optimization

I have a very large (Several million nodes and many more relationships) embedded Neo4J graph database. I'm using version 2.1.5 of Neo4J. I often need to see how/if two nodes are connected. I use the GraphAlgoFactory to build a PathFinder that I then call findSinglePath on. If I build a Djikstra's PathFinder, it runs about an order of magnitude slower than if I run a ShortestPath PathFinder when the nodes are in fact connected. However, when not connected, ShortestPath runs slower than DJikstra's. Anybody know why it might behave like this?
Also, how does one optimize these calls? When two nodes aren't connected, it takes 60-120 seconds to figure that out. For my purposes, that is too slow.
What is the degree-distribution of your network?
Can you filter stronger on rel-types, directions or attributes or labels of nodes in between? Just to reduce the amount of paths looked at?
It might also help to use a different uniqueness, e.g. Node-Global.
You should probably provide an upper limit of your expected length.
Both Dijkstra and shortest path are actually bi-directional.
You can also use the bi-directional traverser yourself.
See this blog post: http://maxdemarzi.com/2015/11/20/bidirectional-traversals-in-space/

Concurrent page request comparisons

I have been hoping to find out what different server setups equate to in theory for concurrent page requests, and the answer always seems to be soaked in voodoo and sorcery. What is the approximation of max concurrent page requests for the following setups?
apache+php+mysql(1 server)
apache+php+mysql+caching(like memcached or similiar (still one server))
apache+php+mysql+caching+dedicated Database Server (2 servers)
apache+php+mysql+caching+dedicatedDB+loadbalancing(multi webserver/single dbserver)
apache+php+mysql+caching+dedicatedDB+loadbalancing(multi webserver/multi dbserver)
+distributed (amazon cloud elastic) -- I know this one is "as much as you can afford" but it would be nice to know when to move to it.
I appreciate any constructive criticism, I am just trying to figure out when its time to move from one implementation to the next, because they each come with their own implementation feat either programming wise or setup wise.
In your question you talk about caching and this is probably one of the most important factors in a web architecture r.e performance and capacity.
Memcache is useful, but actually, before that, you should be ensuring proper HTTP cache directives on your server responses. This does 2 things; it reduces the number of requests and speeds up server response times (if you have Apache configured correctly). This can also be improved by using an HTTP accelerator like Varnish and a CDN.
Another factor to consider is whether your system is stateless. By stateless, it usually means that it doesn't store sessions on the server and reference them with every request. A good systems architecture relies on state as little as possible. The less state the more horizontally scalable a system. Most people introduce state when confronted with issues of personalisation - i.e serving up different content for different users. In such cases you should first investigate using the HTML5 session storage (i.e store the complete user data in javascript on the client, obviously over https) or if the data set is smaller, secure javascript cookies. That way you can still serve up cached resources and then personalise with javascript on the client.
Finally, your stack includes a database tier, another potential bottleneck for performance and capacity. If you are only reading data from the system then again it should be quite easy to horizontally scale. If there are reads and writes, its typically better to separate the read write datasets into a separate database and have the read only in another. You can then use more relevant methods to scale.
These setups do not spit out a single answer that you can then compare to each other. The answer will vary on way more factors than you have listed.
Even if they did spit out a single answer, then it is just one metric out of dozens. What makes this the most important metric?
Even worse, each of these alternatives is not free. There is engineering effort and maintenance overhead in each of these. Which could not be analysed without understanding your organisation, your app and your cost/revenue structures.
Options like AWS not only involve development effort but may "lock you in" to a solution so you also need to be aware of that.
I know this response is not complete, but I am pointing out that this question touches on a large complicated area that cannot be reduced to a single metric.
I suspect you are approaching this from exactly the wrong end. Do not go looking for technologies and then figure out how to use them. Instead profile your app (measure, measure, measure), figure out the actual problem you are having, and then solve that problem and that problem only.
If you understand the problem and you understand the technology options then you should have an answer.
If you have already done this and the problem is concurrent page requests then I apologise in advance, but I suspect not.

Redis mimic MASTER/MASTER? or something else?

I have been reading a lot of the posts on here and surfing the web, but maybe I am not asking the right question. I know that Redis is currently Master/slave until Cluster becomes available. However, I was wondering if someone can tell me how I would want to configure Redis logistically to meet my needs (or if its not the right tool).
Scenerio:
we have 2 sites on opposite ends of the US. We want clients to be able to write at each site at a high volume. We then want each client to be able to perform reads at their site as well. However we want the data to be available from a write at the sister site in < 50ms. Given that we have plenty of bandwidth. Is there a way to configure redis to meet our needs? our writes maximum size would be on the order of 5k usually much less. The main point is how can i have2 masters that are syncing to one another even if it is not supported by default.
The catch with Tom's answer is that you are not running any sort of cluster, you are just writing to two servers. This is a problem if you want to ensure consistency between them. Consider what happens when your client fails a write to the remote server. Do you undo the write to local? What happens to the application when you can't write to the remote server? What happens when you can't read from the local?
The second catch is the fundamental physics issue Joshua raises. For a round trip you are talking a theoretical minimum of 38ms leaving a theoretical maximum processing time on both ends (of three systems) of 12ms. I'd say that expectation is a bit too much and bandwidth has nothing to do with latency in this case. You could have a 10GB pipe and those timings are still extant. That said, transferring 5k across the continent in 12ms is asking a lot as well. Are you sure you've got the connection capacity to transfer 5k of data in 50ms, let alone 12? I've been on private no-utilization circuits across the continent and seen ping times exceeding 50ms - and ping isn't transferring 5k of data.
How will you keep the two unrelated servers in-sync? If you truly need sub-50ms latency across the continent, the above theoretical best-case means you have 12ms to run synchronization algorithms. Even one query to check the data on the other server means you are outside the 50ms window. If the data is out of sync, how will you fix it? Given the above timings, I don't see how it is possible to synchronize in under 50ms.
I would recommend revisiting the fundamental design requirements. Specifically, why this requirement? Latency requirements of 50ms round trip across the continent are usually the sign of marketing or lack of attention to detail. I'd wager that if you analyze the requirements you'll find that this 50ms window is excessive and unnecessary. If it isn't, and data synchronization is actually important (likely), than someone will need to determine if the significant extra effort to write synchronization code is worth it or even possible to keep within the 50ms window. Cross-continent sub-50ms latency data sync is not a simple issue.
If you have no need for synchronization, why not simply run one server? You could use a slave on the other side of the continent for recovery-only purposes. Of course, that still means that best-case you have 12ms to get the data over there and back. I would not count on
50ms round trip operations+latency+5k/10k data transfer across the continent.
It's about 19ms at the speed of light to cross the US. <50ms is going to be hard to achieve.
http://www.wolframalpha.com/input/?i=new+york+to+los+angeles
This is probably best handled as part of your client - just have the client write to both nodes. Writes generally don't need to be synchronous, so sending the extra command shouldn't affect the performance you get from having a local node.