In my node-based app I plan to use redis for a number of purposes, basically interprocess pub/sub communication and a on-line users cache. This application is clear.
I am thinking about where to store the basic app's system configuration. Critical elements like main TCP port, default message channel, database name, admin-user password, etc. would go here.
The traditional choice for the implementation of this kind of thing would be a conf-file, maybe a JSON-structure in a plain text file.
I am wondering however if it would make sense to use redis here. The major issue is data reliability and the risk of loss of data.
What are the pros/cons?
The claims against Redis reliability are because of the async nature of its data persistence and replication mechanisms. Being async, these can't guarantee that all updates will endure failures - depending on how you configure Redis, there's a (small) chance that the most recent updates will be lost if there's a failure.
That said, in the context of configuration settings storage, this isn't really an issue. Configuration data is immutable most of the time and if you, by some bizarre coincidence, lose you most recent updates to it, recovering the changes manually (I.e. reconfiguring) is usually trivial.
Related
I understand that using consistent hashing for load distribution in case of cache servers or (sharded) database servers offer a significant advantage over usual key-based hashing, as while adding/removing server the the data movement required between the servers due to rehashing is minimized.
However, if we consider application servers or web servers, which are often designed to be stateless and hence not storing any user/session-related data, does consistent hashing offer any advantage here? If yes, what is the data being considered here or am I missing something?
If the server is truly stateless, then yes it doesn't matter. Then you optimize other parameters, like the distance to the client.
But for a server that process some business logic, there is an implicit state in its cache. The server has to have some persistent storage (let's call it a database), local or remote, otherwise the client wouldn't need to make the request if it already had all the information.
The database's or the appserver's cache would be already warmed up, and would have to be re-initialized each time the system scales up or down.
Even if the database is distributed too, the appserver's connection to a specific shard of the database could (or could not) also be a state.
We are in the process of creating a piece of software to backup a storage account (blobs & tables, no queues) and while researching how to do this we came across the possibility storage logging. We would like to use this feature to do smart incremental backups after an initial full backup. However in the introductory post for this feature here the following caveat is mentioned:
During normal operation all requests are logged; but it is important to note that logging is provided on a best effort basis. This means we do not guarantee that every message will be logged due to the fact that the log data is buffered in memory at the storage front-ends before being written out, and if a role is restarted then its buffer of logs would be lost.
As this is a backup solution this behavior makes the features unusable, we can't miss a file. However I wonder if this has changed in the meantime as Microsoft has built a number of features on top of it like blob function triggers and very recently their new Azure Event Grid.
My question is whether this behavior has changed in the meantime or are the logs still on a best effort basis and should we stick to our 'scanning' strategy?
The behavior for Azure Storage logs is still same. For your case, you might be better off using the EventGrid notification for Blob storage: https://azure.microsoft.com/en-us/blog/introducing-azure-event-grid-an-event-service-for-modern-applications/
I was reading up on problems with server based authentication. I need help with elaboration on the following point.
Scalability: Since sessions are stored in memory, this provides problems with scalability. As our cloud providers start replicating servers to handle application load, having vital information in session memory will limit our ability to scale.
I don't seem to understand why "... having vital information in session memory will limit our ability to scale", will limit the ability to scale. Is it just because the information is being replicated.. so it's to do with redundancy? I don't think so. Anyway, would anyone be kind enough to explain this further? Much appreciated.
What's being referred to is the difference between stateless and stateful server-side ops. Stateful servers keep part of their resources (main memory, mostly) occupied for retaining state pertaining to some client, even when the server is actually doing nothing at all for the client and just waiting for the client to come back. Such systems' performance profile is "linear" only up to the point where all available memory has been filled with state, and beyond that point the server seems to essentially stall. Stateless servers only keep resources occupied when they're actually doing something, and once finished doing stuff, those resources are immediately freed and available for other clients. Such servers are essentially not capped by memory limits and therefore "scale easier".
Also, the explanation given seems to refer to scenario's where a set of distinct machines present themselves to the outside world as being one, when actually they are not (this is often called a "cluster" of machines/servers). In such scenario's, if a client has connected to the "big single virtual machine", then actually he is connected to just one of the "actual machines" in the cluster. If state is kept there, subsequent visits by that same client must then be routed to the same physical machine, or that piece of state must be trafficked around to whatever machine the next visit happens to be to. The former implies the implementation of management functions that take their own set of resources, plus limitations on the freedom the cluster has to distribute the load (the opposite of why you want to do clustering), the latter implies additional network traffic that will cap scalability in essentially the same way as available memory does.
Server-based authentication makes use of sessions, which in turn make use of a local session id. In the cloud, when the servers are replicated to handle application load, it becomes difficult for one server to know which sessions are active on other servers. Now to overcome this problem, extra steps must be performed... for instance to persist the session id on to the database. However, as the servers are increasingly replicated, it becomes more and more difficult to handle all this. Therefore, server-based or session-based authentication can be problematic for scalability.
I understand that Redis serves all data from memory, but does it persist as well across server reboot so that when the server reboots it reads into memory all the data from disk. Or is it always a blank store which is only to store data while apps are running with no persistence?
I suggest you read about this on http://redis.io/topics/persistence . Basically you lose the guaranteed persistence when you increase performance by using only in-memory storing. Imagine a scenario where you INSERT into memory, but before it gets persisted to disk lose power. There will be data loss.
Redis supports so-called "snapshots". This means that it will do a complete copy of whats in memory at some points in time (e.g. every full hour). When you lose power between two snapshots, you will lose the data from the time between the last snapshot and the crash (doesn't have to be a power outage..). Redis trades data safety versus performance, like most NoSQL-DBs do.
Most NoSQL-databases follow a concept of replication among multiple nodes to minimize this risk. Redis is considered more a speedy cache instead of a database that guarantees data consistency. Therefore its use cases typically differ from those of real databases:
You can, for example, store sessions, performance counters or whatever in it with unmatched performance and no real loss in case of a crash. But processing orders/purchase histories and so on is considered a job for traditional databases.
Redis server saves all its data to HDD from time to time, thus providing some level of persistence.
It saves data in one of the following cases:
automatically from time to time
when you manually call BGSAVE command
when redis is shutting down
But data in redis is not really persistent, because:
crash of redis process means losing all changes since last save
BGSAVE operation can only be performed if you have enough free RAM (the amount of extra RAM is equal to the size of redis DB)
N.B.: BGSAVE RAM requirement is a real problem, because redis continues to work up until there is no more RAM to run in, but it stops saving data to HDD much earlier (at approx. 50% of RAM).
For more information see Redis Persistence.
It is a matter of configuration. You can have none, partial or full persistence of your data on Redis. The best decision will be driven by the project's technical and business needs.
According to the Redis documentation about persistence you can set up your instance to save data into disk from time to time or on each query, in a nutshell. They provide two strategies/methods AOF and RDB (read the documentation to see details about then), you can use each one alone or together.
If you want a "SQL like persistence", they have said:
The general indication is that you should use both persistence methods if you want a degree of data safety comparable to what PostgreSQL can provide you.
The answer is generally yes, however a fuller answer really depends on what type of data you're trying to store. In general, the more complete short answer is:
Redis isn't the best fit for persistent storage as it's mainly performance focused
Redis is really more suitable for reliable in-memory storage/cacheing of current state data, particularly for allowing scalability by providing a central source for data used across multiple clients/servers
Having said this, by default Redis will persist data snapshots at a periodic interval (apparently this is every 1 minute, but I haven't verified this - this is described by the article below, which is a good basic intro):
http://qnimate.com/redis-permanent-storage/
TL;DR
From the official docs:
RDB persistence [the default] performs point-in-time snapshots of your dataset at specified intervals.
AOF persistence [needs to be explicitly configured] logs every write operation received by the server, that will be played again at server startup, reconstructing the
original dataset.
Redis must be explicitly configured for AOF persistence, if this is required, and this will result in a performance penalty as well as growing logs. It may suffice for relatively reliable persistence of a limited amount of data flow.
You can choose no persistence at all.Better performance but all the data lose when Redis shutting down.
Redis has two persistence mechanisms: RDB and AOF.RDB uses a scheduler global snapshooting and AOF writes update to an apappend-only log file similar to MySql.
You can use one of them or both.When Redis reboots,it constructes data from reading the RDB file or AOF file.
All the answers in this thread are talking about the possibility of redis to persist the data: https://redis.io/topics/persistence (Using AOF + after every write (change)).
It's a great link to get you started, but it is defenently not showing you the full picture.
Can/Should You Really Persist Unrecoverable Data/State On Redis?
Redis docs does not talk about:
Which redis providers support this (AOF + after every write) option:
Almost none of them - redis labs on the cloud does NOT provide this option. You may need to buy the on-premise version of redis-labs to support it. As not all companies are willing to go on-premise, then they will have a problem.
Other Redis Providers does not specify if they support this option at all. AWS Cache, Aiven,...
AOF + after every write - This option is slow. you will have to test it your self on your production hardware to see if it fits your requirements.
Redis enterpice provide this option and from this link: https://redislabs.com/blog/your-cloud-cant-do-that-0-5m-ops-acid-1msec-latency/ let's see some banchmarks:
1x x1.16xlarge instance on AWS - They could not achieve less than 2ms latency:
where latency was measured from the time the first byte of the request arrived at the cluster until the first byte of the ‘write’ response was sent back to the client
They had additional banchmarking on a much better harddisk (Dell-EMC VMAX) which results < 1ms operation latency (!!) and from 70K ops/sec (write intensive test) to 660K ops/sec (read intensive test). Pretty impresive!!!
But it defenetly required a (very) skilled devops to help you create this infrastructure and maintain it over time.
One could (falsy) argue that if you have a cluster of redis nodes (with replicas), now you have full persistency. this is false claim:
All DBs (sql,non-sql,redis,...) have the same problem - For example, running set x 1 on node1, how much time it takes for this (or any) change to be made in all the other nodes. So additional reads will receive the same output. well, it depends on alot of fuctors and configurations.
It is a nightmare to deal with inconsistency of a value of a key in multiple nodes (any DB type). You can read more about it from Redis Author (antirez): http://antirez.com/news/66. Here is a short example of the actual ngihtmare of storing a state in redis (+ a solution - WAIT command to know how much other redis nodes received the latest change change):
def save_payment(payment_id)
redis.rpush(payment_id,”in progress”) # Return false on exception
if redis.wait(3,1000) >= 3 then
redis.rpush(payment_id,”confirmed”) # Return false on exception
if redis.wait(3,1000) >= 3 then
return true
else
redis.rpush(payment_id,”cancelled”)
return false
end
else
return false
end
The above example is not suffeint and has a real problem of knowing in advance how much nodes there actually are (and alive) at every moment.
Other DBs will have the same problem as well. Maybe they have better APIs but the problem still exists.
As far as I know, alot of applications are not even aware of this problem.
All in all, picking more dbs nodes is not a one click configuration. It involves alot more.
To conclude this research, what to do depends on:
How much devs your team has (so this task won't slow you down)?
Do you have a skilled devops?
What is the distributed-system skills in your team?
Money to buy hardware?
Time to invest in the solution?
And probably more...
Many Not well-informed and relatively new users think that Redis is a cache only and NOT an ideal choice for Reliable Persistence.
The reality is that the lines between DB, Cache (and many more types) are blurred nowadays.
It's all configurable and as users/engineers we have choices to configure it as a cache, as a DB (and even as a hybrid).
Each choice comes with benefits and costs. And this is NOT an exception for Redis but all well-known Distributed systems provide options to configure different aspects (Persistence, Availability, Consistency, etc). So, if you configure Redis in default mode hoping that it will magically give you highly reliable persistence then it's team/engineer fault (and NOT that of Redis).
I have discussed these aspects in more detail on my blog here.
Also, here is a link from Redis itself.
I've been trying to find out ways to improve our nservicebus code performance. I searched and stumbled on these profiles that you can set upon running/installing the nservicebus host.
Currently we're running the nservicebus host as-is, and I read that by default we are using the "Lite" version of the available profiles. I've also learnt from this link:
http://docs.particular.net/nservicebus/hosting/nservicebus-host/profiles
that there are Integrated and Production profiles. The documentation does not say much - has anyone tried the Production profiles and noticed an improvement in nservicebus performance? Specifically affecting the speed in consuming messages from the queues?
One major difference between the NSB profiles is how they handle storage of subscriptions.
The lite, integration and production profiles allow NSB to configure how reliable it is. For example, the lite profile uses in-memory subscription storage for all pub/sub registrations. This is a concern because in order to register a subscriber in the lite profile, the publisher has to already be running (so the publisher can store the subscriber list in memory). What this means is that if the publisher crashes for any reason (or is taken offline), all the subscription information is lost (until each subscriber is restarted).
So, the lite profile is good if you are running on a developer machine and want to quickly test how your services interact. However, it is just not suitable to other environments.
The integration profile stores subscription information on a local queue. This can be good for simple environments (like QA etc.). However, in a highly distributed environment holding the subscription information in a database is best, hence the production profile.
So, to answer your question, I don't think that by changing profiles you will see a performance gain. If anything, changing from the lite profile to one of the other profiles is likely to decrease performance (because you incur the cost of accessing queue or database storage).
Unless you tuned the logging yourself, we've seen large improvements based on reduced logging. The performance from reading off the queues is same all around. Since the queues are local, you won't gain much from the transport. I would take a look at tuning your handlers and the underlying infrastructure. You may want to check out tuning MSMQ and look at the disk you are using etc. Another spot would be to look at how distributed transactions are working assuming you are using a remote database that requires them.
Another option to increase processing time is to increase the number of threads consuming the queue. This will require a license. If a license is not an option you can have multiple instances of a single threaded endpoint running. This requires you shard your work based on message type or something else.
Continuing up the scale you can then get into using the Distributor to load balance work. Again this will require a license, but you'll be able to add more nodes as necessary. All of the opportunities above also apply to this topology.