I try to use redis for pub/sub by two systems and one of them is now ours (other company maintain it). I would like to have time stamp when I publish something in redis channel. Can someone help with this idea?
I already user redis log with debug level of information (the highest one) - but there is no time info for pub/sub messages in the log.
I tested redis monitor: redis-cli monitor . It's exactly what I want, but it decrease the performance of the system by 50%.
The only way is to implement the time log by myself - may be SET some time info before pub command in redis? This will put in redis local time and it will be slightly before pub.
You cannot achieve the goal with pubsub.
You might want to try Redis Streaming. For each streaming message, the first part of the automatically generated ID is the Unix timestamp when the ID is generated, i.e. the message is received by Redis.
Related
Redis team introduce new Streams data type for Redis 5.0. Since Streams looks like Kafka topics from first view it seems difficult to find real world examples for using it.
In streams intro we have comparison with Kafka streams:
Runtime consumer groups handling. For example, if one of three consumers fails permanently, Redis will continue to serve first and second because now we would have just two logical partitions (consumers).
Redis streams much faster. They stored and operated from memory so this one is as is case.
We have some project with Kafka, RabbitMq and NATS. Now we are deep look into Redis stream to trying using it as "pre kafka cache" and in some case as Kafka/NATS alternative. The most critical point right now is replication:
Store all data in memory with AOF replication.
By default the asynchronous replication will not guarantee that XADD commands or consumer groups state changes are replicated: after a failover something can be missing depending on the ability of followers to receive the data from the master. This one looks like point to kill any interest to try streams in high load.
Redis failover process as operated by Sentinel or Redis Cluster performs only a best effort check to failover to the follower which is the most updated, and under certain specific failures may promote a follower that lacks some data.
And the cap strategy. The real "capped resource" with Redis Streams is memory, so it's not really so important how many items you want to store or which capped strategy you are using. So each time you consumer fails you would get peak memory consumption or message lost with cap.
We use Kafka as RTB bidder frontend which handle ~1,100,000 messages per second with ~120 bytes payload. With Redis we have ~170 mb/sec memory consumption on write and with 512 gb RAM server we have write "reserve" for ~50 minutes of data. So if processing system would be offline for this time we would crash.
Could you please tell more about Redis Streams usage in real world and may be some cases you try to use it themself? Or may be Redis Streams could be used with not big amount of data?
long time no see. This feels like a discussion that belongs in the redis-db mailing list, but the use case sounds fascinating.
Note that Redis Streams are not intended to be a Kafka replacement - they provide different properties and capabilities despite the similarities. You are of course correct with regards to the asynchronous nature of replication. As for scaling the amount of RAM available, you should consider using a cluster and partition your streams across period-based key names.
I'm using Redis for storing simple key, value pairs; where, value is also of string type. In my Redis cluster, I've a master and two slaves. I want to propagate any of the changes to the data from one of the slaves to any other store (actually, oracle database). How can I do that reliably? The sink database only needs to be eventually consistent. Some delay is allowed.
Strategies I can think of:
a) Read the AOF file written by the slave machine and propagate the changes. (Requires parsing the AOF file and getting notified of every change to the file.)
b) Use rpoplpush. The reliable queue pattern provided. But, how to make the slave insert to that queue whenever it gets some set event from the master?
Any other possibility?
This is a very common problem faced by Redis developers. In a nutshell, it is the fact that:
Want to know all changes sinse last
Keep this change data atomic
I believe that any decision one way or another will be around these issues. So, yes AOF is one of best choises in this case, but here is not any production ready instruments for that. Yes, it is not very complex solution in case of one server but then using master/slave or cluster it can be very complex.
Using Keyspace notifications
Look's like Keyspace Notifications feature may be alternative. Keyspace notifications is a feature available since 2.8.0 and available in Redis cluster too. From original documentation:
Keyspace notifications allows clients to subscribe to Pub/Sub channels in order to receive events affecting the Redis data set in some way.Examples of the events that is possible to receive are the following:
All the commands affecting a given key.
All the keys receiving an LPUSH operation.
All the keys expiring in the database 0.
Events are delivered using the normal Pub/Sub layer of Redis, so clients implementing Pub/Sub are able to use this feature without modifications.
Because Redis Pub/Sub is fire and forget currently there is no way to use this feature if you application demands reliable notification of events, that is, if your Pub/Sub client disconnects, and reconnects later, all the events delivered during the time the client was disconnected are lost. This can be improved by duplicating the employees who serve this Pub/Sub channel:
The group of N workers subscribe to notification and put data to SET based "sync" list. This allow us control overhead and do not write same data to our sync list.
The other group of workers pop record with SPOP and write it other store.
Using manual update list
The other way is using special "sync" SET based list with every write operation (as i understand SET/HSET in your case). Something like:
MULTI
SET myKey value
SADD myKey
EXEC
Each time you modify your key you add key name to SET. So in other process or worker you can SPOP that key, read value and update source.
Also you can use RPOPLPUSH/LPOPRPUSH besides of SPOP in some kind of in progress list to protect your key would missed if worker failed. In this case each worker first RPOPLPUSH/LPOPRPUSH from sync set to in progress set, push data to storage and remove key from in progress set.
The Redis INFO command returns a
server_load
metric, for us this is a value like 0.45. The question is what value does this represent? A percentage? A fraction of 1?
You can see from our monitoring that the load is very low:
Is 0.45 good or bad?
From https://azure.microsoft.com/en-gb/blog/investigating-timeout-exceptions-in-stackexchange-redis-for-azure-redis-cache/
"Is there a high Redis-server server load ? Using the Redis-cli client tool, you can connect to your Redis endpoint and run "INFO CPU" to check the value of server_load. A server load of 100 (maximum value) signifies that the redis server has been busy all the time (has not been idle) processing the requests.Run Slowlog from redis-cli to see if there are any requests that are taking more time to process causing server load to max out."
I think server_load only exists in Azure Redis. Haven't seen it in the standalone builds. And i'm pretty sure Microsoft would never use Unix style load indication in ... anything.
No such thing as server_load in the source code either:
https://github.com/antirez/redis/search?utf8=%E2%9C%93&q=server_load
MSOpenTech GitHub Issue link: https://github.com/MSOpenTech/redis/issues/385
I am using LogStash to collect the logs from my service. The volume of the data is so large (20GB/day) that I am afraid that some of the data will be dropped at peak time.
So I asked question here and decided to add a Redis as a buffer between ELB and LogStash to prevent data loss.
However, I am curious about when will LogStash exceed the queue capacity and drop messages?
Because I've done some experiments and the result shows that LogStash can completely process all the data without any loss, e.g., local file --> LogStash --> local file, netcat --> LogStash --> local file.
Can someone give me a solid example when LogStash eventually drop messages? So I can have a better understanding about why we need a buffer in front of it.
As far as I know, Logstash queue is very small. Please refer to here.
Logstash sets each queue size to 20. This means only 20 events can be pending into the next phase.
This helps reduce any data loss and in general avoids logstash trying to act as a data storage
system. These internal queues are not for storing messages long-term.
As you say, your daily logs size are 20GB. It's quite large amount. So, it is recommended that install a redis before logstash. The other advantage for installing a redis is when your logstash process have error and shutdown, redis can buffer the logs for you, otherwise all your logs will be drop.
The maximum queue size is configurable and the queue can be stored on-disk or in-memory. (Strongly advise in-memory due to high volume).
When the queue is full, logstash will stop reading log messages and drop incoming logs.
For log files, logstash will stop reading further when tit can't keep up, it can resume reading later. It's keeping track of active log files and last read position. The files are basically acting like an enormous buffer, it's really unlikely to lose data (unless files are deleted).
For TCP/UDP input, messages can be lost if the queue is full.
For other inputs/outputs, you have to check the doc, whether it can support back pressure, whether it can replay missed messages if a network connection was lost.
Generally speaking, 20 GB a day is pretty low (even in 2014 when it was originally posted), we're talking about 1000 messages a second. logstash really doesn't need a redis in front.
For very large deployments (multiple TB per day), it's common to encounter kafka somewhere in the chain to buffer messages. At this stage there are typically many clients with different types of messages, flowing over a variety of protocols.
I'm developing project with redis.My redis configuration is normal redis setup configuration.
I don't know how should I do redis configuration? Master-Slave? Cluster?
Do you have anything suggestion redis configuration for production?
Standard approach would be to have one master and at least one slave. Depending on your I/O requirements and number of ops/sec, you can always have multiple read-only slaves. Slaves can be read from but not written to. So you'll want to design your application to take advantage of doing round-robin requests to the slaves and writes only to the single master.
Depending on your data storage/backup requirement, you can set fsync for append-only mode to be every second. So while this means you can lose up to one second worth of data, it's really much less than that because your slaves serve as hot backups, and they will have the data within milliseconds.
You'll at least want to do a BGSAVE every hour to get a dump.rdp produced. You can then save this file live while the server is still running, and store it to some off-site backup facility.
But if you're just using Redis as a standard memcache replacement and don't care about data, then you can ignore all of this. Much of it will be changing in Redis Cluster in the 3.0 version.
It depends on what your Read/Writes requirements are. Could you give us more informations on that matter ?
I think 10,000 people use instant my application.I persist member login token on redis.It's important for me.If I don't write redis, member don't login on application.
Even a Redis single instance will be enough to process 10K users (start redis-bench to the throughput available), so just to be sure use a Master/Slave configuration with autopromotion of the slave if the master goes down.
Since you want persistence, use RDB (maybe along with AOF), see this topic on Redisio.