Is there a way to replicate AWS managed Redis to Azure managed Redis in real time? - redis

I have tried below approach .
Step 1) Took a snapshot of AWS Redis and copied it in Azure
Step 2) Listened to the key space notifications from AWS Redis and put it in a Queue and read these events from the queue and applied it on Azure Redis.
Problems faced:
1)Every operation which effects AWS Redis data, needs one query to AWS Redis to get the effected data as it is not present in notification.
2) Some operations like 'rename' is being delivered in two notifications, handling these two notifications to construct the actual operation will be bit difficult even in Queue.

Related

Replaying Kafka events stored in S3

I might be thinking of this incorrectly, but we're looking to set up a connection between Kafka and S3. We are using Kafka as the backbone of our microservice event sourcing system and may occasionally need to replay events from the beginning of time in certain scenarios (i.e. building a new service, rebuilding a corrupted database view).
Instead of storing events indefinitely in AWS EBS storage ($0.10/GB/mo.), we'd like to shift them to S3 ($0.023/Gb/mo. or less) after seven days using the S3 Sink Connector and eventually continually move them down the chain of S3 storage levels.
However, I don't understand that if I need to replay a topic from the beginning to restore a service, how would Kafka get that data back on demand from S3? I know I can utilize a source connector, but it seems that is only for setting up a new topic. Not for pulling data back from an existing topic.
The Confluent S3 Source Connector doesn't dictate where the data is written back into. But you may want to refer the storage configuration properties regarding topics.dir and topic relationship.
Alternatively, write some code to read your S3 events and send them into a Kafka producer client.
Keep in mind, for your recovery payment calculations that reads from different tiers of S3 cost more and more.
You may also want to follow the developments of Kafka native tiered storage support (or similarly, look at Apache Pulsar as an alternative)

How to push AWS ECS Fargate cloudwatch logs to UI for users to see their long running task real time logs

I am creating an app where the long running tasks are getting executed in ECS Fargate and logs are getting pushed to CloudWatch. Now, am looking for a way to give the users an ability in the UI to see those realtime live logs while their tasks are running.
I am thinking of the below approach..
Saving logs temporarily in DynamoDB
DynamoDB stream with batch will trigger a lambda.
Lambda will trigger an AWS Appsync mutation with None data source.
In UI client will subscribed to that mutation to get real time updates. (depends on the batch size, example 5 batch means 5 logs lines )
https://aws.amazon.com/premiumsupport/knowledge-center/appsync-notify-subscribers-real-time/
Is there any other techniques or methods that i can think of?
Why not use Cloudwatch default save in S3 bucket ability and add SNS to let clients choose which topic they want to trail the log. Removing extra DynamoDB.

How to transfer better sqs queue messages into redshift?

I have a sqs queue, that my application constantly sends messages to (about 5-15 messages per second).
I need to take the messages data and put it in redshift.
Right now, I have background service which gets X messages from the queue every Y minutes, then the service put them in an s3 file, and transfer the data into redshift using the COPY command.
This implementation have some problems:
In my service, I get X messages at a time, and because of the sqs limits, amazon allow to receive only 10 messages at max at a time (meaning that if I want to get 1000 messages, I will need to make 100 network calls)
My service doesn't scale as the application scales -> when there will be 30 (or 300) messages per second, my service won't be able to handle all the messages.
Using aws firehose is a little inconvenient the way I see it, because SHARDS are not scalable (I will need to configure manually to add shards) but maybe I'm wrong here
...
A a result of those things, I need something that will be scalable and efficient as possible. any ideas?
For the purpose you have described, I think AWS would say that Kinesis Data Streams plus Kinesis Data Firehose is a more appropriate service than SQS.
Yes, like you said, you do have to configure the shards. But just one shard can handle 1000 incoming records/sec. Also there are ways to automate the scaling, for example like AWS have documented here
One further advantage of using Kinesis Data Firehose is you can create a delivery stream which pushes the data straight into Redshift if you wish.

Propagating data from Redis slave to a SQL database

I'm using Redis for storing simple key, value pairs; where, value is also of string type. In my Redis cluster, I've a master and two slaves. I want to propagate any of the changes to the data from one of the slaves to any other store (actually, oracle database). How can I do that reliably? The sink database only needs to be eventually consistent. Some delay is allowed.
Strategies I can think of:
a) Read the AOF file written by the slave machine and propagate the changes. (Requires parsing the AOF file and getting notified of every change to the file.)
b) Use rpoplpush. The reliable queue pattern provided. But, how to make the slave insert to that queue whenever it gets some set event from the master?
Any other possibility?
This is a very common problem faced by Redis developers. In a nutshell, it is the fact that:
Want to know all changes sinse last
Keep this change data atomic
I believe that any decision one way or another will be around these issues. So, yes AOF is one of best choises in this case, but here is not any production ready instruments for that. Yes, it is not very complex solution in case of one server but then using master/slave or cluster it can be very complex.
Using Keyspace notifications
Look's like Keyspace Notifications feature may be alternative. Keyspace notifications is a feature available since 2.8.0 and available in Redis cluster too. From original documentation:
Keyspace notifications allows clients to subscribe to Pub/Sub channels in order to receive events affecting the Redis data set in some way.Examples of the events that is possible to receive are the following:
All the commands affecting a given key.
All the keys receiving an LPUSH operation.
All the keys expiring in the database 0.
Events are delivered using the normal Pub/Sub layer of Redis, so clients implementing Pub/Sub are able to use this feature without modifications.
Because Redis Pub/Sub is fire and forget currently there is no way to use this feature if you application demands reliable notification of events, that is, if your Pub/Sub client disconnects, and reconnects later, all the events delivered during the time the client was disconnected are lost. This can be improved by duplicating the employees who serve this Pub/Sub channel:
The group of N workers subscribe to notification and put data to SET based "sync" list. This allow us control overhead and do not write same data to our sync list.
The other group of workers pop record with SPOP and write it other store.
Using manual update list
The other way is using special "sync" SET based list with every write operation (as i understand SET/HSET in your case). Something like:
MULTI
SET myKey value
SADD myKey
EXEC
Each time you modify your key you add key name to SET. So in other process or worker you can SPOP that key, read value and update source.
Also you can use RPOPLPUSH/LPOPRPUSH besides of SPOP in some kind of in progress list to protect your key would missed if worker failed. In this case each worker first RPOPLPUSH/LPOPRPUSH from sync set to in progress set, push data to storage and remove key from in progress set.

How to build a simplified redis cluster (support data sharding and load balance)?

Since the redis cluster is still a work in progress, I want to build a simplied one by myselfin the current stage. The system should support data sharding,load balance and master-slave backup. A preliminary plan is as follows:
Master-slave: use multiple master-slave pairs in different locations to enhance the data security. Matsters are responsible for the write operation, while both masters and slaves can provide the read service. Datas are sent to all the masters during one write operation. Use Keepalived between the master and the slave to detect failures and switch master-slave automatically.
Data sharding: write a consistant hash on the client side to support data sharding during write/read in case the memory is not enougth in single machine.
Load balance: use LVS to redirect the read request to the corresponding server for the load balance.
My question is how to combine the LVS and the data sharding together?
For example, because of data sharding, all keys are splited and stored in server A,B and C without overlap. Considering the slave backup and other master-slave pairs, the system will contain 1(A,B,C), 2(A,B,C) , 3(A,B,C) and so on, where each one has three servers. How to configure the LVS to support the redirection in such a situation when a read request comes? Or is there other approachs in redis to achieve the same goal?
Thanks:)
You can get really close to what you need by using:
twemproxy shard data across multiple redis nodes (it also supports node ejection and connection pooling)
redis slave master/slave replication
redis sentinel to handle master failover
depending on your needs you probably need some script listening to fail overs (see sentinel docs) and clean things up when a master goes down