Cloud Memorystore redis scale down - redis

Want to know if we scale down Cloud Memorystore Redis standard instance, will it impact to data present inside the cache?
I referred below URL, but didn't find exact information.
https://cloud.google.com/memorystore/docs/redis/scaling-behavior

Related

Reduce Redis cluster to single GCP memorystore

I have 3 redis instance with redis. One is the master and the other two, are the slaves. I have connected to master node and get info by redis-cli with INFO command. I can see the parameter cluster_enabled:0 and
#Replication
role:master
connected_slaves:2
slave0:ip=xxxxx,port=6379,state=online,offset=15924636776,lag=1
slave1:ip=xxxxx,port=6379,state=online,offset=15924636776,lag=0
And the keyspace, each node has different dbs. I need to migrate all data to a single memorystore in GCP but I don't know how. Anyone can help me?
Since the two nodes are slaves and clustering is not enabled, you only need to replicate the master node. RIOT is a great tool for migrating data in and out of Redis.
However, if you say DB by node do you mean redis DB that you access by select? In that case you'll need to prefix keys as there may be overlap between the keysets of the DBs.
I think setting up another Redis cluster in a single node configuration is the least of your worries.
The real challenge for you would be migrating all your records over to the new setup. This is not a simple question to answer and would depend heavily on multiple factors:
The total size of your data being migrated
Is this is a live Database in production
Do you want to keep the two DB schemas in your new configuration separate?
Ok, I believe currently your Redis Instances are hosted on Google Compute Engine.
And you are looking to migrate to Memorystore for Redis.
As mentioned here, you can leverage Redis snapshots for this. It provides you step-wise instructions on how to achieve this, leveraging GCS buckets as transient storage.
import data into Cloud Memorystore instances using RDB (Redis Database Backup) snapshots, as well as back up data from existing Redis instances.

A solution to increase my EMR master node capacity without shutting down the cluster

My EMR master node has become full and I need to attach some ESB volumne to it, is there any way to do it without terminating the cluster?
You can add additional EBS volumes & also resize
How to explained here :
https://superuser.com/questions/1409373/how-to-add-an-ebs-volume-by-snapshot-id-to-amazon-emr
https://github.com/qyjohn/AWS_Tutorials/wiki/Grow-EBS-volumes-on-EMR-clusters
I don't think so. This is because you set up Amazon Elastic Block Store (Amazon EBS) volumes and configure mount points when the cluster is launched, so it’s difficult to modify the storage capacity after the cluster is running.
The feasible solutions usually involve adding more nodes to your
cluster, backing up your data to a data lake, and then launching a new
cluster with a higher storage capacity. Or, if the data that occupies
the storage is expendable, removing the excess data is usually the way
to go.
For more details,have a look at: https://aws.amazon.com/blogs/big-data/dynamically-scale-up-storage-on-amazon-emr-clusters/

How do I distribute data into multiple nodes of redis cluster?

I have large number of key-value pairs of different types to be stored in Redis cache. Currently I use a single Redis node. When my app server starts, it reads a lot of this data in bulk (using mget) to cache it in memory.
To scale up Redis further, I want to set up a cluster. I understand that in cluster mode, I cannot use mget or mset if keys are stored on different slots.
How can I distribute data into different nodes/slots and still be able to read/write in bulk?
It's handled in redis client library. You need to find if a library exists with this feature in the language of your choice. For example, if you are using golang - per docs redis-go-cluster provides this feature.
https://redis.io/topics/cluster-tutorial
redis-go-cluster is an implementation of Redis Cluster for the Go language using the Redigo library client as the base client. Implements MGET/MSET via result aggregation.

Prometheus export / import data for backup

How do you export and import data in Prometheus? How do you make sure the data is backed up if the instance gets down?
It does not seem that there is a such feature yet, how do you do then?
Since Prometheus version 2.1 it is possible to ask the server for a snapshot. The documentation provides more details - https://web.archive.org/web/20200101000000/https://prometheus.io/docs/prometheus/2.1/querying/api/#snapshot
Once a snapshot is created, it can be copied somewhere for safe keeping and if required a new server can be created using this snapshot as its database.
The documentation website constantly changes all the URLs, this links to fairly recent documentation on this -
https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-admin-apis
There is no export and especially no import feature for Prometheus.
If you need to keep data collected by prometheus for some reason, consider using the remote write interface to write it somewhere suitable for archival, such as InfluxDB (configured as a time-series database).
Prometheus isn't a long term storage: if the database is lost, the user is expected to shrug, mumble "oh well", and restart Prometheus.
credits and many thanks to amorken from IRC #prometheus.
There is an option to enable Prometheus data replication to remote storage backend. Later the data collected from multiple Prometheus instances could be backed up in one place on the remote storage backend. See, for example, how VictoriaMetrics remote storage can save time and network bandwidth when creating backups to S3 or GCS with vmbackup utility.

AWS S3 alternatives for private cloud

Right now we have a requirement to migrate from AWS to private Data Center. We need to find out potential alternative storage instead of AWS S3.
Currently S3 is used in the following way:
Overall storage size is 10TB;
Min/Avg/Max object size is 0.5/2/100 Mb;
We have N App instances that simultaneously writes/reads
objects approximately 50 writes/sec, 30 reads/sec;
This storage should be redundant (Highly Available), Fault Tolerant, Scalable;
The naive implementation could be store this data on:
Simple NFS storage and add some replication functionality;
Just store mentioned objects in NoSQL DB (as example in Cassandra). However Cassandra will require a number of instances to support this storage (It's nor recommended to store > 1TB pn 1 Cassandra node Cassandra capacity planning)
What solution would you recommend for such scenario ?
Using MinIO is your best bet if you want to have a private cloud storage. It is AWS S3 compatible meaning that applications use AWS S3 can be migrated to MinIO seamlessly. They have a tutorial how to connect MinIO server with AWS CLI. You can test it against the public hosted MinIO server https://play.min.io:9000. Please refer to AWS CLI with MinIO Server.
You can have highly available storage system using MinIO distributed setup. Beware that the dynamic expansion is not a feature of MinIO distributed setup. If you want to expand your cluster you end up spinning a new cluster with your desired number of servers/disks and then you have to migrate your data from old one to new one.
I find it much more easier to use than HDFS. In addition to this, there are a lot of technologies outside Hadoop ecosystem lack HDFS integration. For example, Docker Registry lacks built in HDFS storage driver. However, it has a S3 driver so you can use MinIO as it's object storage.
There're a bunch of options as of S3-compatible private cloud service. if you like open source solutions, the above open stack and Cassandra are good ones. Note that usually no matter what you use, probably you end up setting up a cloud with multiple nodes and this is inevitable to exchange for redundancy and availability. There're some good commercial and economic products as well such as the one from Cloudian
If you need object store I could recommend elliptics (in english).
As I know, it doesn't has limits on disk store.
In case for Cassandra we are using SSD disks (for better performance) < 200-500 Gb. Ring size would be depend from your requirements (read/write latency, replication rate, time to life).
50 writes/sec, 30 reads/sec
This is really quite easy for Cassandra, as I can compare with our setup.
In that case it more depends from time to life for your objects.
Generally, in case for distributed network you also could look at GlusterFS.
You can use OpenStack Swift
Swift is a highly available, distributed, eventually consistent object/blob store. Organizations can use Swift to store lots of data efficiently, safely, and cheaply.
Learn More on : https://docs.openstack.org/swift/latest/
And https://oldhenhut.com/2016/05/31/s3-vs-swift/