Which s3 compatible blob storage? - amazon-s3

I want deploy a s3 compatible blob storage in my Kubernetes Cluster. I already use GlusterFS for volumes like mongodb, and I tried to set up minio with the helm chart https://github.com/helm/charts/tree/master/stable/minio. I just realize I can't scale up minio easily because of erasure code.
So I have some questions about blob storage solutions :
Is GlusterFS blob storage service stable and reliable (https://github.com/gluster/gluster-kubernetes/tree/master/docs/examples/gluster-s3-storage-template) ?
Do I must use OpenShift to deploy GlusterFS blob storage as I read in the web ? I think no because I can see simple Kubernetes manifests in the GlusterFS repo like this one : https://github.com/gluster/gluster-kubernetes/blob/master/deploy/kube-templates/gluster-s3-template.yaml.
Is it easy to use Minio federation in Kubernetes ? Is it easily scalable with a "helm upgrade --set replicas=X" or do I need manually upgrade minio configuration ?
As you can see, I feel lost with this s3 storage. So if you have more information/solutions, do not hesitate.
Thanks in advance !

About reliability you should read more about user experience like:
An end user review of GlusterFS
Community Survey Feedback, 2019
Why openshift with glusterFS:
For standalone Red Hat Gluster Storage, there is no component installation required to use it with OpenShift Container Platform. OpenShift Container Platform comes with a built-in GlusterFS volume driver, allowing it to make use of existing volumes on existing clusters but Red Hat Gluster Storage is a commercial storage software product, based on Gluster.
How to deploy it in AWS
For minio please follow official docs:
ConfigMap allows injecting containers with configuration data even while a Helm release is deployed.
To update your MinIO server configuration while it is deployed in a release, you need to
Check all the configurable values in the MinIO chart using helm inspect values stable/minio.
Override the minio_server_config settings in a YAML formatted file, and then pass that file like this helm upgrade -f config.yaml stable/minio.
Restart the MinIO server(s) for the changes to take effect
I didn't try but, but as per documentation:
For federation I can see additional environment variables in the values.yaml.
In addition you should Run MinIO in federated mode Federation Quickstart Guide
Here you can find differences between google and amazon s3 sotrage
or Cloud Storage interoperability from gcloud perspective.
Hope this help.

Related

HDFS over S3 / Google storage bucket translation layer - how?

I'd love to expose a Google storage bucket over HDFS to a service.
Service in question is a cluster (SOLR) that can speak only to HDFS, given I have no hadoop (nor need for it), ideally I'd like to have a docker container that would user a Google storage bucket as a backend and expose it's contents via HDFS.
If possible I'd like to avoid mounts (like fuse gcsfs), has anyone done such thing?
I think I could just do mount gcsfs and setup a single node cluster with HDFS, but is there a simpler / more robust way?
Any hints / directions are appreciated.
The Cloud Storage Connector for Hadoop is the tool you might need.
It is not a Docker image but rather an install. Further instructions can be found in the GitHub repository under README.md and INSTALL.md
If it is accessed from AWS S3 you'll need a Service Account with access to Cloud Storage and set the env variable GOOGLE_APPLICATION_CREDENTIALS to /path/to/keyfile.
To use SOLR with GCS, you need indeed a hadoop cluster and you can do that in GCP by creating a dataproc cluster then use the connector mentioned to connect your SOLR solution with GCS. for more info check this SOLR

HSQLDB on S3 Compatible Service

We use HSQLDB as a filesystem based database as our application requirements for a RDBMS is minimal. We would now like to move this application to Pivotal Cloud Foundry. S3 compatible storage (on cloud) is the only service compatible "filesystem" on physical machines.
So if we move our current HSQLDB to S3, we would not be able to make a direct JDBC connection to the HSQLDB "file" (as accessing S3 objects need authetication etc).
Has anyone faced such a situation before? Are there ways to access HSQLDB with S3 as a storage medium ?
Thanks,
Midhun
Pivotal Cloud Foundry allows you to bind volume mounts to your cf push-ed applications. Thanks to the NFS volume service (see cf marketplace -s), you can bind volume mounts to your application with the usual cf create-service and cf bind-service commands. Then your HSQLDB files must be written under the filesystem directory where the NFS volume is mounted.
This could be handy a solution for running your app in Cloud Foundry with persistent filesystem storage for your HSQLDB database.
Default PCF installations provide such mount from some NFS server. Here is the NFS volumes documentation and especially for your PCF operator, how to enable this feature.

Can spinnaker use local storage such as mysql database?

I want to deploy spinnaker for my team. But I encounter a problem. The document of spinnaker said:
Before you can deploy Spinnaker, you must configure it to use one of the supported storage types.
Azure Storage
Google Cloud Storage
Redis
S3
Can spinnaker use local storage such as mysql database?
The Spinnaker microservice responsible for persisting your pipeline configs and application metadata, front50, has support for the storage systems you listed. One could add support for additional systems like mysql by extending front50, but that support does not exist today.
Some folks have had success configuring front50 to use s3 and pointing it at a minio installation.

Kubernetes Custom Volume Plugin with Dynamic Provisioning

I have a proprietary file-system and I would like to use it for providing file storage to my K8S pods. I am currently running K8S v1.5.1, but open upgrade to 1.6 if need be.
I would like to make use of Dynamic Provisioning so that the volumes are created on need basis. I went through the official documentation on kubernetes.io and this is what I have understood so far:
I need to write a Kubernetes Custom volume-plugin for my proprietary
file-system.
I need to create a StorageClass which makes use of a
provisoner that provisions volumes from my proprietary filesystem
I then create a PVC that refers to my StorageClass
I then create my Pods referring to my storage class by name.
What I am not able to make out is:
Is Provisoner referred by Storage Class and K8S Volume Plugin one and the same? If they are different, how?
There is mention of External Provisoner in K8S documentation. Does this mean I can write the K8S Volume Plugin for my filesystem out-of-tree (outside K8S code)?
My filesystem provides REST APIs to create filesystem volumes. Can I invoke them in my provisoner/volume plugin?
If I write an out-of-tree plugin, how do I load it in my K8S cluster so that it can be used to provision volumes using the Storage Class?
Appreciate any help in answering any or all of the above.
Thanks!
Is Provisoner referred by Storage Class and K8S Volume Plugin one and the same? If they are different, how?
It should be same if you want to provision the storage using that plugin.
There is mention of External Provisoner in K8S documentation. Does this mean I can write the K8S Volume Plugin for my filesystem out-of-tree (outside K8S code)?
Yes, thats correct.
My filesystem provides REST APIs to create filesystem volumes. Can I invoke them in my provisoner/volume plugin?
Yes, as long as the client is part of the provisioner code.
If I write an out-of-tree plugin, how do I load it in my K8S cluster so that it can be used to provision volumes using the Storage Class?
It can run as a container or you can invoke it by a binary execution model.

Retrieve application config from secure location during task start

I want to make sure I'm not storing sensitive keys and credentials in source or in docker images. Specifically I'd like to store my MySQL RDS application credentials and copy them when the container/task starts. The documentation provides an example of retrieving the ecs.config file from s3 and I'd like to do something similar.
I'm using the Amazon ECS optimized AMI with an auto scaling group that registers with my ECS cluster. I'm using the ghost docker image without any customization. Is there a way to configure what I'm trying to do?
You can define a volume on the host and map it to the container with Read only privileges.
Please refer to the following documentation for configuring ECS volume for an ECS task.
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_data_volumes.html
Even though the container does not have the config at build time, it will read the configs as if they are available in its own file system.
There are many ways to secure the config on the host OS.
In my past projects, I have achieved the same by disabling ssh into the host and injecting the config at boot-up using cloud-init.