GCP - CDN Server - load-balancing

I'm trying to architect a system on GCP for scalable web/app servers. My initial intention was to have one disk per web server group hosting the OS, and another hosting the source code + imagery etc. My idea was to mount the OS disk on multiple VM instances so to have exact clones of the servers, with one place to store PHP session files (so moving in between different servers would be transparent and not cause problems).
The second idea was to mount a 2nd disk, containing the source code and media files, which would then be shared with 2 web servers, one configured as a CDN server and one with the main website and backend. The backend would modify/add/delete media files, and the CDN server would supply them to the browser when requested.
My problem arises when reading that the Persistent Disk Storage is only mountable on a single VM instance with read/write access, and if it's needed on multiple instances it can be mounted only in write access. I need to have one of the instances with read/write access with the others (possibly many) with read only access.
Would you be able to suggest ways or methods on how to implement such a system on the GCP, or if it's not possible at all?

Unfortunately, it's not possible.
But, you can create a Single-Node File Server and mount it as a read and write disk on other VMs.
GCP has documentation on how to create a single-Node File Server

An alternative to using persistent (which as you said, only alows a single RW mount or many read-only) is to use Cloud Storage - which can be mounted through FUSE.

Related

Cloud and local application sync ideas

I've a situation where my central MySQL db and file system (S3) runs on a EC2.
But one of my application runs locally at my client site on a PI-3 device, which needs to look up data and files from both the DB and file system on cloud. The application generates transactional records in turn and need to upload the DB and FS (may be at day end).
The irony is that sometimes the cloud may not be available due to connectivity issues (being in a remote area).
What could be the best strategies to accommodate this kind of a scenario?
Can AWS Greengrass help in here?
How to keep the Lookup data (DB and FS)in sync with the local devices?
How to update/sync the transactional data generated by the local devices?
And finally, what could be the risks in such a deployment model?
Appreciate some help/suggestions.
How to keep the Lookup data (DB and FS)in sync with the local devices?
You can have a Greengrass Group and includes all of the devices in the that group. Make the devices subscribe to a topic e.g. DB/Cloud/update. Once device received the message on that topic, trigger a on-demand lambda to download the latest information from the Cloud. To make sure the device do not miss any update when offline, you can use persistent session, it will make sure device will receive all the missing message when it is back online.
How to update/sync the transactional data generated by the local devices?
You may try with the Stream Manager. https://docs.aws.amazon.com/greengrass/latest/developerguide/stream-manager.html
Right now, it is allowed you to add a local use lambda to pre-process the data and sync it up with the cloud

Using Kubernetes Persistent Volume for Data Protection

To resolve a few issues we are running into with docker and running multiple instances of some services, we need to be able to share values between running instances of the same docker image. The original solution I found was to create a storage account in Azure (where we are running our kubernetes instance that houses the containers) and a Key Vault in Azure, accessing both via the well defined APIs that microsoft has provided for Data Protection (detailed here).
Our architect instead wants to use Kubernetes Persitsent Volumes, but he has not provided information on how to accomplish this (he just wants to save money on the azure subscription by not having an additional storage account or key storage). I'm very new to kubernetes and have no real idea how to accomplish this, and my searches so far have not come up with much usefulness.
Is there an extension method that should be used for Persistent Volumes? Would this just act like a shared file location and be accessible with the PersistKeysToFileSystem API for Data Protection? Any resources that you could point me to would be greatly appreciated.
A PersistentVolume with Kubernetes in Azure will not give you the same exact functionality as Key Vault in Azure.
PesistentVolume:
Store locally on a mounted volume on a server
Volume can be encrypted
Volume moves with the pod.
If the pod starts on a different server, the volume moves.
Accessing volume from other pods is not that easy.
You can control performance by assigning guaranteed IOPs to the volume (from the cloud provider)
Key Vault:
Store keys in a centralized location managed by Azure
Data is encrypted at rest and in transit.
You rely on a remote API rather than a local file system.
There might be a performance hit by going to an external service
I assume this not to be a major problem in Azure.
Kubernetes pods can access the service from anywhere as long as they have network connectivity to the service.
Less maintenance time, since it's already maintained by Azure.

Just how volatile is a Bluemix Virtual Server's own storage?

The Bluemix documentation leads a reader to believe that the only persistent storage for a virtual server is using Bluemix Block Storage. Also, the documentation leads you to believe that virtual server's own storage will not persist over restarts or failures. However, in practice, this doesn't seem to be the case at least as far as restarts are concerned. We haven't suffered any virtual server outages yet.
So we want a clearer understanding of the rationale for separating the virtual server's own storage from its attached Block Storage.
Use case: I am moving our Git server and a couple of small LAMP-based assets to a Bluemix Virtual Server as we simultaneously develop new mobile apps using Cloud Foundry. In our case, we don't anticipate scaling up the work that the virtual server does any time soon. We just want a reliable new home for an existing website.
Even if you separate application files and databases out into block storage, re-provisioning the virtual server in the event of its loss is not trivial even when the provisioning is automated with Ansible or the like. So, we are not expecting to have to be regularly provisioning the non-persistent storage of a Bluemix Virtual Server.
The Bluemix doc you reference is a bit misleading and is being corrected. The virtual server's storage on local disk does persist across restart, reboot, suspend/resume, and VM failure. If such was not the case then the OS image would be lost during any such event.
One of the key advantages of storing application data in a block storage volume is that the data will persist beyond the VM's lifecycle. That is, even if the VM is deleted, the block storage volume can be left in tact to persist data. As you mentioned, block storage volumes are often used to back DB servers so that the user data is isolated, which lends itself well to providing a higher class of storage specifically for application data, back up, recovery, etc.
In use cases where VM migration is desired the VMs can be set up to boot from a block storage volume, which enables one to more easily move the VM to a different hypervisor and simply point to the same block storage boot volume.
Based on your use case description you should be fine using VM local storage.

Sharing files across weblogic clusters

I have a weblogic 12 cluster. Files get pushed to it both through http forms and through scp to a single machine on the cluster. However I need the files on all the nodes of the cluster. I can run scp myself and copy to all parts of the cluster, but I was hoping that weblogic supported the functionality in some manner. I don't have a disk shared between the machines that would make this easier. Nor can I create a shared disk.
Does anybody know?
No There is no way for WLS to ensure a file copied on one instance of WLS is copied to another. Especially when you are copying it over even using scp.
Please use a shared storage mount, so that all managed servers can refer to this location with out the need to do SCP.

How to replicate Amazon EBS to S3?

We have a site where users upload files, some of them quite large. We've got multiple EC2 instances and would like to load balance them. Currently, we store the files on an EBS volume for fast access. What's the best way to replicate the files so they can be available on more than one instance?
My thought is that some automatic replication process that uploads the files to S3, and then automatically downloads them to other EC2 instances would be ideal.
EBS snapshots won't work because they replicate the entire volume, and we need to be able to replicate the directories of individual customers on demand.
You could write a shell script that would spawn s3cmd to sync your local filesystem with a S3 bucket whenever a new file is uploaded (or deleted). It would look something like:
s3cmd sync ./ s3://your-bucket/
Depends on what OS you are running on your EC2 instances:
There isn't really any need to add S3 to the mix unless you want to store them there for some other reason (like backup).
If you are running *nix the classic choice might be to run rsync and just sync between instances.
On Windows you could still use rsync or else SyncToy from Microsoft is a simple free option. Otherwise there are probably hundreds of commercial applications in this space...
If you do want to sync to S3 then I would suggest one of the S3 client apps like CloudBerry or JungleDisk, which both have sync functionality...
If you are running Windows it's also worth considering DFS (Distributed File System) which provides replication and is part of Windows Server...
The best way is to use the Amazon Cloud Front service. All of the replication is managed as part of the AWS. Content is served from several different availability zones, but does not require you to have EBS volumes in those zones.
Amazon CloudFront delivers your static and streaming content using a global network of edge locations. Requests for your objects are automatically routed to the nearest edge location, so content is delivered with the best possible performance.
http://aws.amazon.com/cloudfront/
Two ways:
Forget EBS, transfer the files to S3 and use S3 as your file-manager than EBS, add cloudfront and use the common-link everywhere.
Mount S3 bucket on any machines.
1. Amazon CloudFront is a web service for content delivery. It delivers your static and streaming content using a global network of edge locations.
http://aws.amazon.com/cloudfront/
2. You can mount S3 bucket on your linux machine. See below:
s3fs -
http://code.google.com/p/s3fs/wiki/InstallationNotes
- this did work for me. It uses FUSE file-system + rsync to sync the files
in S3. It kepes a copy of all
filenames in the local system & make
it look like a FILE/FOLDER.
That way you can share the S3 bucket on different machines.