How do I setup rolling deployment in Spinnaker? - spinnaker

I just started trying out Spinnaker. I have gone through the tutorial, https://www.spinnaker.io/guides/tutorials/codelabs/gcp-kubernetes-source-to-prod/, and got it working without issues.
Now I want to go a bit more advanced and do a rolling release or a canary deployment (https://www.spinnaker.io/concepts/#deployment-strategies), where it is possible, for instance, to only expose a new release to 5% of the customers.
I cannot find any guide on spinnaker.io (or google) on how to set that up. Can anyone guide me in the right direction?

I have currently been experimenting and doing PoC's on Spinnaker and Canary Deployments myself of late, and here is what I have found thus far.
To implement a rolling release, just create a Deploy stage in Spinnaker, and set the Deployment Strategy to RollingUpdate in your Server Group config. You will need to make sure that the Deployment checkbox is checked before you can change the Deployment Strategy.
For the Canary Deployment, it is a little more involved. I don't think that the Canary Stage currently supports Kubernetes Deployments(yet), but apparently you can manually deploy a canary(e.g. 1 replica) into the same Kubernetes LoadBalancer where your app is running. This is done using a separate Spinnaker Server Group.
Then you can add a Manual Judgement to your Spinnaker pipeline that will pause until you test/validate the canary. Once the canary has been validated, you "Continue" the Manual Judgement, and the new Server Group gets deployed, and the old Server Group gets disabled, and the canary destroyed.
If you don't want to use a Manual Judgement, and want this fully automated, you can add an ACA Stage(Automated Canary Analysis). This involves setting up a judge, that Spinnaker can connect to, that will gather various metrics and provide an ACA score. You can then use that score to decide whether to proceed with a deployment, or stop the deployment.

Related

How to build a development and production environment in apache nifi

I have 2 apache nifi servers that are development and production hosted on AWS, currently the migration between development and production is done manually. I would like to know if it is possible to automate this process and ensure that people do not develop in production?
I thought about uploading the entire nifi in github and having it deploy the new nifi on the production server, but I don't know if that would be correct to do.
One option is to use NiFi registry, store the flows in the registry and share the registry between Development and Production environments. You can then promote the latest version of the flow from dev to prod.
As you say, another option is to potentially use Git to share the flow.xml.gz between environments and using a deploy script. The flow.xml.gz stores the data flow configuration/canvas. You can use parameterized flows (https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Parameters) to point NiFi at different external dev/prod services (eg. NiFi dev processor uses a dev database URL, NiFi prod points to prod database URL).
One more option is to export all or part of the NiFi flow as a template, and upload the template to your production NiFi, however registry is probably a better way of handling this. More info on templates here: https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#templates.
I believe the original design plan behind NiFi was not necessarily to have different environments, and to allow live changes in production. I guess you would build your initial data flow using some test data in production and then once it's ready start the live data flow. But I think it's reasonable to want to have separate environments.

Mirror canary deployment in RTL?

I am new to canary deployments. We are going to start doing canary deployments via Istio.
I was assuming this would just be a deployment mechanism, probably with some Istio routing testing in a pre-prod env but in earlier test envs we'd ring fence to a version being tested as we do today.
It's been suggested the canary concept is applied to all test environments so we effectively run all versions we expect to canary test in prod in the Route To Live.
Wondring what approach others are taking?
Mirroring
As mentioned here
Using Istio, you can use traffic mirroring to duplicate traffic to another service. You can incorporate a traffic mirroring rule as part of a canary deployment pipeline, allowing you to analyze a service's behavior before sending live traffic to it.
If you're looking for best practices I would recommend to start with this tutorial on medium, because it is explained very well here.
How Traffic Mirroring Works
Traffic mirroring works using the steps below:
You deploy a new version of the application and switch on traffic
mirroring.
The old version responds to requests like before but also sends an asynchronous copy to the new version.
The new version processes the traffic but does not respond to the user.
The operations team monitor the new version and report any issues to the development team.
As the application processes live traffic, it helps the team uncover issues that they would typically not find in a pre-production environment. You can use monitoring tools, such as Prometheus and Grafana, for recording and monitoring your test results.
Additionally there is an example with nginx that perfectly shows how it should work.
Canary deployment
As mentioned here
One of the benefits of the Istio project is that it provides the control needed to deploy canary services. The idea behind canary deployment (or rollout) is to introduce a new version of a service by first testing it using a small percentage of user traffic, and then if all goes well, increase, possibly gradually in increments, the percentage while simultaneously phasing out the old version. If anything goes wrong along the way, we abort and rollback to the previous version. In its simplest form, the traffic sent to the canary version is a randomly selected percentage of requests, but in more sophisticated schemes it can be based on the region, user, or other properties of the request.
Depending on your level of expertise in this area, you may wonder why Istio’s support for canary deployment is even needed, given that platforms like Kubernetes already provide a way to do version rollout and canary deployment. Problem solved, right? Well, not exactly. Although doing a rollout this way works in simple cases, it’s very limited, especially in large scale cloud environments receiving lots of (and especially varying amounts of) traffic, where autoscaling is needed.
There are the differences between k8s canary deployment and istio canary deployment.
k8s
As an example, let’s say we have a deployed service, helloworld version v1, for which we would like to test (or simply rollout) a new version, v2. Using Kubernetes, you can rollout a new version of the helloworld service by simply updating the image in the service’s corresponding Deployment and letting the rollout happen automatically. If we take particular care to ensure that there are enough v1 replicas running when we start and pause the rollout after only one or two v2 replicas have been started, we can keep the canary’s effect on the system very small. We can then observe the effect before deciding to proceed or, if necessary, rollback. Best of all, we can even attach a horizontal pod autoscaler to the Deployment and it will keep the replica ratios consistent if, during the rollout process, it also needs to scale replicas up or down to handle traffic load.
Although fine for what it does, this approach is only useful when we have a properly tested version that we want to deploy, i.e., more of a blue/green, a.k.a. red/black, kind of upgrade than a “dip your feet in the water” kind of canary deployment. In fact, for the latter (for example, testing a canary version that may not even be ready or intended for wider exposure), the canary deployment in Kubernetes would be done using two Deployments with common pod labels. In this case, we can’t use autoscaling anymore because it’s now being done by two independent autoscalers, one for each Deployment, so the replica ratios (percentages) may vary from the desired ratio, depending purely on load.
Whether we use one deployment or two, canary management using deployment features of container orchestration platforms like Docker, Mesos/Marathon, or Kubernetes has a fundamental problem: the use of instance scaling to manage the traffic; traffic version distribution and replica deployment are not independent in these systems. All replica pods, regardless of version, are treated the same in the kube-proxy round-robin pool, so the only way to manage the amount of traffic that a particular version receives is by controlling the replica ratio. Maintaining canary traffic at small percentages requires many replicas (e.g., 1% would require a minimum of 100 replicas). Even if we ignore this problem, the deployment approach is still very limited in that it only supports the simple (random percentage) canary approach. If, instead, we wanted to limit the visibility of the canary to requests based on some specific criteria, we still need another solution.
istio
With Istio, traffic routing and replica deployment are two completely independent functions. The number of pods implementing services are free to scale up and down based on traffic load, completely orthogonal to the control of version traffic routing. This makes managing a canary version in the presence of autoscaling a much simpler problem. Autoscalers may, in fact, respond to load variations resulting from traffic routing changes, but they are nevertheless functioning independently and no differently than when loads change for other reasons.
Istio’s routing rules also provide other important advantages; you can easily control fine-grained traffic percentages (e.g., route 1% of traffic without requiring 100 pods) and you can control traffic using other criteria (e.g., route traffic for specific users to the canary version). To illustrate, let’s look at deploying the helloworld service and see how simple the problem becomes.
There is an example.
There are additional resources you may want to check about traffic mirroring in istio:
https://istio.io/latest/docs/tasks/traffic-management/mirroring/
https://itnext.io/use-istio-traffic-mirroring-for-quicker-debugging-a341d95d63f8
https://dev.to/peterj/mirroring-traffic-with-istio-service-mesh-2cm4
https://livebook.manning.com/book/istio-in-action/chapter-5/v-7/130
https://istio.io/latest/docs/tasks/traffic-management/traffic-shifting/#apply-weight-based-routing

Redis-cluster Helm chart unable to complete job when using istio

When using the Bitnami Helm chart for Redis-Cluster, there is a redis-cluster-cluster-create job. However, when enabling istio-injection, this job never ends. If I disable istio-injection, the job quickly ends. Any solutions or reason why this phenomenon is happening?
Answering the main question
there is a redis-cluster-cluster-create job. However, when enabling istio-injection, this job never ends. If I disable istio-injection, the job quickly ends. Any solutions or reason why this phenomenon is happening?
The main issue here is that job is not considered complete until all containers have stopped running, and Istio sidecar run indefinitely, while your task may have completed, the Job as a whole will not appear as completed in Kubernetes.
There is github issue about that.
There is one of the workarounds, and you can find more workarounds here.
I can change the podAnnotations from Redis-Cluster Helm Chart, and when disabling the istio-injection, the Job doesn't spin up istio-proxy. However, the main job 'cluster-create' job never ends, and eventually fails the the deploy
As mentioned here
So as a temporary workaround adding sidecar.istio.io/inject: “false” is possible but this disables Istio for any traffic to/from the annotated Pod. As mentioned, we leveraged Kubernetes Jobs for Integration Testing, which meant some tests may need to access the service mesh. Disabling Istio essentially means breaking routing — a show stopper.
So it might actually not work here. I suggest to try with the quitquitquit as it's the most recommended workaround.
Additionally worth to check these github issues:
List of applications incompatible with Istio
helm stable/redis does not work with istio sidecar
Using Istio with CronJobs
Sidecar Containers

Creating a kubernetes cluster on GCP using Spinnaker

For end to end devops automation I want to have an environment on demand. For this I need to Spun up and environment on kubernetes which is eventually hosted on GCP.
My Use case
1. Developer Checks in the code in feature branch
2. Environment in Spun up on Google Cloud with Kubernetes
3. Application gets deployed on Kubernetes
4. Gets tested and then the environment gets destroyed.
I am able to do everything with Spinnaker except #2. i.e create Kube Cluster on GCP using Spinnaker.
Any help please
Thanks,
Amol
I'm not sure Spinnaker was meant for doing what the second point in your list. Spinnaker assumes a collection of resources (VM's or a Kubernetes cluster) and then works with that. So instead of spinning up a new GKE cluster Spinnaker makes use of existing clusters. I think it'd be better (for you costs as well ;) if you seperate the environments using Kubernetes namespaces.

Automatic cluster setup and app deployment on GCE Kubernetes

We are looking for a solid, declarative (yaml), based proceedure to automate the setup of our Kubernetes cluster and application deployments on Google Container Engine.
As our last resort in a serious failure we want to be able to:
Create a new GCE cluster
Execute all our deployments to their latest versions
Execute all the steps in the correct order
What are the solutions people are currently using. Doing this manually takes us about an hour and is error prone. Really it could take 15-20 mins if automated.
You should take a look at Google Cloud Deployment Manager. It "automates the creation and management of your Google Cloud Platform resources for you" meaning that it can create a Google Container Engine cluster as well as create your deployments.
Looking through the GKE deployment manager example should help get you started.