How to delete the kubernetes provider accounts which is attached to a cluster.(that cluster is deleted) - spinnaker

I have many kubernetes provider accounts. out of them two are 1) my-k8s-account and 2) my-test-account... both are attached to same kubernetes cluster. Now I had to delete the cluster. Now I am trying to sudo hal deploy apply (Now its showing namespaces not found error with my-k8s-account). So I tried to run hal config kubernetes provider delete my-k8s-account and its throwing the error issues with my-test-account. I tried to delete both with the same command even that didnot work
can anyone help me
Issue link in GitHUB
https://github.com/spinnaker/spinnaker.github.io/issues/996

We can use this tag no-validate so that it will skip checking the config details of the clusters:
#hal config provider kubernetes account delete ACCOUNT --no-validate

Related

Getting AccessDenied error while upgrading EKS cluster

I am trying to upgrade my EKS cluster from 1.15 to 1.16 using same ci pipeline which created the cluster...So the credentials have no issue.However I am receiving AccessDenied error.I am using eksctl upgrade cluster command to upgrade cluster.
info: cluster test-cluster exists, will upgrade it
[ℹ] eksctl version 0.33.0
[ℹ] using region us-east-1
[!] NOTE: cluster VPC (subnets, routing & NAT Gateway) configuration changes are not yet implemented
[ℹ] will upgrade cluster "test-cluster" control plane from current version "1.15" to "1.16"
Error: AccessDeniedException:
status code: 403, request id: 1a02b0fd-dca5-4e54-9950-da29cac2cea9
My eksctl version 0.33.0
I am not sure why the same ci pipeline which created the cluster now throwing Access denied error when trying to upgrade the cluster..Is there any permissions I need to add to IAM policy for the user ? I dont find anything in the prerequisites document.So Please let me know what I am missing here.
I have figured out the error was due to missing IAM permission.
I used --verbose 5 to diagnose this issue.

AWS EKS node group migration stopped sending logs to Kibana

I encounter a problem while using EKS with fluent bit and I will be grateful for the community help, first I'll describe the cluster.
We are running EKS cluster in a VPC that had an unmanaged node group.
The EKS cluster network configuration is marked as "public and private" and
using fluent-bit with Elasticsearch service we show logs in Kibana.
We've decided that we want to move to managed node group in that cluster and therefore migrated from the unmanaged node group to a managed node group successfully.
Since our migration we cannot see any logs in Kibana, when getting the logs manually from the fluent bit pods there are no errors.
I toggled debug level logs for fluent bit to get better look at it.
I can see that fluent-bit gathers all the log files and then I saw that we get messages:
[debug] [out_es] HTTP Status=403 URI=/_bulk
[debug] [retry] re-using retry for task_id=63 attemps=3
[debug] [sched] retry=0x7ff56260a8e8 63 in 321 seconds
Furthermore, we have managed node group in other EKS clusters but we did not migrate to them they were created with managed node group.
The created managed node group were created from the same template we have from working managed node group with the only difference is the compute power.
The template has nothing special in it except auto scale.
I compared between the node group IAM role of working node group logs and my non working node group and the Roles seems to be the same.
As far for my fluent bit configuration I have the same configuration in few EKS clusters and it works so I don't think that the root cause but if anyone thinks something else I can add it if requested.
Someone had that kind of problem? why node group migration could cause such issue?
Thanks in advance!
Lesson learned, always look at the access policy of the resource you are having issue with, maybe it does not match your node group role

How to update existing deployment in AWS EKS Cluster?

I have my application deployed in AWS EKS Cluster and now I want to update the deployment with the new image that I created from the recent GIT commit.
I did try to use:
kubectl set image deployment/mydeploy mydeploy=ECR:2.0
error: unable to find container named "stag-simpleui-deployment"
I also tried:
kubectl rolling-update mydeploy mydeploy.2.0 --image=ECR:2.0
Command "rolling-update" is deprecated, use "rollout" instead
Error from server (NotFound): replicationcontrollers "stag-simpleui-deployment" not found
It is confusing with so many articles say different ways, but none is working.
I was able to crack it. In below command line "mydeploy=" should be same as your image name in your "kubectl edit deployment mydeploy"
kubectl set image deployment/mydeploy mydeploy=ECR:2.0

Trouble setting up cert-manager without helm or rbac on gke

I believe I have followed this guide:
https://medium.com/#hobochild/installing-cert-manager-on-a-gcloud-k8s-cluster-d379223f43ff
which, has me install the without-rbac version of cert-manager from this repo:
https://github.com/jetstack/cert-manager
however when the cert-manager pod boots up it starts spamming this error:
leaderelection.go:224] error retrieving resource lock cert-manager/cert-manager-controller: configmaps "cert-manager-controller" is forbidden: User "system:serviceaccount:cert-manager:default" cannot get configmaps in the namespace "cert-manager": Unknown user "system:serviceaccount:cert-manager:default"
Hoping someone has some ideas.
The errors seem to be coming from RBAC. If you're running this in minikube you can grant the default service account in the cert-manager namespace the proper rights by running:
kubectl create clusterrolebinding cert-manager-cluster-admin --clusterrole=cluster-admin --serviceaccount=cert-manager:default
After creating the role binding, cert-manager should complete its startup.
You should use the 'with-rbac.yaml' variant if you are installing in GKE, unless you have explicitly disabled RBAC on the GKE cluster!
This should resolve the issues you're seeing here, as by the looks of your error message, you do have RBAC enabled!

ECONNREFUSED on redis what to do?

I have been working on this for days now, and I can't figure out what is wrong.
Everything else is working, but I get the "ECONNREFUSED" on redis.
I have follow intances running:
app01 ROLE: app
web01 ROLE: web
db01 ROLE:db:primary
redis01 ROLE:redis_master
redis02 ROLE:redis_slave
sidekiq01 ROLE:redis
Here is the error from the productionlog:
Redis::CannotConnectError (Error connecting to Redis on localhost:6379 (ECONNREFUSED)):
app/models/user.rb:63:in `send_password_reset'
app/controllers/password_resets_controller.rb:10:in `create'
Everything is set-up by using the rubber-gem.
I have tried to remove all instaces and start from the start two times. Also I have tried to make a custom security-rule, but i'm not shure if I did it right.
Please help me!
Bringing this post back from the dead because I found it when I was struggling with the same problem today. I resolved my problem by doing the following:
I added redis_slave or redis_master roles to the servers using cap rubber:add_role. I found this will add both the specified role, and the generic "redis" role. Assuming that you want redis01 to be the only redis_master after adding roles, I'd expect your environment to have:
app01 ROLE: app
web01 ROLE: web
db01 ROLE:db:primary
redis01 ROLE:redis_master
redis01 ROLE:redis
redis02 ROLE:redis_slave
redis01 ROLE:redis
sidekiq01 ROLE:redis_slave
sidekiq01 ROLE:redis
After setting up roles, I updated the servers with cap rubber:bootstrap
In my environment, I'm deploying code from git, so I had to commit these changes and run cap -s branch="branch_name_or_sha" deploy to get rubber/deploy-redis.rb on the servers with the new roles and execute it.
After doing all this, redis runs on all my nodes without throwing Redis::CannotConnectError (Error connecting to Redis on localhost:6379 (ECONNREFUSED)) error on any of them.
Good Luck!