ERROR: The overall deployment failed because too many individual instances failed deployment - amazon-s3

I'm trying to deploy using CircleCI -> S3 -> CodeDeploy -> EC2.
I was able to upload deploy image onto S3 from CircleCI, but unable to deploy S3 to EC2 instance. Here's the error.
The overall deployment failed because too many individual instances
failed deployment, too few healthy instances are available for
deployment, or some instances in your deployment group are
experiencing problems. (Error code: HEALTH_CONSTRAINTS)
The error was provided from CodeDeploy. I can't figure out why and how.
I'd appreciate if you give some advise.

If you are running on Ubuntu there might be plenty of reasons, here is a checklist can verify
Check code-deploy agent is installed on your EC2 Instance. Please refer this document to install code deploy agent.
https://docs.aws.amazon.com/codedeploy/latest/userguide/codedeploy-agent-operations-install-ubuntu.html
$ sudo service codedeploy-agent status
In case if you are running Ubuntu release 20.x and you get this error
./install:22:in block in method_missing': undefined method path' for
#<IO:> (NoMethodError)
try running the install file via this script
sudo ./install auto > /tmp/logfile
Check you have EC2 Instance Code Deploy Role -> Create a code deployment role and assign it to the Instance, https://docs.aws.amazon.com/codedeploy/latest/userguide/getting-started-create-service-role.html.
In case if you assign the EC2 Role after initiate, restart the server.
Check your appsec.yml file placement as per the top answer, try to avoid any long timeout in it.
Log into your instance check your error log
$ tail -f /var/log/aws/codedeploy-agent/codedeploy-agent.log

You should be able to figure out what caused the individual instances to fail by digging into the deployment instance details:
http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-view-instance-details.html
These should contain more detailed information about why your application was unable to be deployed.

This error is commonly due to problems in the configuration of the appSpec.yml or appSpec.json file (It depends on the format you are using).
"If you have any Hook I recommend that you remove them, check if it works, then you can add one by one (the Hooks) and so you can identify the error"
The appspec.yml file should be located at the root of your project:
│-- appspec.yml
│-- index.html
└-- scripts
│-- install_dependencies
│-- start_server
└-- stop_server
In the scripts folder you will have to place the processes that you want to be executed according to the Hook
Here is an example of the appspec.yml file
version: 0.0
os: linux
files:
- source: /index.html
destination: /var/www/html/
hooks:
BeforeInstall:
- location: scripts/install_dependencies
timeout: 300
runas: root
- location: scripts/start_server
timeout: 300
runas: root
ApplicationStop:
- location: scripts/stop_server
timeout: 300
runas: root
I hope I can help you 😃👻🕺🏾

Make sure the CodeDeploy Host Agent Service is running in your target EC2 instance.

The error you are facing is a generic error message thrown on any of the event failure which could be beforeblockTraffic, blockTraffic, ApplicationStop etc.
The first step in this case would be check whether code deploy agent is running or not if first event i.e. BeforeBlockTraffic event is failed.
As you can see in the screenshot below, the event failure message would tell you the exact error behind.

From the failed deployments, I can see all lifecycle events were skipped. Instance i-0bcc36e73851297f2 is currently in Stopped state but I can see the IAM instance profile is missing. Your Amazon EC2 instances need permission to access the Amazon S3 buckets or GitHub repositories where the applications that will be deployed by AWS CodeDeploy are stored. To launch Amazon EC2 instances that are compatible with AWS CodeDeploy, you must create an additional IAM role, an instance profile. 1
For such failures, you can always begin with a general troubleshooting checklist for a failed deployment 2 and then look for troubleshooting guides on Deployment Issues and Instance issues3.
1[http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-create-iam-instance-profile.html]1
2 [http://docs.aws.amazon.com/codedeploy/latest/userguide/troubleshooting-general.html]2
3 [http://docs.aws.amazon.com/codedeploy/latest/userguide/troubleshooting.html]3

Check the status of the Code Deploy Agent. In my case, the agent wasn't up.

Please check the role given to the ec2 machine(where the agent is running). It should have s3 access as well. This resolved my issue.

"The CodeDeploy agent did not find an AppSpec file within the unpacked revision directory at revision-relative path 'appspec.yml'"
Please place your appspec.yml file in your root folder to solve this error
To access your after script and before script

The overall deployment failed because too many individual instances failed deployment, too few healthy instances are available for deployment, or some instances in your deployment group are experiencing problems.

Related

Calling a Jenkins job from a Codefresh pipeline fails with: x509: failed to load system roots and no roots provided

I have a Jenkins job which I would like to invoke from my Codefresh pipeline.
Using the following example from the Codefresh docs, I have my Codefresh pipeline configured and ready:
https://codefresh.io/docs/docs/integrations/jenkins-integration/#calling-jenkins-jobs-from-codefresh-pipelines
The resulting build runs with the following output:
Pulling image codefresh/cf-run-jenkins-job:latest
Pulled layer '1160f4abea84'
Pulled layer '6df1582e0e0e'
Digest: sha256:a95b23c24b51d5fc1705731f7d18c5134590b4bc61b91dcf5a878faf2aec60b3
Status: Downloaded newer image for codefresh/cf-run-jenkins-job:latest
INFO[0000] Going to trigger <jenkins_job_name> job on https://<jenkins_host>:8443
ERRO[0000] Post https://<jenkins_host>:8443/job/<jenkins_job_name>/build: x509: failed to load system roots and no roots provided
Successfully ran freestyle step: Triggering Jenkins Job
Reading environment variable exporting file contents.
Reading environment variable exporting file contents.
As you can see, the build fails to successfully trigger the Jenkins job.
After some research in the Internet I came to conclusion that this is an SSL certificate issue.
But I have no idea how to proceed from here on. What exactly is missing and where it should be configured. I would really appreciate any help here.
Do you know that kind of SSL configuration your Jenkins server has? Is it mutual authentication or just a server-side certificate? Is it self-signed or not?
Have you tried to call the Jenkins API on your own (outside of Codefresh) and SSL works fine?
Also I would suggest you open a support ticket (from the top right menu in the Codefresh UI) and make sure to mention the URL of the build that has this issue.

ActiveMQ 5.15 HTTP ERROR: 503

Run environment :linux (CentOS 7), JDK 1.8, & ActiveMQ 5.15
I started Activemq then visit the management page with Chrome,when I try to log in with the default username & password I get the following error;
HTTP ERROR: 503
Problem accessing /admin/. Reason:
Service Unavailable Powered by Jetty://
How can I resolve this problem?
I was getting this same error. It turns out that I had run it as root user originally, then later I stopped it and ran it as a non-root user. Certain data files that had been created and owned by the original root instance were not accessible to the non-root user.
Check the ownership of the files, and change them if necessary to match the user that the broker is running as.
Had the same issue.
Maybe something went wrong the extraction of the package.
I downloaded this:
wget https://archive.apache.org/dist/activemq/5.15.0/apache-activemq-5.15.0-bin.tar.gz
and extracted it with:
sudo tar -zxvf apache-activemq-5.15.0-bin.tar.gz -C /opt
then it worked for me.
My two cents:
I start with the activemq in Ubuntu Repo, but then later change to binary package from official website.
In my case, the repo version left an /etc/default/activemq config file, which runs activemq with user "activemq". It turns out in previous experiments, I did not kill the old processes running under "activemq" when I start activemq under my own user name. There are two activemq processes running under different user names, and when connecting to admin console, I have a 503.
I delete the /etc/default/activemq file, and kill all activemq processes running under "activemq", then restart activemq with my user name, the 503 is gone.

Build spinnaker with docker-compose, redirect to localhost

i build spinnaker using docker-compose follow here
but it always redirect to localhost, how can i fix this.
e.g.
http://localhost:8084/auth/redirect?to=http%3A%2F%2F192.168.99.100%3A9000%2F%23%2Finfrastructure
i set the host:0.0.0.0 in spinnaker-local.yml and configured deck apache2 with proxyPreserve=On, it's not working.
where is the configuration about 'redirect'?
All containers running well but fiat gets error mesages, like this:
WARN 1 --- [ecutionAction-1] c.n.s.fiat.roles.UserRolesSyncer : [] User permission sync failed. Server status is DOWN. Trying again in 10000 ms. Cause:(Provider: DefaultServiceAccountProvider) retrofit.RetrofitError: unexpected url: front50/serviceAccounts
i'm sure set fiat false, is this matter?
thanks.
The docker-compose link project is not available anymore. That deployment type is not supported anymore.
The easiest way i suggest for people to get started quick is by using Armory Open source Minnaker. It runs on top of a K3S small cluster and contains a functional spinnaker deployment.
Great way to get started.
I tried the debian local deployment and it failed all the time.
Enjoy your CD operations.

I am trying OpenShift origin, I cannot create application

I am trying OO on a RHEL Atomic Host. I spun up OO master as a container following this guide https://docs.openshift.org/latest/getting_started/administrators.html
After attaching a shell to the Master Container, I cannot deploy an app.
# oc new-app openshift/deployment-example
error: can't look up Docker image "openshift/deployment-example": Internal error occurred: Get https://registry-1.docker.io/v2/: net/htt p: request canceled while waiting for connection error: no match for "openshift/deployment-example"
The 'oc new-app' command will match arguments to the following types:
1. Images tagged into image streams in the current project or the 'openshift' project
- if you don't specify a tag, we'll add ':latest'
2. Images in the Docker Hub, on remote registries, or on the local Docker engine
3. Templates in the current project or the 'openshift' project
4. Git repository URLs or local paths that point to Git repositories
--allow-missing-images can be used to point to an image that does not exist yet.
See 'oc new-app -h' for examples.
The host needs proxy to access Internet. I have configured proxy in /etc/sysconfig/docker and that is how I could pull the origin image in the same place.
I have tried setting proxy for master and node with luck
https://docs.openshift.org/latest/install_config/http_proxies.html
It is possible that your proxy is terminating the connection. you can test by creating an internal registry, push image to that and then use
"oc new-app your.internal.registry/openshift/deployment-example"

Could not connect to ActiveMQ Server - activemq for mcollective failing

We are continuously getting this error:
2014-11-06 07:05:34,460 [main ] INFO SharedFileLocker - Database activemq-data/localhost/KahaDB/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: Failed to create directory 'activemq-data/localhost/KahaDB'
We have verified that activemq is running as activemq, we have verified that the owner of the directories are activemq. It will not create the directories automatically, and if we create them ourselves, it still gives the same error. The service starts fine, but it will just continuously spit out the same error. There is no lock file as it will not generate any files or directories.
Another way to fix this problem, in one step, is to create the missing symbolic link in /usr/share/activemq/. The permissions are already set properly on /var/cache/activemq/data/, but it seems the activemq RPM is not creating the symbolic link to that location as it should. The symbolic link should be as follows: /usr/share/activemq/activemq-data -> /var/cache/activemq/data/. After creating the symbolic link, restart the activemq service and the issue will be resolved.
I was able to resolve this by the following:
ensure activemq is owner and has access to /var/log/activemq and all sub dirs.
ensure /etc/init.d/activemq has: ACTIVEMQ_CONFIGS="/etc/sysconfig/activemq"
create file activemq in /etc/sysconfig if it doesnt exist.
add this line: ACTIVEMQ_DATA="/var/log/activemq/activemq-data/localhost/KahaDB"
The problem was that activeMQ 5.9.x was using /usr/share/activemq as its KahaDB location.