AWS EC2 Instance Profile for S3 permissions inconsistent - amazon-s3

Background: I'm writing an automated deployment script to deploy a ruby on rails application to AWS on an EC2 instance using S3 as the storage for ActiveStorage. My script creates an instance profile/role and attaches it to the EC2 instance on creation. My script uses the ruby sdk for AWS.
Sometimes when my script runs, it works great (which tells me my configuration is correct). Sometimes it throws the following exception:
/home/ubuntu/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/aws-sigv4-1.2.1/lib/aws-sigv4/signer.rb:613:in `extract_credentials_provider': missing credentials, provide credentials with one of the following options: (Aws::Sigv4::Errors::MissingCredentialsError)
- :access_key_id and :secret_access_key
- :credentials
- :credentials_provider
I generally have success about 9 times out of 10 using a t3a.micro or t3.micro instance. I usually have a failure 9 times out of 10 using a t3a.nano or t3.nano instance.
It sure seems like there is something eventually consistent about these instance profiles, but I can't find anything in the documentation. What's going on, and what can I do to make this succeed consistently?
Thank you.

Related

AKS API Load testing error: Premature end of Content-Length delimited message body

While load testing, after some successful responses from the API, JMeter records errors:
'Premature end of Content-Length delimited message body'.
From logs inside the code the response seems to complete normally.
The APP is deployed on AKS with ingress nginx/1.15.10 controllers. The APP consists of 4 separate APIs (one master calling the 3 others). The APIs are created in FLASK with CONNEXION and run in a WSGIContainer on a Tornado HTTPServer.
Another confusing factor is that the APP is deployed on two AKS instances on the same cluster. The one deployment does not return errors and the other does.
What could be causing the error?
I would suggest to limit your testing scope.
1) target the application directly (bypassing the k8s svc and ingress controller). ensure you target each app running on the two different nodes. Do you still see the issue ?
2) target the app service directly (bypassing ingress controller), ensure you target each app running on the two different nodes. Do you still see the issue ?
3) target the app using its ingress, ensure you target each app running on the two different nodes. Do you still see the issue ?
Based on those results, we should be able to pinpoint better the source of your issue.

ERROR: The overall deployment failed because too many individual instances failed deployment

I'm trying to deploy using CircleCI -> S3 -> CodeDeploy -> EC2.
I was able to upload deploy image onto S3 from CircleCI, but unable to deploy S3 to EC2 instance. Here's the error.
The overall deployment failed because too many individual instances
failed deployment, too few healthy instances are available for
deployment, or some instances in your deployment group are
experiencing problems. (Error code: HEALTH_CONSTRAINTS)
The error was provided from CodeDeploy. I can't figure out why and how.
I'd appreciate if you give some advise.
If you are running on Ubuntu there might be plenty of reasons, here is a checklist can verify
Check code-deploy agent is installed on your EC2 Instance. Please refer this document to install code deploy agent.
https://docs.aws.amazon.com/codedeploy/latest/userguide/codedeploy-agent-operations-install-ubuntu.html
$ sudo service codedeploy-agent status
In case if you are running Ubuntu release 20.x and you get this error
./install:22:in block in method_missing': undefined method path' for
#<IO:> (NoMethodError)
try running the install file via this script
sudo ./install auto > /tmp/logfile
Check you have EC2 Instance Code Deploy Role -> Create a code deployment role and assign it to the Instance, https://docs.aws.amazon.com/codedeploy/latest/userguide/getting-started-create-service-role.html.
In case if you assign the EC2 Role after initiate, restart the server.
Check your appsec.yml file placement as per the top answer, try to avoid any long timeout in it.
Log into your instance check your error log
$ tail -f /var/log/aws/codedeploy-agent/codedeploy-agent.log
You should be able to figure out what caused the individual instances to fail by digging into the deployment instance details:
http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-view-instance-details.html
These should contain more detailed information about why your application was unable to be deployed.
This error is commonly due to problems in the configuration of the appSpec.yml or appSpec.json file (It depends on the format you are using).
"If you have any Hook I recommend that you remove them, check if it works, then you can add one by one (the Hooks) and so you can identify the error"
The appspec.yml file should be located at the root of your project:
│-- appspec.yml
│-- index.html
└-- scripts
│-- install_dependencies
│-- start_server
└-- stop_server
In the scripts folder you will have to place the processes that you want to be executed according to the Hook
Here is an example of the appspec.yml file
version: 0.0
os: linux
files:
- source: /index.html
destination: /var/www/html/
hooks:
BeforeInstall:
- location: scripts/install_dependencies
timeout: 300
runas: root
- location: scripts/start_server
timeout: 300
runas: root
ApplicationStop:
- location: scripts/stop_server
timeout: 300
runas: root
I hope I can help you 😃👻🕺🏾
Make sure the CodeDeploy Host Agent Service is running in your target EC2 instance.
The error you are facing is a generic error message thrown on any of the event failure which could be beforeblockTraffic, blockTraffic, ApplicationStop etc.
The first step in this case would be check whether code deploy agent is running or not if first event i.e. BeforeBlockTraffic event is failed.
As you can see in the screenshot below, the event failure message would tell you the exact error behind.
From the failed deployments, I can see all lifecycle events were skipped. Instance i-0bcc36e73851297f2 is currently in Stopped state but I can see the IAM instance profile is missing. Your Amazon EC2 instances need permission to access the Amazon S3 buckets or GitHub repositories where the applications that will be deployed by AWS CodeDeploy are stored. To launch Amazon EC2 instances that are compatible with AWS CodeDeploy, you must create an additional IAM role, an instance profile. 1
For such failures, you can always begin with a general troubleshooting checklist for a failed deployment 2 and then look for troubleshooting guides on Deployment Issues and Instance issues3.
1[http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-create-iam-instance-profile.html]1
2 [http://docs.aws.amazon.com/codedeploy/latest/userguide/troubleshooting-general.html]2
3 [http://docs.aws.amazon.com/codedeploy/latest/userguide/troubleshooting.html]3
Check the status of the Code Deploy Agent. In my case, the agent wasn't up.
Please check the role given to the ec2 machine(where the agent is running). It should have s3 access as well. This resolved my issue.
"The CodeDeploy agent did not find an AppSpec file within the unpacked revision directory at revision-relative path 'appspec.yml'"
Please place your appspec.yml file in your root folder to solve this error
To access your after script and before script
The overall deployment failed because too many individual instances failed deployment, too few healthy instances are available for deployment, or some instances in your deployment group are experiencing problems.

Read-only web console access in ActiveMQ

I'm using ActiveMQ 5.10 and would like to create a user that has read-only access through the web console.
Red Hat published this article, mentioning that it's not really read only due to a bug in ActiveMQ.
According to the bug report AMQ-4567, the bug is fixed as of ActiveMQ 5.9. However, I'm not seeing it work appropriately.
I have tried a number of different configurations, with the most recent being two separate JAAS implementations, one for Jetty and one for ActiveMQ. The relevant property files are excerpted below.
I can mostly log in to the web console using the "system" user. But the guest user doesn't work at all. The application user (appuser) doesn't need access to the web console at all.
My authN/authZ needs are pretty trivial: one admin user, one application account, and one read-only monitoring account.
Is there any good way to get this working with a recent version of ActiveMQ (>= 5.9.0)?
groups.properties
admins=system
users=appuser,admin
guests=guest
users.properties
system={password redacted}
appuser=appuser
guest=guest
jetty-realm.properties
system: MD5:46cf1b5451345f5176cd70713e0c9e07,user,admin
guest: guest,guest
As an aside, I used the Jetty tutorial and the Rundeck instructions to figure out the jetty-realm.properties file and chapter 6 of ActiveMQ in Action to work out the ActiveMQ JAAS.
I was finally able to get to what I wanted by deploying the web console to an external Tomcat instance. I assume that when it runs out of process, it can't bypass security and so has to use whatever credentials you provide. In this case, I gave the Tomcat instance the read-only JMX user credentials.
It's not great, as there is no security trimmed UI. You can still attempt to create new destinations, delete destinations, etc. When you try with a read-only user, you get an error. That gets a "D" for UX, but a "B" for security.

On Google Cloud, what is the API method to stop an instance (not the gcutil or manual method)

Mysteriously, there appears to be no documented API call to stop a google cloud instance. In these docs:
https://developers.google.com/compute/docs/instances#stop_job
both the prior and following commands describe API calls to accomplishing, but not the very common task of shutting down an instance.
When I hacked the URL for getting GCE help on 'reseting' an instance, assuming the "delete" command probably existed, I go this valid page:
https://developers.google.com/compute/docs/reference/latest/instances/delete
but this talks about deleting "instance resources" rather than instances themselves. Confusing (to me).
So, is there, or is there not, an API call to shut down a google cloud VM instance?
Would this be what you are looking for ?
https://developers.google.com/compute/docs/api/python-guide#stoppinganinstance
Python API to stop Google Compute Engine Instance.
This would 'stop' the instance. As deleting an instance is the recommended way to stop an instance.

RUN#Cloud consistently throws me out during a heavy operation

I'm using a large app instance to run a basic java web application (GWT + Spring). There's an expensive operation within my application (report) which takes a long time to execute.
I've tried running it with the cloudbees SDK on my local machine with similar settings as it would be on the cloud and it seems to function just fine. It runs in about 3-4 minutes.
On the cloud, it seems to be taking longer. The problem isn't the fact that it takes long. What happens in that cloudbees terminates the session after 5 minutes and gives me an error in my browser saying 'Unable to connect to server. Please contact your administrator'. A report which doesn't take as long runs just fine. My application has a session timeout of 30 minutes, so that isn't a problem either.
What could possibly be going wrong? Is it something to do with cloudbees?
This may be due to proxy buffering of your request through the routing layer (revproxy) - so it most likely isn't a session timeout - but the http connection getting cut.
You can either set proxyBuffering=false via the bees CLI command (eg when you deploy the app) - this will ensure longer running connections can work.
Ideally, however, you could change the app slightly to return to the browser with some token which you can poll with to get completion status, as even with a connection that lasts that long, over the internet it may provide a bad experience vs locally.