eks anywhere cluster creation stuck - amazon-eks

I am trying to create eks anywhere local cluster on centos machine. However, the cluster creation is stuck and I don't see any more output on the screen. I have enabled debug to more output. Please see screenshot of the logs advise if I am missing anything. I have been following below link to create eks anywhere local cluster.
Link: https://aws.amazon.com/blogs/aws/amazon-eks-anywhere-now-generally-available-to-create-and-manage-kubernetes-clusters-on-premises/
Here is the screenshot of the logs:
logs screenshot
Here is the screenshot of the cluster create yaml file
cluster create yaml file

Related

Permission denied error for bitnami redis cluster tls connection using helm chart?

Recently have tried to deploy redis-cluster on kubernetes cluster using helm chart. I am following below links--
https://github.com/bitnami/charts/tree/master/bitnami/redis-cluster
For helm deployment have used values-production.yaml. the default deployment went successful and able to create three node redis cluster (three master and three slave).
I am checking on two things currently:
How to enable container logs, as per the official docs, it should be written in "/opt/bitnami/redis/logs" but haven't seen any logs here.
From the official docs got to know, that in redis.conf log file name should be mention but currently it is "" Empty string, not sure how to and where to pass log file so that it should come in redis.conf.
I have tried to enable tls as well.. Have generated the certificates mentioned as per the redis.io/tls official docs. After than I have created the secret key mentioned in bitnami/tls section and passed the certificates in secret key.
Then I have passed the secret key name and all the certificates in values-production.yaml, then deployed the helm chart and it was giving me permission denied error msg.. For libfile.Sh in line number 37...
When I have checked the pod status, out of 6 pods three pods are in running 2/2 state and 3 pods in 1/2 crash loopback off state.
After logging on running pod able to verify that certificates got placed at location "/opt/bitnami/redis/certs/", and changes also got reflected in redis.conf file for the certificates...
Pls let me know how to make any configuration changes in redis.conf file using bitnami redis helm chart and how to resolve above two issue??
My understanding is for any redis.conf related changes, I have to pass values in values-production.yaml file... Pls let me know on this..thank you.
Bitnami developer here
My first recommendation for you is to open an issue at https://github.com/bitnami/charts/issues if you are struggling with the Redis Cluster chart.
Regarding the logs, as it's mentioned at https://github.com/bitnami/bitnami-docker-redis-cluster#logging:
The Bitnami Redis-Cluster Docker image sends the container logs to stdout
Therefore, you can simply access the logs by running (substitute "POD_NAME" with the actual name of any of you Redis pods):
kubectl logs POD_NAME
Finally, with respect to the TLS configuration, I guess you're following this guide right?

AWS EMR - how to copy files to all the nodes?

is there a way to copy a file to all the nodes in EMR cluster thought EMR command line? I am working with presto and have created my custom plugin. The problem is I have to install this plugin on all the nodes. I don't want to login to all the nodes and copy it.
You can add it as a bootstrap script to let this happen during the launch of the cluster.
#Sanket9394 Thanks for the edit!
If you have the control to Bring up a new EMR, then you should consider using the bootstrap script of the EMR.
But incase you want to do it on Existing EMR (bootstrap is only available during launch time)
You can do this with the help of AWS Systems Manager (ssm) and EMR inbuilt client.
Something like (python):
emr_client = boto3.client('emr')
ssm_client = boto3.client('ssm')
You can get the list of core instances using emr_client.list_instances
finally send a command to each of these instance using ssm_client.send_command
Ref : Check the last detailed example Example Installing Libraries on Core Nodes of a Running Cluster on https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-jupyterhub-install-kernels-libs.html#emr-jupyterhub-install-libs
Note: If you are going with SSM , you need to have proper IAM policy of ssm attached to the IAM role of your master node.

Failure to start a Neptune notebook

I can't seem to make a neptune notebook, everytime I try I get the following error:
Notebook Instance Lifecycle Config 'arn:aws:sagemaker:us-west-2:XXXXXXXX:notebook-instance-lifecycle-config/aws-neptune-tutorial-lc'
for Notebook Instance 'arn:aws:sagemaker:us-west-2:XXXXXXXXX:notebook-instance/aws-neptune-tutorial'
took longer than 5 minutes.
Please check your CloudWatch logs for more details if your Notebook Instance has Internet access.
Note that the cloudwatch logs that it suggests to look at don't exist.
The neptune database was created using this cloudformation template: https://github.com/awslabs/aws-cloudformation-templates/blob/master/aws/services/NeptuneDB/Neptune.yaml
Which created the neptune cluster in the default VPC.
The notebook instance was created using this cloudformation template: https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/cloudformation-templates/neptune-sagemaker/neptune-sagemaker-nested-stack.json
passing in the relevant values from in for the created neptune stack.
Has anyone seen this type of error and knows how to get over it?
I had to go in and modify the predefined install script used by neptune and add and nohup command to the final section of the install as described here https://aws.amazon.com/premiumsupport/knowledge-center/sagemaker-lifecycle-script-timeout/
Probably what is happening is that your notebook instance does not have access to the internet. Check your NAT configuration for your VPC and their security groups have allowed outbound rules to all

Does Heptio Authenticator be deployed automatically when creating EKS Cluster?

I have done the below steps.
Created an EKS Cluster
Installed aws-iam-authenticator client binary
Execute "aws eks update-kubeconfig --name <cluster_name>"
Execute "kubectl get svc"
I am able to view the services available in my cluster. When I see ~/.kube/config file it is using an external command called "aws-iam-authenticator".
My understanding is that "aws-iam-authenticator" uses my ~/.aws/credentials and retrieves the token from AWS(aws-iam-authenticator token -i cluster-1) and uses that token for "kubectl get svc" command. Is my understanding correct?
If my understanding correct, where does heptio comes into picture in this flow? Does Heptio Authenticator be deployed automatically when creating the EKS Cluster?
Basically, Heptio authenticator = aws-iam-authenticator.
You can check the details on here. If your aws-iam-authenticator is working fine, then you don't need to care about heptio additionally. They just renamed it.

Amazon EMR Spark Cluster: output/result not visible

I am running a Spark cluster on Amazon EMR. I am running the PageRank example programs on the cluster.
While running the programs on my local machine, I am able to see the output properly. But the same doesn't work on EMR. The S3 folder only shows empty files.
The commands I am using:
For starting the cluster:
aws emr create-cluster --name SparkCluster --ami-version 3.2 --instance-type m3.xlarge --instance-count 2 \
--ec2-attributes KeyName=sparkproj --applications Name=Hive \
--bootstrap-actions Path=s3://support.elasticmapreduce/spark/install-spark \
--log-uri s3://sampleapp-amahajan/output/ \
--steps Name=SparkHistoryServer,Jar=s3://elasticmapreduce/libs/script-runner/script-runner.jar,Args=s3://support.elasticmapreduce/spark/start-history-server
For adding the job:
aws emr add-steps --cluster-id j-9AWEFYP835GI --steps \
Name=PageRank,Jar=s3://elasticmapreduce/libs/script-runner/script-runner.jar,Args=[/home/hadoop/spark/bin/spark-submit,--deploy-mode,cluster,--master,yarn-cluster,--class,SparkPageRank,s3://sampleapp-amahajan/pagerank_2.10-1.0.jar,s3://sampleapp-amahajan/web-Google.txt,2],ActionOnFailure=CONTINUE
After a few unsuccessful attempts... I made a text file for the output of the job and it is successfully created on my local machine. But I am unable to view the same when I SSH into the cluster. I tried FoxyProxy to view the logs for the instances and neither does anything show up there.
Could you please let me know where I am going wrong?
Thanks!
How are you writing the text file locally? Generally, EMR jobs save their output to S3, so you could use something like outputRDD.saveToTextFile("s3n://<MY_BUCKET>"). You could also save the output to HDFS, but storing the results to S3 is useful for "ephemeral" clusters-- where you provision an EMR cluster, submit a job, and terminate upon completion.
"While running the programs on my local machine, I am able to see the
output properly. But the same doesn't work on EMR. The S3 folder only
shows empty files"
For the benefit of newbies:
If you are printing output to the console, it will be displayed in local mode but when you execute on EMR cluster, the reduce operation will be performed on worker nodes and they cant right to the console of the Master/Driver node!
With proper path you should be able to write results to s3.