How can I mount an EFS share to AWS Fargate? - aws-fargate

I have an AWS EFS share that I store container logs.
I would like to mount this nfs share (aws efs) to AWS Fargate. Is it possible?
Any supporting documentation link would be appreciated.

You can do this since April 2020! It's a little tricky but works.
The biggest gotcha I ran into was that you need to set the "Platform version" to 1.4.0 - it will default to "Latest" which is 1.3.0.
In your Container Definitions you need to define a volume and a mountpoint where you want the EFS share mounted inside the container:
Volume:
"volumes": [
{
"efsVolumeConfiguration": {
"transitEncryptionPort": null,
"fileSystemId": "fs-xxxxxxx",
"authorizationConfig": {
"iam": "DISABLED",
"accessPointId": "fsap-xxxxxxxx"
},
"transitEncryption": "ENABLED",
"rootDirectory": "/"
},
"name": "efs volume name",
"host": null,
"dockerVolumeConfiguration": null
}
]
Mount volume in container:
"mountPoints": [
{
"readOnly": null,
"containerPath": "/opt/your-app",
"sourceVolume": "efs volume name"
}
These posts helped me although they're missing a few details:
Tutorial: Using Amazon EFS file systems with Amazon ECS
EFSVolumeConfiguration

EFS support for Fargate is now available!
https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-ecs-aws-fargate-support-amazon-efs-filesystems-generally-available/

EDIT: Since April 2020 this answer is not accurate. This was the situation until Fargate 1.4.0. If you are using earlier versions of Fargate this is still relevant, otherwise see newer answers.
Unfortunately it's not currently possible to use persistent storage with AWS Fargate however progress on this feature can be tracked using the newly launched public roadmap [1] for AWS container services [2]
Your use case seems to suggest logs. Have you considered using the AWSLogs driver [3] and shipping your application logs to CloudWatch Logs?
[1] https://github.com/aws/containers-roadmap/projects/1
[2] https://github.com/aws/containers-roadmap/issues/53
[3] https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_awslogs.html

wow need platform version is 1.4.0 as #TheFiddlerWins suggested

Related

How do I add Linux capabilities SYS_NICE and DAC_READ_SEARCH to container in AWS Fargate?

I'm trying to setup a task definition in ECS Fargate for running Koha containers but Fargate won't accept
--cap-add=SYS_NICE --cap-add=DAC_READ_SEARCH
(or any other kernel capabilities except for SYS_PTRACE) in the task definition json file. I tried adding "linuxParameters": {"capabilities": { "add": [ "SYS_NICE", "DAC_READ_SEARCH"],to the task definition json file but Fargate simply deletes the code.
The mpm_itk module fails without this option (and my container throws an 500 error with the following warning/error in the logs
[mpm_itk:warn] [pid 17146] (itkmpm: pid=17146 uid=33, gid=33) itk_post_perdir_config(): setgid(1000): Operation not permitted
How do I work around this? Is there a way to pass on these capabilities after the container has started up? Any help will be appreciated, thanks!
According to AWS Fargate only allows you to add the SYS_PTRACE kernel capability. It is not possible to add any other capabilities at the moment. The only viable workaround that I can see working is to use ECS EC2.
The container created by docker runc is bounded by capability flag i.e.
0x00000000a80425fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
The container can get the capabilities from this set only.

VM import on AWS gave me 'ClientError: Boot disk is not using MBR partitioning.' error

After a 10hours upload to AWS S3, I've tried to import the vm using this command
aws ec2 import-image --description "My server VM" --disk-containers "file://C:\import\containers.json"
but I got this while processing the VM to import it to AWS
{
"ImportImageTasks": [
{
"Description": "myownVM",
"ImportTaskId": "import-ami-guid",
"Platform": "Windows",
"SnapshotDetails": [
{
"DiskImageSize": 28333778432.0,
"Format": "VMDK",
"Status": "completed",
"UserBucket": {
"S3Bucket": "my",
"S3Key": "Windows 10 x64.ova"
}
}
],
"Status": "deleted",
"StatusMessage": "ClientError: Boot disk is not using MBR partitioning.",
"Tags": []
}
]
}
It was created with VMWare 16 Professional, then exported it to ova... what have I done wrong?
I've tried googling it but I've seen no error corresponding to this
Thanks in advance
The Windows 10 boot disk is probably formatted with GPT and not MBR, which is not supported for VMDK disk images.
From the VMIE docs: https://docs.aws.amazon.com/vm-import/latest/userguide/vmie_prereqs.html#limitations-image
UEFI/EFI boot partitions are supported only for Windows boot volumes with VHDX as the image format. Otherwise, a VM's boot volume must use Master Boot Record (MBR) partitions.
You can check using the command, from inside the VM:
diskpart
list disk
If it shows an asterisk in the GPT column, it's using GPT. If not, it's using MBR.
https://www.top-password.com/blog/tag/how-to-check-gpt-or-mbr-windows-10 has screenshots if you'd like to check via a GUI.
I'm not aware of any way to convert from GPT to MBR without wiping the drive and reinstalling Windows.
If you do reinstall, make sure you disable UEFI and secure boot if you have those options in the VM BIOS in VMware Workstation. That should allow you to choose the "custom" install during the Windows setup and then delete the default partitions and recreate them, as described here:
https://answers.microsoft.com/en-us/windows/forum/windows_10-windows_install-winpc/why-the-latest-w10-cant-install-mbr-disk/04351813-f7f5-46b8-b045-7d3b43094d36
Another option would be to use VHD(x) disk images using VirtualBox instead of VMware Workstation. Theoretically, this should allow you to continue using GPT. A walkthrough for this route is here:
https://gist.github.com/peterforgacs/abebc777fcd6f4b67c07b2283cd31777

Spinnaker - SQL backend for front50

I am trying to setup SQL backend for front50 using the document below.
https://www.spinnaker.io/setup/productionize/persistence/front50-sql/
I have fron50-local.yaml for the mysql config.
But, not sure how to disable persistent storage in halyard config. Here, I can not completely remove persistentStorage and persistentStoreType should be one of a3,azs,gcs,redis,s3,oracle.
There is no option to disable persistent storage here.
persistentStorage:
persistentStoreType: s3
azs: {}
gcs:
rootFolder: front50
redis: {}
s3:
bucket: spinnaker
rootFolder: front50
maxKeys: 1000
oracle: {}
So within your front50-local.yaml you will want to disable the service you used to utilize e.g.
spinnaker:
gcs:
enabled: false
s3:
enabled: false
You may need/want to remove the section from your halconfig and run your apply with
hal deploy apply --no-validate
There are a number of users dealing with these same issues and some more help might be found on the Slack: https://join.spinnaker.io/
I've noticed the same issue just recently. Maybe this is because, for example Kayenta (which is an optional component to enable) is still missing the non-object storage persistent support, or...
I've created a GitHub issue on this here: https://github.com/spinnaker/spinnaker/issues/5447

JupyterHub server is unable start in Terraformed EMR cluster running in private subnet

I'm creating an EMR cluster (emr-5.24.0) with Terraform, deployed into a private subnet, that includes Spark, Hive and JupyterHub.
I've added an additional configuration JSON to the deployment, which should add persistency for the Jupiter notebooks into S3 (instead of locally on disk).
The overall architecture includes a VPC endpoint to S3 and I'm able to access the bucket I'm trying to write the notebooks to.
When the cluster is provisioned, the JupyterHub server is unable to start.
Logging into the master node and trying to start/restart the docker container for the jupyterhub does not help.
The configuration for this persistency looks like this:
[
{
"Classification": "jupyter-s3-conf",
"Properties": {
"s3.persistence.enabled": "true",
"s3.persistence.bucket": "${project}-${suffix}"
}
},
{
"Classification": "spark-env",
"Configurations": [
{
"Classification": "export",
"Properties": {
"PYSPARK_PYTHON": "/usr/bin/python3"
}
}
]
}
]
In the terraform EMR resource definition, this is then referenced:
configurations = "${data.template_file.configuration.rendered}"
This is read from:
data "template_file" "configuration" {
template = "${file("${path.module}/templates/cluster_configuration.json.tpl")}"
vars = {
project = "${var.project_name}"
suffix = "bucket"
}
}
When I don't use persistency on the notebooks, everything works fine and I am able to log into JupyterHub.
I'm fairly certain it's not a IAM policy issue since the EMR cluster role policy Allow action is defined as "s3:*".
Are there any additional steps that need to be taken in order for this to function ?
/K
It seems that the jupyter on EMR uses the S3ContentsManager to connect with S3.
https://github.com/danielfrg/s3contents
I dig a bit S3ContentsManager and found the S3 endpoints which are the public one (as expected). Since the endpoint of S3 is public, jupyter needs to access the internet but you are running the EMR on the private subnet which is not possible to connect the endpoint I guess.
You might need to use a NAT gateway in a public subnet or create s3 endpoint for your VPC.
Yup. We ran into this too. Add an S3 VPC Endpoint, then from AWS support -
add a JupyterHub notebook config:
{
"Classification": "jupyter-notebook-conf",
"Properties": {
"config.S3ContentsManager.endpoint_url": "\"https://s3.${aws_region}.amazonaws.com\"",
"config.S3ContentsManager.region_name": "\"${aws_region}\""
}
},
hth

How YARN does check health of hadoop nodes in YARN web console

I would like to know how YARN Web UI running at port 8088 consolidates the Datanodes,Namenodes and other cluster components health status.
For example, this is what i see when i open the Web UI.
Hi guy, your all datanodes are healthy.
The ResourceManager REST API's allow the user to get information about the cluster - status on the cluster, metrics on the cluster, scheduler information, information about nodes in the cluster, and information about applications on the cluster.
The below example is taken from the official documentation.
Request:
GET http://<rm http address:port>/ws/v1/cluster/info
Response:
{
"nodes":
{
"node":
[
{
"rack":"\/default-rack",
"state":"NEW",
"id":"h2:1235",
"nodeHostName":"h2",
"nodeHTTPAddress":"h2:2",
"healthStatus":"Healthy",
"lastHealthUpdate":1324056895432,
"healthReport":"Healthy",
"numContainers":0,
"usedMemoryMB":0,
"availMemoryMB":8192,
"usedVirtualCores":0,
"availableVirtualCores":8
},
{
"rack":"\/default-rack",
"state":"NEW",
"id":"h1:1234",
"nodeHostName":"h1",
"nodeHTTPAddress":"h1:2",
"healthStatus":"Healthy",
"lastHealthUpdate":1324056895092,
"healthReport":"Healthy",
"numContainers":0,
"usedMemoryMB":0,
"availMemoryMB":8192,
"usedVirtualCores":0,
"availableVirtualCores":8
}
]
}
}
More information can be found from the below link
https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html
I hope this helps.