JupyterHub server is unable start in Terraformed EMR cluster running in private subnet - amazon-s3

I'm creating an EMR cluster (emr-5.24.0) with Terraform, deployed into a private subnet, that includes Spark, Hive and JupyterHub.
I've added an additional configuration JSON to the deployment, which should add persistency for the Jupiter notebooks into S3 (instead of locally on disk).
The overall architecture includes a VPC endpoint to S3 and I'm able to access the bucket I'm trying to write the notebooks to.
When the cluster is provisioned, the JupyterHub server is unable to start.
Logging into the master node and trying to start/restart the docker container for the jupyterhub does not help.
The configuration for this persistency looks like this:
[
{
"Classification": "jupyter-s3-conf",
"Properties": {
"s3.persistence.enabled": "true",
"s3.persistence.bucket": "${project}-${suffix}"
}
},
{
"Classification": "spark-env",
"Configurations": [
{
"Classification": "export",
"Properties": {
"PYSPARK_PYTHON": "/usr/bin/python3"
}
}
]
}
]
In the terraform EMR resource definition, this is then referenced:
configurations = "${data.template_file.configuration.rendered}"
This is read from:
data "template_file" "configuration" {
template = "${file("${path.module}/templates/cluster_configuration.json.tpl")}"
vars = {
project = "${var.project_name}"
suffix = "bucket"
}
}
When I don't use persistency on the notebooks, everything works fine and I am able to log into JupyterHub.
I'm fairly certain it's not a IAM policy issue since the EMR cluster role policy Allow action is defined as "s3:*".
Are there any additional steps that need to be taken in order for this to function ?
/K

It seems that the jupyter on EMR uses the S3ContentsManager to connect with S3.
https://github.com/danielfrg/s3contents
I dig a bit S3ContentsManager and found the S3 endpoints which are the public one (as expected). Since the endpoint of S3 is public, jupyter needs to access the internet but you are running the EMR on the private subnet which is not possible to connect the endpoint I guess.
You might need to use a NAT gateway in a public subnet or create s3 endpoint for your VPC.

Yup. We ran into this too. Add an S3 VPC Endpoint, then from AWS support -
add a JupyterHub notebook config:
{
"Classification": "jupyter-notebook-conf",
"Properties": {
"config.S3ContentsManager.endpoint_url": "\"https://s3.${aws_region}.amazonaws.com\"",
"config.S3ContentsManager.region_name": "\"${aws_region}\""
}
},
hth

Related

How can I mount an EFS share to AWS Fargate?

I have an AWS EFS share that I store container logs.
I would like to mount this nfs share (aws efs) to AWS Fargate. Is it possible?
Any supporting documentation link would be appreciated.
You can do this since April 2020! It's a little tricky but works.
The biggest gotcha I ran into was that you need to set the "Platform version" to 1.4.0 - it will default to "Latest" which is 1.3.0.
In your Container Definitions you need to define a volume and a mountpoint where you want the EFS share mounted inside the container:
Volume:
"volumes": [
{
"efsVolumeConfiguration": {
"transitEncryptionPort": null,
"fileSystemId": "fs-xxxxxxx",
"authorizationConfig": {
"iam": "DISABLED",
"accessPointId": "fsap-xxxxxxxx"
},
"transitEncryption": "ENABLED",
"rootDirectory": "/"
},
"name": "efs volume name",
"host": null,
"dockerVolumeConfiguration": null
}
]
Mount volume in container:
"mountPoints": [
{
"readOnly": null,
"containerPath": "/opt/your-app",
"sourceVolume": "efs volume name"
}
These posts helped me although they're missing a few details:
Tutorial: Using Amazon EFS file systems with Amazon ECS
EFSVolumeConfiguration
EFS support for Fargate is now available!
https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-ecs-aws-fargate-support-amazon-efs-filesystems-generally-available/
EDIT: Since April 2020 this answer is not accurate. This was the situation until Fargate 1.4.0. If you are using earlier versions of Fargate this is still relevant, otherwise see newer answers.
Unfortunately it's not currently possible to use persistent storage with AWS Fargate however progress on this feature can be tracked using the newly launched public roadmap [1] for AWS container services [2]
Your use case seems to suggest logs. Have you considered using the AWSLogs driver [3] and shipping your application logs to CloudWatch Logs?
[1] https://github.com/aws/containers-roadmap/projects/1
[2] https://github.com/aws/containers-roadmap/issues/53
[3] https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_awslogs.html
wow need platform version is 1.4.0 as #TheFiddlerWins suggested

Consul - External service registration with more than 1 service

I want know if Im doing something wrong or the support for Consul External services is actually kind of limited ( or designed that way maybe).
I cant use ESM because I cannot install anything else, even if in containers :(.
Case:
- I have several hosts where Mysql has at least 4 processes running.
- I installed exporters on those hosts for each mysql process, which are already exposing the metrics for prometheus.
- I want those exporters to be registered in Consul as external services as I cant install the consul agent.
I already checked the Consul documentation and it seems that I cant register an external node with several services, just 1 service per node.
{
"Node": "ltmysqldb01-1.com",
"Address": "ltmysqldb01-1.com",
"NodeMeta": {
"external-node": "true",
"external-probe": "true"
},
"Service": {
"ID": "ltmysqldb01-1-node_exporter",
"Service": "node_exporter",
"Port": 9100
},
"Checks": [{
"Name": "http-check",
"status": "passing",
"Definition": {
"http": "ltmysqldb01-1.com",
"interval": "30s"
}
}]
}
curl --request PUT --data #external_mysql_ltmysqldb01-1.json https://consul-instance.com/v1/catalog/register
Multiple services can be easily defined per single node(agent):
You basically setup agent, and configure it with several external services.

How YARN does check health of hadoop nodes in YARN web console

I would like to know how YARN Web UI running at port 8088 consolidates the Datanodes,Namenodes and other cluster components health status.
For example, this is what i see when i open the Web UI.
Hi guy, your all datanodes are healthy.
The ResourceManager REST API's allow the user to get information about the cluster - status on the cluster, metrics on the cluster, scheduler information, information about nodes in the cluster, and information about applications on the cluster.
The below example is taken from the official documentation.
Request:
GET http://<rm http address:port>/ws/v1/cluster/info
Response:
{
"nodes":
{
"node":
[
{
"rack":"\/default-rack",
"state":"NEW",
"id":"h2:1235",
"nodeHostName":"h2",
"nodeHTTPAddress":"h2:2",
"healthStatus":"Healthy",
"lastHealthUpdate":1324056895432,
"healthReport":"Healthy",
"numContainers":0,
"usedMemoryMB":0,
"availMemoryMB":8192,
"usedVirtualCores":0,
"availableVirtualCores":8
},
{
"rack":"\/default-rack",
"state":"NEW",
"id":"h1:1234",
"nodeHostName":"h1",
"nodeHTTPAddress":"h1:2",
"healthStatus":"Healthy",
"lastHealthUpdate":1324056895092,
"healthReport":"Healthy",
"numContainers":0,
"usedMemoryMB":0,
"availMemoryMB":8192,
"usedVirtualCores":0,
"availableVirtualCores":8
}
]
}
}
More information can be found from the below link
https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html
I hope this helps.

'Unauthorized' to push images into SSL Artifactory Docker Registry

Im sorry if this topic is duplicated, I was not able to find anything similar to this problem.
Our docker clients v17.X + (Docker for Mac & Docker for Linux) are unable to push images under a SSL V2 Registry but are successfully authenticated for pushes under an Insecure V2 Registry (CNAME) that serves the same machine. The output is always the same: unauthorized even if I docker login correctly.
The weird thing is: for our old docker clients (v1.6) we are able to login and push Docker images to a secure v2 Docker registry without any problem using the credentials file stored at ~/.dockercfg. My Nginx appears to be working just fine. Any ideas about what I'm missing here?
_
Im attaching both credentials configuration files, if anyone wants to check:
Docker client: v.17
~/.docker/config.json
{
"auths" : {
"https://secure-docker-registry.intranet": {
"auth": "someAuth",
"email": "somemail#gmail.com"
}
},
"credsStore" : "osxkeychain"
}
Obs: In Docker for Mac's case I tried with 'credsStore' and without it
Obs2: Even allowing anonymous to push images, I'm still getting an unauthorized for this registry.
Obs3: Logs are not very clean about this problem
Obs4: Artifactory is configured using a LDAP Group
Docker client: v.1.6.2
~/.dockercfg
{
"secure-docker-registry.intranet": {
"auth": "someAuth",
"email": "somemail#gmail.com"
},
"insecure-docker-registry.intranet": {
"auth": "someAuth",
"email": "somemail#gmail.com"
}
}
Artifactory Pro's version: 5.4.2

How to create a user provided redis service which spring auto configuration cloud connectors picks?

I have created a user provided service for redis as below
cf cups p-redis -p "{\"host\":\"xx.xx.xxx.xxx\",\"password\":\"xxxxxxxx\",\"port\":6379}"
This not getting picked automcatically by the redis auto reconfiguration or the service connectors and getting jedis connection pool exception.
When I bind to the redis service created from the market place it works fine with the spring boot application. This confirms there is no issue with the code or configuration. I wanted a custom service for the redis to work with the spring boot app. How can i create such service? What am i missing here? Is this possible?
System-Provided:
{
"VCAP_SERVICES": {
"user-provided": [
{
"credentials": {
"host": "xx.xx.xxx.xxx",
"password": "xxxxxxxx",
"port": 6379
},
"label": "user-provided",
"name": "p-redis",
"syslog_drain_url": "",
"tags": []
}
]
}
}
I could extend the abstract cloud connector and create redis factory myself but i want to make it work out of the box with custom service and auto configuration.
All routes to mapping this service automatically lead to the spring-cloud-connectors project. If you look at the implementation, services must be either tagged with redis or expose a uri with a redis scheme from credential keys based on a permutation of uri.
If you'd like additional detection behavior, I'd recommend opening an issue in the GitHub repo.
What worked for me:
cf cups redis -p '{"uri":"redis://:PASSWORD#HOSTNAME:PORT"}' -t "redis"
Thanks to earlier answers that led me to this solution.