How YARN does check health of hadoop nodes in YARN web console - hadoop-yarn

I would like to know how YARN Web UI running at port 8088 consolidates the Datanodes,Namenodes and other cluster components health status.
For example, this is what i see when i open the Web UI.
Hi guy, your all datanodes are healthy.

The ResourceManager REST API's allow the user to get information about the cluster - status on the cluster, metrics on the cluster, scheduler information, information about nodes in the cluster, and information about applications on the cluster.
The below example is taken from the official documentation.
Request:
GET http://<rm http address:port>/ws/v1/cluster/info
Response:
{
"nodes":
{
"node":
[
{
"rack":"\/default-rack",
"state":"NEW",
"id":"h2:1235",
"nodeHostName":"h2",
"nodeHTTPAddress":"h2:2",
"healthStatus":"Healthy",
"lastHealthUpdate":1324056895432,
"healthReport":"Healthy",
"numContainers":0,
"usedMemoryMB":0,
"availMemoryMB":8192,
"usedVirtualCores":0,
"availableVirtualCores":8
},
{
"rack":"\/default-rack",
"state":"NEW",
"id":"h1:1234",
"nodeHostName":"h1",
"nodeHTTPAddress":"h1:2",
"healthStatus":"Healthy",
"lastHealthUpdate":1324056895092,
"healthReport":"Healthy",
"numContainers":0,
"usedMemoryMB":0,
"availMemoryMB":8192,
"usedVirtualCores":0,
"availableVirtualCores":8
}
]
}
}
More information can be found from the below link
https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html
I hope this helps.

Related

JupyterHub server is unable start in Terraformed EMR cluster running in private subnet

I'm creating an EMR cluster (emr-5.24.0) with Terraform, deployed into a private subnet, that includes Spark, Hive and JupyterHub.
I've added an additional configuration JSON to the deployment, which should add persistency for the Jupiter notebooks into S3 (instead of locally on disk).
The overall architecture includes a VPC endpoint to S3 and I'm able to access the bucket I'm trying to write the notebooks to.
When the cluster is provisioned, the JupyterHub server is unable to start.
Logging into the master node and trying to start/restart the docker container for the jupyterhub does not help.
The configuration for this persistency looks like this:
[
{
"Classification": "jupyter-s3-conf",
"Properties": {
"s3.persistence.enabled": "true",
"s3.persistence.bucket": "${project}-${suffix}"
}
},
{
"Classification": "spark-env",
"Configurations": [
{
"Classification": "export",
"Properties": {
"PYSPARK_PYTHON": "/usr/bin/python3"
}
}
]
}
]
In the terraform EMR resource definition, this is then referenced:
configurations = "${data.template_file.configuration.rendered}"
This is read from:
data "template_file" "configuration" {
template = "${file("${path.module}/templates/cluster_configuration.json.tpl")}"
vars = {
project = "${var.project_name}"
suffix = "bucket"
}
}
When I don't use persistency on the notebooks, everything works fine and I am able to log into JupyterHub.
I'm fairly certain it's not a IAM policy issue since the EMR cluster role policy Allow action is defined as "s3:*".
Are there any additional steps that need to be taken in order for this to function ?
/K
It seems that the jupyter on EMR uses the S3ContentsManager to connect with S3.
https://github.com/danielfrg/s3contents
I dig a bit S3ContentsManager and found the S3 endpoints which are the public one (as expected). Since the endpoint of S3 is public, jupyter needs to access the internet but you are running the EMR on the private subnet which is not possible to connect the endpoint I guess.
You might need to use a NAT gateway in a public subnet or create s3 endpoint for your VPC.
Yup. We ran into this too. Add an S3 VPC Endpoint, then from AWS support -
add a JupyterHub notebook config:
{
"Classification": "jupyter-notebook-conf",
"Properties": {
"config.S3ContentsManager.endpoint_url": "\"https://s3.${aws_region}.amazonaws.com\"",
"config.S3ContentsManager.region_name": "\"${aws_region}\""
}
},
hth

RabbitMQ - ACCESS_REFUSED - Login was refused

I'm using rabbitmq-server and fetch messages from it using a consumer written in Scala. This has been working like a charm but since I migrated my RabbitMQ server from a server to another, I get the following error when trying to connect to it:
com.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfile.
In addition, the rabbitmq-server logs:
=INFO REPORT==== 18-Jul-2018::15:28:05 ===
accepting AMQP connection <0.7107.0> (127.0.0.1:42632 -> 127.0.0.1:5672)
=ERROR REPORT==== 18-Jul-2018::15:28:05 ===
Error on AMQP connection <0.7107.0> (127.0.0.1:42632 -> 127.0.0.1:5672, state: starting):
PLAIN login refused: user 'my_personal_user' - invalid credentials
=INFO REPORT==== 18-Jul-2018::15:28:05 ===
closing AMQP connection <0.7107.0> (127.0.0.1:42632 -> 127.0.0.1:5672)
I went through every SO questions about authentication problems and found the following leads:
My credentials are wrong
I'm trying to connect with guest from
remote
My RabbitMQ version is not compatible with the consumer
All those leads did not help me. My crendetials are good, I'm not using guest to connect but a privileged user with full access and admin I created and my RabbitMQ version did not change through the migration.
NB: I migrated my RabbitMQ server from a separate server to the same as my consumer, so now the consumer is fetching from localhost. Don't know the consequences but I figured it could help you guys help me.
So I just had a similar problem googled solutions, which is how I found this page. I didn't find a direct answer to my question, but I ended up discovering that rabbitmq has 2 different sets of rights to configure that don't exactly overlap with each other, in my case I had 0 rights for 1 set of rights and admin rights for the other set of rights. I wounder if you could be running into a similar scenario.
Seeing code will make the 2 sets of rights make more since, but first some background context:
My RMQ is hosted on Kubernetes where stuffs ephemeral, and I needed some usernames and passwords to ship preloaded with a fresh rabbitmq instance, well in Kubernetes there's an option to inject a preconfigured broker definition on first startup. (When I say broker definition I'm referring to that spot in the management Web GUI there's an option to import and export broker definitions AKA backup or replace your RMQ live configuration.)
Here's a shortened version of my config with sensitive stuff removed:
{
"vhosts": [
{"name":"/"}
],
"policies": [
{
"name": "ha",
"vhost": "/",
"pattern": ".*",
"definition": {
"ha-mode": "all",
"ha-sync-mode": "automatic",
"ha-sync-batch-size": 2
}
}
],
"users": [
{
"name": "guest",
"password": "guest",
"tags": "management"
},
{
"name": "admin",
"password": "PASSWORD",
"tags": "administrator"
}
],
"permissions": [
{
"user": "guest",
"vhost": "/",
"configure": "^$",
"write": "^$",
"read": "^$"
},
{
"user": "admin",
"vhost": "/",
"configure": ".*",
"write": ".*",
"read": ".*"
}
]
}
Ok so when I originally saw that tags attribute, I assumed o arbitrary value I'll put a self documenting tag there, and that was equivalent to "", which resulted in me having 0 rights to the web management GUI/REST API, while below I had all ".*" so that part had full admin rights. It was really confusing for me because (I was getting a false error message saying I was supplying invalid credentials, but the credentials were correct, I just didn't have access.)
If it's not that then there's also this configuration thing where guest gets limited to localhost access only by default, but you can override it.
Similar problem, we were also facing with different tech stack. In our case tech stack was:
RabbitMQ deployed in Kubernetes (AKS) using Bitnami package in HA mode
Consumer and Producer created in microservice created using Java 8 with Spring Boot Framework using Apache Camel also running in same Kubernetes cluster
We verified below points:
User and password are correct
User associated with required VHOST
Required permission given (administrator tag)
User was able to login from RabbitMQ Web Console
Connectivity on host and port was there from microservice Pod to RabbitMQ service (checked with various tools like telnet)
All code and configuration was absolutely same (as there is same configuration in lower environment working correctly)
Was getting issue:
com.rabbitmq.client.AuthenticationFailureException: ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfile.
After much investigation and troubleshoot we found that, the size of username was larger than consumer API supported.
Example, we used username 'productionappuser'. This user was able to login in management web console but was failing from microservice.
We just changed the username to a new user with 8 characters and it started working.
This looks very weird as same user was able to login thus shared findings.

'Unauthorized' to push images into SSL Artifactory Docker Registry

Im sorry if this topic is duplicated, I was not able to find anything similar to this problem.
Our docker clients v17.X + (Docker for Mac & Docker for Linux) are unable to push images under a SSL V2 Registry but are successfully authenticated for pushes under an Insecure V2 Registry (CNAME) that serves the same machine. The output is always the same: unauthorized even if I docker login correctly.
The weird thing is: for our old docker clients (v1.6) we are able to login and push Docker images to a secure v2 Docker registry without any problem using the credentials file stored at ~/.dockercfg. My Nginx appears to be working just fine. Any ideas about what I'm missing here?
_
Im attaching both credentials configuration files, if anyone wants to check:
Docker client: v.17
~/.docker/config.json
{
"auths" : {
"https://secure-docker-registry.intranet": {
"auth": "someAuth",
"email": "somemail#gmail.com"
}
},
"credsStore" : "osxkeychain"
}
Obs: In Docker for Mac's case I tried with 'credsStore' and without it
Obs2: Even allowing anonymous to push images, I'm still getting an unauthorized for this registry.
Obs3: Logs are not very clean about this problem
Obs4: Artifactory is configured using a LDAP Group
Docker client: v.1.6.2
~/.dockercfg
{
"secure-docker-registry.intranet": {
"auth": "someAuth",
"email": "somemail#gmail.com"
},
"insecure-docker-registry.intranet": {
"auth": "someAuth",
"email": "somemail#gmail.com"
}
}
Artifactory Pro's version: 5.4.2

How to create a user provided redis service which spring auto configuration cloud connectors picks?

I have created a user provided service for redis as below
cf cups p-redis -p "{\"host\":\"xx.xx.xxx.xxx\",\"password\":\"xxxxxxxx\",\"port\":6379}"
This not getting picked automcatically by the redis auto reconfiguration or the service connectors and getting jedis connection pool exception.
When I bind to the redis service created from the market place it works fine with the spring boot application. This confirms there is no issue with the code or configuration. I wanted a custom service for the redis to work with the spring boot app. How can i create such service? What am i missing here? Is this possible?
System-Provided:
{
"VCAP_SERVICES": {
"user-provided": [
{
"credentials": {
"host": "xx.xx.xxx.xxx",
"password": "xxxxxxxx",
"port": 6379
},
"label": "user-provided",
"name": "p-redis",
"syslog_drain_url": "",
"tags": []
}
]
}
}
I could extend the abstract cloud connector and create redis factory myself but i want to make it work out of the box with custom service and auto configuration.
All routes to mapping this service automatically lead to the spring-cloud-connectors project. If you look at the implementation, services must be either tagged with redis or expose a uri with a redis scheme from credential keys based on a permutation of uri.
If you'd like additional detection behavior, I'd recommend opening an issue in the GitHub repo.
What worked for me:
cf cups redis -p '{"uri":"redis://:PASSWORD#HOSTNAME:PORT"}' -t "redis"
Thanks to earlier answers that led me to this solution.

How to remove analytics logs in mobilefirst 6.3

We are working in a mobilefirst 6.3 project, and our .war is installed in a liberty profile server.
We didn't configure the TTL on the analytics before. is there any way (tool, rest service or file-system) that I can remove the analytics logs in mobilefirst.
MobileFirst Platform Foundation Analytics uses ElasticSearch and Lucene at its core - there is nothing special to be done from a MobileFirst perspective.
If you want to remove everything, the whole Analytics store:
Stop the Analytics server
Delete the "analyticsData" folder which is under servers/<server-name>/ in the Liberty installation
Restart the server
Otherwise, using either CURL or Postman you can invoke the DELETE query.
You can find the ElasticSearch API here: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
Some additional questions about this topic in Stack Overflow:
Removing Data From ElasticSearch
Delete all documents from index/type without deleting type
http://www.tekkie.ro/quick-n-dirty/howto-quickly-erase-all-documents-from-an-elasticsearch-index/
Example steps:
Open the ES port - MobileFirst uses port 9500,
In the Analytics server set the JNDI property http.enabled=true and restart the Analytics server (if it's a cluster, you still only need to open the port on one of the cluster members)
The default "index" to use in your query is "worklight", and the mappings are documented in the user documentation, and are shown on the admin tab in the Analytics console
The endpoint for your delete query would need to be the Analytics server
Postman example query:
DELETE
http://your-analytics-server:9500/worklight/network_transactions/_query
{
"query": {
"range": {
"worklight_data.timestamp": {
"to": 1432313605000
}
}
}
}
CURL example query:
curl -X DELETE 'http://server:9500/worklight/network_transactions/_query' (http://server:9500/worklight/network_transactions/_query%27) -d '{ "query" : { "range" : { "timestamp" : { "lte" : "1432222333424" } } } }'