Resource manager is not starting

Resource manager is not starting - hadoop-yarn

While starting resource manager from Ambari its not working and services like App Timeline Server, Node Manager and Yarn client have started n status of NodeManagers is
Status n/a
active / n/a
lost / n/a
unhealthy / n/a
rebooted / n/a
decommissioned
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/resourcemanager.py", line 304, in <module>
Resourcemanager().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 314, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/resourcemanager.py", line 124, in start
self.wait_for_dfs_directories_created(params.entity_groupfs_store_dir, params.entity_groupfs_active_dir)
File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/resourcemanager.py", line 261, in wait_for_dfs_directories_created
self.wait_for_dfs_directory_created(dir_path, ignored_dfs_dirs)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/decorator.py", line 55, in wrapper
return function(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/resourcemanager.py", line 291, in wait_for_dfs_directory_created
raise Fail("DFS directory '" + dir_path + "' does not exist !")
resource_management.core.exceptions.Fail: DFS directory '/ats/done/' does not exist !

The service start script is looking for hdfs path /ats/done, check if such path exist with proper ownership and permissions as indicated below.
[hdfs#vp-solr2 ~]$ hdfs dfs -ls / | grep ats
drwxr-xr-x - yarn hadoop 0 2017-03-27 15:12 /ats
[hdfs#vp-solr2 ~]$ hdfs dfs -ls /ats | grep done
drwx------ - yarn hadoop 0 2017-06-19 08:33 /ats/done

Related

Permission denied creating group ckan 2.9.5

I'm having an issue with my ckan 2.9.5 instance while doing anything (create groups, organizations, uploading file....). Every time I try to do something I get a Permission denied: '/var/lib/ckan/storage/uploads/group' even if is an sysadmin user.
I tried giving full permissions to the /var/lib/ckan/storage but anything happens.
These are the permissions of the folder
And this is the error log:
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask/app.py", line 2449, in wsgi_app
response = self.handle_exception(e)
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask/app.py", line 1866, in handle_exception
reraise(exc_type, exc_value, tb)
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask/app.py", line 2446, in wsgi_app
response = self.full_dispatch_request()
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask/app.py", line 1951, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask/app.py", line 1820, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask/app.py", line 1949, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask_debugtoolbar/__init__.py", line 125, in dispatch_request
return view_func(**req.view_args)
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask/views.py", line 89, in view
return self.dispatch_request(*args, **kwargs)
File "/usr/lib/ckan/venv/lib/python3.8/site-packages/flask/views.py", line 163, in dispatch_request
return meth(*args, **kwargs)
File "/usr/lib/ckan/venv/src/ckan/ckan/views/group.py", line 859, in post
group = _action(u'group_create')(context, data_dict)
File "/usr/lib/ckan/venv/src/ckan/ckan/logic/__init__.py", line 504, in wrapped
result = _action(context, data_dict, **kw)
File "/usr/lib/ckan/venv/src/ckan/ckan/logic/action/create.py", line 871, in group_create
return _group_or_org_create(context, data_dict)
File "/usr/lib/ckan/venv/src/ckan/ckan/logic/action/create.py", line 701, in _group_or_org_create
upload = uploader.get_uploader('group')
File "/usr/lib/ckan/venv/src/ckan/ckan/lib/uploader.py", line 60, in get_uploader
upload = Upload(upload_to, old_filename)
File "/usr/lib/ckan/venv/src/ckan/ckan/lib/uploader.py", line 126, in __init__
os.makedirs(self.storage_path)
File "/usr/lib/python3.8/os.py", line 223, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/var/lib/ckan/storage/uploads/group'
Thanx for any help.

If you're doing this as the 'ckan' user, I think you're getting this error because the storage folder is probably owned by 'root'. You should give folder owner to user 'ckan'.

I had the same problem with a Docker instance of ckan.
Solution (do this in the ckan container as root user):
$ cd /var/lib/ckan
$ ls -l
total 8
drwxr-xr-x 3 root root 4096 Sep 7 17:17 storage
drwxr-xr-x 5 ckan ckan 4096 Sep 7 17:18 webassets
$ chown -R ckan.ckan storage
$ ls -l
total 8
drwxr-xr-x 3 ckan ckan 4096 Sep 7 17:17 storage
drwxr-xr-x 5 ckan ckan 4096 Sep 7 17:18 webassets
Now CKAN works smoothly.

py4j.protocol.Py4JJavaError: An error occurred while calling o27.partitions in Cloudera CDH 5.5.0 VM, Spark 2.4.7, JDK1.8.0_181

I am learning to use Spark on a personal computer with hardware capable of running Hadoop. Here's the config:
Cloudera CDH 5.5.0 w/ Cloudera Quickstart, Spark 2.4.7, JDK1.8.0_181, Hadoop 2.6.0, Python 3.6.9.
When running a Python script (copied from a Udemy video on YouTube), I ran into and fixed several errors, but I could not find any solution for the following one:
java.io.IOException: Incomplete HDFS URI, no host: hdfs: /user/cloudera / Spark / ml - 100 k / u.data
Traceback (most recent call last):
File "/home/cloudera/Spark/LowestRatedMovieDataFrame.py", line 75, in < module >
movieDataset = spark.createDataFrame(movies)
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 746, in createDataFrame
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 390, in _createFromRDD
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 361, in _inferSchema
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 1378, in first
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 1327, in take
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 2517, in getNumPartitions
File "/usr/local/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
File "/usr/local/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
File "/usr/local/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred
while calling o27.partitions.: java.io.IOException: Incomplete HDFS URI, no host: hdfs: /user/cloudera / Spark / ml - 100 k / u.data

Ambari unable to run custom hook for modifying user hive

Attempting to add a client node to cluster via Ambari (v2.7.3.0) (HDP 3.1.0.0-78) and seeing odd error
stderr:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 38, in <module>
BeforeAnyHook().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute
method(env)
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 31, in hook
setup_users()
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/shared_initialization.py", line 51, in setup_users
fetch_nonlocal_groups = params.fetch_nonlocal_groups,
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/accounts.py", line 90, in action_create
shell.checked_call(command, sudo=True)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']2019-11-25 13:07:58,000 - Reporting component version failed
Traceback (most recent call last):
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute
self.save_component_version_to_structured_out(self.command_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out
stack_select_package_name = stack_select.get_package_name()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name
package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages
supported_packages = get_supported_packages()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages
raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path))
Fail: Unable to query for supported packages using /usr/bin/hdp-select
stdout:
2019-11-25 13:07:57,644 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=None -> 3.1
2019-11-25 13:07:57,651 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2019-11-25 13:07:57,652 - Group['livy'] {}
2019-11-25 13:07:57,654 - Group['spark'] {}
2019-11-25 13:07:57,654 - Group['ranger'] {}
2019-11-25 13:07:57,654 - Group['hdfs'] {}
2019-11-25 13:07:57,654 - Group['zeppelin'] {}
2019-11-25 13:07:57,655 - Group['hadoop'] {}
2019-11-25 13:07:57,655 - Group['users'] {}
2019-11-25 13:07:57,656 - User['yarn-ats'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-11-25 13:07:57,658 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-11-25 13:07:57,659 - Modifying user hive
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
2019-11-25 13:07:57,971 - The repository with version 3.1.0.0-78 for this command has been marked as resolved. It will be used to report the version of the component which was installed
2019-11-25 13:07:58,000 - Reporting component version failed
Traceback (most recent call last):
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute
self.save_component_version_to_structured_out(self.command_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out
stack_select_package_name = stack_select.get_package_name()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name
package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages
supported_packages = get_supported_packages()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages
raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path))
Fail: Unable to query for supported packages using /usr/bin/hdp-select
Command failed after 1 tries
The problem appears to be
resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd
caused by
2019-11-25 13:07:57,659 - Modifying user hive
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
This is further reinforced by the fact that manually adding the ambari-hdp-1.repo and yum-installing hdp-select before adding the host to the cluster shows the same error messages, just truncated up to the parts of stdout/err shown here.
When running
[root#HW001 .ssh]# /usr/bin/hdp-select versions
3.1.0.0-78
from the ambari server node, I can see the command runs.
Looking at what the hook script is trying to run/access, I see
[root#client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py
-rw-r--r-- 1 root root 1.2K Nov 25 10:51 /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py
[root#client001~]# ls -lha /var/lib/ambari-agent/data/command-632.json
-rw------- 1 root root 545K Nov 25 13:07 /var/lib/ambari-agent/data/command-632.json
[root#client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY
total 0
drwxr-xr-x 4 root root 34 Nov 25 10:51 .
drwxr-xr-x 8 root root 147 Nov 25 10:51 ..
drwxr-xr-x 2 root root 34 Nov 25 10:51 files
drwxr-xr-x 2 root root 188 Nov 25 10:51 scripts
[root#client001~]# ls -lha /var/lib/ambari-agent/data/structured-out-632.json
ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory
[root#client001~]# ls -lha /var/lib/ambari-agent/tmp
total 96K
drwxrwxrwt 3 root root 4.0K Nov 25 13:06 .
drwxr-xr-x 10 root root 267 Nov 25 10:50 ..
drwxr-xr-x 6 root root 4.0K Nov 25 13:06 ambari_commons
-rwx------ 1 root root 1.4K Nov 25 13:06 ambari-sudo.sh
-rwxr-xr-x 1 root root 1.6K Nov 25 13:06 create-python-wrap.sh
-rwxr-xr-x 1 root root 1.6K Nov 25 10:50 os_check_type1574715018.py
-rwxr-xr-x 1 root root 1.6K Nov 25 11:12 os_check_type1574716360.py
-rwxr-xr-x 1 root root 1.6K Nov 25 11:29 os_check_type1574717391.py
-rwxr-xr-x 1 root root 1.6K Nov 25 13:06 os_check_type1574723161.py
-rwxr-xr-x 1 root root 16K Nov 25 10:50 setupAgent1574715020.py
-rwxr-xr-x 1 root root 16K Nov 25 11:12 setupAgent1574716361.py
-rwxr-xr-x 1 root root 16K Nov 25 11:29 setupAgent1574717392.py
-rwxr-xr-x 1 root root 16K Nov 25 13:06 setupAgent1574723163.py
notice there is ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory. Not sure if this is normal, though.
Anyone know what could be causing this or any debugging hints from this point?
UPDATE 01:
Adding some log printing lines near the offending final line in the error trace, ie. File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages, I print the code and stdout:
2
ambari-python-wrap: can't open file '/usr/bin/hdp-select': [Errno 2] No such file or directory
So what the heck? It wants hdp-select to already be there, but ambari add-host UI complains if I manually install that binary myself beforehand. When I do manually install it (using the same repo file as in the rest of the existing cluster nodes) all I see is...
0
Packages:
accumulo-client
accumulo-gc
accumulo-master
accumulo-monitor
accumulo-tablet
accumulo-tracer
atlas-client
atlas-server
beacon
beacon-client
beacon-server
druid-broker
druid-coordinator
druid-historical
druid-middlemanager
druid-overlord
druid-router
druid-superset
falcon-client
falcon-server
flume-server
hadoop-client
hadoop-hdfs-client
hadoop-hdfs-datanode
hadoop-hdfs-journalnode
hadoop-hdfs-namenode
hadoop-hdfs-nfs3
hadoop-hdfs-portmap
hadoop-hdfs-secondarynamenode
hadoop-hdfs-zkfc
hadoop-httpfs
hadoop-mapreduce-client
hadoop-mapreduce-historyserver
hadoop-yarn-client
hadoop-yarn-nodemanager
hadoop-yarn-registrydns
hadoop-yarn-resourcemanager
hadoop-yarn-timelinereader
hadoop-yarn-timelineserver
hbase-client
hbase-master
hbase-regionserver
hive-client
hive-metastore
hive-server2
hive-server2-hive
hive-server2-hive2
hive-webhcat
hive_warehouse_connector
kafka-broker
knox-server
livy-client
livy-server
livy2-client
livy2-server
mahout-client
oozie-client
oozie-server
phoenix-client
phoenix-server
pig-client
ranger-admin
ranger-kms
ranger-tagsync
ranger-usersync
shc
slider-client
spark-atlas-connector
spark-client
spark-historyserver
spark-schema-registry
spark-thriftserver
spark2-client
spark2-historyserver
spark2-thriftserver
spark_llap
sqoop-client
sqoop-server
storm-client
storm-nimbus
storm-slider-client
storm-supervisor
superset
tez-client
zeppelin-server
zookeeper-client
zookeeper-server
Aliases:
accumulo-server
all
client
hadoop-hdfs-server
hadoop-mapreduce-server
hadoop-yarn-server
hive-server
Command failed after 1 tries
UPDATE 02:
Printing some custom logging from File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 322 (printing the values of err_msg, code, out, err), ie.
....
312 if throw_on_failure and not code in returns:
313 err_msg = Logger.filter_text("Execution of '{0}' returned {1}. {2}".format(command_alias, c ode, all_output))
314
315 #TODO remove
316 print("\n----------\nMY LOGS\n----------\n")
317 print(err_msg)
318 print(code)
319 print(out)
320 print(err)
321
322 raise ExecutionFailed(err_msg, code, out, err)
323
324 # if separate stderr is enabled (by default it's redirected to out)
325 if stderr == subprocess32.PIPE:
326 return code, out, err
327
328 return code, out
....
I see
Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd
6
usermod: user 'hive' does not exist in /etc/passwd
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-816.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-816.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
2019-11-26 10:25:46,928 - The repository with version 3.1.0.0-78 for this command has been marked as resolved. It will be used to report the version of the component which was installed
So it seems like it is failing to create the hive user (even though it seems to have no problem creating the yarn-ats user before that)

After just giving in and trying to manually create the hive user myself, I see
[root#airflowetl ~]# useradd -g hadoop -s /bin/bash hive
useradd: user 'hive' already exists
[root#airflowetl ~]# cat /etc/passwd | grep hive
<nothing>
[root#airflowetl ~]# id hive
uid=379022825(hive) gid=379000513(domain users) groups=379000513(domain users)
The fact that this existing user's uid looks like this and is not in the /etc/passwd file made me think that there is some existing Active Directory user (which this client node syncs with via installed SSSD) that already has the name hive. Checking our AD users, this turned out to be true.
Temporarily stopping the SSSD service to stop sync with AD (service sssd stop) (since, not sure if you can get a server to ignore AD syncs on an individual user basis) before rerunning the client host add in Ambari fixed the problem for me.

Apache Ambari : Datanode installation failed while installing in existing cluster

I have created hadoop cluster using apache ambari 2.1.0 with 3 datanodes.
Now when i am trying to add another datanode into(existing cluster) it, it throws an error that
resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum
-d 0 -e 0 -y install 'hadoop_2_3_*'' returned 1. No Presto metadata available for base
Delta RPMs reduced 3.6 M of updates to 798 k (78% saved)
Here is my web UI console log:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 153, in
DataNode().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 218, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 34, in install
self.install_packages(env, params.exclude_packages)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 376, in install_packages
Package(name)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157, in init
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/init.py", line 45, in action_install
self.install_package(package_name, self.resource.use_repos, self.resource.skip_repos)
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 49, in install_package
shell.checked_call(cmd, sudo=True, logoutput=self.get_logoutput())
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum
-d 0 -e 0 -y install 'hadoop_2_3_*'' returned 1. No Presto metadata available for base Delta RPMs reduced 3.6 M of updates to 798 k (78%
saved)
Error downloading packages:
hadoop_2_3_4_0_3485-yarn-proxyserver-2.7.1.2.3.4.0-3485.el6.x86_64:
[Errno 256] No more mirrors to try.

This looks like there are two issues with yum and your repositories.
First I see the message:
No Presto metadata available for base Delta RPMs reduced 3.6 M of
updates to 798 k (78% saved)
Try running the following command on the host that you are trying to add as a datanode to fix the first issue:
sudo yum clean all
Then see if you can perform this command successfully:
sudo yum -v install hadoop_2_3_*
If you get to the prompt that asks if you want to install (y/n) then it was successful, choose the no option, and retry the add datanode action from Ambari. If you get an error or some failure take a look at the verbose output to troubleshoot the problem further.

Using redis with bokeh-server. Permission denied: '/bokehpids.json'

I'm trying to run bokeh-server with supervisor with redis as a backend and I get this error message on startup:
Traceback (most recent call last):
File "/usr/share/nginx/test-status/flask/bin/bokeh-server", line 7, in <module>
bokeh.server.run()
File "/usr/share/nginx/test-status/flask/lib/python2.7/site-packages/bokeh/server/__init__.py", line 175, in run
start_server(args)
File "/usr/share/nginx/test-status/flask/lib/python2.7/site-packages/bokeh/server/__init__.py", line 179, in start_server
start.start_simple_server(args)
File "/usr/share/nginx/test-status/flask/lib/python2.7/site-packages/bokeh/server/start.py", line 54, in start_simple_server
start_redis()
File "/usr/share/nginx/test-status/flask/lib/python2.7/site-packages/bokeh/server/start.py", line 40, in start_redis
save=redis_save)
File "/usr/share/nginx/test-status/flask/lib/python2.7/site-packages/bokeh/server/services.py", line 81, in start_redis
stdin=subprocess.PIPE
File "/usr/share/nginx/test-status/flask/lib/python2.7/site-packages/bokeh/server/services.py", line 32, in __init__
self.add_to_pidfile()
File "/usr/share/nginx/test-status/flask/lib/python2.7/site-packages/bokeh/server/services.py", line 46, in add_to_pidfile
with open(self.pidfilename, "w+") as f:
IOError: [Errno 13] Permission denied: '/bokehpids.json'
Note that I can run the server with supervisor if I use memory as the backend, and I can run bokeh-server manually with redis as a backend just fine. Does anyone know where the permissions I should change lie?

Turns out it was trying to access the pidfile in the root directory...
I solved this by changing the directory in the supervisor config file:
[program:bokeh]
...
directory=/usr/share/nginx/test-status
...

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Resource manager is not starting - hadoop-yarn

Related

Permission denied creating group ckan 2.9.5

py4j.protocol.Py4JJavaError: An error occurred while calling o27.partitions in Cloudera CDH 5.5.0 VM, Spark 2.4.7, JDK1.8.0_181

Ambari unable to run custom hook for modifying user hive

Apache Ambari : Datanode installation failed while installing in existing cluster

Using redis with bokeh-server. Permission denied: '/bokehpids.json'

Categories

Resources