Apache Ambari : Datanode installation failed while installing in existing cluster - apache

I have created hadoop cluster using apache ambari 2.1.0 with 3 datanodes.
Now when i am trying to add another datanode into(existing cluster) it, it throws an error that
resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum
-d 0 -e 0 -y install 'hadoop_2_3_*'' returned 1. No Presto metadata available for base
Delta RPMs reduced 3.6 M of updates to 798 k (78% saved)
Here is my web UI console log:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 153, in
DataNode().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 218, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 34, in install
self.install_packages(env, params.exclude_packages)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 376, in install_packages
Package(name)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157, in init
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/init.py", line 45, in action_install
self.install_package(package_name, self.resource.use_repos, self.resource.skip_repos)
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 49, in install_package
shell.checked_call(cmd, sudo=True, logoutput=self.get_logoutput())
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum
-d 0 -e 0 -y install 'hadoop_2_3_*'' returned 1. No Presto metadata available for base Delta RPMs reduced 3.6 M of updates to 798 k (78%
saved)
Error downloading packages:
hadoop_2_3_4_0_3485-yarn-proxyserver-2.7.1.2.3.4.0-3485.el6.x86_64:
[Errno 256] No more mirrors to try.

This looks like there are two issues with yum and your repositories.
First I see the message:
No Presto metadata available for base Delta RPMs reduced 3.6 M of
updates to 798 k (78% saved)
Try running the following command on the host that you are trying to add as a datanode to fix the first issue:
sudo yum clean all
Then see if you can perform this command successfully:
sudo yum -v install hadoop_2_3_*
If you get to the prompt that asks if you want to install (y/n) then it was successful, choose the no option, and retry the add datanode action from Ambari. If you get an error or some failure take a look at the verbose output to troubleshoot the problem further.

Related

Cannot install petsc4py on Ubuntu 20.04

After successfully installing and testing PETSc I went ahead and tried to install petsc4py with:
$ sudo python3 -m pip install petsc4py
but got loads of errors. Here are the messages:
Collecting petsc4py
Using cached petsc4py-3.16.1.tar.gz (2.3 MB)
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-yg5szcfl/petsc4py/setup.py'"'"'; __file__='"'"'/tmp/pip-install-yg5szcfl/petsc4py/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-yg5szcfl/petsc4py/pip-egg-info
cwd: /tmp/pip-install-yg5szcfl/petsc4py/
Complete output (54 lines):
running egg_info
creating /tmp/pip-install-yg5szcfl/petsc4py/pip-egg-info/petsc4py.egg-info
writing /tmp/pip-install-yg5szcfl/petsc4py/pip-egg-info/petsc4py.egg-info/PKG-INFO
writing dependency_links to /tmp/pip-install-yg5szcfl/petsc4py/pip-egg-info/petsc4py.egg-info/dependency_links.txt
writing requirements to /tmp/pip-install-yg5szcfl/petsc4py/pip-egg-info/petsc4py.egg-info/requires.txt
writing top-level names to /tmp/pip-install-yg5szcfl/petsc4py/pip-egg-info/petsc4py.egg-info/top_level.txt
writing manifest file '/tmp/pip-install-yg5szcfl/petsc4py/pip-egg-info/petsc4py.egg-info/SOURCES.txt'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-install-yg5szcfl/petsc4py/setup.py", line 289, in <module>
main()
File "/tmp/pip-install-yg5szcfl/petsc4py/setup.py", line 286, in main
run_setup()
File "/tmp/pip-install-yg5szcfl/petsc4py/setup.py", line 135, in run_setup
setup(packages = ['petsc4py',
File "/usr/local/lib/python3.8/dist-packages/setuptools/__init__.py", line 155, in setup
return distutils.core.setup(**attrs)
File "/usr/local/lib/python3.8/dist-packages/setuptools/_distutils/core.py", line 148, in setup
return run_commands(dist)
File "/usr/local/lib/python3.8/dist-packages/setuptools/_distutils/core.py", line 163, in run_commands
dist.run_commands()
File "/usr/local/lib/python3.8/dist-packages/setuptools/_distutils/dist.py", line 967, in run_commands
self.run_command(cmd)
File "/usr/local/lib/python3.8/dist-packages/setuptools/_distutils/dist.py", line 986, in run_command
cmd_obj.run()
File "/usr/local/lib/python3.8/dist-packages/setuptools/command/egg_info.py", line 298, in run
self.find_sources()
File "/usr/local/lib/python3.8/dist-packages/setuptools/command/egg_info.py", line 305, in find_sources
mm.run()
File "/usr/local/lib/python3.8/dist-packages/setuptools/command/egg_info.py", line 540, in run
self.add_defaults()
File "/usr/local/lib/python3.8/dist-packages/setuptools/command/egg_info.py", line 577, in add_defaults
sdist.add_defaults(self)
File "/usr/local/lib/python3.8/dist-packages/setuptools/_distutils/command/sdist.py", line 226, in add_defaults
self._add_defaults_python()
File "/usr/local/lib/python3.8/dist-packages/setuptools/command/sdist.py", line 111, in _add_defaults_python
build_py = self.get_finalized_command('build_py')
File "/usr/local/lib/python3.8/dist-packages/setuptools/_distutils/cmd.py", line 299, in get_finalized_command
cmd_obj.ensure_finalized()
File "/usr/local/lib/python3.8/dist-packages/setuptools/_distutils/cmd.py", line 107, in ensure_finalized
self.finalize_options()
File "/usr/local/lib/python3.8/dist-packages/setuptools/command/build_py.py", line 29, in finalize_options
orig.build_py.finalize_options(self)
File "/usr/local/lib/python3.8/dist-packages/setuptools/_distutils/command/build_py.py", line 43, in finalize_options
self.set_undefined_options('build',
File "/usr/local/lib/python3.8/dist-packages/setuptools/_distutils/cmd.py", line 287, in set_undefined_options
src_cmd_obj.ensure_finalized()
File "/usr/local/lib/python3.8/dist-packages/setuptools/_distutils/cmd.py", line 107, in ensure_finalized
self.finalize_options()
File "/tmp/pip-install-yg5szcfl/petsc4py/conf/baseconf.py", line 411, in finalize_options
self.petsc_dir = config.get_petsc_dir(self.petsc_dir)
File "/tmp/pip-install-yg5szcfl/petsc4py/conf/baseconf.py", line 349, in get_petsc_dir
petsc_dir = petsc.get_petsc_dir()
AttributeError: module 'petsc' has no attribute 'get_petsc_dir'
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
There are so many error messages that I have no idea where from I should start. Any help? The installation should be straightforward, I don't know what's going on. Trying to fix the package (or setup.py thereof) manually is futile. I'm using Python 3.8.10.
Any hints will be much appreciated.
The easiest solution I found to solve this issue in one of my Dockerfile is simply to use apt-get install -y python3-petsc4py-complex.
(If you want/have the petsc real version, use apt-get install -y python3-petsc4py-real instead).
From those who are interested, the same idea can be applied for slepc with apt-get install -y python3-slepc4py-complex.
Feel free to tell me if it works ! If not, I will try on a new Dockerfile to solve this.

Ambari cluster restart error: Timeline Service V2.0 Reader not restarting

Attempting to restart an Ambari-managed cluster and getting errors related to the Timeline Service V2.0 Reader service starting:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 108, in <module>
ApplicationTimelineReader().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 51, in start
hbase(action='start')
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 80, in hbase
createTables()
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 147, in createTables
logoutput=True)
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
returns=self.resource.returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 308, in _call
raise ExecuteTimeoutException(err_msg)
resource_management.core.exceptions.ExecuteTimeoutException: Execution of 'ambari-sudo.sh su yarn-ats -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/texlive/2016/bin/x86_64-linux:/usr/lib64/qt-3.3/bin:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/maven/bin:/root/bin:/opt/maven/bin:/opt/maven/bin:/var/lib/ambari-agent'"'"' ; sleep 10;export HBASE_CLASSPATH_PREFIX=/usr/hdp/3.0.0.0-1634/hadoop-yarn/timelineservice/*; /usr/hdp/3.0.0.0-1634/hbase/bin/hbase --config /usr/hdp/3.0.0.0-1634/hadoop/conf/embedded-yarn-ats-hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -Dhbase.client.retries.number=35 -create -s'' was killed due timeout after 300 seconds
I have not changed any configs or installed anything new between the restart attempt; simply stopped the cluster services and attempted to restart them. Not sure what this error message means. Any debugging tips or fixes?
Found the solution on another community post.
navigate to the host where Timeline Reader is installed and Install Hbase Client in that host
Here is how I installed HBase Client from via the Ambari UI...
In the Ambari UI, go to Hosts then click the host you want to install the hbase client component on
In the list on components, you will have option to add more, see...
From here I installed the HBase client
Then stopped and restarted the cluster via Ambari UI (got notification of stale configs (though not sure if this was my problem all along or if installing the HBase Client reaised the stale configs alert))

Cannot run dask-mpi with Python 3.7 -- timeout when connecting client to dask-mpi scheduler

I'm attempting to run the Dask-MPI "Getting Started" (http://mpi.dask.org/en/latest/) example in a fresh Anaconda environment.
I set up an environment using
conda create -n dask-mpi -c conda-forge python=3.7 dask-mpi
conda activate dask-mpi
Inside the environment, I run
mpirun -np 4 dask-mpi --scheduler-file ./scheduler.json
Then, from a python interpreter on the same machine (and in the same folder), I run
from dask.distributed import Client
client = Client(scheduler_file='/path/to/scheduler.json')
This results in the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/distributed/client.py", line 712, in __init__
self.start(timeout=timeout)
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/distributed/client.py", line 858, in start
sync(self.loop, self._start, **kwargs)
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/distributed/utils.py", line 331, in sync
six.reraise(*error[0])
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/six.py", line 693, in reraise
raise value
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/distributed/utils.py", line 316, in f
result[0] = yield future
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/tornado/gen.py", line 729, in run
value = future.result()
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/tornado/gen.py", line 736, in run
yielded = self.gen.throw(*exc_info) # type: ignore
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/distributed/client.py", line 954, in _start
yield self._ensure_connected(timeout=timeout)
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/tornado/gen.py", line 729, in run
value = future.result()
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/tornado/gen.py", line 736, in run
yielded = self.gen.throw(*exc_info) # type: ignore
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/distributed/client.py", line 1015, in _ensure_connected
timedelta(seconds=timeout), self._update_scheduler_info()
File "/home/nleaf/anaconda3/envs/dask-mpi/lib/python3.7/site-packages/tornado/gen.py", line 729, in run
value = future.result()
tornado.util.TimeoutError: Timeout
The terminal that I ran dask-mpi from does not have any output which would indicate that something is trying to connect. I have verified that the port in question, 8786, is open. I've also verified via debugger that the client is getting the correct address from the scheduler file.
I've tried this in quite a few different environments and on a few different machines, including a fresh Ubuntu 18.04 docker container. I'm completely at a loss for what steps I might be missing.
It turns out this was due to an error in newer versions of dask.distributed (1.25.3) which broke the behavior of dask-mpi. This seems to be fixed as of dask-mpi 1.0.3 (https://github.com/dask/dask-mpi/releases/tag/1.0.3).

Pandas not installing in 64 bit windows

I recently installed a new verion of Anaconda 3 4.2 on a windows laptop.
All the packages work fine , but pandas never worked for me from day 1.
So i thought of uninstalling and installing a new version of pandas 0.19
While using pip:
C:\Users\>python -m pip install --user pandas
Collecting pandas
Using cached pandas-0.19.2-cp35-cp35m-win_amd64.whl
Exception:
Traceback (most recent call last):
File "D:\Anaconda3-4.2\lib\site-packages\pip\basecommand.py", line 215, in main
status = self.run(options, args)
File "D:\Anaconda3-4.2\lib\site-packages\pip\commands\install.py", line 335, in run
wb.build(autobuilding=True)
File "D:\Anaconda3-4.2\lib\site-packages\pip\wheel.py", line 749, in build
self.requirement_set.prepare_files(self.finder)
File "D:\Anaconda3-4.2\lib\site-packages\pip\req\req_set.py", line 380, in prepare_files
ignore_dependencies=self.ignore_dependencies))
File "D:\Anaconda3-4.2\lib\site-packages\pip\req\req_set.py", line 620, in _prepare_file
session=self.session, hashes=hashes)
File "D:\Anaconda3-4.2\lib\site-packages\pip\download.py", line 821, in unpack_url
hashes=hashes
File "D:\Anaconda3-4.2\lib\site-packages\pip\download.py", line 663, in unpack_http_url
unpack_file(from_path, location, content_type, link)
File "D:\Anaconda3-4.2\lib\site-packages\pip\utils\__init__.py", line 599, in unpack_file
flatten=not filename.endswith('.whl')
File "D:\Anaconda3-4.2\lib\site-packages\pip\utils\__init__.py", line 499, in unzip_file
fp = open(fn, 'wb')
PermissionError: [Errno 13] Permission denied: C:\\Users\\AppData\\Local\\Temp\\pip-build-h5ip5q8f\\pandas\\pandas/io/tests/data/blank.xls
While using conda :
An unexpected error has occurred.
Please consider posting the following information to the
conda GitHub issue tracker at:
https://github.com/conda/conda/issues
Current conda install:
platform : win-64
conda version : 4.3.8
conda is private : False
conda-env version : 4.3.8
conda-build version : 2.0.2
python version : 3.5.2.final.0
requests version : 2.12.4
root environment : D:\Anaconda3-4.2 (writable)
default environment : D:\Anaconda3-4.2
envs directories : D:\Anaconda3-4.2\envs
package cache : D:\Anaconda3-4.2\pkgs
channel URLs : https://repo.continuum.io/pkgs/free/win-64
https://repo.continuum.io/pkgs/free/noarch
https://repo.continuum.io/pkgs/r/win-64
https://repo.continuum.io/pkgs/r/noarch
https://repo.continuum.io/pkgs/pro/win-64
https://repo.continuum.io/pkgs/pro/noarch
https://repo.continuum.io/pkgs/msys2/win-64
https://repo.continuum.io/pkgs/msys2/noarch
config file : None
offline mode : False
user-agent : conda/4.3.8 requests/2.12.4 CPython/3.5.2 Windows/7
Windows/6.1.7601
D:\Anaconda3-4.2\Scripts\conda-script.py install pandas
Traceback (most recent call last):
File "D:\Anaconda3-4.2\lib\site-packages\conda\exceptions.py", line 617, in conda_exception_handler
return_value = func(*args, **kwargs)
File "D:\Anaconda3-4.2\lib\site-packages\conda\cli\main.py", line 137, in _main
exit_code = args.func(args, p)
File "D:\Anaconda3-4.2\lib\site-packages\conda\cli\main_install.py", line 80, in execute
install(args, parser, 'install')
File "D:\Anaconda3-4.2\lib\site-packages\conda\cli\install.py", line 347, in install
execute_actions(actions, index, verbose=not context.quiet)
File "D:\Anaconda3-4.2\lib\site-packages\conda\plan.py", line 837, in execute_actions
execute_instructions(plan, index, verbose)
File "D:\Anaconda3-4.2\lib\site-packages\conda\instructions.py", line 258, in execute_instructions
cmd(state, arg)
File "D:\Anaconda3-4.2\lib\site-packages\conda\instructions.py", line 118, in UNLINKLINKTRANSACTION_CMD
txn = UnlinkLinkTransaction.create_from_dists(index, prefix, unlink_dists, link_dists)
File "D:\Anaconda3-4.2\lib\site-packages\conda\core\link.py", line 121, in create_from_dists
for dist, pkg_dir in zip(link_dists, pkg_dirs_to_link))
File "D:\Anaconda3-4.2\lib\site-packages\conda\core\link.py", line 121, in <genexpr>
for dist, pkg_dir in zip(link_dists, pkg_dirs_to_link))
File "D:\Anaconda3-4.2\lib\site-packages\conda\gateways\disk\read.py", line 71, in read_package_info
index_json_record = read_index_json(extracted_package_directory)
File "D:\Anaconda3-4.2\lib\site-packages\conda\gateways\disk\read.py", line 94, in read_index_json
with open(join(extracted_package_directory, 'info', 'index.json')) as fi:
FileNotFoundError: [Errno 2] No such file or directory: 'D:\\Anaconda3-4.2\\pkgs\\pandas-0.19.2-np111py35_1\\info\\index.json
The problem is that you are trying to install Pandas as a regular user, and therefore cannot change the contents of Anaconda installation folder (where Python installed packages live). You need to run CMD or PowerShell (whichever you are using) as an administrator. Right-click on its shortcut in the start menu and click "Run as administrator", then run again the same command.
Try to use Anaconda package manager - conda instead of pip:
C:\> conda install pandas

Yum broken on RHEL 6

Using 64-bit RHEL 6, receiving this error from Yum:
[root /]# yum install [package_name]
---Start Error---
<BR><BR>
Traceback (most recent call last):<BR>
File "/usr/bin/yum", line 29, in <module>
yummain.user_main(sys.argv[1:], exit_code=True)
File "/usr/share/yum-cli/yummain.py", line 288, in user_main
errcode = main(args)
File "/usr/share/yum-cli/yummain.py", line 140, in main
result, resultmsgs = base.doCommands()
File "/usr/share/yum-cli/cli.py", line 436, in doCommands
self._getTs(needTsRemove)
File "/usr/lib/python2.6/site-packages/yum/depsolve.py", line 99, in _getTs
self._getTsInfo(remove_only)
File "/usr/lib/python2.6/site-packages/yum/depsolve.py", line 110, in _getTsInfo
pkgSack = self.pkgSack
File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 887, in <lambda>
pkgSack = property(fget=lambda self: self._getSacks(),
File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 669, in _getSacks
self.repos.populateSack(which=repos)
File "/usr/lib/python2.6/site-packages/yum/repos.py", line 308, in populateSack
sack.populate(repo, mdtype, callback, cacheonly)
File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 187, in populate
dobj = repo_cache_function(xml, csum)
File "/usr/lib64/python2.6/site-packages/sqlitecachec.py", line 46, in getPrimary
self.repoid))
TypeError: Parsing primary.xml error: Start tag expected, '<' not found
---End Error---
Just started today. Was working just fine a couple days ago. Haven't installed anything on this system since last use.
Have already rebuilt Python 2.6 and Yum 3.4.3. Still same errors as above. Any ideas?
Clear the repo cache and rebuild it
yum clean all
yum update
Run this:
sudo su
export LD_LIBRARY_PATH=/usr/lib64:/usr/local/lib
yum clean all
yum update yum
I think this fixes it. It worked for me.