Attempting to restart an Ambari-managed cluster and getting errors related to the Timeline Service V2.0 Reader service starting:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 108, in <module>
ApplicationTimelineReader().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 51, in start
hbase(action='start')
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 80, in hbase
createTables()
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 147, in createTables
logoutput=True)
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
returns=self.resource.returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 308, in _call
raise ExecuteTimeoutException(err_msg)
resource_management.core.exceptions.ExecuteTimeoutException: Execution of 'ambari-sudo.sh su yarn-ats -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/texlive/2016/bin/x86_64-linux:/usr/lib64/qt-3.3/bin:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/maven/bin:/root/bin:/opt/maven/bin:/opt/maven/bin:/var/lib/ambari-agent'"'"' ; sleep 10;export HBASE_CLASSPATH_PREFIX=/usr/hdp/3.0.0.0-1634/hadoop-yarn/timelineservice/*; /usr/hdp/3.0.0.0-1634/hbase/bin/hbase --config /usr/hdp/3.0.0.0-1634/hadoop/conf/embedded-yarn-ats-hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -Dhbase.client.retries.number=35 -create -s'' was killed due timeout after 300 seconds
I have not changed any configs or installed anything new between the restart attempt; simply stopped the cluster services and attempted to restart them. Not sure what this error message means. Any debugging tips or fixes?
Found the solution on another community post.
navigate to the host where Timeline Reader is installed and Install Hbase Client in that host
Here is how I installed HBase Client from via the Ambari UI...
In the Ambari UI, go to Hosts then click the host you want to install the hbase client component on
In the list on components, you will have option to add more, see...
From here I installed the HBase client
Then stopped and restarted the cluster via Ambari UI (got notification of stale configs (though not sure if this was my problem all along or if installing the HBase Client reaised the stale configs alert))
Related
I have a Redis server that I am trying to connect to for Mule 4 applications.
My objective is :
Connect with Redis using Mule 4 app : Success
Connect with Redis using Redisinsight to visualise the data -> Problem
While connecting using Redisinsight I do the following :
Launch the Redisinsight tool. It starts the tool at : http://localhost:8001/
Click on "I already have a database"
Click on "Connect to a Redis Database"
Here I provide the host, port, name (which as per documentation I provide anything say redis_test) and password.
I get the error message : "Something went wrong adding the database. Please try again"
Interestingly while connecting by mule, I just need to provide host, port and password and it works.
Please help. Thanks in advance
From the redisinsight logs :
ERROR 2021-02-09 14:14:20,123 django.request Internal Server Error: /api/instance/
Traceback (most recent call last):
File "django\core\handlers\exception.py", line 34, in inner
File "django\core\handlers\base.py", line 115, in _get_response
File "django\core\handlers\base.py", line 113, in _get_response
File "django\views\decorators\csrf.py", line 54, in wrapped_view
File "django\views\generic\base.py", line 71, in view
File "rest_framework\views.py", line 495, in dispatch
File "rest_framework\views.py", line 455, in handle_exception
File "rest_framework\views.py", line 492, in dispatch
File "redisinsight\core\views\instance.py", line 208, in post
File "redisinsight\core\views\instance.py", line 147, in _save_redis_instance
File "redisinsight\core\services\database\_routines.py", line 80, in _wrapped_add_db_func
File "redisinsight\core\services\database\_routines.py", line 765, in add_redis_database
File "redisinsight\core\services\database\_routines.py", line 809, in add_standalone_db
File "redisinsight\core\services\database\_routines.py", line 576, in _add_standalone_db
File "redisinsight\core\services\database\_routines.py", line 190, in _assert_db_type
File "redisinsight\core\services\database\_routines.py", line 175, in _probe_db_type
File "redis\client.py", line 1281, in info
File "redis\client.py", line 878, in execute_command
File "redis\client.py", line 892, in parse_response
File "redis\connection.py", line 752, in read_response
redis.exceptions.ResponseError: unknown command `INFO`, with args beginning with:
It looks like the INFO command is disabled on your Redis server. RedisInsight needs basic commands like INFO and PING to be enabled.
To enable INFO command
Edit the Redis config file:
sudo nano /etc/redis/redis.conf
search for the INFO command
something like:
rename-command INFO ""
Comment the line and restart redis:
systemctl restart redis
I am following tutorial for deploying the inception model using tensorflow serving.I am using ubuntu 16.04 and bazel 13.0.The server is running am able to ping the server.But when I upload a pic ,It shows the following error
jennings#Jennings:~/serving$ bazel-bin/tensorflow_serving/example/inception_clie nt --server=localhost:9000 --image=./Xiang_Xiang_panda.jpg
Traceback (most recent call last):
File "/home/jennings/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example/inception_client.py", line 56, in <module>
tf.app.run()
File "/home/jennings/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/jennings/serving/bazel-bin/tensorflow_serving/example/inception_client.runfiles/tf_serving/tensorflow_serving/example/inception_client.py", line 51, in main
result = stub.Predict(request, 10.0) # 10 secs timeout
File "/home/jennings/.local/lib/python2.7/site-packages/grpc/beta/_client_adaptations.py", line 309, in __call__
self._request_serializer, self._response_deserializer)
File "/home/jennings/.local/lib/python2.7/site-packages/grpc/beta/_client_adaptations.py", line 195, in _blocking_unary_unary
raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.ExpirationError: ExpirationError(code=StatusCode.DEADLINE_EXCEEDED, details="Deadline Exceeded")
This happens when the tensorflow serving client is not able to make communication with server. Or this might also occur due to network error. If you are using a docker to host your tensorflow model server, you need to open the port in the container as mentioned below,
docker run --name=tensorflow_container -p 9020:9020 -it $USER/tensorflow-serving-devel
Let me know if this works.Have a good one.
I am trying to get LetsEncrypt SSL certificates installed on a Centos 6 server using Cerbot-Auto, however no matter what I try, it just hangs on:
Apache version is 2.2.15
Command
./certbot-auto -v
When I press CTRL + C to exit the program, it takes about 15 seconds and then exits with a stack trace:
Exiting abnormally:
Traceback (most recent call last):
File "/opt/eff.org/certbot/venv/bin/letsencrypt", line 9, in <module>
load_entry_point('letsencrypt==0.7.0', 'console_scripts', 'letsencrypt')()
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot/main.py", line 1240, in main
return config.func(config, plugins)
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot/main.py", line 981, in run
installer, authenticator = plug_sel.choose_configurator_plugins(config, plugins, "run")
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot/plugins/selection.py", line 189, in choose_configurator_plugins
authenticator = installer = pick_configurator(config, req_inst, plugins)
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot/plugins/selection.py", line 25, in pick_configurator
(interfaces.IAuthenticator, interfaces.IInstaller))
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot/plugins/selection.py", line 77, in pick_plugin
verified.prepare()
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot/plugins/disco.py", line 248, in prepare
return [plugin_ep.prepare() for plugin_ep in six.itervalues(self._plugins)]
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot/plugins/disco.py", line 248, in <listcomp>
return [plugin_ep.prepare() for plugin_ep in six.itervalues(self._plugins)]
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot/plugins/disco.py", line 130, in prepare
self._initialized.prepare()
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot_apache/configurator.py", line 225, in prepare
self.parser = self.get_parser()
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot_apache/override_centos.py", line 39, in get_parser
self.version, configurator=self)
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot_apache/override_centos.py", line 47, in __init__
super(CentOSParser, self).__init__(*args, **kwargs)
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot_apache/parser.py", line 74, in __init__
if self.find_dir("Define", exclude=False):
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/certbot_apache/parser.py", line 401, in find_dir
"%s//*[self::directive=~regexp('%s')]" % (start, regex))
File "/opt/eff.org/certbot/venv/lib64/python3.4/site-packages/augeas.py", line 413, in match
ctypes.byref(array))
KeyboardInterrupt
Please see the logfiles in /var/log/letsencrypt for more details.
I thought it may be a python version issue but when checked, the server is running Python 2.6.6, which, according to the Certbot System Requirements is acceptable.
Letsencrypt.log
When I checked the log, it is exactly the same stacktrace as was reported by the script previously.
Any ideas?
I am using HDP 2.4.2 and I had previously installed the zeppelin server. It was working fine but today when i restarted the cluster ( AWS nodes were restarted), Ambari shows that Zeppelin server is not running and fails to start the server with the following error:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/2.4/services/ZEPPELIN/package/scripts/master.py", line 235, in <module>
Master().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/2.4/services/ZEPPELIN/package/scripts/master.py", line 169, in start
+ params.zeppelin_log_file, user=params.zeppelin_user)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 158, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 121, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/usr/hdp/current/zeppelin-server/lib/bin/zeppelin-daemon.sh start >> /var/log/zeppelin/zeppelin-setup.log' returned 1. /usr/hdp/current/zeppelin-server/lib/bin/zeppelin-daemon.sh: line 187: /var/run/zeppelin-notebook/zeppelin-zeppelin-ip-10-0-0-11.eu-west-1.compute.internal.pid: Permission denied
cat: /var/run/zeppelin-notebook/zeppelin-zeppelin-ip-10-0-0-11.eu-west-1.compute.internal.pid: No such file or directory
In the zeppelin logs:
ERROR [2016-06-06 03:20:36,714] ({main} VFSNotebookRepo.java[list]:140) - Can't read note file:///usr/hdp/current/zeppelin-server/lib/notebook/screenshots java.io.IOException: file:///usr/hdp/current/zeppelin-server/lib/notebook/screenshots/note.json not found
ERROR [2016-06-06 03:34:12,795] ({main} Notebook.java[loadNoteFromRepo]:330) - Failed to load 2BHU1G67J java.io.IOException: file:///usr/hdp/current/zeppelin-server/lib/notebook/2BHU1G67J is not a directory
But for some reason, the zeppelin port is listening and despite these errors, the zeppelin server is running fine and executing all the queries. Please advice on how to correct the issue in Ambari and start the service without error from ambari.
The problem is with the PID file for the zeppelin service. It's either owned by the wrong user or has the wrong permissions. Manually stop the zeppelin service then delete the pid file locate at: /var/run/zeppelin-notebook/zeppelin-zeppelin-ip-10-0-0-11.eu-west-1.compute.internal.pid. Double check the owner/permissions on the /var/run/zeppelin-notebook folder as well. You should then be able to restart the service in the Ambari UI.
I have created hadoop cluster using apache ambari 2.1.0 with 3 datanodes.
Now when i am trying to add another datanode into(existing cluster) it, it throws an error that
resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum
-d 0 -e 0 -y install 'hadoop_2_3_*'' returned 1. No Presto metadata available for base
Delta RPMs reduced 3.6 M of updates to 798 k (78% saved)
Here is my web UI console log:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 153, in
DataNode().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 218, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 34, in install
self.install_packages(env, params.exclude_packages)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 376, in install_packages
Package(name)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157, in init
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/init.py", line 45, in action_install
self.install_package(package_name, self.resource.use_repos, self.resource.skip_repos)
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 49, in install_package
shell.checked_call(cmd, sudo=True, logoutput=self.get_logoutput())
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum
-d 0 -e 0 -y install 'hadoop_2_3_*'' returned 1. No Presto metadata available for base Delta RPMs reduced 3.6 M of updates to 798 k (78%
saved)
Error downloading packages:
hadoop_2_3_4_0_3485-yarn-proxyserver-2.7.1.2.3.4.0-3485.el6.x86_64:
[Errno 256] No more mirrors to try.
This looks like there are two issues with yum and your repositories.
First I see the message:
No Presto metadata available for base Delta RPMs reduced 3.6 M of
updates to 798 k (78% saved)
Try running the following command on the host that you are trying to add as a datanode to fix the first issue:
sudo yum clean all
Then see if you can perform this command successfully:
sudo yum -v install hadoop_2_3_*
If you get to the prompt that asks if you want to install (y/n) then it was successful, choose the no option, and retry the add datanode action from Ambari. If you get an error or some failure take a look at the verbose output to troubleshoot the problem further.