yarn logs throws an error of "not valid bcfile" - hadoop-yarn

HDP version: 3.1.0.78, using ambari 2.7.3 to install.
yarn version: 3.1.1
The application is finished with failed status.
Here is the complete log:
yarn logs -applicationId application_1565920695836_0005
19/08/16 11:49:57 INFO client.AHSProxy: Connecting to Application History server at dn1003.jja.bigo/10.221.117.44:10300
Not a valid BCFile.
Can not find any log file matching the pattern: [ALL] for the application: application_1565920695836_0005
Can not find the logs for the application: application_1565920695836_0005 with the appOwner: hadoop
Thanks for your reply.
UPDATE:
After using debug with export YARN_ROOT_LOGGER="DEBUG,console", it says "/data1/app-logs/hadoop/logs/application_1566555356033_0034/meta" is not valid bcfile.

Related

Hyperledger Fabric error: "TLS: bad certificate server" when installing chaincode

I'm just starting learning HLF, and I have an error while following tutorial from the docs: link
I downloaded fabric-samples using this command (replaced bit.ly link with the destination):
curl -sSL https://raw.githubusercontent.com/hyperledger/fabric/master/scripts/bootstrap.sh | bash -s -- 2.2.2 1.4.9
I run logspout in one terminal and try to execute peer lifecycle chaincode install basic.tar.gz in another one, and this is the result i get
Error: failed to retrieve endorser client for install: endorser client
failed to connect to localhost:7051: failed to create new connection:
context deadline exceeded
Log presented by Logspout:
peer0.org1.example.com|2022-03-15 13:03:24.452 UTC [core.comm]
ServerHandshake -> ERRO 04a Server TLS handshake failed in 2.650245ms
with error remote error: tls: bad certificate server=PeerServer
remoteaddress=172.22.0.1:61126
I set the envs in terminal as instructed in the docs, and I checked that CORE_PEER_TLS_ROOTCERT_FILE variable points to an existing file. The content of the file is the same as on the container.
What I tried to do:
download fabric-samples again and redo all the setup with copy-pasting the commands directly from docs
Do you have any suggestions where I can look for an issue?
I resolved the problem, I was using peer version 2.2.1 from previous experiments, it probably collided with FABRIC_CFG_PATH

Install Ambari, can't download hortonworks HDP from amazon S3

I'm tring to install apache ambari on my wsl(ubuntu 20.04) as the Ambari User Guides step by step. while install and packing the project to deb files use command:
mvn -B clean install jdeb:jdeb -DnewVersion=2.7.5.0.0 -DbuildNumber=5895e4ed6b30a2da8a90fee2403b6cab91d19972 -DskipTests -Dpython.ver="python >= 2.6" .
got this error
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (default) on project ambari-metrics-timelineservice: An Ant BuildException has occured: Can't get https://s3.amazonaws.com/dev.hortonworks.com/HDP/centos7/3.x/BUILDS/3.1.4.0-315/tars/hbase/hbase-2.0.2.3.1.4.0-315-bin.tar.gz to /root/apache-ambari-2.7.5-src/ambari-metrics/ambari-metrics-timelineservice/target/embedded/hbase.tar.gz
[ERROR] around Ant part ...<get usetimestamp="true" src="https://s3.amazonaws.com/dev.hortonworks.com/HDP/centos7/3.x/BUILDS/3.1.4.0-315/tars/hbase/hbase-2.0.2.3.1.4.0-315-bin.tar.gz" dest="/root/apache-ambari-2.7.5-src/ambari-metrics/ambari-metrics-timelineservice/target/embedded/hbase.tar.gz"/>... # 5:273 in /root/apache-ambari-2.7.5-src/ambari-metrics/ambari-metrics-timelineservice/target/antrun/build-Download HBase.xml
apache-ambari-2.7.5-src/ambari-metrics/pom.xml defined the nortonworks HDP sources:
<hbase.tar>http://dev.hortonworks.com.s3.amazonaws.com/HDP/centos7/3.x/BUILDS/3.0.0.0-1634/tars/hbase/hbase-2.0.0.3.0.0.0-1634-bin.tar.gz</hbase.tar>
<hbase.folder>hbase-2.0.0.3.0.0.0-1634</hbase.folder>
<hadoop.tar>http://dev.hortonworks.com.s3.amazonaws.com/HDP/centos7/3.x/BUILDS/3.0.0.0-1634/tars/hadoop/hadoop-3.1.0.3.0.0.0-1634.tar.gz</hadoop.tar>
<hadoop.folder>hadoop-3.1.0.3.0.0.0-1634</hadoop.folder>
<phoenix.tar>http://dev.hortonworks.com.s3.amazonaws.com/HDP/centos7/3.x/BUILDS/3.0.0.0-1634/tars/phoenix/phoenix-5.0.0.3.0.0.0-1634.tar.gz</phoenix.tar>
<phoenix.folder>phoenix-5.0.0.3.0.0.0-1634</phoenix.folder>
I tried to download official Hbase-2.3.2, Hadoop-3.3.0,phoenix-5.0.0-HBase-2.0 to instead hotonworks HDP, but failed and got an other error.
I tried to download hortonworks HDP directly use wget and got:
Resolving dev.hortonworks.com.s3.amazonaws.com (dev.hortonworks.com.s3.amazonaws.com)... 52.217.40.204
Connecting to dev.hortonworks.com.s3.amazonaws.com
(dev.hortonworks.com.s3.amazonaws.com)|52.217.40.204|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
where/how can i download these hortonworks HDP files and continue to install ambari?
This solution from the Apache Ambari team fixed this and other similar issues with HDP urls: https://github.com/apache/ambari/pull/3283
You need to make the corresponding code changes: https://github.com/apache/ambari/pull/3283/commits/3dca705f831383274a78a8c981ac2b12e2ecce85
replace the downlink with
<hbase.tar>https://s3.amazonaws.com/dev.hortonworks.com/HDP/centos7/3.x/BUILDS/3.1.4.1-1/tars/hbase/hbase-2.0.2.3.1.4.1-1-bin.tar.gz</hbase.tar>
<hbase.tar>https://private-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.4.1-1/tars/hbase/hbase-2.0.2.3.1.4.1-1-bin.tar.gz</hbase.tar>
<hbase.folder>hbase-2.0.2.3.1.4.1-1</hbase.folder>
<hadoop.tar>https://s3.amazonaws.com/dev.hortonworks.com/HDP/centos7/3.x/BUILDS/3.1.4.1-1/tars/hadoop/hadoop-3.1.1.3.1.4.1-1.tar.gz</hadoop.tar>
<hadoop.tar>https://private-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.4.1-1/tars/hadoop/hadoop-3.1.1.3.1.4.1-1.tar.gz</hadoop.tar>
<hadoop.folder>hadoop-3.1.1.3.1.4.1-1</hadoop.folder>
<grafana.folder>grafana-6.4.2</grafana.folder>
<grafana.tar>https://dl.grafana.com/oss/release/grafana-6.4.2.linux-amd64.tar.gz</grafana.tar>
<phoenix.tar>https://s3.amazonaws.com/dev.hortonworks.com/HDP/centos7/3.x/BUILDS/3.1.4.1-1/tars/phoenix/phoenix-5.0.0.3.1.4.1-1.tar.gz</phoenix.tar>
<phoenix.tar>https://private-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.4.1-1/tars/phoenix/phoenix-5.0.0.3.1.4.1-1.tar.gz</phoenix.tar>
<phoenix.folder>phoenix-5.0.0.3.1.4.1-1</phoenix.folder>
edit file ambari-metrics/pom.xml and replace download link, for example I changed
https://s3.amazonaws.com/dev.hortonworks.com/HDP/centos7/3.x/BUILDS/3.1.4.0-315/tars/hbase/hbase-2.0.2.3.1.4.0-315-bin.tar.gz
for
https://downloads.apache.org/hbase/2.3.5/hbase-2.3.5-src.tar.gz

Unable to provision rabbitmq using chef and testkitchen

I am trying to install an old version of RabbitMQ using Chef (cookbook 'rabbitmq', '~> 5.8.5') and Kitchen, below my configuration:
Attributes
#Erlang
default['erlang']['install_method'] = 'source'
default['erlang']['source']['version']='R13B03'
default['erlang']['source']['checksum']='e7c46c8b2778f22064a3b369c1a1b572a1cc0e8a2198166858d4b9a1b488d662'
#RabbitMQ
default['rabbitmq']['erlang']['enabled'] = true
default['rabbitmq']['version'] = "3.4.4"
default['rabbitmq']['rpm_package'] ='rabbitmq-server-3.4.4-1.noarch.rpm'
Recipe:
include_recipe 'rabbitmq::default'
When I run kitchen converge, I am getting the following exception:
Running handlers:
[2020-08-22T22:20:07+00:00] ERROR: Running exception handlers
Running handlers complete
[2020-08-22T22:20:07+00:00] ERROR: Exception handlers complete
Chef Infra Client failed. 9 resources updated in 06 minutes 26 seconds
[2020-08-22T22:20:07+00:00] FATAL: Stacktrace dumped to /tmp/kitchen/cache/chef-stacktrace.out
[2020-08-22T22:20:07+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2020-08-22T22:20:07+00:00] FATAL: Mixlib::ShellOut::ShellCommandFailed: rpm_package[/tmp/kitchen/cache/rabbitmq-server-3.4.4-1.noarch.rpm] (rabbitmq::default line 224) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of ["rpm", "-i", "/tmp/kitchen/cache/rabbitmq-server-3.4.4-1.noarch.rpm"] ----
STDOUT:
STDERR: warning: /tmp/kitchen/cache/rabbitmq-server-3.4.4-1.noarch.rpm: Header V4 DSA/SHA1 Signature, key ID 056e8e56: NOKEY
error: Failed dependencies:
erlang >= R13B-03 is needed by rabbitmq-server-3.4.4-1.noarch
---- End output of ["rpm", "-i", "/tmp/kitchen/cache/rabbitmq-server-3.4.4-1.noarch.rpm"] ----
Ran ["rpm", "-i", "/tmp/kitchen/cache/rabbitmq-server-3.4.4-1.noarch.rpm"] returned 1
But when I logged in to the VM, I can see erlang is installed:
[vagrant#kitchen-rmq-server-centos-7 ~]$ erl
Erlang R13B03 (erts-5.7.4) [source] [64-bit] [rq:1] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.7.4 (abort with ^G)
1>
And it is the same version required by RMQ (R13B03)
Any idea how to solve this issue?
Edit: to replicate the issue https://github.com/Proximator/chef-rmq
Firstly, we have to make sure erlang is installed by the rabbitmq cookbook, and not by any other means. This is the note found on Chef supermarket for rabbitmq cookbook:
The packages are cannot be installed alongside with other Erlang packages, for example, those from standard Debian repositories or Erlang Solutions.
To make sure that the Erlang cookbook is not used by rabbitmq::default
Also, there is a compatibility matrix of RabbitMQ and Erlang versions. RabbitMQ 3.7.0 being the lowest supported version, for which the lowest compatible Erlang version is 19.3.
There are zero dependency Erlang RPMs "just enough to run RabbitMQ" as documented here:
https://github.com/rabbitmq/erlang-rpm
For example - to install RabbitMQ 3.7.x with the compatible Erlang 19.3.x:
You should have these attributes:
default['rabbitmq']['erlang']['enabled'] = true
default['rabbitmq']['version'] = '3.7.6'
default['rabbitmq']['erlang']['yum']['baseurl'] = 'https://dl.bintray.com/rabbitmq-erlang/rpm/erlang/19/el/7'
default['rabbitmq']['erlang']['version'] = '19.3.6.13'
Then include below recipes:
include_recipe 'rabbitmq::erlang_package'
include_recipe 'rabbitmq::default'

Yarn Launch Container Failed with Privilege Issue

Stack: Ambari 2.4.2.0, HDP 2.5.3.0, CentOS 6.8, FreeIPA 3.0.0
When I try to use hdp user to submit a job on yarn, _000001 container can be created and launched successfully, but I got error when _000002 container is being launched after container created:
2018-11-27 22:13:35,919 WARN privileged.PrivilegedOperationExecutor (PrivilegedOperationExecutor.java:executePrivilegedOperation(170)) - Shell execution returned exit code: 255. Privileged Execution Operation Output:
main : command provided 1
main : run as user is hdp
main : requested yarn user is hdp
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.pid.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...
Getting exit code file...
Creating script paths...
Full command array for failed execution:
[/usr/hdp/current/hadoop-yarn-nodemanager/bin/container-executor, hdp, hdp, 1, application_1543327888220_0001, container_e14_1543327888220_0001_01_000002, /hadoop/yarn/local/usercache/hdp/appcache/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/launch_container.sh, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.tokens, /hadoop/yarn/local/nmPrivate/application_1543327888220_0001/container_e14_1543327888220_0001_01_000002/container_e14_1543327888220_0001_01_000002.pid, /hadoop/yarn/local, /hadoop/yarn/log, cgroups=none]
2018-11-27 22:13:35,921 WARN runtime.DefaultLinuxContainerRuntime (DefaultLinuxContainerRuntime.java:launchContainer(107)) - Launch container failed. Exception: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=255:
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:175)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:103)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:89)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:392)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: ExitCodeException exitCode=255:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:933)
at org.apache.hadoop.util.Shell.run(Shell.java:844)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1123)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
... 9 more
There is no more log about Privilege, anybody has some idea?
Thanks in advance!
Problem resolved and the problem is submitted job itself not YARN/Privilege.
Suggestion is that you'd better try to find details in container log not resourcemanager/nodemanager log.

Getting BlazeInfoException in IntelliJ from the Bazel plugin

I just installed the Bazel plugin for IntelliJ, and I keep getting this exception:
com.google.idea.blaze.base.command.info.BlazeInfoException: blaze info failed with exit code: -1
java.util.concurrent.ExecutionException:
com.google.idea.blaze.base.command.info.BlazeInfoException: blaze info failed with exit code: -1
at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:502)
[...]
at com.google.idea.blaze.base.async.FutureUtil$Builder.lambda$run$0(FutureUtil.java:93)
[...]
at com.intellij.openapi.progress.impl.CoreProgressManager.lambda$runProcess$1(CoreProgressManager.java:170)
at com.intellij.openapi.progress.impl.CoreProgressManager.registerIndicatorAndRun(CoreProgressManager.java:548)
at com.intellij.openapi.progress.impl.CoreProgressManager.executeProcessUnderProgress(CoreProgressManager.java:493)
at com.intellij.openapi.progress.impl.ProgressManagerImpl.executeProcessUnderProgress(ProgressManagerImpl.java:94)
at com.intellij.openapi.progress.impl.CoreProgressManager.runProcess(CoreProgressManager.java:157)
at com.google.idea.blaze.base.async.executor.BlazeExecutor$3.call(BlazeExecutor.java:108)
at com.google.idea.blaze.base.async.executor.BlazeExecutor$3.call(BlazeExecutor.java:105)
[...]
Caused by: com.google.idea.blaze.base.command.info.BlazeInfoException: blaze info failed with exit code: -1
at com.google.idea.blaze.base.command.info.BlazeInfoRunnerImpl.runBlazeInfo(BlazeInfoRunnerImpl.java:105)
at com.google.idea.blaze.base.command.info.BlazeInfoRunnerImpl.lambda$runBlazeInfo$2(BlazeInfoRunnerImpl.java:75)
... 6 more
Edit: I am using IntelliJ 2017.3 on MacOS El Capitan 10.11.6, Java JRE 1.8.0, and the Bazel version is 0.8.1-homebrew (the plugin version is 2017.11.20.0.4). I get this error when I try to setup a new Java project (an Hello World with only one class and one BUILD file containing a java_binary similar to this tutorial), and it appears when I click on the Bazel sync button.
The output in the Bazel console is:
Syncing project: Sync (incremental)...
Updating VCS...
Running Bazel info...
Command: info --tool_tag=ijwb:IDEA:community --curses=no --color=no --experimental_ui=no --progress_in_terminal_title=no --
==== TIMING REPORT ====
Sync: 47ms
BazelInfo: 4ms
Timing summary:
BlazeInvocation: 4ms
Sync failed
Command: git diff --name-status --no-renames abc8913346474d12ad45226503438848011929ae
Does anybody have an idea about what is it and/or how to fix it? Thanks!
I found my answer, thanks to this post.
In IntelliJ > Settings > Other Settings > Bazel Settings, the field "Bazel binary location" was empty. For my case, I entered /usr/local/bin/bazel and now it works!