We are trying to run a fixed number of Hive queries in parallel using GNU parallel.
Even with parallelism set to 1 (i.e. sequential execution) via -j1 the first execution works, but the second one gets stuck:
$ parallel -j1 --eta --verbose beeline -e '"SELECT \"{}\";"' ::: a b c
beeline -e "SELECT \"a\";"
Computers / CPU cores / Max jobs to run
1:local / 40 / 1
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
ETA: 0s Left: 3 AVG: 0.00s local:1/0/100%/0.0s
+------+
| _c0 |
+------+
| a |
+------+
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH/jars/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
WARNING: Use "yarn jar" to launch YARN applications.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH/jars/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://en02.example.cloud:2181,mn01.example.cloud:2181,mn02.example.cloud:2181/default;principal=hive/_HOST#example.cloud;serviceDiscoveryMode=zooKeeper;ssl=true;zooKeeperNamespace=hiveserver2
21/11/16 10:41:39 [main]: INFO jdbc.HiveConnection: Connected to mn01.example.cloud:10000
Connected to: Apache Hive (version 3.1.3000.7.1.6.0-297)
Driver: Hive JDBC (version 3.1.3000.7.1.6.0-297)
Transaction isolation: TRANSACTION_REPEATABLE_READ
INFO : Compiling command(queryId=hive_20211116104139_7be8d8ee-f58d-4572-88a4-43533846160b): SELECT "a"
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:string, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20211116104139_7be8d8ee-f58d-4572-88a4-43533846160b); Time taken: 0.102 seconds
INFO : Executing command(queryId=hive_20211116104139_7be8d8ee-f58d-4572-88a4-43533846160b): SELECT "a"
INFO : Completed executing command(queryId=hive_20211116104139_7be8d8ee-f58d-4572-88a4-43533846160b); Time taken: 0.006 seconds
INFO : OK
1 row selected (0.191 seconds)
Beeline version 3.1.3000.7.1.6.0-297 by Apache Hive
Closing: 0: jdbc:hive2://en02.example.cloud:2181,mn01.example.cloud:2181,mn02.example.cloud:2181/default;principal=hive/_HOST#example.cloud;serviceDiscoveryMode=zooKeeper;ssl=true;zooKeeperNamespace=hiveserver2
beeline -e "SELECT \"b\";"
ETA: 79s Left: 2 AVG: 42.00s local:1/1/100%/47.0s
Simplifying this further, even a parallel call to beeline --help gets stuck for the second run in the same way, so it doesn't seem to be related to the connection to the Hive DB.
The solutions with which we finally got it working is
parallel -j1 --eta --verbose beeline -e '"SELECT \"{}\";"' < /dev/null ::: a b c
and (thanks #OleTange !)
parallel -j1 --eta --verbose --tty beeline -e '"SELECT \"{}\";"' ::: a b c
How we found out:
We added a set -x to the beeline bash script and some of the scripts it's calling, logged the results to separate files for the parallel runs and diffed them.
We saw that there was a part in the logs about
[ -p /dev/stdin ]
and below that a few environment variables that got set in the first parallel execution, but not on the second one.
We then played around with various options to give beeline a stdin and the /dev/null version finally worked.
Related
I try to migrate from DSpace 6.4 to 7.1. New Dspace is installed on other machine (virtual machine on Centos 7 with 8Gb of RAM)
I have created full site AIP backup with user passwords. (total size of packages - 11Gb)
I've tried to do full restore but always got the same error.
So I'm just trying to import only "first level without childs"
JAVA_OPTS="-Xmx2048m -Xss4m -Dfile.encoding=UTF-8" /dspace/bin/dspace packager -r -k -t AIP -e dinkwi.test#gmail.com -o skipIfParentMissing=true -i 123456789/0 /home/dimich/11111/repo.zip
It doesn't matter if I use -k or -f param, output ia always the same
Ingesting package located at /home/dimich/11111/repo.zip
Exception: null
java.lang.StackOverflowError
at org.dspace.eperson.GroupServiceImpl.getChildren(GroupServiceImpl.java:788)
at org.dspace.eperson.GroupServiceImpl.getChildren(GroupServiceImpl.java:802)
.... (more then 1k lines)
at org.dspace.eperson.GroupServiceImpl.getChildren(GroupServiceImpl.java:802)
my dspace.log ended with
2021-12-20 11:05:28,141 INFO unknown unknown org.dspace.eperson.GroupServiceImpl # dinkwi.test#gmail.com::update_group:group_id=9e6a2038-01d9-41ad-96b9-c6fb55b44381
2021-12-20 11:05:30,048 INFO unknown unknown org.dspace.eperson.GroupServiceImpl # dinkwi.test#gmail.com::update_group:group_id=23aaa7e9-ca2d-4af5-af64-600f7126e2be
2021-12-20 11:05:30,800 INFO unknown unknown org.springframework.cache.ehcache.EhCacheManagerFactoryBean # Shutting down EhCache CacheManager 'org.dspace.services'
So I just want to figure out the problem: small stack or some bug in user/group that fails with infinite loop/recursion, or maybe something else...
Main problem - i'm good in php/mysql and have no experience with java/postgre and the way how to debug this ...
Any help would be appreciated.
p.s after failed restore I always run command
/dspace/bin/dspace cleanup -v
I'm trying to create a tensorflow cluster on top of the Ignite cluster in my local multi-node environment.
I followed the tutorials and found tried the following command:
./ignite-tf.sh start TESTDATA models python /usr/local/grid/cifar10_main.py
This gives me an unmatched error as follows:
Unmatched argument:
Usage: ignite-tf [-hV] [-c=<cfg>] [COMMAND]
Apache Ignite and TensorFlow integration command line tool that allows to
start, maintain and stop distributed deep learning utilizing Apache Ignite
infrastructure and data.
-c, --config=<cfg> Apache Ignite client configuration.
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Commands:
start Starts a new TensorFlow cluster and attaches to user script process.
stop Stops a running TensorFlow cluster.
attach Attaches to running TensorFlow cluster (user script process).
ps Prints identifiers of all running TensorFlow clusters.
I'm not sure which is the umatched argument. Need help getting this to work.
I have downloaded this specific version and tried to run this command, I don't get any error messages, it would try to start a node:
311842 pts/12 S+ 0:00 | | \_ bash ./ignite-tf.sh start TESTDATA models python /usr/local/grid/cifar10_main.py
311902 pts/12 Sl+ 0:03 | | \_ /usr/lib/jvm/java-8-openjdk-amd64//bin/java -XX:+AggressiveOpts -Xms1g -Xmx1g -server -XX:MaxMetaspaceSize=256m -DIGNITE_QUIET=false -DIGNITE_SUCCESS_FILE=/home/user/Downloads/gridgain-community-8.7.24/work/ignite_success_20e882c5-5b64-4d0a-b7ed-9587c08a0e44 -DIGNITE_HOME=/home/user/Downloads/gridgain-community-8.7.24 -DIGNITE_PROG_NAME=./ignite-tf.sh -cp /home/user/Downloads/gridgain-community-8.7.24/libs/*:/home/user/Downloads/gridgain-community-8.7.24/libs/ignite-control-utility/*:/home/user/Downloads/gridgain-community-8.7.24/libs/ignite-indexing/*:/home/user/Downloads/gridgain-community-8.7.24/libs/ignite-opencensus/*:/home/user/Downloads/gridgain-community-8.7.24/libs/ignite-spring/*:/home/user/Downloads/gridgain-community-8.7.24/libs/licenses/*:/home/user/Downloads/gridgain-community-8.7.24/libs/optional//ignite-tensorflow/*:/home/user/Downloads/gridgain-community-8.7.24/libs/optional//ignite-slf4j/* org.apache.ignite.tensorflow.submitter.JobSubmitter start TESTDATA models python /usr/local/grid/cifar10_main.py
I am combining Singularity & Snakemake to create a workflow for some sequencing data. I modeled my pipeline after this git project https://github.com/sci-f/snakemake.scif. The version of the pipeline that does not use Singularity runs absolutely fine. The version that uses Singularity always stops after the first rule with the following error:
$ singularity run --bind data/raw_data/:/scif/data/ /gpfs/data01/heinzlab/home/cag104/bin/chip-seq-pipeline/chip-seq-pipeline-hg38.simg run snakemake all
[snakemake] executing /bin/bash /scif/apps/snakemake/scif/runscript all
Copying Snakefile to /scif/data
Copying config.yaml to /scif/data
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
1 bowtie2_mapping
1 create_bigwig
1 create_tag_directories
1 fastp
1 fastqc
1 quality_metrics
1 samtools_index
8
rule fastp:
input: THP-1_PU1-cMyc_PU1_sc_S40_R1_001.fastq.gz
output: fastp/THP-1_PU1-cMyc_PU1_sc_S40_R1_001.fastp.fastq.gz, fastp_report/THP-1_PU1-cMyc_PU1_sc_S40_R1_001.html, fastp_report/THP-1_PU1-cMyc_PU1_sc_S40_R1_001.json
log: logs/fastp/THP-1_PU1-cMyc_PU1_sc_S40_R1_001.log
jobid: 7
wildcards: sample=THP-1_PU1-cMyc_PU1_sc_S40_R1_001
usage: scif run [-h] [cmd [cmd ...]]
positional arguments:
cmd app and optional arguments to target for the entry
optional arguments:
-h, --help show this help message and exit
Waiting at most 5 seconds for missing files.
MissingOutputException in line 16 of /scif/data/Snakefile:
Missing files after 5 seconds:
fastp/THP-1_PU1-cMyc_PU1_sc_S40_R1_001.fastp.fastq.gz
fastp_report/THP-1_PU1-cMyc_PU1_sc_S40_R1_001.html
fastp_report/THP-1_PU1-cMyc_PU1_sc_S40_R1_001.json
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Will exit after finishing currently running jobs.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /scif/data/.snakemake/log/2018-04-06T224320.958604.snakemake.log
The directory however does create the fastp and fastp_report directories as well as the logs directory. I tried increasing the latency to 50 seconds, but I still get the same error.
Any ideas on what to try here?
I wrote pig script as :
my_script.pig
bag_1 = LOAD '$INPUT' USING PigStorage('|') AS (LN_NR:chararray,ET_NR:chararray,ET_ST_DT:chararray,ED_DT:chararray,PI_ID:chararray);
bag_2 = LIMIT bag_1 $SIZE;
DUMP bag_2;
and made one param file as :
my_param.txt:
INPUT = hdfs://0.0.0.0:8020/user/training/example
SIZE = 10
now, I am calling the script by
pig my_param.txt my_script.pig
this command but getting error as:
ERROR 1000: Error during parsing. Lexical error
any suggestions for that
I think you need to provide the parameter file using -m or -param_file option. Refer the help documentation below.
$ pig --help
Apache Pig version 0.11.0-cdh4.7.1 (rexported)
compiled Nov 18 2014, 09:08:23
USAGE: Pig [options] [-] : Run interactively in grunt shell.
Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).
Pig [options] [-f[ile]] file : Run cmds found in file.
options include:
-4, -log4jconf - Log4j configuration file, overrides log conf
-b, -brief - Brief logging (no timestamps)
-c, -check - Syntax check
-d, -debug - Debug level, INFO is default
-e, -execute - Commands to execute (within quotes)
-f, -file - Path to the script to execute
-g, -embedded - ScriptEngine classname or keyword for the ScriptEngine
-h, -help - Display this message. You can specify topic to get help for that topic.
properties is the only topic currently supported: -h properties.
-i, -version - Display version information
-l, -logfile - Path to client side log file; default is current working directory.
-m, -param_file - Path to the parameter file
-p, -param - Key value pair of the form param=val
-r, -dryrun - Produces script with substituted parameters. Script is not executed.
-t, -optimizer_off - Turn optimizations off. The following values are supported:
SplitFilter - Split filter conditions
PushUpFilter - Filter as early as possible
MergeFilter - Merge filter conditions
PushDownForeachFlatten - Join or explode as late as possible
LimitOptimizer - Limit as early as possible
ColumnMapKeyPrune - Remove unused data
AddForEach - Add ForEach to remove unneeded columns
MergeForEach - Merge adjacent ForEach
GroupByConstParallelSetter - Force parallel 1 for "group all" statement
All - Disable all optimizations
All optimizations listed here are enabled by default. Optimization values are case insensitive.
-v, -verbose - Print all error messages to screen
-w, -warning - Turn warning logging on; also turns warning aggregation off
-x, -exectype - Set execution mode: local|mapreduce, default is mapreduce.
-F, -stop_on_failure - Aborts execution on the first failed job; default is off
-M, -no_multiquery - Turn multiquery optimization off; default is on
-P, -propertyFile - Path to property file
$
You are not using the command correctly.
To use a property file, use -param_file in the command:
pig -param_file <file> pig_script.pig
You can refer more details in the Parameter Substitution
Using Apache Pig version 0.10.1.21 (rexported). When I execute a pig script, there are a lots of INFO logging lines which looks like that:
2013-05-18 14:30:12,810 [Thread-28] INFO org.apache.hadoop.mapred.Task - Task 'attempt_local_0005_r_000000_0' done.
2013-05-18 14:30:18,064 [main] WARN org.apache.pig.tools.pigstats.PigStatsUtil - Failed to get RunningJob for job job_local_0005
2013-05-18 14:30:18,094 [Thread-31] WARN org.apache.hadoop.mapred.JobClient - No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
2013-05-18 14:30:18,114 [Thread-31] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-05-18 14:30:18,254 [Thread-32] INFO org.apache.hadoop.mapred.Task - Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin#3fcb2dd1
2013-05-18 14:30:18,265 [Thread-32] INFO org.apache.hadoop.mapred.MapTask - io.sort.mb = 10
Is there a SET command within the pig script or a command line flag to allow the logging level? Basically I would like to hide the [Thread-xx] INFO messages. Only showing WARNING and ERROR. I have tried the command line debug flag. Unfortunately, the INFO messages still show up:
pig -x local -d WARN MyScript.pig
Hope there is a solution. Thanks in advance for any help.
SOLVED: Answer by Loran Bendig, set the log4j.properties. Summarized here for convenience
Step1: copy the log4j config file to the folder where my pig scripts are located.
cp /etc/pig/conf.dist/log4j.properties log4j_WARN
Step2: Edit log4j_WARN file and make sure these two lines are present
log4j.logger.org.apache.pig=WARN, A
log4j.logger.org.apache.hadoop = WARN, A
Step3: Run pig script and instruct it to use the custom log4j
pig -x local -4 log4j_WARN MyScript.pig
Another setting could be also like this:
Create a file named nolog.conf, with the following content
log4j.rootLogger=fatal
and then run pig as follows
pig -x local -4 nolog.conf
You can override the default log configuration (which includes INFO messages) like this:
pig -4 log4j.properties MyScript.pig
You need to set rootLogger too:
log4j.rootLogger=ERROR, A
log4j.logger.org.apache.pig=ERROR, A
log4j.logger.org.apache.hadoop = ERROR, A