GCG/ Error: Calling hmetis unsuccessful! happen during the process of 'optimize' - scip

I installed the GCG and built the soft link with hMETIS in the scipoptsuite/gcg directory:
ubuntu18:~/Documents/Software/scipoptsuite-6.0.0/gcg$ ln -s /home/yang/Documents/Software/scipoptsuite-6.0.0/hmetis-2.0pre1/Linux-x86_64/hmetis2.0pre1 hmetis
When I run the Example test of /check/instances/cs/TEST0055.lp, there are some differences compared to logs of http://gcg.or.rwth-aachen.de/doc/EXAMPLE.html which use the same TEST0055.lp document:
Presolving Time: 0.01
start creating seeedpool for current problem
created seeedpool for current problem, n detectors: 25
Consclassifier "nonzeros" yields a classification with 2 different constraint classes
Consclassifier "constypes" yields a classification with 2 different constraint classes
Consclassifier "constypes according to miplib" yields a classification with 2 different constraint classes
Consclassifier "constypes according to miplib" is not considered since it offers the same structure as "constypes" consclassifier
Varclassifier "vartypes" yields a classification with 2 different variable classes
Varclassifier "varobjvals" yields a classification with 2 different variable classes
Varclassifier "varobjvalsigns" yields a classification with 2 different variable classes
Varclassifier "varobjvalsigns" is not considered since it offers the same structure as "varobjvals"
Begin of detection round 0 of 1 total rounds
Start to propagate seeed with id 1 (0 of 1 in round 0)
in dec_consclass: there are 2 different constraint classes
the current constraint classifier "nonzeros" consists of 2 different classes
the current constraint classifier "constypes" consists of 2 different classes
dec_consclass found 6 new seeeds
dec_densemasterconss found 1 new seeed
sh: 1: zsh: not found
[src/dec_hrgpartition.cpp:314] ERROR: Calling hmetis unsuccessful! See the above error message for more details.
[src/dec_hrgpartition.cpp:315] ERROR: Call was zsh -c "hmetis gcg-r-1.metis.1l488e 20 -seed 1 -ptype rb -ufactor 5.000000 > /dev/null"
sh: 1: zsh: not found
[src/dec_hrgpartition.cpp:314] ERROR: Calling hmetis unsuccessful! See the above error message for more details.
[src/dec_hrgpartition.cpp:315] ERROR: Call was zsh -c "hmetis gcg-r-1.metis.1l488e 10 -seed 1 -ptype rb -ufactor 5.000000 > /dev/null"
sh: 1: zsh: not found
[src/dec_hrgpartition.cpp:314] ERROR: Calling hmetis unsuccessful! See the above error message for more details.
[src/dec_hrgpartition.cpp:315] ERROR: Call was zsh -c "hmetis gcg-r-1.metis.1l488e 29 -seed 1 -ptype rb -ufactor 5.000000 > /dev/null"
Detecting Arrowhead structure: 20 10 29 done, 0 seeeds found.
Start finishing of partial decomposition 1.
The objective value is same as the example on GCG's web. But solutions are different.
Why do these errors appear? Is there anything wrong with the GCG or SCIP software? Another special issue is that: the number of Solving Nodes is only '1' on my test, however this number is '82' on the http://gcg.or.rwth-aachen.de/doc/EXAMPLE.html example. I also run the instance of 'bpp/N1C1W4_M.BPP.lp', and above errors happen also.
Begin of detection round 0 of 1 total rounds
Start to propagate seeed with id 39 (0 of 1 in round 0)
in dec_consclass: there are 1 different constraint classes
the current constraint classifier "nonzeros" consists of 2 different classes
dec_consclass found 3 new seeeds
dec_densemasterconss found 1 new seeed
sh: 1: zsh: not found
[src/dec_hrgpartition.cpp:314] ERROR: Calling hmetis unsuccessful! See the above error message for more details.
[src/dec_hrgpartition.cpp:315] ERROR: Call was zsh -c "hmetis gcg-r-39.metis.wDKr6U 50 -seed 1 -ptype rb -ufactor 5.000000 > /dev/null"
sh: 1: zsh: not found
[src/dec_hrgpartition.cpp:314] ERROR: Calling hmetis unsuccessful! See the above error message for more details.
[src/dec_hrgpartition.cpp:315] ERROR: Call was zsh -c "hmetis gcg-r-39.metis.wDKr6U 51 -seed 1 -ptype rb -ufactor 5.000000 > /dev/null"
Detecting Arrowhead structure: 50 51 done, 0 seeeds found.
And it is strange that the number of Solving Nodes is still 1.
SCIP Status : problem is solved [optimal solution found]
Solving Time (sec) : 0.72
Solving Nodes : 1
Primal Bound : +4.10000000000000e+01 (3 solutions)
Dual Bound : +4.10000000000000e+01
Gap : 0.00 %

Related

What's a bad file descriptor?

I have the next system swi-prolog in a file call 'system.pl';
helloWorld :- read(X), write(X).
And i want to test it, then, i write it;
:- begin_tests(helloWorld_test).
test(myTest, true(Output == "hello")) :-
with_output_to(string(Output), getEntry).
:- end_tests(helloWorld_test).
getEntry :-
open('testcase.test', read, Myfile),
set_input(Myfile),
process_create(path(swipl), ['-g', 'main', '-t', 'halt', 'system.pl'], [stdin(stream(Myfile)), stdout(pipe(Stream))]),
copy_stream_data(Stream, current_output),
close(Myfile).
In testcase.test is contained the following;
hello.
Ok, now, when i call to swipl -g run_tests -t halt system.pl i get it;
% PL-Unit: helloWorld_test ERROR: -g helloWorld: read/1: I/O error in read on stream user_input (Bad file descriptor)
ERROR: c:/programasvscode/prolog/programasrandom/system.pl:40:
test myTest: wrong answer (compared using ==)
ERROR: Expected: "hello"
ERROR: Got: ""
done
% 1 test failed
% 0 tests passed
ERROR: -g run_tests: false
Warning: Process "c:\swipl\bin\swipl.exe": exit status: 2
I tried use read/2 with current_input but i got the same with the difference of read/2 instead read/1
What does mean it? any solve?

How do I get Source Extractor to Analyze an Image?

I'm relatively inexperienced in coding, so right now I'm just familiarizing myself with the basics of how to use SE, which I'll need to use in the near future.
At the moment I'm trying to get it to analyze a FITS file on my computer (which is a Mac). I'm sure this is something obvious, but I haven't been able to get it do that. Following the instructions in Chapters 6 and 7 of Source Extractor for Dummies (linked below), I input the following:
sex MedSpiral_20deg_Serl2_.45_.fits.fits -c configuration_file.txt
And got the following error message:
WARNING: configuration_file.txt not found, using internal defaults
----- SExtractor 2.19.5 started on 2020-02-05 at 17:10:59 with 1 thread
Setting catalog parameters
ERROR: can't read default.param
I then tried entering parameters manually:
sex MedSpiral_20deg_Ser12_.45_.fits.fits -c configuration_file.txt -DETECT_TYPE CCD -MAG_ZEROPOINT 2.5 -PIXEL_SCALE 0 -SATUR_LEVEL 1 -SEEING_FWHM 1
And got the same error message. I tried referencing default.sex directly:
sex MedSpiral_20deg_Ser12_.45_.fits.fits -c default.sex
And got the same error message again, substituting "configuration_file.txt not found" with "default.sex not found" (I checked that default.sex was on my computer, it is). The same thing happened when I tried to use default.param.
Here's the link to SE for Dummies (Chapter 6 begins on page 19):
http://astroa.physics.metu.edu.tr/MANUALS/sextractor/Guide2source_extractor.pdf
If you run the command "sex MedSpiral_20deg_Ser12_.45_fits.fits -c default.sex" within the config folder (within the sextractor folder), you will be able to run it.
However, I wonder how I can possibly run sextractor command from any folder in my computer?

Snakemake in cluster mode with --no-shared-fs: How to set cluster-status

I'm running Snakemake in a cluster environment and would like to use S3 as shared file system for writing output files.
Options --default-remote-provider, --default-remote-prefix and --no-shared-fs are set accordingly. The cluster uses UGE as scheduler, so setting --cluster is straightforward, but how do I set --cluster-status, whose use is enforced when using --no-shared-fs?
My best guess was a naive --cluster-status "qstat -j" which resulted in
subprocess.CalledProcessError: Command 'qstat Your job 2 ("snakejob.bwa_map.1.sh") has been submitted' returned non-zero exit status 1.
So I guess my question is, how do I get the actual jobid in there?
Thanks!
Andreas
EDIT 1:
I found https://groups.google.com/forum/#!topic/snakemake/7cyqAIfgeq4, so cluster-status has to be a script. So I wrote a Python script that is able to parse the above line, however snakemake still fails with:
/bin/sh: -c: line 0: syntax error near unexpected token `('
/bin/sh: -c: line 0: `/home/ec2-user/clusterstatus.py Your job 2 ("snakejob.bwa_map.1.sh") has been submitted'
...
subprocess.CalledProcessError: Command '/home/ec2-user/clusterstatus.py
Your job 2 ("snakejob.bwa_map.1.sh") has been submitted' returned non-zero exit status 1.
To answer my own question:
First, I needed the -terse option for qsub (which I had not added at first in my case and snakemake somehow remembered the wrong cluster command
Secondly, the cluster-status argument needs to point to a script being able to get the job status (job id being the only argument) and output "failed", "running" or "success".

Handling SIGPIPE error in snakemake

The following snakemake script:
rule all:
input:
'test.done'
rule pipe:
output:
'test.done'
shell:
"""
seq 1 10000 | head > test.done
"""
fails with the following error:
snakemake -s test.snake
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
1 pipe
2
rule pipe:
output: test.done
jobid: 1
Error in job pipe while creating output file test.done.
RuleException:
CalledProcessError in line 9 of /Users/db291g/Tritume/test.snake:
Command '
seq 1 10000 | head > test.done
' returned non-zero exit status 141.
File "/Users/db291g/Tritume/test.snake", line 9, in __rule_pipe
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/thread.py", line 55, in run
Removing output files of failed job pipe since they might be corrupted:
test.done
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
The explanation returned non-zero exit status 141 seems to say that snakemake has caught the SIGPIPE fail sent by head. I guess strictly speaking snakemake is doing the right thing in catching the fail, but I wonder if it would be possible to ignore some types of errors like this one. I have a snakemake script using the head command and I'm trying to find a workaround this error.
Yes, Snakemake sets pipefail by default, because in most cases this is what people implicitly expect. You can always deactivate it for specific commands by prepending set +o pipefail; to the shell command.
A somehow clunky solution is to append || true to the script. This will make the command always exit cleanly, which is not acceptable. To check whether the script actually succeded you can query the array variable ${PIPESTATUS[#]} to ensure it contains the expected exit codes:
This script is ok:
seq 1 10000 | head | grep 1 > test.done || true
echo ${PIPESTATUS[#]}
141 0 0
This is not ok:
seq 1 10000 | head | FOOBAR > test.done || true
echo ${PIPESTATUS[#]}
0

Wrong qstat GPU resource count SGE

I have a gpu resource called gpus. When I run qstat -F gpus I get weird output of the format "qc:gpus=-1" , thus negative number of available gpus are reported. If i run qstat -g c says I have multiple GPUs available. Multiple jobs fail because of "unavailable gpus". It's like the counting of GPUs starts from 1 instead of 8 on each node, so if I used more than 1 it becomes negative. My queue is :
hostlist node-01 node-02 node-03 node-04 node-05
seq_no 0
load_thresholds NONE
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors UNDEFINED
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list smp mpich2
rerun FALSE
slots 1,[node-01=8],[node-02=8],[node-03=8],[node-04=8],[node-05=8]
Does anyone have any idea why this is happening?
I believe you set the "gpus" complex in the host configuration. You can see it if you do
qconf -se node-01
And you can check the definition of the "gpus" complex with
qconf -sc
For instance, my UGE has this definition for the "ngpus" complex:
#name shortcut type relop requestable consumable default urgency
ngpus gpu INT <= YES YES 0 1000
And an example node "qconf -se gpu01":
hostname gpu01.cm.cluster
...
complex_values exclusive=true,m_mem_free=65490.000000M, \
m_mem_free_n0=32722.546875M,m_mem_free_n1=32768.000000M, \
ngpus=2,slots=16,vendor=intel
You can modify the value by "qconf -me node-01". See the man page complex(5) for details.