SGE Command Not Found, Undefined Variable - variables

I'm attempting to setup a new compute cluster, and currently experiencing errors when using the qsub command in the SGE. Here's a simple experiment that shows the problem:
test.sh
#!/usr/bin/zsh
test="hello"
echo "${test}"
test.sh.eXX
test=hello: Command not found.
test: Undefined variable.
test.sh.oXX
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
If I ran the script on the head node (sh test.sh), the output is correct. I submit the job to the SGE by typing "qsub test.sh".
If I submit the exact same script job in the same way on an established compute cluster like HPC, it works perfectly as expected. What setting could be causing this problem?
Thanks for any help on this matter.

Most likely the queues on your cluster are set to posix_compliant mode with a default shell of /bin/csh. The posix_compliant setting means your #! line is ignored. You can either change the queues to unix_behavior or specify the required shell using qsub's -S option.
#$ -S /bin/sh

Related

Github Actions, permission denied when using custom shell

I am trying to use a shell script as a custom shell in Github Actions like this:
- name: Test bash-wrapper
shell: bash-wrapper {0}
run: |
echo Hello world
However, when I try to run it, I get Permission denied.
Background: I have set up a chroot jail, which I use with QEMU user mode emulation in order to build for non-IA64 architectures with toolchains that lack cross-compilation support.
The script is intended to provide a bash shell on the target architecture and looks like this:
#!/bin/bash
sudo chroot --userspec=`whoami`:`whoami` $CROSS_ROOT qemu-arm-static /bin/bash -c "$*"
It resides in /bin/bash-wrapper and it thus on $PATH.
Digging a bit deeper, I found:
Running bash-wrapper "echo Hello world" in a GHA step with the default shell works as expected.
Running bash-wrapper 'echo Running as $(whoami)' from the default shell correctly reports we are running as user runner.
Removing --userspec from the chroot command in bash-wrapper (thus running the command as root) does not make a difference – the custom shell gives the same error.
GHA converts each step into a script file and passes it to the shell.
File ownership on these files is runner:docker, runner being the user that runs the job by default.
Interestingly, the script files generated by GHA are not executable. I suspect that is what is ultimately causing the permission error.
Indeed, if I modify bash-wrapper to set the executable bit on the script before running it, everything works as expected.
I imagine non-executable script files would cause all sorts of troubles with various shells, thus I would expect GHA would have a way of dealing with that – in fact I am a bit surprised these on-the-fly scripts are not executable by default.
Is there a less hacky way of fixing this, such as telling GHA to set the executable bit on temporary scripts? (How does Github expect this to be solved?)
When calling your script try running it like this:
- name: Test bash-wrapper
shell: bash-wrapper {0}
run: |
bash <your_script>.sh
Alternatively, try running this command locally and the commit and push the repository:
git update-index --chmod=+x <your_script>.sh

Apache Airflow command not found with SSHOperator

I am trying to use the SSHOperator to SSH into a remote machine and run an external application through the command line. I have setup the SSH connection via the admin page.
This section of code is used to define the commands and the SSH connection to the external machine.
sshHook = SSHHook(ssh_conn_id='remote_comp')
command_1 ="""
cd /files/232-065/Rans
bash run.sh
"""
Where 'run.sh' runs the shell script:
#!/bin/sh
starccm+ -batch run_export.java Rans_Model.sim
Which simply runs the commercial software starccm+ with some options I have specified.
This section defines the task:
inlet_profile = SSHOperator(
task_id='inlet_profile',
ssh_hook=sshHook,
command=command_1
)
I have confirmed the SSH connection works by giving a simple 'ls' command and checking the output.
The error that I get is:
bash run.sh, error: run.sh: line 2: starccm+: command not found
The command in 'run.sh' works when I am logged into the machine (it does not require a GUI). This makes me think that there is a problem with the SSH session and it is not the same as the one that Apache Airflow logs into, but I am not sure how to solve this problem.
Does anyone have any experience with this?
There is no issue with SSH connection (at least from the error message). However, the issue is with starccm+ installation path.
Please check the installation path of starccm+ .
Check if the installation path is part of $PATH env variable
$ echo $PATH
If not, then install it in the standard locations like /bin or /usr/bin etc (provided they are included in $PATH variable), or export the installed director into PATH variable like this,
$ export PATH=$PATH:/<absolute_path>
It is not ideal but if you struggle with setting the path variable you can run starccm stating the full path like:
/directory/where/star/is/installed/starccm+ -batch run_export.java Rans_Model.sim

Why is $PATH different when executing commands via SSH and libssh?

I'm trying to run a command on a remote host via libssh2 as wrapped by the ssh2 Rust crate.
So I would like to run the command cargo build, but when I try to run it via libssh, I get the error:
cargo: command not found
However, when I ssh into the server manually from the command line everything works fine.
I have noticed that the $PATH is different when running ssh from the command line and libssh as well:
for instance when I echo $PATH
ssh gives me:
/home/<user>/.cargo/bin:/usr/share/swift/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bi
while libssh gives me:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
So it looks like what's happening is that the modifications made to $PATH inside .bashrc and .profile are not making it in when running via libssh.
I also get the same behavior if I run /bin/bash -c "echo ${PATH}"
Why would this be the case, and is there any way to get the same behavior in both these cases?
Please take a look at that question.
TL;DR A login shell first reads /etc/profile and then ~/.bash_profile. A non-login shell reads from /etc/bash.bashrc and then ~/.bashrc.

Can't run powerline-config during startup (in .tmux.conf)

When I start tmux, I get a failure when trying to configure powerline. I have set an environment environment variable with this:
export POWERLINE_CONFIG_COMMAND=`which powerline-config`
My ~/.tmux.conf contains the following:
if-shell "POWERLINE_CONFIG_COMMAND" \
run-shell "$POWERLINE_CONFIG_COMMAND tmux setup"
The error I get is:
unknown command: /path/to/powerline-config
I can run the config command manually after tmux starts with this:
$POWERLINE_CONFIG_COMMAND tmux setup
I don't understand why tmux can't run the command during the startup when it can run just fine afterwards.
I don't understand how you get that error. You should not get any message, and nothing should work.
if-shell "POWERLINE_CONFIG_COMMAND" \
run-shell "$POWERLINE_CONFIG_COMMAND tmux setup"
will fail, because POWERLINE_CONFIG_COMMAND is not a command. Your if-shell should have a $ in front of POWERLINE_CONFIG_COMMAND.
Let's assume that was a typo, and it's correct in your actual .conf. Then, the problem is that run-shell runs against tmux, the way it'd run if you typed <prefix>: in your tmux session.
tmux $POWERLINE_CONFIG_COMMAND tmux setup is not a valid command.
You could instead do
run-shell 'send-keys "$POWERLINE_CONFIG_COMMAND tmux setup" Enter'
If you wanted it run in a single pane.

How to keep the snakemake shell file while running in cluster

While running my snakemake file in cluster I keep getting an error,
snakemake -j 20 --cluster "qsub -o out.txt -e err.txt -q debug" -s
seadragon/scripts/viral_hisat.snake --config json="<input file>"
output="<output file>"
Now this gives me the follwing error,
Error in job run_salmon while creating output file
/gpfs/home/user/seadragon/output/quant_v2_4/test.
ClusterJobException in line 58 of seadragon/scripts/viral_hisat.snake
:
Error executing rule run_salmon on cluster (jobid: 1, external: 156618.sn-mgmt.cm.cluster, jobscript: /gpfs/home/user/.snakemake/tmp.j9nb0hyo/snakejob.run_salmon.1.sh). For detailed error see the cluster log.
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message
Now I don't find any way to track the error, since my cluster does not give me an way to store the log files, on the other hand /gpfs/home/user/.snakemake/tmp.j9nb0hyo/snakejob.run_salmon.1.sh file is deleted immediately after finishing.
Please let me know if there is an way to keep this shell file even if the snakemake fails.
I am not a qsub user anymore, but if I remember correctly, stdout and stderr are stored in the working directory, under the jobid that Snakemake gives you under external in the error message.
You need to redirect the standard output and standard error output to a file yourself instead of relying on the cluster or snakemake to do this for you.
Instead of the following
my_script.sh
Run the following
my_script.sh > output_file.txt 2> error_file.txt