Setting additional jars property in Pig while running in local mode - apache-pig

When Pig is running in distributed (HDFS) mode, you can pass additional jars to it from command-line using the following syntax, so that you don't have to explicitly using the REGISTER call
pig -Dpig.additional.jars=jar1.jar:jar2.jar -f pigfile.pig
But when I do the same thing while running in local mode, it fails
pig -x local -Dpig.additional.jars=jar1.jar:jar2.jar -f pigfile.pig
Does anyone know how to register additional jars while running Pig in local mode?

Properties should be passed before any Pig-specific options:
pig -Dpig.additional.jars=jar1.jar:jar2.jar -x local -f pigfile.pig

Related

Github Actions, permission denied when using custom shell

I am trying to use a shell script as a custom shell in Github Actions like this:
- name: Test bash-wrapper
shell: bash-wrapper {0}
run: |
echo Hello world
However, when I try to run it, I get Permission denied.
Background: I have set up a chroot jail, which I use with QEMU user mode emulation in order to build for non-IA64 architectures with toolchains that lack cross-compilation support.
The script is intended to provide a bash shell on the target architecture and looks like this:
#!/bin/bash
sudo chroot --userspec=`whoami`:`whoami` $CROSS_ROOT qemu-arm-static /bin/bash -c "$*"
It resides in /bin/bash-wrapper and it thus on $PATH.
Digging a bit deeper, I found:
Running bash-wrapper "echo Hello world" in a GHA step with the default shell works as expected.
Running bash-wrapper 'echo Running as $(whoami)' from the default shell correctly reports we are running as user runner.
Removing --userspec from the chroot command in bash-wrapper (thus running the command as root) does not make a difference – the custom shell gives the same error.
GHA converts each step into a script file and passes it to the shell.
File ownership on these files is runner:docker, runner being the user that runs the job by default.
Interestingly, the script files generated by GHA are not executable. I suspect that is what is ultimately causing the permission error.
Indeed, if I modify bash-wrapper to set the executable bit on the script before running it, everything works as expected.
I imagine non-executable script files would cause all sorts of troubles with various shells, thus I would expect GHA would have a way of dealing with that – in fact I am a bit surprised these on-the-fly scripts are not executable by default.
Is there a less hacky way of fixing this, such as telling GHA to set the executable bit on temporary scripts? (How does Github expect this to be solved?)
When calling your script try running it like this:
- name: Test bash-wrapper
shell: bash-wrapper {0}
run: |
bash <your_script>.sh
Alternatively, try running this command locally and the commit and push the repository:
git update-index --chmod=+x <your_script>.sh

One liner ssh different enviroment variables than normal ssh

I am using AWS Beanstalk, in case it may be relevant to the question.
The issue that I have is that when I do from my local terminal:
ssh mozart-api printenv
I missing most of the enviroment variables, instead if I do:
ssh mozart-api
..wait to open..
printenv
I get all enviroment varibles as I was expecting.
At first I thought it could be an ssh configuration in server but can't find anything strange.
if I do:
ssh mozart-api "export hello=123 && echo $hello"
then it outputs 123, which means that variables can be set and queried, however I just cannot get the existing variables from the server.
This is causing an issue because I am preparing a script that will run a command in ssh on this server, but because the variables are not loaded the project fails to open the database.
I tried reimporting them in one liner:
ssh mozart-api "sudo chmod +777 /etc/profile.d/sh.local && (/opt/elasticbeanstalk/bin/get-config environment | jq -r 'keys[] as \$k | \"echo export \(\$k)=\(.[\$k])\"') > /etc/profile.d/sh.local && printenv"
But still can't see the new added variables.
ssh mozart-api executes a login shell, which probably sources one or more files that define your environment variables.
ssh mozart-api printenv executes printenv instead of a login shell, so the only variables you see are the ones you inherit from the parent process, not any of the variables defined in your shell configuration files.

Run Kubectl in apache

I have this bash script:
#!/bin/bash
USERNAME=$1
WORKDIR='dir_'$USERNAME
mkdir deployment/$WORKDIR
cat deployment/deploy.yml > deployment/$WORKDIR/deploy.yml
sed -i 's/alopezfu/'$USERNAME'/g' deployment/$WORKDIR/deploy.yml
kubectl apply -f deployment/$WORKDIR/deploy.yml
rm -rf deployment/$WORKDIR/
And i use exec funcition in PHP for run.
And i get this messege in /var/log/apache/error.log
To view or setup config directly use the 'config' command.
error: no configuration has been provided, try setting KUBERNETES_MASTER environment variable
error: Missing or incomplete configuration info. Please point to an existing, complete config file:
Via the command-line flag --kubeconfig
Via the KUBECONFIG environment variable
In your home directory as ~/.kube/config
I need help 🙏
Since you are running the script as a diferent user, you need to "tell" to kubectl where is the configuration file.
This can be done setting the variable KUBECONFIG in your environment.
Supposing the kubernetes config file is in the dir /var/www/ with the correct permission to be readable, you can configure your php script like this:
<?php
$kubeconfig = "/var/www/config"; // The config file
putenv("KUBECONFIG=$kubeconfig"); // set the environment variable KUBECONFIG
$output = shell_exec("KUBECONFIG=$kubeconfig ; kubectl get pods -A"); // Runs the command
echo "<pre>$output</pre>"; // and return the expected output.
?>
Please be aware that:
Setting certain environment variables may be a potential security breach.
Some actions that should mitigate the impacts:
Make sure your config file is safe and not reachable from the browser;
Consider to create a serviceAccount with limited permissions;
Here you can find some useful commands and kubectl tips.
How to create a service account for kubectl

Pig basic program error

I am getting the below error while running a pig script.![
]1
Please read Pig manual carefully
https://pig.apache.org/docs/r0.9.1/start.html
and observe that -x expects execution mode to be specified (either local or mapreduce). So the correct command would be
pig -x local wordcount.pig

SGE Command Not Found, Undefined Variable

I'm attempting to setup a new compute cluster, and currently experiencing errors when using the qsub command in the SGE. Here's a simple experiment that shows the problem:
test.sh
#!/usr/bin/zsh
test="hello"
echo "${test}"
test.sh.eXX
test=hello: Command not found.
test: Undefined variable.
test.sh.oXX
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
If I ran the script on the head node (sh test.sh), the output is correct. I submit the job to the SGE by typing "qsub test.sh".
If I submit the exact same script job in the same way on an established compute cluster like HPC, it works perfectly as expected. What setting could be causing this problem?
Thanks for any help on this matter.
Most likely the queues on your cluster are set to posix_compliant mode with a default shell of /bin/csh. The posix_compliant setting means your #! line is ignored. You can either change the queues to unix_behavior or specify the required shell using qsub's -S option.
#$ -S /bin/sh