Apache pig script delete a folder if exists - apache-pig

I want to delete the output folder of the previous execution through the apache pig script. This command works fine.
sh [ -e /home/LocalPig/test ] && rm -rf /home/LocalPig/test
but if I write
sh OutpuFile=/home/LocalPig/test
sh [ -e OutputFile] && rm -rf OutputFile
I got the error about OutputFile!
ERROR 2997: Encountered IOException. org.apache.pig.tools.parameters.ParameterSubstitutionException: Undefined parameter : OutputFile
Does anybody have any idea?
Thanks

Hope this solves the problem. Its simply the below command from .pig script file. You don't have to write any shell command. It can be accomplished from within the pig environment using the built in fs command.
Example, put a statement in your pig script like below, it will also not error out due to non-existence of the folder. It will delete if exists or gracefully exists the statement.
fs -rm -f -r -R /user/horton/denver_total;
Offcourse you can also do a lot of work outside the pig but its very useful to perform any delete within your script that controls creation of the data. It makes life simpler to trace the lineage of create and destroy of that files.

Reference: Parameter Substituion
%declare OutputFile '/home/LocalPig/test'
sh [ -e '$OutputFile' ] && rm -rf '$OutputFile'

Related

GitLab pipeline - Copy file if exists

I have a pipeline that needs to copy some files from a folder to a new one only if the files exists in the source folder.
This is my script line:
script:
- cp source_folder/file.txt dest_folder/ 2>/dev/null
I have also tried this:
script:
- test -f source_folder/file.txt && cp source_folder/file.txt dest_folder/ 2>/dev/null
but still fails if the file do not exists.
Cleaning up project directory and file based variables.
ERROR: Job failed: exit code 1
How can I check the file and copy it only if exists?
EDIT:
this command is executed on a server, the pipeline use ssh to log into
Check for the existence of the file (-f) and, in positive case, copy it.
script:
- |
files=(conf.yaml log.txt)
for file in $files; do
if [[ -f "source_folder/$file" ]]; then
cp source_folder/$file dest_folder
fi
done
Take a look at other answers for one-shot less-flexible statements.
Note: I haven't tested the script above, but I'm quite accustomed with Gitlab pipeline and bash.

Can cd commands be used multiple times in a script?

I am writing a script , please confirm if I can use multiple cd commands as I have to create and cd multiple times to make the job run. So can I use it again and again.
I have created a small script from it to mkdir and cd in one command but its not working .
1.
function mkdircd () { mkdir -p testjdk && eval cd "$_" ; }
mkdircd /tmp/testjdk
pwd
mkdir test && cd "$_"
However 2nd one works outside if I directly tried to run it but inside the script its not working .
I am assuming you want a bash script to make a directory and then cd into it? Something similar to what is shown below will work.
You need to pass an argument to the function and to the script itself. So $1 is the argument that you pass to the Function call when you run the script from the command line. And then within the script then the same argument is passed to the function.
So say this script was named test.sh, then you would run it by executing something like source test.sh ./my_dir. Here ./my_dir is the relative path to the directory that you want to create and enter. If you want to create and enter it in the root then run the script with sudo and specify the full path.
#!/bin/bash
#It is a function
myFunction() {
mkdir -p $1;
cd $1
}
#function call
myFunction $1

Exit code from docker-compose breaking while loop

I've got case: there's WordPress project where I'm supposed to create a script for updating plugins and commit source changes to the separated branch. While doing this I had run into a strange issue.
Input variable:
akimset,4.0.3
all-in-one-wp-migration,6.71
What I wanted to do was iterating over each line of this variable
while read -r line; do
echo $line
done <<< "$variable"
and this piece of code worked perfectly fine, but when I have added docker-compose logic everything started to act weirdly
while read -r line; do
docker-compose run backend echo $line
done <<< "$variable"
now only one line was executed and after this script exited with 0 and stopped iterating. I have found workaround with:
echo $variable > file.tmp
for line in $(cat file.tmp); do
docker-compose run backend echo $line
done
and that works perfectly fine and it iterates each line. Now my question is: why? ZSH and shell scripting could be a bit misterious and running in edge-cases like this one isn't anything new for me, but I'm wondering why succesfully executed script broke input stream.
The problem with this
while read -r line; do
docker-compose run backend echo $line
done <<< "$variable"
is that docker allocate pseudo-TTY. After the first execution of docker-compose run (first loop) it access to the terminal using up the next lines as input.
You have to pass -T parameter to 'docker-compose run' command in order to avoid docker allocating pseudo-TTY. Then, a working code is:
while read -r line; do
docker-compose run -T backend echo $line
done < $(variable)
Update
The above solution is for docker version 18 and docker-compose version 1.17. For newer version the parameter -T is not working but you can try:
-d instead of -T to run container in background mode BUT no you will not see stdout in terminal.
If you have docker-compose v1.25.0, in your docker-compose.yml add the parameter stdin_open: false to the service.
I was able to solve the same problem by using a different loop :
for line in $(cat $variable)
do
docker-compose run backend echo $line
done
I ran into a nearly identical problem about a year ago, though the shell was bash (the command/problem was also slightly different, but it applied to your issue). I ended up writing the script in zsh.
I'm not certain what's going on, but it's not actually the exit code (you can confirm by running the following):
variable=$'akimset,4.0.3\nall-in-one-wp-migration,6.71'
while read line; do docker-compose run backend print "$line"; print "$?"; done <<<($variable)
... which yielded ...
(akimset,4.0.3
0
(I'm not at all sure where the ( came from and perhaps solving that would answer why this problem happens)
Working Script
for line in "${(f)variable}"; do
docker-compose run backend echo "$line"
done
The (f) flag tells zsh to split on newlines; the "${(f)variable" is in quotes so that any blank lines aren't lost. If you're going to include escap sequences that you want to not be converted to the corresponding values (something that I often need when reading file contents from a variable), make the flags (fV)

Handle gsutil ls and rm command errors if no files present

I am running the following command to remove files from a gcs bucket prior to loading new files there.
gsutil -m rm gs://mybucket/subbucket/*
If there are no files in the bucket, it throws the "CommandException: One or more URLs matched no objects".
I would like for it to delete the files if exists without throwing the error.
There is same error with gsutil ls gs://mybucket/subbucket/*
How can I rewrite this without having to handle the exception explicitly? Or, how to best handle these exceptions in batch script?
Try this:
gsutil -m rm gs://mybucket/foo/* 2> /dev/null || true
Or:
gsutil -m ls gs://mybucket/foo/* 2> /dev/null || true
This has the effect of suppressing stderr (it's directed to /dev/null), and returning a success error code even on failure.
You might not want to ignore all errors as it might indicate something different that file not found. With the following script you'll ignore only the 'One or more URLs matched not objects' but will inform you of a different error. And if there is no error it will just delete the file:
gsutil -m rm gs://mybucket/subbucket/* 2> temp
if [ $? == 1 ]; then
grep 'One or more URLs matched no objects' temp
if [ $? == 0 ]; then
echo "no such file"
else
echo temp
fi
fi
rm temp
This will pipe stderr to a temp file and will check the message to decide whether to ignore it or show it.
And it also works for single file deletions. I hope it helps.
Refs:
How to grep standard error stream
Bash Reference Manual - Redirections
You may like rsync to sync files and folders to a bucket. I used this for clearing a folder in a bucket and replacing it with new files from my build script.
gsutil rsync -d newdata gs://mybucket/data - replaces data folder with newdata

Running .sh scripts in Git Bash

I'm on a Windows machine using Git 2.7.2.windows.1 with MinGW 64.
I have a script in C:/path/to/scripts/myScript.sh.
How do I execute this script from my Git Bash instance?
It was possible to add it to the .bashrc file and then just execute the entire bashrc file.
But I want to add the script to a separate file and execute it from there.
Let's say you have a script script.sh. To run it (using Git Bash), you do the following: [a] Add a "sh-bang" line on the first line (e.g. #!/bin/bash) and then [b]:
# Use ./ (or any valid dir spec):
./script.sh
Note: chmod +x does nothing to a script's executability on Git Bash. It won't hurt to run it, but it won't accomplish anything either.
#!/usr/bin/env sh
this is how git bash knows a file is executable. chmod a+x does nothing in gitbash. (Note: any "she-bang" will work, e.g. #!/bin/bash, etc.)
If you wish to execute a script file from the git bash prompt on Windows, just precede the script file with sh
sh my_awesome_script.sh
if you are on Linux or ubuntu write ./file_name.sh
and you are on windows just write sh before file name like that sh file_name.sh
For Linux -> ./filename.sh
For Windows -> sh file_name.sh
If your running export command in your bash script the above-given solution may not export anything even if it will run the script. As an alternative for that, you can run your script using
. script.sh
Now if you try to echo your var it will be shown. Check my the result on my git bash
(coffeeapp) user (master *) capstone
$ . setup.sh
done
(coffeeapp) user (master *) capstone
$ echo $ALGORITHMS
[RS256]
(coffeeapp) user (master *) capstone
$
Check more detail in this question
I had a similar problem, but I was getting an error message
cannot execute binary file
I discovered that the filename contained non-ASCII characters. When those were fixed, the script ran fine with ./script.sh.
Once you're in the directory, just run it as ./myScript.sh
If by any chance you've changed the default open for .sh files to a text editor like I had, you can just "bash .\yourscript.sh", provided you have git bash installed and in path.
I was having two .sh scripts to start and stop the digital ocean servers that I wanted to run from the Windows 10. What I did is:
downloaded "Git for Windows" (from https://git-scm.com/download/win).
installed Git
to execute the .sh script just double-clicked the script file it started the execution of the script.
Now to run the script each time I just double-click the script
#!/bin/bash at the top of the file automatically makes the .sh file executable.
I agree the chmod does not do anything but the above line solves the problem.
you can either give the entire path in gitbash to execute it or add it in the PATH variable
export PATH=$PATH:/path/to/the/script
then you an run it from anywhere