Difference between &> and > in Hive

Difference between &> and > in Hive - hive

Can anyone tell me what is the difference betwwen &> and > in hive?
When to use which one?
For my work, hive -f Script.hql > output.xls works and hive -f Script.hql &> output.xlsx works?

This is not specific to hive but rather a feature in your shell. command > file redirects stdout to a file and command &> file redirects both stderr and stdout to the same file.
See also Redirect stderr and stdout in a Bash script

Related

Handle gsutil ls and rm command errors if no files present

I am running the following command to remove files from a gcs bucket prior to loading new files there.
gsutil -m rm gs://mybucket/subbucket/*
If there are no files in the bucket, it throws the "CommandException: One or more URLs matched no objects".
I would like for it to delete the files if exists without throwing the error.
There is same error with gsutil ls gs://mybucket/subbucket/*
How can I rewrite this without having to handle the exception explicitly? Or, how to best handle these exceptions in batch script?

Try this:
gsutil -m rm gs://mybucket/foo/* 2> /dev/null || true
Or:
gsutil -m ls gs://mybucket/foo/* 2> /dev/null || true
This has the effect of suppressing stderr (it's directed to /dev/null), and returning a success error code even on failure.

You might not want to ignore all errors as it might indicate something different that file not found. With the following script you'll ignore only the 'One or more URLs matched not objects' but will inform you of a different error. And if there is no error it will just delete the file:
gsutil -m rm gs://mybucket/subbucket/* 2> temp
if [ $? == 1 ]; then
grep 'One or more URLs matched no objects' temp
if [ $? == 0 ]; then
echo "no such file"
else
echo temp
fi
fi
rm temp
This will pipe stderr to a temp file and will check the message to decide whether to ignore it or show it.
And it also works for single file deletions. I hope it helps.
Refs:
How to grep standard error stream
Bash Reference Manual - Redirections

You may like rsync to sync files and folders to a bucket. I used this for clearing a folder in a bucket and replacing it with new files from my build script.
gsutil rsync -d newdata gs://mybucket/data - replaces data folder with newdata

Executing a SQL file in interactive impala-shell session

In an interactive impala-shell session, is there a way to load and execute a text file containing one or more SQL statements? In Hive's beeline, for example, you can use !run <filename> to run the SQL commands in that file.

This is not currently possible. You can file a JIRA.

I believe it is possible - see impala-shell -h (version v2.1.1-cdh5):
-f QUERY_FILE, --query_file=QUERY_FILE
Execute the queries in the query file, delimited by ;
[default: none]
combine this with shell command in interactive mode:
shell impala-shell -f file;

Open PDF found with volatility

my task is to analyze a memory dump. I've found the location of a PDF-File and I want to analyze it with virustotal. But I can't figure out how to "download" it from the memory dump.
I've already tried it with this command:
python vol.py -f img.vmem dumpfiles -r pdf$ -i --name -D dumpfiles/
But in my dumpfile-directory there is just a .vacb file which is not a valid pdf.

I think you may have missed a command line argumenet from your command:
python vol.py -f img.vmem dumpfiles -r pdf$ -i --name -D dumpfiles/
If you are not getting a .dat file in your output folder you can add -u:
-u, --unsafe Relax safety constraints for more data
Can't test this with out access to the dump but you should be able to rename the .dat file created to .pdf.
So it should look something like this:
python vol.py -f img.vmem dumpfiles -r pdf$ -i --name -D dumpfiles/ -u
You can check out the documentation on the commands here

VACB is "virtual address control block". Your output type seems to be wrong.
Try something like:
$ python vol.py -f img.vmem dumpfiles --output=pdf --output-file=bla.pdf --profile=[your profile] -D dumpfiles/
or check out the cheat sheet: here

cron job script creating unwanted files

I have a cron job script and i used >/dev/null 2>&1 to Stop sending Emails. But each time a file is created in the same name of PHP file with trailing numbers like phpfile.php.1, phpfle.php.2, phpfile.php.3….
Is there any script to stop that?

Add -O /dev/null to your wget command.

Is there a curl/wget option that prevents saving files in case of http errors?

I want to download a lot of urls in a script but I do not want to save the ones that lead to HTTP errors.
As far as I can tell from the man pages, neither curl or wget provide such functionality.
Does anyone know about another downloader who does?

I think the -f option to curl does what you want:
-f, --fail
(HTTP) Fail silently (no output at all) on server errors. This is mostly done to better
enable scripts etc to better deal with failed attempts. In normal cases when an HTTP
server fails to deliver a document, it returns an HTML document stating so (which often
also describes why and more). This flag will prevent curl from outputting that and
return error 22. [...]
However, if the response was actually a 301 or 302 redirect, that still gets saved, even if its destination would result in an error:
$ curl -fO http://google.com/aoeu
$ cat aoeu
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
here.
</BODY></HTML>
To follow the redirect to its dead end, also give the -L option:
-L, --location
(HTTP/HTTPS) If the server reports that the requested page has moved to a different
location (indicated with a Location: header and a 3XX response code), this option will
make curl redo the request on the new place. [...]

One liner I just setup for this very purpose:
(works only with a single file, might be useful for others)
A=$$; ( wget -q "http://foo.com/pipo.txt" -O $A.d && mv $A.d pipo.txt ) || (rm $A.d; echo "Removing temp file")
This will attempt to download the file from the remote Host. If there is an Error, the file is not kept. In all other cases, it's kept and renamed.

Ancient thread.. landed here looking for a solution... ended up writing some shell code to do it.
if [ `curl -s -w "%{http_code}" --compress -o /tmp/something \
http://example.com/my/url/` = "200" ]; then
echo "yay"; cp /tmp/something /path/to/destination/filename
fi
This will download output to a tmp file, and create/overwrite output file only if status was a 200. My usecase is slightly different.. in my case the output takes > 10 seconds to generate... and I did not want the destination file to remain blank for that duration.

NOTE: I am aware that this is an older question, but I believe I have found a better solution for those using wget than any of the above answers provide.
wget -q $URL 2>/dev/null
Will save the target file to the local directory if and only if the HTTP status code is within the 200 range (Ok).
Additionally, if you wanted to do something like print out an error whenever the request was met with an error, you could check the wget exit code for non-zero values like so:
wget -q $URL 2>/dev/null
if [ $? != 0]; then
echo "There was an error!"
fi
I hope this is helpful to someone out there facing the same issues I was.
Update:
I just put this into a more script-able form for my own project, and thought I'd share:
function dl {
pushd . > /dev/null
cd $(dirname $1)
wget -q $BASE_URL/$1 2> /dev/null
if [ $? != 0 ]; then
echo ">> ERROR could not download file \"$1\"" 1>&2
exit 1
fi
popd > /dev/null
}

I have a workaround to propose, it does download the file but it also removes it if its size is 0 (which happens if a 404 occurs).
wget -O <filename> <url/to/file>
if [[ (du <filename> | cut -f 1) == 0 ]]; then
rm <filename>;
fi;
It works for zsh but you can adapt it for other shells.
But it only saves it in first place if you provide the -O option

As alternative you can create a temporal rotational file:
wget http://example.net/myfile.json -O myfile.json.tmp -t 3 -q && mv list.json.tmp list.json
The previous command will always download the file "myfile.json.tmp" however only when the wget exit status is equal to 0 the file is rotated as "myfile.json".
This solution will prevent to overwrite the final file when a network failure occurs.
The advantage of this method is that in case that something is wrong you can inspect the temporal file and see what error message is returned.
The "-t" parameter attempt to download the file several times in case of error.
The "-q" is the quiet mode and it's important to use with cron because cron will report any output of wget.
The "-O" is the output file path and name.
Remember that for Cron schedules it's very important to provide always the full path for all the files and in this case for the "wget" program it self as well.

You can download the file without saving using "-O -" option as
wget -O - http://jagor.srce.hr/
You can get mor information at http://www.gnu.org/software/wget/manual/wget.html#Advanced-Usage

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Difference between &> and > in Hive - hive

Can anyone tell me what is the difference betwwen &> and > in hive? When to use which one? For my work, hive -f Script.hql > output.xls works and hive -f Script.hql &> output.xlsx works?

This is not specific to hive but rather a feature in your shell. command > file redirects stdout to a file and command &> file redirects both stderr and stdout to the same file. See also Redirect stderr and stdout in a Bash script

Related

Handle gsutil ls and rm command errors if no files present

Executing a SQL file in interactive impala-shell session

Open PDF found with volatility

cron job script creating unwanted files

Is there a curl/wget option that prevents saving files in case of http errors?

Categories

Resources