CommandException: Caught non-retryable exception - aborting rsync - gsutil

After using gsutil for more than 1 year I suddenly have this error:
.....
At destination listing 8350000...
At destination listing 8360000...
CommandException: Caught non-retryable exception - aborting rsync
.....
I tried to locate the files with this sync problem but I am not able to do so. Is there a "skip error" option of is there a way I can have gsutil more verbose?
My command line is like this:
gsutil -V -m rsync -d -r -U -P -C -e -x -x 'Download/*' /opt/ gs://mybucket1/kraanloos/
I have created a script to split the problem. This gives me more info for a solution
!#/bin/bash
array=(
3ware
AirTime
Amsterdam
BigBag
Download
guide
home
Install
Holding
Multimedia
newsite
Overig
Trak-r
)
for i in "${array[#]}"
do
echo Processing : $i
PROCESS="/usr/bin/gsutil -m rsync -d -r -U -P -C -e -x 'Backup/*' /opt/$i/ gs://mybucket1/kraanloos/$i/"
echo $PROCESS
$PROCESS
echo ""
echo ""
done

I've been struggling with the same problem the last few days. One way to make it super verbose is to put the -D flag before the rsync argument, as in:
gsutil -D rsync ...
By doing that, I found that my problem is due to having # characters in filenames, as in this question.

In my case, it was because of a broken link to a directory.
As blambert said, use the -D option to see exactly what file causes the problem.

I had struggled with this problem as well. I figured it out now.
you need to re-authenticate your Google Cloud SDK Shell and set a target project again.
It seems like rsync will not show the correct error message.
try cp instead, it will guide you to authentic and set the correct primary project
gsutil cp OBJECT_LOCATION gs://DESTINATION_BUCKET_NAME/
after that, your gsutil rsync should run fine.

Related

Run RapSearch-Program with Torque PBS and qsub

My problem is that I have a cluster-server with Torque PBS and want to use it to run a sequence-comparison with the program rapsearch.
The normal RapSearch command is:
./rapsearch -q protein.fasta -d database -o output -e 0.001 -v 10 -x t -z 32
Now I want to run it with 2 nodes on the cluster-server.
I've tried with: echo "./rapsearch -q protein.fasta -d database -o output -e 0.001 -v 10 -x t -z 32" | qsub -l nodes=2 but nothing happened.
Do you have any suggestions? Where I'm wrong? Help please.
Standard output (and error output) files are placed in your home directory by default; take a look. You are looking for a file named STDIN.e[numbers], it will contain the error message.
However, I see that you're using ./rapsearch but are not really being explicit about what directory you're in. Your problem is therefore probably a matter of changing directory into the directory that you submitted from. When your terminal is in the directory of the rapsearch executable, try echo "cd \$PBS_O_WORKDIR && ./rapsearch [arguments]" | qsub [arguments] to submit your job to the cluster.
Other tips:
You could add rapsearch to your path if you use it often. Then you can use it like a regular command anywhere. It's a matter of adding the line export PATH=/full/path/to/rapsearch/bin:$PATH to your .bashrc file.
Create a submission script for use with qsub. Here is a good example.

Open PDF found with volatility

my task is to analyze a memory dump. I've found the location of a PDF-File and I want to analyze it with virustotal. But I can't figure out how to "download" it from the memory dump.
I've already tried it with this command:
python vol.py -f img.vmem dumpfiles -r pdf$ -i --name -D dumpfiles/
But in my dumpfile-directory there is just a .vacb file which is not a valid pdf.
I think you may have missed a command line argumenet from your command:
python vol.py -f img.vmem dumpfiles -r pdf$ -i --name -D dumpfiles/
If you are not getting a .dat file in your output folder you can add -u:
-u, --unsafe Relax safety constraints for more data
Can't test this with out access to the dump but you should be able to rename the .dat file created to .pdf.
So it should look something like this:
python vol.py -f img.vmem dumpfiles -r pdf$ -i --name -D dumpfiles/ -u
You can check out the documentation on the commands here
VACB is "virtual address control block". Your output type seems to be wrong.
Try something like:
$ python vol.py -f img.vmem dumpfiles --output=pdf --output-file=bla.pdf --profile=[your profile] -D dumpfiles/
or check out the cheat sheet: here

Rsync issue pulling from remote server to local server

I've been hitting a wall with this one and i'm not entirely sure what the issue may be. I will be posting two examples of what i'm trying to do one which works and one which doesn't, unfortunately I need the latter to work.
The Error:
Unexpected local arg: /data/
If arg is a remote file/dir, prefix it with a colon (:).
rsync error: syntax or usage error (code 1) at main.c(1215) [receiver=3.0.6]
This Command executes without issues, however it seems that it does not fully rsync my /data directory.
rsync -e "ssh -e 'none'" --force -azPxvIh --delete-after --exclude-from="/tmp/exclude.txt" root#myserver:/ / > "/tmp/wetrun.txt"
This is the command I've been trying utilize to rsync my /data directory and seems to fail.
rsync -e "ssh -e 'none'" --force -azPxvIh --delete-after --exclude-from="/tmp/exclude.txt" root#myserver:/data/ /data/ > "/tmp/wetrun.txt"

error when cassandra-cli command executed in ssh

I have two servers A and B, I have a shell script in serverA which logs into serverB (through ssh) and runs the following command:
sh cassandra-cli -h <serverB> -v -f database_import.txt;
so when I do this manually, I follow these steps:
serverA:~$ ssh serverB
serverB:~$ sh cassandra-cli -h <serverB> -v -f database_import.txt;
It works properly when I follow these steps manually but when I automate this process in a shell script by this following line:
serverA:~$ssh serverB "sh cassandra-cli -h <serverB> -v -f database_import.txt;"
I get this error,
cassandra-cli: 46: cassandra-cli: -ea: not found
So, as you already pointed out, $JAVA is empty through ssh.
This is because .bashrc is not sourced when you log in using ssh. You can source it like this:
. ~/.bashrc
And your command is going to look like this:
ssh serverB ". ~/.bashrc; sh cassandra-cli -h <serverB> -v -f database_import.txt;"
You can also try placing this into your .bash_profile instead of invoking it manually each time.
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

Is there a curl/wget option that prevents saving files in case of http errors?

I want to download a lot of urls in a script but I do not want to save the ones that lead to HTTP errors.
As far as I can tell from the man pages, neither curl or wget provide such functionality.
Does anyone know about another downloader who does?
I think the -f option to curl does what you want:
-f, --fail
(HTTP) Fail silently (no output at all) on server errors. This is mostly done to better
enable scripts etc to better deal with failed attempts. In normal cases when an HTTP
server fails to deliver a document, it returns an HTML document stating so (which often
also describes why and more). This flag will prevent curl from outputting that and
return error 22. [...]
However, if the response was actually a 301 or 302 redirect, that still gets saved, even if its destination would result in an error:
$ curl -fO http://google.com/aoeu
$ cat aoeu
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
here.
</BODY></HTML>
To follow the redirect to its dead end, also give the -L option:
-L, --location
(HTTP/HTTPS) If the server reports that the requested page has moved to a different
location (indicated with a Location: header and a 3XX response code), this option will
make curl redo the request on the new place. [...]
One liner I just setup for this very purpose:
(works only with a single file, might be useful for others)
A=$$; ( wget -q "http://foo.com/pipo.txt" -O $A.d && mv $A.d pipo.txt ) || (rm $A.d; echo "Removing temp file")
This will attempt to download the file from the remote Host. If there is an Error, the file is not kept. In all other cases, it's kept and renamed.
Ancient thread.. landed here looking for a solution... ended up writing some shell code to do it.
if [ `curl -s -w "%{http_code}" --compress -o /tmp/something \
http://example.com/my/url/` = "200" ]; then
echo "yay"; cp /tmp/something /path/to/destination/filename
fi
This will download output to a tmp file, and create/overwrite output file only if status was a 200. My usecase is slightly different.. in my case the output takes > 10 seconds to generate... and I did not want the destination file to remain blank for that duration.
NOTE: I am aware that this is an older question, but I believe I have found a better solution for those using wget than any of the above answers provide.
wget -q $URL 2>/dev/null
Will save the target file to the local directory if and only if the HTTP status code is within the 200 range (Ok).
Additionally, if you wanted to do something like print out an error whenever the request was met with an error, you could check the wget exit code for non-zero values like so:
wget -q $URL 2>/dev/null
if [ $? != 0]; then
echo "There was an error!"
fi
I hope this is helpful to someone out there facing the same issues I was.
Update:
I just put this into a more script-able form for my own project, and thought I'd share:
function dl {
pushd . > /dev/null
cd $(dirname $1)
wget -q $BASE_URL/$1 2> /dev/null
if [ $? != 0 ]; then
echo ">> ERROR could not download file \"$1\"" 1>&2
exit 1
fi
popd > /dev/null
}
I have a workaround to propose, it does download the file but it also removes it if its size is 0 (which happens if a 404 occurs).
wget -O <filename> <url/to/file>
if [[ (du <filename> | cut -f 1) == 0 ]]; then
rm <filename>;
fi;
It works for zsh but you can adapt it for other shells.
But it only saves it in first place if you provide the -O option
As alternative you can create a temporal rotational file:
wget http://example.net/myfile.json -O myfile.json.tmp -t 3 -q && mv list.json.tmp list.json
The previous command will always download the file "myfile.json.tmp" however only when the wget exit status is equal to 0 the file is rotated as "myfile.json".
This solution will prevent to overwrite the final file when a network failure occurs.
The advantage of this method is that in case that something is wrong you can inspect the temporal file and see what error message is returned.
The "-t" parameter attempt to download the file several times in case of error.
The "-q" is the quiet mode and it's important to use with cron because cron will report any output of wget.
The "-O" is the output file path and name.
Remember that for Cron schedules it's very important to provide always the full path for all the files and in this case for the "wget" program it self as well.
You can download the file without saving using "-O -" option as
wget -O - http://jagor.srce.hr/
You can get mor information at http://www.gnu.org/software/wget/manual/wget.html#Advanced-Usage