Batch script - redirecting - possible file lock? - file-io

I have a batch script that unzips some files from a folder and this script may be called several times.
For unzipping I use unzip.exe and I log it to a log file. For instance this is what goes into this logfile:
ECHO %DATE% - %TIME% >> Unzipped.log
ECHO ERROR LEVEL IS: !ERRORLEVEL! >> Unzipped.log
ECHO Error with file %1 >> Unzipped.log
My question is, is it possible to get a file lock on "Unzipped.log" file, if my batch script is called several times in a short time period?
I've tried to google this but with no luck. The only time where I have seen a problem is when I open the "Unzipped.log" file in Word, than my batch script won't write to it. When I have it open in Notepad/Notepad++ there is no problem in writing to the log file.

Yes, you most definitely can get a failure due to file locking if a batch process attempts to open the file for writing while another process already has it open for writing. The two processes could be on the same machine, or they could be on different machines if your are dealing with a file on a shared network drive. Both processes could be batch process, but they don't have to be.
It is possible to safely write to a log file "simultaneously" from multiple batch processes with a little bit of code to manage the locking of the file. See How do you have shared log files under Windows?

Related

How to delete remote file using Kettle Pentaho

I have a directory in remote Linux machine where files are being archived and kept for a certain period of time. I want to delete a file from remote (Linux) machine using kettle transformation based on some condition.
If file does not exists then job should not throw any error but if file exists at remote location, then job should delete file or raise an error in case some other reason, i.e., permission issue.
Here, the file name will be retrieved as a variable from previous steps of transformation and directory path of archived files will be fixed one.
How can I achieve this in Pentaho Kettle transformation?
Make use of "Run SSH commands" utility to pass commands to your remote server.
Assuming you do a rm -f /path/file it won't error for a non-existent file.
You can capture the output and perform an error handling as well (Filter rows and trigger the course of action).
Or you can mount remote directory to machine where kettle is, and try to delete file as regular.
Using ssh, i think, non trivial. It needs a lots of experiments to find out error types, to find way to distinguish errors. It might be and error with ssh connection or error to delete file.

How can I use boto or boto-rsync a full backup of 1000+ files to an S3-compatible cloud?

I'm trying to back up my entire collection of over 1000 work files, mainly text but also pictures, and a few large (0.5-1G) audiorecordings, to an S3 cloud (Dreamhost DreamObjects). I have tried to use boto-rsync to perform the first full 'put' with this:
$ boto-rsync --endpoint objects.dreamhost.com /media/Storage/Work/ \
> s3:/work.personalsite.net/ > output.txt
where '/media/Storage/Work/' is on a local hard disk, 's3:/work.personalsite.net/' is a bucket named after my personal web site for uniqueness, and output.txt is where I wanted a list of the files uploaded and error messages to go.
Boto-rsync grinds its way through the whole dirtree, but refreshing output about each file's progress doesn't look so good when it's printed in a file. Still as the upload is going, I 'tail output.txt' and I see that most files are uploaded, but some are only uploaded to less than 100%, and some are skipped altogether. My questions are:
Is there any way to confirm that a transfer is 100% complete and correct?
Is there a good way to log the results and errors of a transfer?
Is there a good way transfer a large number of files in a big directory hierarchy to one or more buckets for the first time, as opposed to an incremental backup?
I am on a Ubuntu 12.04 running Python 2.7.3. Thank you for your help.
you can encapsulate the command in an script and starts over nohup:
nohup script.sh
nohup generates automaticaly nohup.out file where all the output aof the script/command are captured.
to appoint the log you can do:
nohup script.sh > /path/to/log
br
Eddi

lsof if file descriptor is opened

I use the following command to find out if file descriptor is opened:
/usr/sbin/lsof -a -c sqlplus -u ${USER} | grep -l "${FILE_NAME}”
If it is not, I perform some actions. The file is a log spooled from sqlplus.
Sometimes lsof tells that file descriptor is not opened, but then I find some new data in this file. It happens very seldom, so I cannot reproduce it.
What can be the reason?
How does sql spool work?
Does it keep open file descriptor from the SPOOL file command till the SPOOL OFF comand or does it open and close file descriptor several times?
You probably have a "race condition". Sqlplus opened the file, put some new data in it and closed it in between the time lsof checked the file and when you used the result of lsof to process the file.
Often, the best way to avoid race conditions in a file system is to rename the file concerned before processing it. Renaming is a relatively cheap operation and this stops other processes from opening/modifying the file while your process deals with it. You need to make sure that if the file is open in another process when it is renamed that you wait until it is no longer being accessed via the open file handle before your process deals with it.
Most programmers write code that is littered with race conditions. These cause all sorts of unreproducible bugs. You'll be a much better programmer if you keep in mind that almost all programs have multiple processes sharing resources and that sharing must always be managed.

ssh scripting and copying files

I am writing a BASH deployment script on RH 5. Script runs great and send out an email at the end of the script run. However, what I need to do is, at the end of the script, if I detect any failure, I need to copy log files back local server to attach to the email.
Script can detect failure fine, how to copy log files back. I don't want to just cat the log files as they can be huge.
Any suggestions?
Thanks
S
If I understand correctly your problem, you should use scp
http://linux.die.net/man/1/scp
and here you can find how to automate the login so you can use it in a script
http://linuxproblem.org/art_9.html
I can't see any easy way of avoiding a second login with scp/sftp. If you're sure that it's only the log file that will be returned you could do something like the following:
ssh -e none REMOTE SCRIPT | gzip -dc > LOGFILE
Inside SCRIPT you have something like gzip -c LOGFILE when if fails.

Hadoop put command doing nothing!

I am running Cloudera's distribution of Hadoop and everything is working perfectly.The hdfs contains a large number of .seq files.I need to merge the contents of all the .seq files into one large .seq file.However, the getmerge command did nothing for me.I then used cat and piped the data of some .seq files onto a local file.When i want to "put" this file into hdfs it does nothing.No error message shows up,and no file is created.
I am able to "touchz" files in the hdfs and user permissions are not a problem here.The put command simply does not work.What am I doing wrong?
Write a job that merges the all sequence files into a single one. It's just the standard mapper and reducer with only one reduce task.
if the "hadoop" commands fails silently you should have a look at it.
Just type: 'which hadoop', this will give you the location of the "hadoop" executable. It is a shell script, just edit it and add logging to see what's going on.
If the hadoop bash script fails at the beginning it is no surprise that the hadoop dfs -put command does not work.