robocopy, jungledisk file copy problems - amazon-s3

I'm a huge fan of robocopy and use it extensively to copy between various servers I need to update.
Lately I've been archiving to an Amazon S3 account that I access via a mapped drive using JungleDisk. I then robocopy my files from local PC to S3.
Sometimes I get a very strange 'Incorrect function' error message in robocopy and the file fails to copy. I've tried xcopy and straightforward copy and paste between file explorer windows. In each case I get some variation of the 'Incorrect function' or 'Illegal MS-DOS function' and the file will never copy.
I delete the target but to no avail.
Any ideas?

Don't know if you're allowed to answer your own questions but I think I've fixed it...
I found this in the jungledisk support forums
The quick solution is to zip the
files, delete the original, then unzip
the files because zip can't handle
extended attributes. Another solution
is to move them to a FAT filesystem,
then move again to NTFS filesystem,
because FAT don't manage extended
attributes.
In both cases the result is the
deletion of extended attributes, and
the files can be moved to the
jungledisk.
The files can have extended attributes
for different reasons, expecially
migrations from other filesystems: in
my case was the migration of a CVS
repository from a ext2 filesystem to
NTFS.
Seems to have worked for me...

I've had similar issues from both OSX and linux. At first I was not concerned by it but then it occurred to me these issues could result in potential data contamination or backup failure. So I have abandon JungleDisk for everything except my lightweight work.
Zipping/taring files was not an option for me because of the size of my data set. With this approach you have to upload your entire data set each and every time.

I'm not sure which attributes you refer to but could you robocopy with the /COPY:DT switch to strip off the attributes?

Related

Use RStudio to connect to, and run queries on, a locally stored, compressed SQL databse

I'm trying to connect to and run queries on two large, locally-stored SQL databases with file extensions like so:
filename.sql.zstd.part
filename2.sql.zstd
My preference is to use the RMySQL package- however i am finding it hard to find documentation of a) how to access locally stored SQL files, and b) how to deal with the zstd extension.
This may be very basic but help is appreciated!
Seems like you have problems understanding the file extensions.
filename.sql.zstd.part
.part usually means you are downloading a file from the internet, but the download isn't complete yet (so downloads that are in progress or have been stopped)
So to get from filename.sql.zstd.part to filename.sql.zstd you need to complete your download
.zstd means it is a compressed file (to save disk space). You need a decompression program to get from filename.sql.zstd to filename.sql
The compression algorithm used is called Zstandard so you need a decompressor specifically for this program. Look here https://facebook.github.io/zstd/ for such a program.
There was also once an R package for this - but it has been archived. But you could also download an older version
(https://cran.r-project.org/web/packages/zstdr/index.html)
In filename.sql is actually not a database. In an .sql file are usually SQL statements for creating / modifying database structures. You'd have to install a database e.g. MariaDB and then import this .sql file to actually really have the files in a database on your computer. And then you would access this database via R.

Can't copy to network path using Filesystem.filecopy or FileSystem.CopyFile

I'm trying to copy a file from a local computer to a remote share but I get an exception saying "Can't find the specified file".
My.Computer.FileSystem.CopyFile("C:\filename.jpg", "\\focserver2\consultoria\teste\filename.jpg")
The remote shared folder has full control permissions on "Everyone".
What am I doing wrong? Or it's not possible to copy to network paths using FileSystem.CopyFile ?
Thanks.
João
The issue does not appear to be the destination but the source. I assume that the sample you showed above is NOT the real code and in the real code, you combine a couple values together to define the source. To help prevent issues, get in the habit of using the Path.Combine() method when concatenating strings for a file path. It's a life-saver.
The next most important thing is to learn how to debug your code by setting break-points and seeing what the values of your combined strings are before posting to websites. This is a good one to get you started. http://weblogs.asp.net/scottgu/debugging-tips-with-visual-studio-2010
Most likely the problem is that you are trying to access the root folder of the C:\ drive. This folder is generally locked down by the operating system, and even simple file operations don't work easily there.
Try copying the file from a subfolder e.g.
My.Computer.FileSystem.CopyFile("C:\Temp\filename.jpg", "\\focserver2\consultoria\teste\filename.jpg")

FTP client sees a file that isn't there... How can I successfully delete/overwrite this "ghost" file?

So we have a client that creates "training packages" and then uploads them via ftp to their website. They create the training packages in PowerPoint, and then use some program to convert them into html/swf files and package them within a folder. When they upload, they use Filezilla, and just transfer the entire folder over. The folder is uniquely named, uses no spaces or special characters.
These files have uploaded fine for about a year. Recently, they've run into a problem. Whenever they try to upload training package folder, they are immediately presented with the "This file already exists, do you want to overwrite?" message. Except... the folder they're moving is brand new, and the file it's asking to overwrite DOESN'T EXIST. When they choose "Overwrite" the file looks like it transfers, but the file size is wrong, and the training package doesn't work correctly.
This happens with every training package they try to upload. It's not just a badly outputted package. Also, it's always the same file that has the problem--it's the main "player" for the training package, and though it contains different content for every package, it is the same file name (cplayer.swf) every time.
Things they've tried without success:
-Re-uploading the file again by itself, and overwriting
-Deleting the "bad" file and re-uploading the single file - Get the overwrite message again, even though the file DOES NOT EXIST.
-Renaming the file on the server and re-uploading the single file - Get the overwrite message.
-Renaming the single file locally within the package and uploading/renaming it - Won't let us rename because the file already exists.
-Used another FTP client - Same results as above, so not a client specific problem.
-Used a different FTP login - Same results as above, so not a permissions problem.
Other things of note:
-The file is small--it's not a time out problem. Plus, all other files upload fine, and some are a lot larger.
-They've emailed this file to me, and I've uploaded it successfully.
I am completely at my wits end. Does anyone have any ideas where I can at least troubleshoot a little further?
Thanks for the non-help, the downvote, and the general lack of response on what was a pretty serious issue for me.
In case anyone else has a similar problem, here's what was going on:
Virus software (specifically Malware Bytes) was blocking THIS ONE SINGLE FILE. All I had to do was exclude the folder that contained the file.

How can I determine if files in a "drop folder" are completely transfered

Remote clients will upload images (and perhaps some instructional files in specially formatted text) to a "drop folder." Once the upload is complete we need to begin processing these images. It would be an easy, but flawed, solution to just have a script automatically begin processing any files in the folder every few seconds (the files can be move out of the folder once processed); but problems would arise when attempting to process large images which are only partially transfered.
What are some tricks I can use to ensure the files are fully uploaded before processing them?
A few of my own thoughts:
The script can check the validity of the file; ie, a partial jpeg would result in an error and you could respond to that error in the script, this would be fairly CPU intensive though. Some files have special markers on the end, but I can't count on this, I'm not sure what formats I'll be dealing with.
I've heard of "file handles" but haven't really figured out the basics of what they are and how I can tell if there is a "file handle" on a particular file. Basically the FTP daemon (actually, I'm on Windows, so "service") would keep a "handle" on the file while it's being uploaded and you would know not to process that file. These are just a few of my thoughts but I'm not really sure if they will work or if there are better or more accepted ways of solving this problem.
If you have an server-side script upload system (PHP, ASP, JSP, whatever), you could instruct the script to call another script to process the files, or to create a flag-file indicating the upload is done, something like this.
If your server is Linux-based, you can use lsof to check if the file is open. As your ftp/script/cgi will close the file after upload completes, lsof will not show the file in the list.
If your server is Windows-based, you can use Process Explorer to list the open files.
By what method are your users uploading the images?

Watch folder for files being Read

I am trying to watch files in a directory to determine when files are opened/accessed. I thought FileSystemWatcher would do the trick using the event Changed.
Problem is that some applications do not create a lock on the file they open/access or change either the date modified or date accessed (even after fsutil behavior set disablelastaccess 0). Notepad for example. Apparently is makes a copy of the file in memory and plays with it there until you save it. Nor does it update the Date Accessed.
How can I monitor a directory of files and be notified when a file is simply opened/accessed by any program (e.g. Notepad)? Files may be opened from another computer, not necessarily on the computer running the "watcher".
I found lots of similar questions but did not see one focusing on file "access".
This is quite normal. Updating an existing file is quite dangerous since it can cause irretrievable data loss. A disk error (like disk full) while writing is very bad news. The common algorithm used:
rename the original file
write a new file using the original name
no error: delete the renamed file
error: delete the new file, rename original file back
Clearly this doesn't cause a Changed event to be raised, no file was changed.
Sorry, I didn't read the question well enough. There is no notification whatsoever for an app just opening a file for reading. FSW can only detect changes to the file system. There is no ready alternative either, this requires a custom file system filter driver that snoops on driver requests. Like the kind that SysInternals' ProcMon utility uses. I'm not aware of such a driver ready for use in a C# program, you can't write them in C# either. This just isn't a common requirement.