tar: how to exclude executables from backup? - backup

I wrote a backup utility, with a file where the paths to be excluded from backup are listed.
How can I specify to exclude all executable files in a UNIX like system, where executables typically have no extension?
tar --create --exclude-from=exclude.txt -f backup.tar
I would like a standard way of doing it (Linux, Mac OS-X, BSD, ...).

find dir -executable -type f > execlist.txt
tar --create --exclude-from=exclude.txt --exclude-from=execlist.txt -f backup.tar

Related

Extract huge tar.gz archives from S3 without copying archives to a local system

I'm looking for a way to extract huge dataset (18 TB+ found here https://github.com/cvdfoundation/open-images-dataset#download-images-with-bounding-boxes-annotations) with this in mind I need the process to be fast (i.e. I don't want to spend twice the time for first copying and then extracting files) Also I don't want archives to take extra space not even one 20 gb+ archive.
Any thoughts on how one can achieve that?
If you can arrange to pipe the data straight into tar, it can uncompress and extract it without needing a temporary file.
Here is a example. First create a tar file to play with
$ echo abc >one
$ echo def >two
$ tar cvf test.tar
$ tar cvf test.tar one two
one
two
$ gzip test.tar
Remove the test files
$ rm one two
$ ls one two
ls: cannot access one: No such file or directory
ls: cannot access two: No such file or directory
Now extract the contents by piping the compressed tar file into the tar command.
$ cat test.tar.gz | tar xzvf -
one
two
$ ls one two
one two
The only part missing now is how to download the data and pipe it into tar. Assuming you can access the URL with wget, you can get it to send the data to stdout. So you end up with this
wget -qO- https://youtdata | tar xzvf -

How to gzip a folder under a symlink

I'm trying to gzip all subdirectories and files of a folder.The peculiarity is that the file that I compress is a symbolic link to the last release of my site
filename=$(date '+%Y%m%d')
cd /home/site
tar -zcvf $filename.tar.gz current/
scp $filename.tar.gz server:~/backups/production
rm $filename.tar.gz
When the operation ended and I open the compressed folder. I'm sying the symlink of the folder not its content. What's the wrong point ?
This is expected behavior. You need to specify the -h flag when creating the archive if you want to dereference symlinks. From the tar manual:
Normally, when tar archives a symbolic link, it writes a block to the
archive naming the target of the link. In that way, the tar archive is
a faithful record of the file system contents. When --dereference
(-h) is used with --create (-c), tar archives the files symbolic
links point to, instead of the links themselves.

Deleting folder on my OS X 10.10.5 by executing a file

I have a list of files I want to delete on my mac, how do I automate this without entering each into the terminal with a
"sudo rm -r folderName"
Here is the list of folders:
/Library/Application Support/VMware
/Library/Application Support/VMware Fusion
/Library/Preferences/VMware Fusion
~/Library/Application Support/VMware Fusion
~/Library/Caches/com.vmware.fusion
~/Library/Preferences/VMware Fusion
~/Library/Preferences/com.vmware.fusion.LSSharedFileList.plist
~/Library/Preferences/com.vmware.fusion.LSSharedFileList.plist.lockfile
~/Library/Preferences/com.vmware.fusion.plist
~/Library/Preferences/com.vmware.fusion.plist.lockfile
~/Library/Preferences/com.vmware.fusionDaemon.plist
~/Library/Preferences/com.vmware.fusionDaemon.plist.lockfile
~/Library/Preferences/com.vmware.fusionStartMenu.plist
~/Library/Preferences/com.vmware.fusionStartMenu.plist.lockfile
You could use this in your terminal to remove the files you list and recursively clear out directories in your list.
while read p; do rm -r $p; done <list.txt
list.txt is a list of files/directories with each entry on its own line.
This will loop over that list and call rm -r with the file/directory name.

Is it possible to make SCP ignore symbolic links during copy?

I need to reinstall one of ours servers, and as a precaution, I want to move /home, /etc, /opt, and /Services to backup server.
However, I have a problem: because of plenty of symbolic links a lot of files are copied multiple times.
Is it possible to make scp ignore the symbolic links (or actually to copy link as a link not as a directory or file)? If not, is there another way to do it?
I knew that it was possible, I just took wrong tool. I did it with rsync
rsync --progress -avhe ssh /usr/local/ XXX.XXX.XXX.XXX:/BackUp/usr/local/
I found that the rsync method did not work for me, however I found an alternative that did work on this website (www.docstore.mik.ua/orelly).
Specifically section 7.5.3 of "O'Reilly: SSH: The Secure Shell. The Definitive Guide".
7.5.3. Recursive Copy of Directories
...
Although scp can copy directories, it isn't necessarily the best method. If your directory contains hard links or soft links, they won't be duplicated. Links are copied as plain files (the link targets), and worse, circular directory links cause scp1 to loop indefinitely. (scp2 detects symbolic links and copies their targets instead.) Other types of special files, such as named pipes, also aren't copied correctly.A better solution is to use tar, which handles special files correctly, and send it to the remote machine to be untarred, via SSH:
$ tar cf - /usr/local/bin | ssh server.example.com tar xf -
Using tar over ssh as both sender and receiver does the trick as well:
cd $DEST_DIR
ssh user#remote-host 'cd $REMOTE_SRC_DIR; tar cf - ./' | tar xvf -
One solution is to use a shell pipe. I have a situation where I got some *.gz files and symbolic links generated by some software to link to the same *.gz files with a slightly shorter name. If I simply use scp, then the symbolic links will be copied as regular files and resulting in duplicates. I know rsync can ignore symbolic links, but my gz files are not compressed with syncable options, and sync is very slow in copying these gz files. So I simply use the following script to copy over the files:
find . -type f -exec scp {} target_host:/directory/name/data \;
The -f option will only find regular files and ignore symbolic links. You need to give this command on the source host. Hope this may help some user in my situation. Let me know if I missed anything.
A one liner solution which can be executed at client to copy folder from server using tar + ssh command.
ssh user#<Server IP/link> 'mkdir -p <Remote destination directory;cd <Remote destination directory>; tar cf - ./' | tar xf - C <Source destination directory>
Note: mkdir is must, if the remote destination directory is not present then the command will simply compress the entire home of the remote server and extract it to client.

Bash copying specific files

How can I get tar/cp to copy only files that dont end in .jar and only in root and /plugins directories?
So, I'm making a Minecraft server backup script. One of the options I wish to have is a backup of configuration files only. Here's the scenario:
There are many folders with massive amounts of data in.
Configuration files mainly use the following extensions, but some may use a different one:
.yml
.json
.properties
.loc
.dat
.ini
.txt
Configuration files mainly appear in the /plugins folder
There are a few configuration files in the root directory, but none in any others except /plugins
The only other files in these two directories are .jar files - to an extent. These do not need to be backed up. That's the job of the currently-working plugins flag.
The code uses a mix of tar and cp depending on which flags the user started the process with.
The process is started with a command, then paths are added via a concatenated variable, such as $paths = plugins world_nether mysql/hawk where arguments can be added one at a time.
How can I selectively backup these configuration files with tar and cp? Due to the nature of the configuration process, we needn't have the same flags to add into both commands - it can be seperate arguments for either command.
Here are the two snippets of code in concern:
Configure paths:
# My first, unsuccessful attempt.
if $BKP_CFG; then
# Tell user they are backing up config
echo " +CONFIG $confType - NOT CURRENTLY WORKING"
# Main directory, and everything in plugin directory only
# Jars are not allowed to be backed up
#paths="$paths --no-recursion * --recursion plugins$suffix --exclude *.jar"
fi
---More Pro Stuff----
# Set commands
if $ARCHIVE; then
command="tar -cpv"
if $COMPRESSION; then
command=$command"z"
fi
# Paths starts with a space </protip>
command=$command"C $SERVER_PATH -f $BACKUP_PATH/$bkpName$paths"
prep=""
else
prep="mkdir $BACKUP_PATH/$bkpName"
# Make each path an absolute path. Currently, they are all relative
for path in $paths; do
path=$SERVER_PATH/$path
done
command="cp -av$paths $BACKUP_PATH/$bkpName"
fi
I can provide more code/explaination where neccessary.
find /actual/path ! -iname '*jar' -maxdepth 1 -exec cp \{\} /where/to/copy/ \;
find /actual/path/plugins ! -iname '*jar' -maxdepth 1 -exec cp \{\} /where/to/copy/ \;
Might help.
Final code:
if $BKP_CFG; then
# Tell user what's being backed up
echo " +CONFIG $confType"
# Main directory, and everything in plugin directory only
# Jars are not allowed to be backed up
# Find matches within the directory cd'd to earlier, strip leading ./
paths="$paths $(find . -maxdepth 1 -type f ! -iname '*.jar' | sed -e 's/\.\///')"
paths="$paths $(find ./plugins -type f ! -iname '*.jar' | sed -e 's/\.\///')"
fi