How to check if a Unix .tar.gz file is a valid file without uncompressing?

How to check if a Unix .tar.gz file is a valid file without uncompressing? - gzip

I have found the question How to determine if data is valid tar file without a file?, but I was wondering: is there a ready made command line solution?

What about just getting a listing of the tarball and throw away the output, rather than decompressing the file?
tar -tzf my_tar.tar.gz >/dev/null
Edited as per comment. Thanks zrajm!
Edit as per comment. Thanks Frozen Flame! This test in no way implies integrity of the data. Because it was designed as a tape archival utility most implementations of tar will allow multiple copies of the same file!

you could probably use the gzip -t option to test the files integrity
http://linux.about.com/od/commands/l/blcmdl1_gzip.htm
from: http://unix.ittoolbox.com/groups/technical-functional/shellscript-l/how-to-test-file-integrity-of-targz-1138880
To test the gzip file is not corrupt:
gunzip -t file.tar.gz
To test the tar file inside is not corrupt:
gunzip -c file.tar.gz | tar -t > /dev/null
As part of the backup you could probably just run the latter command and
check the value of $? afterwards for a 0 (success) value. If either the tar
or the gzip has an issue, $? will have a non zero value.

If you want to do a real test extract of a tar file without extracting to disk, use the -O option. This spews the extract to standard output instead of the filesystem. If the tar file is corrupt, the process will abort with an error.
Example of failed tar ball test...
$ echo "this will not pass the test" > hello.tgz
$ tar -xvzf hello.tgz -O > /dev/null
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error exit delayed from previous errors
$ rm hello.*
Working Example...
$ ls hello*
ls: hello*: No such file or directory
$ echo "hello1" > hello1.txt
$ echo "hello2" > hello2.txt
$ tar -cvzf hello.tgz hello[12].txt
hello1.txt
hello2.txt
$ rm hello[12].txt
$ ls hello*
hello.tgz
$ tar -xvzf hello.tgz -O
hello1.txt
hello1
hello2.txt
hello2
$ ls hello*
hello.tgz
$ tar -xvzf hello.tgz
hello1.txt
hello2.txt
$ ls hello*
hello1.txt hello2.txt hello.tgz
$ rm hello*

You can also check contents of *.tag.gz file using pigz (parallel gzip) to speedup the archive check:
pigz -cvdp number_of_threads /[...]path[...]/archive_name.tar.gz | tar -tv > /dev/null

A nice option is to use tar -tvvf <filePath> which adds a line that reports the kind of file.
Example in a valid .tar file:
> tar -tvvf filename.tar
drwxr-xr-x 0 diegoreymendez staff 0 Jul 31 12:46 ./testfolder2/
-rw-r--r-- 0 diegoreymendez staff 82 Jul 31 12:46 ./testfolder2/._.DS_Store
-rw-r--r-- 0 diegoreymendez staff 6148 Jul 31 12:46 ./testfolder2/.DS_Store
drwxr-xr-x 0 diegoreymendez staff 0 Jul 31 12:42 ./testfolder2/testfolder/
-rw-r--r-- 0 diegoreymendez staff 82 Jul 31 12:42 ./testfolder2/testfolder/._.DS_Store
-rw-r--r-- 0 diegoreymendez staff 6148 Jul 31 12:42 ./testfolder2/testfolder/.DS_Store
-rw-r--r-- 0 diegoreymendez staff 325377 Jul 5 09:50 ./testfolder2/testfolder/Scala.pages
Archive Format: POSIX ustar format, Compression: none
Corrupted .tar file:
> tar -tvvf corrupted.tar
tar: Unrecognized archive format
Archive Format: (null), Compression: none
tar: Error exit delayed from previous errors.

I have tried the following command and they work well.
bzip2 -t file.bz2
gunzip -t file.gz
However, we can found these two command are time-consuming. Maybe we need some more quick way to determine the intact of the compress files.

These are all very sub-optimal solutions. From the GZIP spec
ID2 (IDentification 2)
These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139
(0x8b, \213), to identify the file as being in gzip format.
Has to be coded into whatever language you're using.

> use the -O option. [...] If the tar file is corrupt, the process will abort with an error.
Sometimes yes, but sometimes not. Let's see an example of a corrupted file:
echo Pete > my_name
tar -cf my_data.tar my_name
# // Simulate a corruption
sed < my_data.tar 's/Pete/Fool/' > my_data_now.tar
# // "my_data_now.tar" is the corrupted file
tar -xvf my_data_now.tar -O
It shows:
my_name
Fool
Even if you execute
echo $?
tar said that there was no error:
0
but the file was corrupted, it has now "Fool" instead of "Pete".

Related

gitlab backup: make gitlab-rake produce tar.gz files not tar

Backup files I get with gitlab-rake are tar files how can I get tar.gz ?
Here the files::
root#gitlab:~# ll /mnt/backup-git/ -h
total 1.9G
-rw------- 1 git git 57M Nov 29 15:57 1480431448_gitlab_backup.tar
-rw------- 1 git git 57M Nov 29 15:57 1480431473_gitlab_backup.tar
-rw------- 1 git git 452M Nov 30 02:00 1480467623_gitlab_backup.tar
Here my configuration values for the backup::
$ grep -i backup /etc/gitlab/gitlab.rb | grep -v '^#'
gitlab_rails['backup_path'] = "/mnt/backup-git/"
gitlab_rails['backup_keep_time'] = 604800
To create them, following the documentation here, (omnibus installation):
root#gitlab:~# crontab -l | grep -v '^#'
0 2 * * * /opt/gitlab/bin/gitlab-rake gitlab:backup:create CRON=1

It doesn't really make sense to compress the gitlab backup tar files. The gitlab backup tar files are the final tarball made during the backup process and the contents are all files compressed during the backup process. You can read more here

Count number of files in directory then scp transfer a certain range such as 21404-42806

I found the number of files in /dev/shm/split/1/ to be 42806 using:
/bin/ls -lU /dev/shm/split/1/ | wc -l
What I can't seem to find anywhere online is how to select a certain range, say from 21404-42806, and use scp to securely copy those files. Then, for management purposes, I would like to move the files I copied to another folder, say /dev/shm/split/2/.
How do I do that using CentOS?
I tried:
sudo chmod 400 ~/emails/name.pem ; ls -1 /dev/shm/split/1/ | sed -n '21443,42806p' | xargs -i scp -i ~/emails/name.pem {} root#ipaddress:/dev/shm/split/2/
This produced:
no such file or directory
errors on all of the files...

ls itself lists files relative to the directory you give. This means your ls prints the filenames in the directory, but later on, scp doesn't have the path to them. You can fix this two ways:
Give the path to scp:
ls -1 /dev/shm/split/1/ | sed -n '21443,42806p' | xargs -i \
scp -i ~/emails/name.pem /dev/shm/split/1/{} root#ipaddress:/dev/shm/split/2/
Change to that directory and it will work:
cd /dev/shm/split/1/; ls -1 | sed -n '21443,42806p' | xargs -i \
scp -i ~/emails/name.pem {} root#ipaddress:/dev/shm/split/2/

Login via Shell Script

My issue is that I want run a script from root for which I always have to login with root manually by typing "su -" on command line.
My query is that the script which I am executing it automatically login with root by just prompting me for password. Help me!!!
::::::::::Script:::::::::::::
if [ "$(whoami)" != "root" ]; then
echo -e '\E[41m'"\033[1mYou must be root to run this script\033[0m"
**su - # at this line I want to login as root but it is not working**
exit 1
fi
sleep 1
if [ "$(pwd)" != "/root" ]; then
echo -e '\E[41m'"\033[1mCopy this script to /root & then try again\033[0m"
cd /root
exit 1
fi
sleep 1
echo -e '\E[36;45m'"\033[1mDownloading Flash Player from ftp.etilizepak.com\033[0m"
sleep 2
wget -vcxr ftp://soft:S0ft\!#ftp.abc.com/ubuntu/ubuntu\ 12.04/adobe-flashplugin=/install_flash_player_11_linux.i386.tar.gz
cd ftp.abc.com/ubuntu/ubuntu\ 12.04/adobe-flashplugin/
sleep 1
echo -e '\E[42m'"\033[1mUnzipping .tar File...\033[0m"
sleep 1
tar -xzf install_flash_player_11_linux.i386.tar.gz
echo "Unzipping Compeleted"
sleep 2
echo -e '\E[42m'"\033[1mCopying libflashplayer.so\033[0m"
cp libflashplayer.so /usr/lib/mozilla/plugins/
:::::::::::::::END:::::::::::::::::::::

I'm not sure if I understand your question but I suppose you want to run something inside you script with root privileges - then you shuold use "sudo" command.
You can also suppress the password prompt, this can be configured in sudoers" configuration file.
Some more info here:
https://unix.stackexchange.com/questions/35338/su-vs-sudo-s-vs-sudo-bash
Shell script calls sudo; how do I suppress the password prompt
There is tons of examples, google something like "linux sudo examples" and you will get lots of examples how to use su, sudo ans sudoers commands.

According to your comments to my previous answer, here is how i do it:
There are two files in the same directory:
-rwx------ 1 root root 19 Sep 10 13:04 test2.sh
-rwxrwxrwx 1 root root 29 Sep 10 13:06 test.sh
File test.sh:
#!/bin/bash
# put your message here
su -c ./test2.sh
File test2.sh:
#!/bin/bash
echo You run as:
whoami
# put your code here
Result:
> ./test.sh
Password:****
You run as:
root
If you want to suppress the password prompt for this script only, replace "su -c" with "sudo" and configure sudoers file according to insctructions from here: https://askubuntu.com/questions/155791/how-do-i-sudo-a-command-in-a-script-without-being-asked-for-a-password

What is the equivalent command for objdump in IBM AIX

I am not able to find objdump command in IBM AIX 5.1 machine. Actually I want to get the assembly instructions (disassemble) from a library generated in AIX. Linux has objdump command and solaris dis command to do this. What is the equivalent command in IBM AIX?

You can use the dis command to disassemble object files on AIX, it should come with xlc.
It may be easier to install the GNU bintools suite to just get objdump though. Its available from the AIX linux toolbox.

I have only part of an answer. Following up on #CoreyStup, I found the dis command in /opt/IBM/xlc/16.1.0/exe/dis (not the bin directory). But it was very recalcitrant, and seemed unwilling to print to stdout or stderr. I did find it was writing the output a filename created by replacing the .o on the command line with .s. So:
% /opt/IBM/xlc/16.1.0/exe/dis aix/ktraceback.o
% ls -l aix/ktraceback.s
-rw-r--r-- 1 ota staff 10432 Nov 19 14:01 aix/ktraceback.s
% /opt/IBM/xlc/16.1.0/exe/dis -o /tmp/foo.s aix/ktraceback.o
% ls /tmp/foo.s
-rw-r--r-- 1 ota staff 10432 Nov 19 14:06 /tmp/foo.s
Using strings -a -n2, I was able to extract a possible usage message, but it was unclear what most of the options do, with the exception of -o.
dis disassembler version 1.27.0.1 Nov 9 2018 08:18:36
%s [-D] [-G] [-g] [-h] [-i] [-k] [-L] [-l] [-M] [-m <architecture>]
[-o <file name>] [-p <level>] [-r] [-R] [-S] [-T] [-t] [ filename ]
-D
disassemble .data and .bss only
-G
do not print symbolic debugging information
-g
print symbolic debugging information (default)
-H
print BO branch hints
-h
print headers
-i
line input mode
-k
do not interpret traceback table
-L
print linker section
-l
print line number table
-M
print text maps
-e
print except entries
-m
force architecture selection:
pwr|pwrx|pwr2|pwr2s|p2sc|com|403|601|602|603|603e|604|604e|620|
ppc|ppcgr|ppc64|rs64a|rs64b|rs64c|pwr3|pwr4|pwr4x|pwr5|pwr5x|
pwr6|pwr6e|pwr7|pwr8|pwr9|[ppc]970|440|440d|450|450d
-o
output to file
-p
print level
-R
print relative offsets (no added labels)
-r
print relocation table
-S
suppress printing symbolic definitions
-T
disassemble .text only
-t
print symbol table

Check the total content size of a tar gz file

How can I extract the size of the total uncompressed file data in a .tar.gz file from command line?

This works for any file size:
zcat archive.tar.gz | wc -c
For files smaller than 4Gb you could also use the -l option with gzip:
$ gzip -l compressed.tar.gz
compressed uncompressed ratio uncompressed_name
132 10240 99.1% compressed.tar

This will sum the total content size of the extracted files:
$ tar tzvf archive.tar.gz | sed 's/ \+/ /g' | cut -f3 -d' ' | sed '2,$s/^/+ /' | paste -sd' ' | bc
The output is given in bytes.
Explanation: tar tzvf lists the files in the archive in verbose format like ls -l. sed and cut isolate the file size field. The second sed puts a + in front of every size except the first and paste concatenates them, giving a sum expression that is then evaluated by bc.
Note that this doesn't include metadata, so the disk space taken up by the files when you extract them is going to be larger - potentially many times larger if you have a lot of very small files.

The command gzip -l archive.tar.gz doesn't work correctly with file sizes greater than 2Gb. I would recommend zcat archive.tar.gz | wc --bytes instead for really large files.

I know this is an old answer; but I wrote a tool just for this two years ago. It’s called gzsize and it gives you the uncompressed size of a gzip'ed file without actually decompressing the whole file on disk:
$ gzsize <your file>

Use the following command:
tar -xzf archive.tar.gz --to-stdout|wc -c

I'm finding everything sites in the web, and don't resolve this problem the get size when file size is bigger of 4GB.
first, which is most faster?
[oracle#base tmp]$ time zcat oracle.20180303.030001.dmp.tar.gz | wc -c
6667028480
real 0m45.761s
user 0m43.203s
sys 0m5.185s
[oracle#base tmp]$ time gzip -dc oracle.20180303.030001.dmp.tar.gz | wc -c
6667028480
real 0m45.335s
user 0m42.781s
sys 0m5.153s
[oracle#base tmp]$ time tar -tvf oracle.20180303.030001.dmp.tar.gz
-rw-r--r-- oracle/oinstall 111828 2018-03-03 03:05 oracle.20180303.030001.log
-rw-r----- oracle/oinstall 6666911744 2018-03-03 03:05 oracle.20180303.030001.dmp
real 0m46.669s
user 0m44.347s
sys 0m4.981s
definitely, tar -xvf is the most faster, but
¿how to cancel executions after get header?
my solution is this:
[oracle#base tmp]$ time echo $(timeout --signal=SIGINT 1s tar -tvf oracle.20180303.030001.dmp.tar.gz | awk '{print $3}') | grep -o '[[:digit:]]*' | awk '{ sum += $1 } END { print sum }'
6667023572
real 0m1.005s
user 0m0.013s
sys 0m0.066s

A tar file is uncompressed until/unless it is filtered through another program, such as gzip, bzip2, lzip, compress, lzma, etc. The file size of the tar file is the same as the extracted files, with probably less than 1kb of header info added in to make it a valid tarball.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to check if a Unix .tar.gz file is a valid file without uncompressing? - gzip

I have found the question How to determine if data is valid tar file without a file?, but I was wondering: is there a ready made command line solution?

You can also check contents of *.tag.gz file using pigz (parallel gzip) to speedup the archive check: pigz -cvdp number_of_threads /[...]path[...]/archive_name.tar.gz | tar -tv > /dev/null

I have tried the following command and they work well. bzip2 -t file.bz2 gunzip -t file.gz However, we can found these two command are time-consuming. Maybe we need some more quick way to determine the intact of the compress files.

These are all very sub-optimal solutions. From the GZIP spec ID2 (IDentification 2) These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139 (0x8b, \213), to identify the file as being in gzip format. Has to be coded into whatever language you're using.

Related

gitlab backup: make gitlab-rake produce tar.gz files not tar

Count number of files in directory then scp transfer a certain range such as 21404-42806

Login via Shell Script

What is the equivalent command for objdump in IBM AIX

Check the total content size of a tar gz file

Categories

Resources