I have a .txt file and on each line is a different file location e.g.
file1.zip
file2.zip
file3.zip
How can I open that file, loop through each line and rm -f filename on each one?
Also, will deleting it throw an error if the file doesn't exist (has already been deleted) and if so how can I avoid this?
EDIT: The file names may have spaces in them, so this needs to be catered for as well.
You can use a for loop with cat to iterate through the lines:
IFS=$'\n'; \
for file in `cat list.txt`; do \
if [ -f $file ]; then \
rm -f "$file"; \
fi; \
done
The if [ -f $file ] will check if the file exists and is a regular file (not a directory). If the check fails, it will skip it.
The IFS=$'\n' at the top will set the delimiter to be newlines-only; This will allow you to process files with whitespace.
xargs -n1 echo < test.txt
Replace 'echo' with rm -f or any other command. You can also use cat test.txt |
'man xargs' for more info.
Related
Suppose these are the files:
folder1/11.txt
folder1/12.txt
folder1/levela/11a1.txt
folder1/levela/11a2.txt
folder1/levela/levelb/11b1.txt
folder1/levela/levelb/11b2.txt
folder2/21.txt
folder2/22.txt
folder2/levela/21a1.txt
folder2/levela/21a2.txt
folder2/levela/levelb/21b1.txt
folder2/levela/levelb/21b2.txt
folder3/a/b/c/d/e/deepfile1.txt
folder3/a/b/c/d/e/deepfile2.txt
Is there a way (for example using ls, find or grep or any gnuwin32 commands) to show the 1st file from every subfolder please?
Desired output:
folder1/11.txt
folder1/levela/11a1.txt
folder1/levela/levelb/11b1.txt
folder2/21.txt
folder2/levela/21a1.txt
folder2/levela/levelb/21b1.txt
folder3/a/b/c/d/e/deepfile1.txt
Thank you.
Suggesting this solution:
find -type f -printf "%p %h\n"|sort --key 2.1,1.1|uniq --skip-fields=1|awk '{print $1}'
Explanation:
find -type -printf "%p %n\n"
This find command search for all regular files under current directory.
And print for each file. Files' relative path, (space), and files' relative folder.
Suggesting to run this command on your directory.
sort --key 2.1,1.1
Sort the files list lexicography, from 2nd field than 1st field
Result in all files are sorted per their specific directory
Suggesting to try this:
find -type f -printf "%p %h\n"|sort --key 2.1,1.1
uniq --skip-fields=1
From the sorted files list.
Remove those lines having duplicate directory (field #2)
awk '{print $1}'
Print only first field, the relative files path.
A bash script:
script.sh
#!/bin/bash
declare -A filesArr # declare assiciate array for files in directories
for currFile in $(find "$1" -type f); do # main loop scan all files undre $1
currDir=$(dirname "$currFile") # get the curret file's directory
if [[ -z ${filesArr["$currDir"]} ]]; then # if current directory is not stored in filesArr
filesArr[$currDir]="$currFile" # store the directory with curren file
fi
if [[ ${filesArr["$currDir"]} > "$currFile" ]]; then # if current file < stored file in array
filesArr[$currDir]="$currFile" # set the stored file to be current file
fi
done
for currFile in ${filesArr[#]}; do # loop over array to output each directory
echo "$currFile"
done
Running script.sh on /tmp folder
chmod a+x script.sh
./script.sh /tmp
BTW: answer below with sort and uniq is much faster.
I would like to process multiple .gz files with gawk.
I was thinking of decompressing and passing it to gawk on the fly
but I have an additional requirement to also store/print the original file name in the output.
The thing is there's 100s of .gz files with rather large size to process.
Looking for anomalies (~0.001% rows) and want to print out the list of found inconsistencies ALONG with the file name and row number that contained it.
If I could have all the files decompressed I would simply use FILENAME variable to get this.
Because of large quantity and size of those files I can't decompress them upfront.
Any ideas how to pass filename (in addition to the gzip stdout) to gawk to produce required output?
Assuming you are looping over all the files and piping their decompression directly into awk something like the following will work.
for file in *.gz; do
gunzip -c "$file" | awk -v origname="$file" '.... {print origname " whatever"}'
done
Edit: To use a list of filenames from some source other than a direct glob something like the following can be used.
$ ls *.awk
a.awk e.awk
$ while IFS= read -d '' filename; do
echo "$filename";
done < <(find . -name \*.awk -printf '%P\0')
e.awk
a.awk
To use xargs instead of the above loop will require the body of the command to be in a pre-written script file I believe which can be called with xargs and the filename.
this is using combination of xargs and sh (to be able to use pipe on two commands: gzip and awk):
find *.gz -print0 | xargs -0 -I fname sh -c 'gzip -dc fname | gawk -v origfile="fname" -f printbadrowsonly.awk >> baddata.txt'
I'm wondering if there's any bad practice with the above approach…
I've been having problems with multiple hidden infected PHP files which are encrypted (ClamAV can't see them) in my server.
I would like to know how can you run an SSH command that can search all the infected files and edit them.
Up until now I have located them by the file contents like this:
find /home/***/public_html/ -exec grep -l '$tnawdjmoxr' {} \;
Note: $tnawdjmoxr is a piece of the code
How do you locate and remove this code inside all PHP files in the directory /public_html/?
You can add xargs and sed:
find /home/***/public_html/ -exec grep -l '$tnawdjmoxr' {} \; | xargs -d '\n' -n 100 sed -i 's|\$tnawdjmoxr||g' --
You may also use sed immediately than using grep -but- it can alter the modification time of that file and may also give some unexpected modifications like perhaps some line endings, etc.
-d '\n' makes it sure that every argument is read line by line. It's helpful if filenames has spaces on it.
-n 100 limits the number of files that sed would process in one instance.
-- makes sed recognize filenames starting with a dash. It's also commendable that grep would have it: grep -l -e '$tnawdjmoxr' -- {} \;
File searching may be faster with grep -F.
sed -i enables inline editing.
Besides using xargs it would also be possible to use Bash:
find /home/***/public_html/ -exec grep -l '$tnawdjmoxr' {} \; | while IFS= read -r FILE; do sed -i 's|\$tnawdjmoxr||g' -- "$FILE"; done
while IFS= read -r FILE; do sed -i 's|\$tnawdjmoxr||g' -- "$FILE"; done < <(exec find /home/***/public_html/ -exec grep -l '$tnawdjmoxr' {} \;)
readarray -t FILES < <(exec find /home/***/public_html/ -exec grep -l '$tnawdjmoxr' {} \;)
sed -i 's|\$tnawdjmoxr||g' -- "${FILES[#]}"
I am new to awk and shell based programming. I have a bunch of files name file_0001.dat, file_0002.dat......file_1000.dat. I want to change the file names such as the number after file_ will be a multiple of 4 in comparison to previous file name. SO i want to change
file_0001.dat to file_0004.dat
file_0002.dat to file_0008.dat
and so on.
Can anyone suggest a simple script to do it. I have tried the following but without any success.
#!/bin/bash
a=$(echo $1 sed -e 's:file_::g' -e 's:.dat::g')
b=$(echo "${a}*4" | bc)
shuf file_${a}.dat > file_${b}.dat
This script will do that trick for you:
#!/bin/bash
for i in `ls -r *.dat`; do
a=`echo $i | sed 's/file_//g' | sed 's/\.dat//g'`
almost_b=`bc -l <<< "$a*4"`
b=`printf "%04d" $almost_b`
rename "s/$a/$b/g" $i
done
Files before:
file_0001.dat file_0002.dat
Files after first execution:
file_0004.dat file_0008.dat
Files after second execution:
file_0016.dat file_0032.dat
Here's a pure bash way of doing it (without bc, rename or sed).
#!/bin/bash
for i in $(ls -r *.dat); do
prefix="${i%%_*}_"
oldnum="${i//[^0-9]/}"
newnum="$(printf "%04d" $(( 10#$oldnum * 4 )))"
mv "$i" "${prefix}${newnum}.dat"
done
To test it you can do
mkdir tmp && cd $_
touch file_{0001..1000}.dat
(paste code into convert.sh)
chmod +x convert.sh
./convert.sh
Using bash/sed/find:
files=$(find -name 'file_*.dat' | sort -r)
for file in $files; do
n=$(sed 's/[^_]*_0*\([^.]*\).*/\1/' <<< "$file")
let n*=4
nfile=$(printf "file_%04d.dat" "$n")
mv "$file" "$nfile"
done
ls -r1 | awk -F '[_.]' '{printf "%s %s_%04d.%s\n", $0, $1, 4*$2, $3}' | xargs -n2 mv
ls -r1 list file in reverse order to avoid conflict
the second part will generate new filename. For example: file_0002.dat will become file_0002.dat file_0008.dat
xargs -n2 will pass two arguments every time to mv
This might work for you:
paste <(seq -f'mv file_%04g.dat' 1000) <(seq -f'file_%04g.dat' 4 4 4000) |
sort -r |
sh
This can help:
#!/bin/bash
for i in `cat /path/to/requestedfiles |grep -o '[0-9]*'`; do
count=`bc -l <<< "$i*4"`
echo $count
done
I am writing simple Script which displays regular files in a directory.
#!/bin/bash
for FILE in "$#"
do
if [ -f "$FILE" ]
then
ls -l "$FILE"
fi
done
Even though my directory have 2 files, this script is not showing anything.
Can some one please what is wrong in my script?
why dont you go for simple command like :
ls -p|grep -v /
coming to your issue :
#!/bin/bash
for FILE in "$#"
do
if [ -f "$FILE" ]
then
ls -l "$FILE"
fi
done
try this
for FILE in $#/*
instead of
for FILE in "$#"