bulk renaming files rearranging file names based on delimiter - awk

I have seen questions that are close to this but I have not seen the exact answer I need and can't seem to get my head wrapped around the regex, awk, sed, grep, rename that I would need to make it happen.
I have files in one directory sequentially named from multiple sub directories of a different directory created using find piped to xargs.
Command I used:
find `<dir1>` -name "*.png" | xargs cp -t `<dir2>`
This resulted in the second directory containing duplicate filenames sequentially named as follows:
<name>.png
<name>.png.~1~
<name>.png.~2~
...
<name>.png.~n~
What I would like to do is take all files ending in ~*~ and rename it as follows:
<name>.#.png where the '#" is the number between the "~"s at the end of the file name
Any help would be appreciated.

With Perl's rename (stand alone command):
rename -nv 's/^([^.]+)\.(.+)\.~([0-9]+)~/$1.$3.$2/' *
If everything looks fine remove option -n.

There might be an easier way to this, but here is a small shell script using grep and awk to achieve what you wanted
for i in $(ls|grep ".png."); do
name=$(echo $i|awk -F'png' '{print $1}');
n=$(echo $i|awk -F'~' '{print $2}');
mv $i $name$n.png;
done

Related

How to search/replace a single inline with sed/awk? [duplicate]

This question already has answers here:
Save modifications in place with awk
(7 answers)
Closed 1 year ago.
I have a lot of files, where I would like to edit only those lines that start with private.
It principle I want to
gawk '/private/{gsub(/\//, "_"); gsub(/-/, "_"); print}' filename
but this only prints out the modified part of the file, and not everything.
Question
Does gawk have a way similar to sed -i inplace?
Or is there are much simpler way to do the above woth either sed or gawk?
Just move the final print outside of the filtered pattern. eg:
gawk '/private/{gsub(/\//, "_"); gsub(/-/, "_")} {print}'
usually, that is simplified to:
gawk '/private/{gsub(/\//, "_"); gsub(/-/, "_")}1'
You really, really, really, (emphasis on "really") do not want to use something like sed -i to edit the files "in-place". (I put "in-place" in quotes, because gnu's sed does not edit the files in place, but creates new files with the same name.) Doing so is a recipe for data corruption, and if you have a lot of files you don't want to take that risk. Just write the files into a new directory tree. It will make recovery much simpler.
eg:
d=backup/$(dirname "$filename")
mkdir -p "$d"
awk '...' "$filename" > "$d/$filename"
Consider if you used something like -i which puts backup files in the same directory structure. If you're modifying files in bulk and the process is stopped half-way through, how do you recover? If you are putting output into a separate tree, recovery is trivial. Your original files are untouched and pristine, and there are no concerns if your filtering process is terminated prematurely or inadvertently run multiple times. sed -i is a plague on humanity and should never be used. Don't spread the plague.
GNU awk from 4.1.0 has the in place ability.
And you should put the print outside the reg match block.
Try this:
gawk '/^private/{gsub(/[/-]/, "_");} 1' filename
or, make sure you backed up the file:
gawk -i inplace '/^private/{gsub(/[/-]/, "_");} 1' filename
You forgot the ^ to denote start, you need it to change lines started with private, otherwise all lines contain private will be modified.
And yeah, you can combine the two gsubs with a single one.
The sed command to do the same would be:
sed '/^private/{s/[/-]/_/g;}' filename
Add the -i option when you done testing it.

Using awk to find and replace strings in every file in directory

I have a directory full of output files, with files with names:
file1.out,file2.out,..., fileN.out.
There is a certain key string in each of these files, lets call it keystring. I want to replace every instance of keystring with newstring in all files.
If there was only one file, I know I can do:
awk '{gsub("keystring","newstring",$0); print $0}' file1.out > file1.out
Is there a way to loop through all N files in awk?
You could use find command for the same. Please make sure you run this on a test file first once it works fine then only run it in your actual path(on all your actual files) for safer side. This also needs gawk newer version which has inplace option in it to save output into files itself.
find your_path -type f -name "*.out" -exec awk -i inplace -f myawkProgram.awk {} +
Where your awk program is as follows: as per your shown samples(cat myawkProgram.awk is only to show contents of your awk program here).
cat myawkProgram.awk
{
gsub("keystring","newstring",$0); print $0
}
2nd option would be pass all .out format files into your gawk program itself with -inplace by doing something like(but again make sure you run this on a single test file first and then run actual command for safer side once you are convinced by command):
awk -i inplace '{gsub("keystring","newstring",$0); print $0}' *.out
sed is the most ideal solution for this and so integrating it with find:
find /directory/path -type f -name "*.out" -exec sed -i 's/keystring/newstring/g' {} +
Find files with the extension .out and then execute the sed command on as many groups of the found files as possible (using + with -exec)

Search file contents recursively when know where in file

I am interested in efficiently searching files for content using bash and related tools (eg sed, grep), in the specific case that I have additional information about where in the file the intended content is. For example, I want to replace a particular string in line #3 of each file that contains a specific string on line 3 of the file. Therefore, I don't want to do a recursive grep -r on the whole directory as that would search the entirety of each file, wasting time since I know that the string of interest is on line #3, if it is there. This full-grep approach could be done with grep -rl 'string_to_find_in_files' base_directory_to_search_recursively. Instead I am thinking about using sed -i ".bak" '3s/string_to_replace/string_to_replace_with' files to search only on line #3 of all files recursively in a directory, however sed seems to only be able to take one file as input argument. How can I apply sed to multiple files recursively? find -exec {} \; and find -print0 | xargs -0 seem to be very slow.. Is there a faster method than using find? I can achieve the desired effect very quickly with awk but only on a single directory, it does not seem to me to be recursive, such as using awk 'FNR==3{print $0}' directory/*. Any way to make this recursive? Thanks.
You can use find to have the list of files and feed to sed or awk one by one by xargs
for example, this will print the first lines of all files listed by find.
$ find . -name "*.csv" | xargs -L 1 sed -n '1p'

How to run awk script on multiple files

I need to run a command on hundreds of files and I need help to get a loop to do this:
have a list of input files /path/dir/file1.csv, file2.csv, ..., fileN.csv
need to run a script on all those input files
script is something like: command input=/path/dir/file1.csv output=output1
I have tried things like:
for f in /path/dir/file*.csv; do command, but how do I get to read and write the new file every time?
Thank you....
Try this, (after changing /path/to/data to the correct path. Same with /path/to/awkscript and other places, pointing to your test data.)
#!/bin/bash
cd /path/to/data
for f in *.csv ; do
echo "awk -f /path/to/awkscript \"$f\" > ${f%.csv}.rpt"
#remove_me awk -f /path/to/awkscript "$f" > ${f%.csv}.rpt
done
make the script "executable" with
chmod 755 myScript.sh
The echo version will help you ensure the script is going to work as expected. You still have to carefully examine that output OR work on a copy of your data so you don't wreck you base-line data.
You could take the output of the last iteration
awk -f /path/to/awkscript myFileLast.csv > myFileLast.rpt
And copy/paste to cmd-line to confirm it will work.
WHen you comfortable that the awk script works as you need, then comment out the echo awk .. line, and uncomment the word #remove_me (and save your bash script).
for f in /path/to/files/*.csv ; do
bname=`basename $f`
pref=${bname%%.csv}
awk -f /path/to/awkscript $f > /path/to/store/output/${pref}_new.txt
done
Hopefully this helps, I am on my blackberry so there may be typos

Retain backslashes with while read loop in multiple shells

I have the following code:
#!/bin/sh
while read line; do
printf "%s\n" $line
done < input.txt
Input.txt has the following lines:
one\two
eight\nine
The output is as follows
onetwo
eightnine
The "standard" solutions to retain the slashes would be to use read -r.
However, I have the following limitations:
must run under #!/bin/shfor reasons of portability/posix compliance.
not all systems
will support the -r switch to read under /sh
The input file format cannot be changed
Therefore, I am looking for another way to retain the backslash after reading in the line. I have come up with one working solution, which is to use sed to replace the \ with some other value (e.g.||) into a temporary file (thus bypassing my last requirement above) then, after reading them in use sed again to transform it back. Like so:
#!/bin/sh
sed -e 's/[\/&]/||/g' input.txt > tempfile.txt
while read line; do
printf "%s\n" $line | sed -e 's/||/\\/g'
done < tempfile.txt
I'm thinking there has to be a more "graceful" way of doing this.
Some ideas:
1) Use command substitution to store this into a variable instead of a file. Problem - I'm not sure command substitution will be portable here either and my attempts at using a variable instead of a file were unsuccessful. Regardless, file or variable the base solution is really the same (two substitutions).
2) Use IFS somehow? I've investigated a little, but not sure that can help in this issue.
3) ???
What are some better ways to handle this given my constraints?
Thanks
Your constraints seem a little strict. Here's a piece of code I jotted down(I'm not too sure of how valuable your while loop is for the other stuffs you would like to do, so I removed it off just for ease). I don't guarantee this code to be robustness. But anyway, the logic would give you hints in the direction you may wish to proceed. (temp.dat is the input file)
#!/bin/sh
var1="$(cut -d\\ -f1 temp.dat)"
var2="$(cut -d\\ -f2 temp.dat)"
iter=1
set -- $var2
for x in $var1;do
if [ "$iter" -eq 1 ];then
echo $x "\\" $1
else
echo $x "\\" $2
fi
iter=$((iter+1))
done
As Larry Wall once said, writing a portable shell is easier than writing a portable shell script.
perl -lne 'print $_' input.txt
The simplest possible Perl script is simpler still, but I imagine you'll want to do something with $_ before printing it.