extract part of column to make cp command [duplicate]

extract part of column to make cp command [duplicate] - awk

This question already has answers here:
Bash One Liner: copy template_*.txt to foo_*.txt?
(8 answers)
Closed 3 years ago.
I wan to create copy command to copy files from one directory to just back of it with removing suffix date. There are multiple files are there.
eg file LOAN.DAILY.20191204
want to create command
cp LOAN.DAILY.20191204 ../LOAN.DAILY
My attempt
ls -lrt | awk ' /DAILY/{ print "cp " , $9 , "../" , sub(/\.20191204$/,""); $9 }'
getting output
cp LOAN.DAILY.20191204 ../ 1
why this 1 is coming

This might work for you (GNU sed):
ls *DAILY* | sed -E 's#^(.*)\..*#cp & \1#'
and once the output has been checked use this version to enact the copy.
ls *DAILY* | sed -E 's#^(.*)\..*#cp & \1#e'
or an alternative using GNU parallel:
parallel --dry-run cp {} {.} ::: *DAILY*
again, check the result and if all ok, use:
parallel cp {} {.} ::: *DAILY*

One simple way:
shopt -s nullglob
for file in *.DAILY.* ; do cp "$file" ../"${file%.*}"; done
shopt -s nullglob: To avoid any unecessary copies in case the glob doesn't get a match.
"${file%.*}": Shell's parameter expansion to strip off the everything from strings's end till the first matched . in reverse direction.
I can't recall better and shorter ways to do this, although I suppose there are many.

According to https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html:
As mentioned, the third argument to sub() must be a variable, field, or array element. Some versions of awk allow the third argument to be an expression that is not an lvalue. In such a case, sub() still searches for the pattern and returns zero or one, but the result of the substitution (if any) is thrown away because there is no place to put it.
This explains why you get a 1 in the output.
If you want to modify the value of the ninth column you need to specify it in the sub call:
ls -lrt | awk ' /DAILY/{ orig=$9; sub(/\.20191204$/,"", $9); print "cp " , orig , "../", $9 }'
In this command, the original value of $9 is stored in a variable orig, then the date suffix is removed using sub, and finally the cp command is constructed using the old and new values.

Related

Batch renaming files with text from file as a variable

I am attempting to convert the files with the titles {out1.hmm, out2.hmm, ... , outn.hmm} to unique identifiers based on the third line of the file {PF12574.hmm, PF09847.hmm, PF0024.hmm} The script works on a single file however the variable does not get overwritten and only one file remains after running the command below:
for f in *.hmm;
do output="$(sed -n '3p' < $f |
awk -F ' ' '{print $2}' |
cut -f1 -d '.' | cat)" |
mv $f "${output}".hmm; done;
The first line calls all the outn.hmms as an input. The second line sets a variable to return the desired unique identifier. SED, AWK, and CUT are used to get the unique identifier. The variable supposed to rename the current file by the unique identifier, however the variable remains locked and overwrites the previous file.
out1.hmm out2.hmm out3.hmm becomes PF12574.hmm
How can I overwrite the variable to get the following file structure:
out1.hmm out2.hmm out3.hmm becomes PF12574.hmm PF09847.hmm PF0024.hmm

You're piping the empty output of the assignment statement (to the variable named "output") into the mv command. That variable is not set yet, so what I think will happen is that you will - one after the other - rename all the files that match *.hmm to the file named ".hmm".
Try ls -a to see if that's what actually happened.
The sed, awk, cut, and (unneeded) cat are a bit much. awk can do all you need. Then do the mv as a separate command:
for f in *.hmm
do
output=$(awk 'NR == 3 {print $2}' "$f")
mv "$f" "${output%.*}.hmm"
done
Note that the above does not do any checking to verify that output is assigned to a reasonable value: one that is non-empty, that is a proper "identifier", etc.

Find a word in a text file and replace it with the filename

I have a lot of text files in which I would like to find the word 'CASE' and replace it with the related filename.
I tried
find . -type f | while read file
do
awk '{gsub(/CASE/,print "FILENAME",$0)}' $file >$file.$$
mv $file.$$ >$file
done
but I got the following error
awk: syntax error at source line 1 context is >>> {gsub(/CASE/,print <<< "CASE",$0)}
awk: illegal statement at source line 1
I also tried
for i in $(ls *);
do
awk '{gsub(/CASE/,${i},$0)}' ${i} > file.txt;
done
getting an empty output and
awk: syntax error at source line 1 context is >>> {gsub(/CASE/,${ <<<
awk: illegal statement at source line 1

Why awk? sed is what you want:
while read -r file; do
sed -i "s/CASE/${file##*/}/g" "$file"
done < <( find . -type f )
or
while read -r file; do
sed -i.bak "s/CASE/${file##*/}/g" "$file"
done < <( find . -type f )
To create a backup of the original.

You didn't post any sample input and expected output so this is a guess but maybe this is what you want:
find . -type f |
while IFS= read -r file
do
awk '{gsub(/CASE/,FILENAME)} 1' "$file" > "${file}.$$" &&
mv "${file}.$$" "$file"
done
Every change I made to the shell code is important so if you don't understand why I changed any part of it, ask the question.
btw if after making the changes you are still getting the error message:
awk: syntax error at source line 1
awk: illegal statement at source line 1
then you are using old, broken awk (/usr/bin/awk on Solaris). Never use that awk. On Solaris use /usr/xpg4/bin/awk (or nawk if you must).
Caveats: the above will fail if your file name contains newlines or ampersands (&) or escaped digits (e.g. \1). See Is it possible to escape regex metacharacters reliably with sed for details. If any of that is a problem, post some representative sample input and expected output.

print in that first script is the error.
The second argument to gsub is the replacement string not a command.
You want just FILENAME. (Note not "FILENAME" that's a literal string. FILENAME the variable.)
find . -type f -print0 | while IFS= read -d '' file
do
awk '{gsub(/CASE/,FILENAME,$0)} 7' "$file" >"$file.$$"
mv "$file.$$" "$file"
done
Note that I quoted all your variables and fixed your find | read pipeline to work correctly for files with odd characters in the names (see Bash FAQ 001 for more about that). I also fixed the erroneous > in the mv command.
See the answers on this question for how to properly escape the original filename to make it safe to use in the replacement portion of gsub.
Also note that recent (4.1+ I believe) versions of awk have the -i inplace argument.
To fix the second script you need to add the quotes you removed from the first script.
for i in *; do awk '{gsub(/CASE/,"'"${i}"'",$0)}' "${i}" > file.txt; done
Note that I got rid of the worse than useless use of ls (worse than useless because it actively breaks files with spaces or shell metacharacters in the their names (see Parsing ls for more on that).
That command though is somewhat ugly and unsafe for filenames with various characters in them and would be better written as the following though:
for i in *; do awk -v fname="$i" '{gsub(/CASE/,fname,$0)}' "${i}" > file.txt; done
since that will work with filenames with double quotes/etc. in their names correctly whereas the direct variable expansion version will not.
That being said the corrected first script is the right answer.

Using grep or sed in a foreach loop won't work

I've spent countless hours trying to get this work and I think it's time to get some help. I have a 2-column file - let's call it "result.txt" with a list of values like this:
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileC_2.ext -9.5
fileD.ext -9.4
fileC_3.ext -9.3
I want to recreate this list using only unique results for each file type, so it should look like this:
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileD.ext -9.4
I created a list of files which would be able to do this by using grep or sed to extract the first line containing the matching file:
fileA
fileB
fileC
fileD
We'll call this result2.txt.
I have attempted to write the following c-shell script:
foreach l (`cat result2.txt`)
set name = "$l"
echo "$name"
grep -m1 "$name" result.txt >> result3.txt
end
The output file, "result3.txt" is empty. The script runs perfectly up to the grep command. When I run the grep command outside of the loop, using a line from result2.txt, it works fine. I get the same result using this: sed -n '/"\$name\"/p'
And I think I tried an awk command at some point.
The problem seems to be in getting those programs to recognise the $name or $l variables. I have tried different combinations of " and ' around $name and I have tried adding backslashes: e.g. $\name. Can anyone please tell me what the issue is?
Thanks

Sounds like a job for awk. Use underscore or whitespace as the field separator, and print a line only if the first field has not been seen yet:
awk -F '[_[:space:]]+' '!seen[$1]++' << END
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileC_2.ext -9.5
fileD.ext -9.4
fileC_3.ext -9.3
END
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileD.ext -9.4

I've just tried in CSH and both your version and the following simplified version just work. Note, no quotation marks at all.
foreach name (`cat result2.txt`)
grep -m1 $name result.txt >>result3.txt
end
Could you please check whether result.txt really contains what you mentioned at the beginning?
cat result.txt

sed -n 's/.*/²&³/;H
$ {x;s/\(.\).*/&\1/
t again
: again
s/²\([^_]\{1,\}_\)\(.*\)\²\1[^³]*³./²\1\2/
t again
s/.\(.*\)./\1/;s/[²³]//g
p
}' YourFile
Use of 2 temporary delimiter ² and ³ due to limitation in \n manipulation

How do I iterate over all the lines output by a command in zsh?

How do I iterate over all the lines output by a command using zsh, without setting IFS?
The reason is that I want to run a command against every file output by a command, and some of these files contain spaces.
Eg, given the deleted file:
foo/bar baz/gamma
That is, a single directory 'foo', containing a sub directory 'bar baz', containing a file 'gamma'.
Then running:
git ls-files --deleted | xargs ls
Will report in that file being handled as two files: 'foo/bar', and '/baz/gamma'.
I need it to handle it as one file: 'foo/bar baz/gamma'.

If you want to run the command once for all the lines:
ls "${(#f)$(git ls-files --deleted)}"
The f parameter expansion flag means to split the command's output on newlines. There's a more general form (#s:||:) to split at an arbitrary string like ||. The # flag means to retain empty records. Somewhat confusingly, the whole expansion needs to be inside double quotes, to avoid IFS splitting on the output of the command substitution, but it will produce separate words for each record.
If you want to run the command for each line in turn, the portable idiom isn't particularly complicated:
git ls-filed --deleted | while IFS= read -r line; do ls $line; done
If you want to run the command as few times as the command line length limit permits, use zargs.
autoload -U zargs
zargs -- "${(#f)$(git ls-files --deleted)}" -- ls

Using tr and the -0 option of xargs, assuming that the lines don't contain \000 (NUL), which is a fair assumption due to NUL being one of the characters that can't appear in filenames:
git ls-files --deleted | tr '\n' '\000' | xargs -0 ls
this turns the line: foo/bar baz/gamma\n into foo/bar baz/gamma\000 which xargs -0 knows how to handle

awk split question

I wrote a small script, using awk 'split' command to get the current directory name.
echo $PWD
I need to replace '8' with the number of tokens as a result of the split operation.
// If PWD = /home/username/bin. I am trying to get "bin" into package.
package="`echo $PWD | awk '{split($0,a,"/"); print a[8] }'`"
echo $package
Can you please tell me what do I substitute in place of 'print a[8]' to get the script working for any directory path ?
-Sachin

You don't need awk for that. If you always want the last dir in a path just do:
#!/bin/sh
cur_dir="${PWD##*/}/"
echo "$cur_dir"
The above has the added benefit of not creating any subshells and/or forks to external binaries. It's all native POSIX shell syntax.

You could use print a[length(a)] but it's better to avoid splitting and use custom fields separator and $NF:
echo $PWD | awk -F/ '{print $NF}'
But in that specific case you should rather use basename:
basename "$PWD"

The other answers are better replacements to perform the function you're trying to accomplish. However, here is the specific answer to your question:
package=$(echo $PWD | awk '{n = split($0,a,"/"); print a[n] }')
echo "$package"
split() returns the number of resulting elements.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

extract part of column to make cp command [duplicate] - awk

Related

Batch renaming files with text from file as a variable

Find a word in a text file and replace it with the filename

Using grep or sed in a foreach loop won't work

How do I iterate over all the lines output by a command in zsh?

awk split question

Categories

Resources