Augmentation of a path using awk and/or sed commands - awk

I'm an awk and sed newbie. I have the following string in a file
mount --bind /vsepr_app_repo/fedora/20/plone/4.3.4/Plone/buildout-cache/downloads /buildout-cache/downloads
and I want to produce the following output from it:
mount --bind /vsepr_app_repo/fedora/20/plone/4.3.4/Plone/buildout-cache/downloads /Plone/buildout-cache/downloads
How can I do that using sed and awk commands in a shell script?
I want to repeat the same operations on many lines of my file.
Any suggestion would help me a lot.

Without knowing a few more details, the following awk command is a start.
$ cat data
mount --bind /vsepr_app_repo/fedora/20/plone/4.3.4/Plone/buildout-cache/downloads /buildout-cache/downloads
$ awk '/^mount.*buildout-cache.downloads/ { $NF = "/Plone" $NF; print }' < data
mount --bind /vsepr_app_repo/fedora/20/plone/4.3.4/Plone/buildout-cache/downloads /Plone/buildout-cache/downloads
It will prefix the last token on the line with /Plone for any lines that start with mount and end with buildout-cache/downloads.
The /^mount.*buildout-cache.downloads/ part makes the block apply to lines that match the regular expression. The command block uses $NF which is a reference to the last field on the line, prepends it with "/Plone", and then prints the entire line out.

for the generic folder name in path
sed 's#\(/[^/]\{1,\}\)\(/.\{1,\}\)\([[:space:]]\{1,\}\)\2$#\1\2\3\1\2#' YourFile
based on last path as pattern on first path

Related

Git URL - Pull out substring via Shell (awk & sed)?

I have got the following URL:
https://xcg5847#git.rz.bankenit.de/scm/smat/sma-mes-test.git
I need to pull out smat-mes-test and smat:
git config --local remote.origin.url|sed -n 's#.*/\([^.]*\)\.git#\1#p'
sma-mes-test
This works. But I also need the project name, which is smat
I am not really familiar to complex regex and sed, I was able to find the other command in another post here. Does anyone know how I am able to extract the smat value here?
With your shown samples please try following awk code. Simple explanation would be, setting field separator(s) as / and .git for all the lines and in main program printing 3rd last and 3nd last elements from the line.
your_git_command | awk -F'/|\\.git' '{print $(NF-2),$(NF-1)}'
Your sed is pretty close. You can just extend it to capture 2 values and print them:
git config --local remote.origin.url |
sed -E 's~.*/([^/]+)/([^.]+)\.git$~\1 \2~'
smat sma-mes-test
If you want to populate shell variable using these 2 values then use this read command in bash:
read v1 v2 < <(git config --local remote.origin.url |
sed -E 's~.*/([^/]+)/([^.]+)\.git$~\1 \2~')
# check variable values
declare -p v1 v2
declare -- v1="smat"
declare -- v2="sma-mes-test"
Using sed
$ sed -E 's#.*/([^/]*)/#\1 #' input_file
smat sma-mes-test.git
I would harness GNU AWK for this task following way, let file.txt content be
https://xcg5847#git.rz.bankenit.de/scm/smat/sma-mes-test.git
then
awk 'BEGIN{FS="/"}{sub(/\.git$/,"",$NF);print $(NF-1),$NF}' file.txt
gives output
smat sma-mes-test
Explanation: I instruct GNU AWK that field separator is slash character, then I replace .git (observe that . is escaped to mean literal dot) adjacent to end ($) in last field ($NF), then I print 2nd from end field ($(NF-1)) and last field ($NF), which are sheared by space, which is default output field separator, if you wish to use other character for that purpose set OFS (output field separator) in BEGIN. If you want to know more about NF then read 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR
(tested in gawk 4.2.1)
Why not sed 's!.*/\(.*/.*\)!\1!'?
string=$(config --local remote.origin.url | tail -c -21)
var1=$(echo "${string}" | cut -d'/' -f1)
var2=$(echo "${string}" | cut -d'/' -f2 | sed s'#\.git##')
If you have multiple urls with variable lengths, this will not work, but if you only have the one, it will.
var1=smat
var2=sma-mes-test.git
If I did have something variable, personally I would replace all of the forward slashes with carriage returns, throw them into a file, and then export the last and second last lines with ed, which would give me the two last segments of the url.
Regular expressions literally give me a migraine headache, but as long as I can get everything on its' own line, I can quite easily bypass the need for them entirely.

Check if all multiple strings exist in one line

I have a file that have this info
IRE_DRO_Fabric_A drogesx0112_IRE_DRO_A_ISIL03_091_871
IRE_DRO_Fabric_A drogesx0112_IRE_DRO_A_NETAPP_7890_2D5_1D8
IRE_DRO_Fabric_A drogesx0112_SAN_A
IRE_DRO_Fabric_B drogesx0112_IRE_DRO_B_ISIL03_081_873
IRE_DRO_Fabric_B drogesx0112_IRE_DRO_B_NETAPP_7890_9D3_2D8
IRE_DRO_Fabric_B drogesx0112_SAN_B
and wanted to check if multiple string were found per line. Tried this command but it's not working. Not sure if it's possible for the current text type?
grep 'drogesx0112.*ISIL03_091_871\|ISIL03_091_871.*drogesx0112' file << tried this but not working
grep 'drogesx0112' file | grep 'ISIL03_091_871' << tried this but not working
Looking for this output (I'm actually looking for string1(drogesx0112) and string2(ISIL03_091_871)
>grep 'drogesx0112.*ISIL03_091_871\|ISIL03_091_871.*drogesx0112' file # command
>IRE_DRO_Fabric_A drogesx0112_IRE_DRO_A_ISIL03_091_871 < output
so it's like i wanted to check if drogesx0112 and ISIL03_091_871 are present in a single line in a file.
Simple awk
$ awk ' /drogesx0112/ && /ISIL03_091_871/ ' gafm.txt
IRE_DRO_Fabric_A drogesx0112_IRE_DRO_A_ISIL03_091_871
$
Simple Perl
$ perl -ne ' print if /drogesx0112/ and /ISIL03_091_871/ ' gafm.txt
IRE_DRO_Fabric_A drogesx0112_IRE_DRO_A_ISIL03_091_871
$
If you are not looking for any order and simply want to check if both strings are present in a single line or not then try following.
awk '/drogesx0112/ && /ISIL03_091_871/' Input_file
In case you are looking for sequence of strings in line:
If your line has drogesx0112 first and then ISIL03_091_871 then try following.
awk '/drogesx0112.*ISIL03_091_871/' Input_file
If your line has ISIL03_091_871 first and then drogesx0112 then try following.
awk '/ISIL03_091_871.*drogesx0112/' Input_file
This might work for you (GNU sed):
sed '/drogesx0112/!d;/ISIL03_091_871/!d' file
Delete the current line if it does not contain drogesx0112 and delete it if does not contain ISIL03_091_871 too.
Another way:
sed -n '/drogesx0112/{/ISIL03_091_871/p}' file
A third:
sed '/drogesx0112/{/ISIL03_091_871/p};d' file

How do I write a sed or a awk command that finds a pattern and deletes it in a text file

I have the following lines in a text file:
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:FOO.${TAG_NAME}
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:BAR.${TAG_NAME}
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:BAZ.${TAG_NAME}
I want to write a sed or awk command that finds all occurrences of FOO., BAR. and BAZ. in the above lines in a file and deletes these occurrences so that the result looks like this in the end:
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:${TAG_NAME}
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:${TAG_NAME}
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:${TAG_NAME}
Sed one:
sed 's/\b\(FOO\|BAR\|BAZ\)\.//' input_file
When you want to replace to first occurance of :something. in every line, test
sed 's/:[^.]*[.]/:/' inputfile
When this works and you want it replaced in the file without making a backup, you can use the option -i:
sed -i 's/:[^.]*[.]/:/' inputfile
When ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME} might have a :, the pattern is matched at the wrong place. When you are sure that the ${TAG_NAME} is without :, use
sed -r 's/:[^.]*[.]([^:]*)$/:\1/' inputfile
Edit: After the comment of #potong, I replaced '.' with ':' in the replacement strings. I kept the wrong character.
Use one of these Perl one-liners:
# Remove { FOO, BAR or BAZ } and the following '.' :
perl -pe 's/(FOO|BAR|BAZ)[.]//' in_file > out_file
# Remove *anything* between the last ':' (exclusive) and '.' (inclusive) :
perl -pe 's/\A(.+:)[^.]+[.]/$1/' in_file > out_file
The Perl one-liners use these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
The regex uses:
\A : beginning of the line,
(.+:) : capture into variable $1 everything between the first character and the first semicolon,
[^.]+[.] : 1 or more occurrence of any character other than '.', followed by '.'. It is surrounded by brackets to match a literal dot: [.]
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlre: Perl regular expressions (regexes)
perldoc perlre: Perl regular expressions (regexes): Quantifiers; Character Classes and other Special Escapes; Assertions; Capture groups
In awk the variable FS containing a regex can help.
$ cat file
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:FOO.${TAG_NAME}
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:BAR.${TAG_NAME}
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:BAZ.${TAG_NAME}
In the input, we can see that the 3 words, each one with 3 characters (FOO, BAR, BAZ) and followed by dot. We can make a regex for the match inside the FS separator...
FS='[A-Z]{3}\\.'
In the awk manual we can read that "the value of FS may be a string containing any regular expression. In this case, each match in the record for the regular expression separates fields."
https://www.gnu.org/software/gawk/manual/html_node/Regexp-Field-Splitting.html
So we have
awk -v FS='[A-Z]{3}\\.' '{print $1 $2}' file
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:${TAG_NAME}
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:${TAG_NAME}
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:${TAG_NAME}
..without comma: $1 $2
The same result gives the regex with the 3 specific words:
$ awk -v FS='(FOO|BAR|BAZ)\\.' '{print $1 $2}' file
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:${TAG_NAME}
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:${TAG_NAME}
docker pull ${DOCKER_REGISTRY}/{REPOSITORY}/{IMAGE_NAME}:${TAG_NAME}
--

How to use sed/awk to replace the original file and get the following desired output?

I'm writing a bash scrip that would translate one file to another, and am encountering an issue.
Whenever the program sees something like this(......not included):
......Mul(-a1+b2-c3...+f+e)......
change it to:
......M(-a1)*M(b2)*M(-c3)*...*M(f)*M(e)......
the number of the variables in Mul is unknown and there could be multiple occurrence of Mul in the file. There are also other places in the file where + or - appears. And Variables could be one or more characters.
I tried grouping in sed, with a group followed by a "*", but it doesn't seem to be working due to the need of replacing unknown amount of variables.
Here is a sed script that will do it:
:a
s/\(Mul(.[^)]*\)\([+-].\)/\1)*Mul(\2/
ta
s/Mul(+\{0,1\}/M(/g
The trick is to use the test to jump back to the beginning after making a substitution (e.g. "Mul(a+b+c)"=>"Mul(a)*Mul(+b+c)").
$ cat tst.awk
match($0,/Mul\([^()]+\)/) {
tgt = substr($0,RSTART+4,RLENGTH-5)
gsub(/[-+][[:alnum:]]+/,"*M(&)",tgt)
gsub(/\+/,"",tgt)
sub(/^\*/,"",tgt)
print substr($0,1,RSTART-1) tgt substr($0,RSTART+RLENGTH)
}
$ awk -f tst.awk file
......M(-a1)*M(b2)*M(-c3)*M(f)*M(e)......
The above was run on this input file:
$ cat file
......Mul(-a1+b2-c3+f+e)......

Processing of awk with multiple variable from previous processing?

I have a Q's for awk processing, i got a file below
cat test.txt
/home/shhh/
abc.c
/home/shhh/2/
def.c
gthjrjrdj.c
/kernel/sssh
sarawtera.c
wrawrt.h
wearwaerw.h
My goal is to make a full path from splitting sentences into /home/jhyoon/abc.c.
This is the command I got from someone:
cat test.txt | awk '/^\/.*/{path=$0}/^[a-zA-Z]/{printf("%s/%s\n",path,$0);}'
It works, but I do not understand well about how do make interpret it step by step.
Could you teach me how do I make interpret it?
Result :
/home/shhh//abc.c
/home/shhh/2//def.c
/home/shhh/2//gthjrjrdj.c
/kernel/sssh/sarawtera.c
/kernel/sssh/wrawrt.h
/kernel/sssh/wearwaerw.h
What you probably want is the following:
$ awk '/^\//{path=$0}/^[a-zA-Z]/ {printf("%s/%s\n",path,$0)}' file
/home/jhyoon//abc.c
/home/jhyoon/2//def.c
/home/jhyoon/2//gthjrjrdj.c
/kernel/sssh/sarawtera.c
/kernel/sssh/wrawrt.h
/kernel/sssh/wearwaerw.h
Explanation
/^\//{path=$0} on lines starting with a /, store it in the path variable.
/^[a-zA-Z]/ {printf("%s/%s\n",path,$0)} on lines starting with a letter, print the stored path together with the current line.
Note you can also say
awk '/^\//{path=$0; next} {printf("%s/%s\n",path,$0)}' file
Some comments
cat file | awk '...' is better written as awk '...' file.
You don't need the ; at the end of a block {} if you are executing just one command. It is implicit.