Extract directory path from file path - awk

I have a requirement for getting part of the string which should be read from end of the string. Like below:
a/b/c/d.txt
Now I want to get the output as /a/b/c/ – basically the path of the file. For this, I want the string to be read from the end and where the first / appears, it prints till the first text of the string.

If you have single variable then how about parameter expansion.
Let's say we have following A variable with your provided value.
echo $A
a/b/c/d.txt
Then following could provide you path name for files using parameter expansion.
echo ${A%/*}/
a/b/c/

echo a/b/c/d.txt | awk -F/ '{$NF=""}1' OFS=/
a/b/c/

This should be done with parameter expansion like in #RavinderSingh13's very good answer or dirname as #aragaer suggests, but if you are gung-ho about an awk solution, you could do something like:
echo "a/b/c/d.txt" | awk -F"/" '{ for (f=1;f<NF;f++){printf "%s/", $f}; printf "\n"}'
But that's horrible overkill when you can just echo $(dirname "a/b/c/d.txt")/

Simple sed approach:
echo "a/b/c/d.txt" | sed 's~/[^/]*$~/~'
a/b/c/

Related

Strip last field

My script will be receiving various lengths of input and I want to strip the last field separated by a "/". An example of the input I will be dealing with is.
this/that/and/more
But the issue I am running into is that the length of the input will vary like so:
this/that/maybe/more/and/more
or/even/this/could/be/it/and/maybe/more
short/more
In any case, the expected output should be the whole string minus the last "/more".
Note: The word "more" will not be a constant these are arbitrary examples.
Example input:
this/that/and/more
this/that/maybe/more/and/more
Expected output:
this/that/and
this/that/maybe/more/and
What I know works for a string you know the length of would be
cut -d'/' -f[x]
With what I need is a '/' delimited AWK command I'm assuming like:
awk '{$NF=""; print $0}'
With awk as requested:
$ awk '{sub("/[^/]*$","")} 1' file
this/that/maybe/more/and
or/even/this/could/be/it/and/maybe
short
but this is the type of job sed is best suited for:
$ sed 's:/[^/]*$::' file
this/that/maybe/more/and
or/even/this/could/be/it/and/maybe
short
The above were run against this input file:
$ cat file
this/that/maybe/more/and/more
or/even/this/could/be/it/and/maybe/more
short/more
Depending on how you have the input in your script, bash's Shell Parameter Expansion may be convenient:
$ s1=this/that/maybe/more/and/more
$ s2=or/even/this/could/be/it/and/maybe/more
$ s3=short/more
$ echo ${s1%/*}
this/that/maybe/more/and
$ echo ${s2%/*}
or/even/this/could/be/it/and/maybe
$ echo ${s3%/*}
short
(Lots of additional info on parameter expansion at https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html)
In your script, you could create a loop that removes the last character in the input string if it is not a slash through each iteration. Then, when the loop finds a slash character, exit the loop then remove the final character (which is supposed to be a slash).
Pseudo-code:
while (lastCharacter != '/') {
removeLastCharacter();
}
removeLastCharacter(); # removes the slash
(Sorry, it's been a while since I wrote a bash script.)
Another awk alternative using fields instead of regexs
awk -F/ '{printf "%s", $1; for (i=2; i<NF; i++) printf "/%s", $i; printf "\n"}'
Here is an alternative shell solution:
while read -r path; do dirname "$path"; done < file

removing part of a value from variable using sed

I have the following value from a variable .
manager&org.apache.catalina.filters.CSRF_NONCE=314C54E5671790D592A37C2C4A6B9AAF
I need to modify the above variable to remove &amp from it. SO , the variable should like this
manager&;org.apache.catalina.filters.CSRF_NONCE=314C54E5671790D592A37C2C4A6B9AAF
Please suggest
I put your variable in a file and was able to do what you wanted with sed. The trick to make sure you do not remove any other references to just amp is to include the & as part of the substitution.
$ cat /tmp/file
manager&org.apache.catalina.filters.CSRF_NONCE=314C54E5671790D592A37C2C4A6B9AAF
$ cat /tmp/file | sed 's/\&amp/\&/g'
manager&;org.apache.catalina.filters.CSRF_NONCE=314C54E5671790D592A37C2C4A6B9AAF
sed 's/&/\&;/g' <<<"yourString"
The above line should help.
example:
kent$ sed 's/&/\&;/g'<<< "foo&bar&blah"
foo&;bar&;blah
In bash, you can use Parameter Expansion - Pattern substitution to remove a substring:
VAR='manager&org.apache.catalina.filters.CSRF_NONCE=314C54E5671790D592A37C2C4A6B9AAF'
echo ${VAR/&amp}

Using grep or sed in a foreach loop won't work

I've spent countless hours trying to get this work and I think it's time to get some help. I have a 2-column file - let's call it "result.txt" with a list of values like this:
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileC_2.ext -9.5
fileD.ext -9.4
fileC_3.ext -9.3
I want to recreate this list using only unique results for each file type, so it should look like this:
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileD.ext -9.4
I created a list of files which would be able to do this by using grep or sed to extract the first line containing the matching file:
fileA
fileB
fileC
fileD
We'll call this result2.txt.
I have attempted to write the following c-shell script:
foreach l (`cat result2.txt`)
set name = "$l"
echo "$name"
grep -m1 "$name" result.txt >> result3.txt
end
The output file, "result3.txt" is empty. The script runs perfectly up to the grep command. When I run the grep command outside of the loop, using a line from result2.txt, it works fine. I get the same result using this: sed -n '/"\$name\"/p'
And I think I tried an awk command at some point.
The problem seems to be in getting those programs to recognise the $name or $l variables. I have tried different combinations of " and ' around $name and I have tried adding backslashes: e.g. $\name. Can anyone please tell me what the issue is?
Thanks
Sounds like a job for awk. Use underscore or whitespace as the field separator, and print a line only if the first field has not been seen yet:
awk -F '[_[:space:]]+' '!seen[$1]++' << END
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileC_2.ext -9.5
fileD.ext -9.4
fileC_3.ext -9.3
END
fileA.ext -10.3
fileB.ext -9.8
fileC_1.ext -9.7
fileD.ext -9.4
I've just tried in CSH and both your version and the following simplified version just work. Note, no quotation marks at all.
foreach name (`cat result2.txt`)
grep -m1 $name result.txt >>result3.txt
end
Could you please check whether result.txt really contains what you mentioned at the beginning?
cat result.txt
sed -n 's/.*/²&³/;H
$ {x;s/\(.\).*/&\1/
t again
: again
s/²\([^_]\{1,\}_\)\(.*\)\²\1[^³]*³./²\1\2/
t again
s/.\(.*\)./\1/;s/[²³]//g
p
}' YourFile
Use of 2 temporary delimiter ² and ³ due to limitation in \n manipulation

How to quote a shell variable in a TCL-expect string

I'm using the following awk command in an expect script to get the gateway for a particular destination
route | grep $dest | awk '{print $2}'
However the expect script does not like the $2 in the above statement.
Does anyone know of an alternative to awk to perform the same function as above? ie. output 2nd column.
You can use cut:
route | grep $dest | cut -d \ -f 2
That uses spaces as the field delimiter and pulls out the second field
To answer your Expect question, single quotes have no special meaning to the Tcl parser. You need to use braces to protect the body of the awk script:
route | grep $dest | awk {{print $2}}
And as awk can do what grep does, you can get away with one less process:
route | awk -v d=$dest {$0 ~ d {print $2}}
Before switching to another utility, check if changing field separator worrks. Documentation for field separators in GNU Awk here.
SED is the best alternative to use. If you don't mind a dependency, Perl should also be sufficient to solve the task
Depending on the structure of your data, you can use either cut, or use sed to do both filtering and printing the second column.
Alternatively, you could use Perl:
perl -ne 'if(/foo/) { #_ = split(/:/); print $_[1]; }'
This will print second token of each line containing foo, with : as token separator.

awk split question

I wrote a small script, using awk 'split' command to get the current directory name.
echo $PWD
I need to replace '8' with the number of tokens as a result of the split operation.
// If PWD = /home/username/bin. I am trying to get "bin" into package.
package="`echo $PWD | awk '{split($0,a,"/"); print a[8] }'`"
echo $package
Can you please tell me what do I substitute in place of 'print a[8]' to get the script working for any directory path ?
-Sachin
You don't need awk for that. If you always want the last dir in a path just do:
#!/bin/sh
cur_dir="${PWD##*/}/"
echo "$cur_dir"
The above has the added benefit of not creating any subshells and/or forks to external binaries. It's all native POSIX shell syntax.
You could use print a[length(a)] but it's better to avoid splitting and use custom fields separator and $NF:
echo $PWD | awk -F/ '{print $NF}'
But in that specific case you should rather use basename:
basename "$PWD"
The other answers are better replacements to perform the function you're trying to accomplish. However, here is the specific answer to your question:
package=$(echo $PWD | awk '{n = split($0,a,"/"); print a[n] }')
echo "$package"
split() returns the number of resulting elements.