awk split question

awk split question - awk

I wrote a small script, using awk 'split' command to get the current directory name.
echo $PWD
I need to replace '8' with the number of tokens as a result of the split operation.
// If PWD = /home/username/bin. I am trying to get "bin" into package.
package="`echo $PWD | awk '{split($0,a,"/"); print a[8] }'`"
echo $package
Can you please tell me what do I substitute in place of 'print a[8]' to get the script working for any directory path ?
-Sachin

You don't need awk for that. If you always want the last dir in a path just do:
#!/bin/sh
cur_dir="${PWD##*/}/"
echo "$cur_dir"
The above has the added benefit of not creating any subshells and/or forks to external binaries. It's all native POSIX shell syntax.

You could use print a[length(a)] but it's better to avoid splitting and use custom fields separator and $NF:
echo $PWD | awk -F/ '{print $NF}'
But in that specific case you should rather use basename:
basename "$PWD"

The other answers are better replacements to perform the function you're trying to accomplish. However, here is the specific answer to your question:
package=$(echo $PWD | awk '{n = split($0,a,"/"); print a[n] }')
echo "$package"
split() returns the number of resulting elements.

Related

awk/grep/sed: find multiple patterns at one line in my files

I read a lot here about awk and variables, but could not find what I want.
I have some files ($FILES) in a directory ($DIR) and I want to search in those files for all lines containing: both the 2 strings (SEARCH1 and SEARCH2). Using sh (/bin/bash): I do NOT want to use the read command, so I prefer awk/grep/sed. The wanted output is the line(s) containing the 2 strings and the corresp. file name(s) of the file(s).
When I use this code, everything is ok:
FILES="news_*.txt"
DIR="/news"
awk '/Corona US/&&/Infected/{print a[FILENAME]?$0:FILENAME RS $0;a[FILENAME]++}' ${DIR}/${FILES}
Now I want to replace the 2 patterns ('Corona US' and "Infected') with variables in the awk command and I tried:
SEARCH1="Corona US"
SEARCH2="Infected"
awk -v str1="$SEARCH1" -v str2="$SEARCH2" '/str1/&&/str2/{print a[FILENAME]?$0:FILENAME RS $0;a[FILENAME]++}' ${DIR}/${FILES}
However that did not give me the right output: it came up empty (didn't find anything).

Since you have not shown sample of output so couldn't test it, based on OP's code trying to fix it.
awk -v str1="$SEARCH1" -v str2="$SEARCH2" 'index($0,str1) && index($0,str2){print (seen[FILENAME]++ ? "" : FILENAME ORS) $0;a[FILENAME]++}' ${DIR}/${FILES}
OR
awk -v str1="$SEARCH1" -v str2="$SEARCH2" '$0 ~ str1 && $0 ~ str2{print (seen[FILENAME]++ ? "" : FILENAME ORS) $0;a[FILENAME]++}' ${DIR}/${FILES}
OP's code issue: We can't search variables inside /var/ in should be used like index or $0 ~ str style.

It isn't 100% clear exactly what you are looking for, but it sounds like grep -H with an alternate pattern would allow you to output the filename and the line that matches $SEARCH1 or $SEARCH2 anywhere in the line. For example, you could do:
grep -H "$SEARCH1.*$SEARCH2\|$SEARCH2.*$SEARCH1" "$DIR/"$FILES
(note $FILES must NOT be quoted in order for * expansion to take place.)
If you just want a list of filenames that contain a match on any line, you can change -H to -l.

Extract directory path from file path

I have a requirement for getting part of the string which should be read from end of the string. Like below:
a/b/c/d.txt
Now I want to get the output as /a/b/c/ – basically the path of the file. For this, I want the string to be read from the end and where the first / appears, it prints till the first text of the string.

If you have single variable then how about parameter expansion.
Let's say we have following A variable with your provided value.
echo $A
a/b/c/d.txt
Then following could provide you path name for files using parameter expansion.
echo ${A%/*}/
a/b/c/

echo a/b/c/d.txt | awk -F/ '{$NF=""}1' OFS=/
a/b/c/

This should be done with parameter expansion like in #RavinderSingh13's very good answer or dirname as #aragaer suggests, but if you are gung-ho about an awk solution, you could do something like:
echo "a/b/c/d.txt" | awk -F"/" '{ for (f=1;f<NF;f++){printf "%s/", $f}; printf "\n"}'
But that's horrible overkill when you can just echo $(dirname "a/b/c/d.txt")/

Simple sed approach:
echo "a/b/c/d.txt" | sed 's~/[^/]*$~/~'
a/b/c/

Retain backslashes with while read loop in multiple shells

I have the following code:
#!/bin/sh
while read line; do
printf "%s\n" $line
done < input.txt
Input.txt has the following lines:
one\two
eight\nine
The output is as follows
onetwo
eightnine
The "standard" solutions to retain the slashes would be to use read -r.
However, I have the following limitations:
must run under #!/bin/shfor reasons of portability/posix compliance.
not all systems
will support the -r switch to read under /sh
The input file format cannot be changed
Therefore, I am looking for another way to retain the backslash after reading in the line. I have come up with one working solution, which is to use sed to replace the \ with some other value (e.g.||) into a temporary file (thus bypassing my last requirement above) then, after reading them in use sed again to transform it back. Like so:
#!/bin/sh
sed -e 's/[\/&]/||/g' input.txt > tempfile.txt
while read line; do
printf "%s\n" $line | sed -e 's/||/\\/g'
done < tempfile.txt
I'm thinking there has to be a more "graceful" way of doing this.
Some ideas:
1) Use command substitution to store this into a variable instead of a file. Problem - I'm not sure command substitution will be portable here either and my attempts at using a variable instead of a file were unsuccessful. Regardless, file or variable the base solution is really the same (two substitutions).
2) Use IFS somehow? I've investigated a little, but not sure that can help in this issue.
3) ???
What are some better ways to handle this given my constraints?
Thanks

Your constraints seem a little strict. Here's a piece of code I jotted down(I'm not too sure of how valuable your while loop is for the other stuffs you would like to do, so I removed it off just for ease). I don't guarantee this code to be robustness. But anyway, the logic would give you hints in the direction you may wish to proceed. (temp.dat is the input file)
#!/bin/sh
var1="$(cut -d\\ -f1 temp.dat)"
var2="$(cut -d\\ -f2 temp.dat)"
iter=1
set -- $var2
for x in $var1;do
if [ "$iter" -eq 1 ];then
echo $x "\\" $1
else
echo $x "\\" $2
fi
iter=$((iter+1))
done

As Larry Wall once said, writing a portable shell is easier than writing a portable shell script.
perl -lne 'print $_' input.txt
The simplest possible Perl script is simpler still, but I imagine you'll want to do something with $_ before printing it.

Can we use shell variables in awk?

Can we use shell variables in AWK like $VAR instead of $1, $2? For example:
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW)
NUSR=`echo ${UL[*]}|awk -F, '{print NF}'`
echo $NUSR
echo ${UL[*]}|awk -F, '{print $NUSR}'
Actually am an oracle DBA we get lot of import requests. I'm trying to automate it using the script. The script will find out the users in the dump and prompt for the users to which dump needs to be loaded.
Suppose the dumps has two users AKHIL, SWATHI (there can be may users in the dump and i want to import more number of users). I want to import the dumps to new users AKHIL_NEW and SWATHI_NEW. So the input to be read some think like AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW.
First, I need to find the Number of users to be created, then I need to get new users i.e. AKHIL_NEW,SWATHI_NEW from the input we have given. So that I can connect to the database and create the new users and then import. I'm not copying the entire code: I just copied the code from where it accepts the input users.
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW) ## it can be many users like USER1:USER1_NEW,USER2_USER2_NEW,USER3:USER_NEW..
NUSR=`echo ${UL[*]}|awk -F, '{print NF}'` #finding number of fields or users
y=1
while [ $y -le $NUSR ] ; do
USER=`echo ${UL[*]}|awk -F, -v NUSR=$y '{print $NUSR}' |awk -F: '{print $2}'` #getting Users to created AKHIL_NEW and SWATHI_NEW and passing to SQLPLUS
if [[ $USER = SCPO* ]]; then
TBS=SCPODATA
else
if [[ $USER = WWF* ]]; then
TBS=WWFDATA
else
if [[ $USER = STSC* ]]; then
TBS=SCPODATA
else
if [[ $USER = CSM* ]]; then
TBS=CSMDATA
else
if [[ $USER = TMM* ]]; then
TBS=TMDATA
else
if [[ $USER = IGP* ]]; then
TBS=IGPDATA
fi
fi
fi
fi
fi
fi
sqlplus -s '/ as sysdba' <<EOF # CREATING the USERS in the database
CREATE USER $USER IDENTIFIED BY $USER DEFAULT TABLESPACE $TBS TEMPORARY TABLESPACE TEMP QUOTA 0K on SYSTEM QUOTA UNLIMITED ON $TBS;
GRANT
CONNECT,
CREATE TABLE,
CREATE VIEW,
CREATE SYNONYM,
CREATE SEQUENCE,
CREATE DATABASE LINK,
RESOURCE,
SELECT_CATALOG_ROLE
to $USER;
EOF
y=`expr $y + 1`
done
impdp sysem/manager DIRECTORY=DATA_PUMP DUMPFILE=imp.dp logfile=impdp.log SCHEMAS=AKHIL,SWATHI REMPA_SCHEMA=${UL[*]}
In the last impdp command I need to get the original users in the dumps i.e AKHIL,SWATHI using the variables.

Yes, you can use the shell variables inside awk. There are a bunch of ways of doing it, but my favorite is to define a variable with the -v flag:
$ echo | awk -v my_var=4 '{print "My var is " my_var}'
My var is 4
Just pass the environment variable as a parameter to the -v flag. For example, if you have this variable:
$ VAR=3
$ echo $VAR
3
Use it this way:
$ echo | awk -v env_var="$VAR" '{print "The value of VAR is " env_var}'
The value of VAR is 3
Of course, you can give the same name, but the $ will not be necessary:
$ echo | awk -v VAR="$VAR" '{print "The value of VAR is " VAR}'
The value of VAR is 3
A note about the $ in awk: unlike bash, Perl, PHP etc., it is not part of the variable's name but instead an operator.

Awk and Gawk provide the ENVIRON associative array that holds all exported environment variables. So in your awk script you can use ENVIRON["VarName"] to get the value of VarName, provided that VarName has been exported before running awk.
Note ENVIRON is a predefined awk variable NOT a shell environment variable.
Since I don't have enough reputation to comment on the other answers I have to include them here!
The earlier answer showing $ENVIRON is incorrect - that syntax would be expanded by the shell, and probably result in expanding to nothing.
Further earlier comments about C not being able to access environment variable is wrong. Contrary to what is said above, C (and C++) can access environment variables using the getenv("VarName") function. Many other languages provide similar access (e.g., Java: System.getenv(), Python: os.environ, Haskell System.Environment, ...). Note in all cases access to environment variables is read-only, you cannot change an environment variable in a program and get that value back to the calling script.

There are two ways to pass variables to awk: one way is defining the variable in a command line argument:
$ echo ${UL[*]}|awk -F, -v NUSR=$NUSR '{print $NUSR}'
SWATHI:SWATHI_NEW
Another way is converting the shell variable to an environment variable using export, and reading the environment variable from the ENVIRON array:
$ export NUSR
$ echo ${UL[*]}|awk -F, '{print $ENVIRON["NUSR"]}'
SWATHI:SWATHI_NEW
Update 2016: The OP has comma-separated data and wants to extract an item given its index. The index is in the shell variable NUSR. The value of NUSR is passed to awk, and awk's dollar operator extracts the item.
Note that it would be simpler to declare UL as an array of more than one element, and do the extraction in bash, and take awk out of the equation completely. This however uses 0-based indexing.
UL=(AKHIL:AKHIL_NEW SWATHI:SWATHI_NEW)
NUSR=1
echo ${UL[NUSR]} # prints SWATHI:SWATHI_NEW

There is another way, but it could cause immense confusion:
$ VarName="howdy" ; echo | awk '{print "Just saying '$VarName'"}'
Just saying howdy
$
So you are temporarily exiting the single quote environment (which would normally prevent the shell from interpreting '$') to interpret the variable and then going back into it. It has the virtue of being relatively brief.

Not sure if i understand your question.
But lets say we got a variable number=3 and we want to use it istead of $3, in awk we can do that with the following code
results="100 Mbits/sec 110 Mbits/sec 90 Mbits/sec"
number=3
speed=$(echo $results | awk '{print '"\$${number}"'}')
so the speed variable will get the value 110.
Hope this helps.

No. You can pass the value of a shell variable to an awk script just like you can pass the value of a shell variable to a C program but you cannot access a shell variable in an awk script any more than you could access a shell variable in a C program. Like C, awk is not shell. See question 24 in the comp.unix.shell FAQ at cfajohnson.com/shell/cus-faq-2.html#Q24.
One way to write your code would be:
UL="AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW"
NUSR=$(awk -F, -v ul="$UL" 'BEGIN{print gsub(FS,""); exit}')
echo "$NUSR"
echo "$UL" | awk -F, -v nusr="$NUSR" '{print $nusr}' # could have just done print $NF
but since your original starting point:
UL=(AKHIL:AKHIL_NEW,SWATHI:SWATHI_NEW)
was declaring UL as an array with just one entry, you might want to rethink whatever it is you're trying to do as you may have completely the wrong approach.

How to quote a shell variable in a TCL-expect string

I'm using the following awk command in an expect script to get the gateway for a particular destination
route | grep $dest | awk '{print $2}'
However the expect script does not like the $2 in the above statement.
Does anyone know of an alternative to awk to perform the same function as above? ie. output 2nd column.

You can use cut:
route | grep $dest | cut -d \ -f 2
That uses spaces as the field delimiter and pulls out the second field

To answer your Expect question, single quotes have no special meaning to the Tcl parser. You need to use braces to protect the body of the awk script:
route | grep $dest | awk {{print $2}}
And as awk can do what grep does, you can get away with one less process:
route | awk -v d=$dest {$0 ~ d {print $2}}

Before switching to another utility, check if changing field separator worrks. Documentation for field separators in GNU Awk here.

SED is the best alternative to use. If you don't mind a dependency, Perl should also be sufficient to solve the task

Depending on the structure of your data, you can use either cut, or use sed to do both filtering and printing the second column.

Alternatively, you could use Perl:
perl -ne 'if(/foo/) { #_ = split(/:/); print $_[1]; }'
This will print second token of each line containing foo, with : as token separator.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

awk split question - awk

You don't need awk for that. If you always want the last dir in a path just do: #!/bin/sh cur_dir="${PWD##*/}/" echo "$cur_dir" The above has the added benefit of not creating any subshells and/or forks to external binaries. It's all native POSIX shell syntax.

You could use print a[length(a)] but it's better to avoid splitting and use custom fields separator and $NF: echo $PWD | awk -F/ '{print $NF}' But in that specific case you should rather use basename: basename "$PWD"

The other answers are better replacements to perform the function you're trying to accomplish. However, here is the specific answer to your question: package=$(echo $PWD | awk '{n = split($0,a,"/"); print a[n] }') echo "$package" split() returns the number of resulting elements.

Related

awk/grep/sed: find multiple patterns at one line in my files

Extract directory path from file path

Retain backslashes with while read loop in multiple shells

Can we use shell variables in awk?

How to quote a shell variable in a TCL-expect string

Categories

Resources