How to specify case insensitivity in AWK within for loop? [duplicate] - awk

This question already has answers here:
ignorecase in AWK
(4 answers)
Closed 8 years ago.
I have an awk line :
awk -F'/|//' '{for(i=1;i<=NF;i++) if($i=="CUST")break;print $(i)}'
I want the CUST to be case insensitive and I am using ($i==CUST) because the file contains words like CUSTOMER, which should not get matched.
I tried using Character class like: if($i=="[Cc][Uu][Ss][Tt]") but that throws an error.

Your mistake is you are doing string comparison with == when the regular expression comparison operator is ~ and your regular expression string should be like /^[Cc][Uu][Ss][Tt]$/ (notice the anchors ^ and $ stop overmatching):
awk -F'/|//' '{for (i=1;i<=NF;i++) if ($i ~ /^[Cc][Uu[Ss][tT]$/)break; print $i}'
Better approachs would be to use the IGNORECASE variable or the tolower, toupper functions.

Use awk
toupper($i)
or
tolower($i)
Like this:
awk -F'/|//' '{for (i=1;i<=NF;i++) if (tolower($i) == "cust")break; print $i}'

Related

how to use "," as field delimiter [duplicate]

This question already has answers here:
Escaping separator within double quotes, in awk
(3 answers)
Closed 1 year ago.
i have a file like this:
"1","ab,c","def"
so only use comma a field delimiter will get wrong result, so i want to use "," as field delimiter, i tried like this:
awk -F "," '{print $0}' file
or like this:
awk -F "","" '{print $0}' file
or like this:
awk -F '","' '{print $0}' file
but the result is incorrect, don't know how to include "" as part of the field delimiter itself,
If you can handle GNU awk, you could use FPAT:
$ echo '"1","ab,c","def"' | # echo outputs with double quotes
gawk ' # use GNU awk
BEGIN {
FPAT="([^,]*)|(\"[^\"]+\")" # because FPAT
}
{
for(i=1;i<=NF;i++) # loop all fields
gsub(/^"|"$/,"",$i) # remove leading and trailing double quotes
print $2 # output for example the second field
}'
Output:
ab,c
FPAT cannot handle RS inside the quotes.
What you are attempting seems misdirected anyway. How about this instead?
awk '/^".*"$/{ sub(/^\"/, ""); sub(/\"$/, ""); gsub(/\",\", ",") }1'
The proper solution to handling CSV files with quoting in them is to use a language which has an actual CSV parser. My thoughts go to Python, which includes a csv module in its standard library.
In GNU AWK
{print $0}
does print whole line, if no change were made original line is printed, no matter what field separator you set you will get original lines if only action is print $0. Use $1=$1 to trigger string rebuild.
If you must do it via FS AT ANY PRICE, then you might do it as follows: let file.txt content be
"1","ab,c","def"
then
BEGIN{FS="\x22,?\x22?"}{$1=$1;print $0}
output
1 ab,c def
Note leading space (ab,c is $3). Explanation: I inform GNU AWK that field separator is literal " (\x22, " is 22(hex) in ASCII) followed by zero or one (?) , followed by zero or one (?) literal " (\x22). $1=$1 trigger line rebuilt as mentioned earlier. Disclaimer: this solution assume that you never have escaped " inside your string,
(tested in gawk 4.2.1)

why is awk not writing variable to csv file? [duplicate]

This question already has answers here:
How do I use shell variables in an awk script?
(7 answers)
Closed 1 year ago.
I have a csv file that I need to add a number of columns at the end. The new columns are variables taken from other files.
STNO=3389
STNNAME=SWORDS
awk -F "," '{ stnname='"$STNNAME"';stno='"$STNO"';print $0","stnname","stno }' infile
example of the output.
992501062,-6.278983000,202105210736,,3389
The stno is written fine but the stnname is blank. It seems like I can use numeric variables but not text.
any help appreciated.
thanks.
You are interpolating the literal symbol SWORDS where apparently you were hoping to interpolate a quoted string. Awk has no variable named SWORDS so it resolves to an empty string.
Better hygiene and stability altogether is to avoid interpolating junk right into your Awk script. Use -v on the Awk command line to pass in values.
awk -v stnname="$STNNAME" -v stno="$STNO" 'BEGIN {FS=OFS=","}
{ print $0, stnname , stno }' infile
Tangentially, avoid upper case for your private shell variables.
It is very easy to get lost in a sea of quotes. Maybe catch the env variables using -v like this:
awk -v stnname="$STNNAME" -v stno="$STNO" -F "," '{ print $0,stnname,stno }' infile
then you can use them in the command directly without trying to piece together a string

how can I read an argument within gsub [duplicate]

This question already has answers here:
How do I use shell variables in an awk script?
(7 answers)
Closed 2 years ago.
I want to write a script which replaces 'gene' feature from the 3rd column of the $1 file into 'quant'.
#!/bin/bash
awk -F "\t" '{gsub("gene","quant",$3);print}' $1
The code works well, however I would like to read "gene" as an argument, so how can I specify argument $2 instead of 'gene' in the above code?
Thanks!
Use -v awkvar="$value" to create an awk variable with a given value. Thus:
#!/bin/bash
awk -v orig="$2" -F '\t' '{gsub(orig,"quant",$3);print}' "$1"

using a wildcard in awk

Using awk, I want to print all lines that have a string in the first column that starts with 22_
I tried the following, but obviously * does not work as a wildcard in awk:
awk '$1=="22_*" {print $0}' input > output
Is this possible in awk?
Let's start with a test file:
$ cat >file
22_something keep
23_other omit
To keep only lines that start with 22_:
$ awk '/^22_/' file
22_something keep
Alternatively, if you prefer to reference the first field explicitly, we could use:
$ awk '$1 ~ /^22_/' file
22_something keep
Note that we don't have to write {print $0} after the condition because that is exactly the default action that awk associates with a condition.
At the start of a regular expressions, ^ matches the beginning of a line. Thus, if you want 22_ to occur at the start of a line or the start of a field, you want to write ^22_.
In the condition $1 ~ /^22_/, note that the operator is ~. That operator tells awk to check if the preceding string, $1, matches the regular expression ^22_.
Chosen answer does not answer how to use a wildcard in awk, which is achieved using .* (instead of *):
awk '$1=="22_.*" {print $0}' input > output

simple value with variable replacement in awk doesn't work [duplicate]

This question already has answers here:
Awk with a variable
(2 answers)
Closed 6 years ago.
Having this line
Doc=$(awk '/1516001/ { print substr($0,15,11) }' /home/data.txt)
want to change the 1616001 with a variable.
for example:
Var='1516001'
Doc=$(awk '/$Var/ { print substr($0,15,11) }' /home/data.txt)
But it doesn't work
#Pedro, in awk a variable's value doesn't work like shell's variable, so we have to assign shell variable's value to an awk's variable and then use it.
Doc=$(awk -vvar="$Var" '{if($0 ~ var){print substr($0,15,11)} }' /home/data.txt)
Let me know if this helps you.
You can use awk -v
Var='1516001'
Doc=$(awk -v pat="$Var" ' $0~pat{ print substr($0,15,11) }' /home/data.txt)