why is awk not writing variable to csv file? [duplicate] - awk

This question already has answers here:
How do I use shell variables in an awk script?
(7 answers)
Closed 1 year ago.
I have a csv file that I need to add a number of columns at the end. The new columns are variables taken from other files.
STNO=3389
STNNAME=SWORDS
awk -F "," '{ stnname='"$STNNAME"';stno='"$STNO"';print $0","stnname","stno }' infile
example of the output.
992501062,-6.278983000,202105210736,,3389
The stno is written fine but the stnname is blank. It seems like I can use numeric variables but not text.
any help appreciated.
thanks.

You are interpolating the literal symbol SWORDS where apparently you were hoping to interpolate a quoted string. Awk has no variable named SWORDS so it resolves to an empty string.
Better hygiene and stability altogether is to avoid interpolating junk right into your Awk script. Use -v on the Awk command line to pass in values.
awk -v stnname="$STNNAME" -v stno="$STNO" 'BEGIN {FS=OFS=","}
{ print $0, stnname , stno }' infile
Tangentially, avoid upper case for your private shell variables.

It is very easy to get lost in a sea of quotes. Maybe catch the env variables using -v like this:
awk -v stnname="$STNNAME" -v stno="$STNO" -F "," '{ print $0,stnname,stno }' infile
then you can use them in the command directly without trying to piece together a string

Related

how to use "," as field delimiter [duplicate]

This question already has answers here:
Escaping separator within double quotes, in awk
(3 answers)
Closed 1 year ago.
i have a file like this:
"1","ab,c","def"
so only use comma a field delimiter will get wrong result, so i want to use "," as field delimiter, i tried like this:
awk -F "," '{print $0}' file
or like this:
awk -F "","" '{print $0}' file
or like this:
awk -F '","' '{print $0}' file
but the result is incorrect, don't know how to include "" as part of the field delimiter itself,
If you can handle GNU awk, you could use FPAT:
$ echo '"1","ab,c","def"' | # echo outputs with double quotes
gawk ' # use GNU awk
BEGIN {
FPAT="([^,]*)|(\"[^\"]+\")" # because FPAT
}
{
for(i=1;i<=NF;i++) # loop all fields
gsub(/^"|"$/,"",$i) # remove leading and trailing double quotes
print $2 # output for example the second field
}'
Output:
ab,c
FPAT cannot handle RS inside the quotes.
What you are attempting seems misdirected anyway. How about this instead?
awk '/^".*"$/{ sub(/^\"/, ""); sub(/\"$/, ""); gsub(/\",\", ",") }1'
The proper solution to handling CSV files with quoting in them is to use a language which has an actual CSV parser. My thoughts go to Python, which includes a csv module in its standard library.
In GNU AWK
{print $0}
does print whole line, if no change were made original line is printed, no matter what field separator you set you will get original lines if only action is print $0. Use $1=$1 to trigger string rebuild.
If you must do it via FS AT ANY PRICE, then you might do it as follows: let file.txt content be
"1","ab,c","def"
then
BEGIN{FS="\x22,?\x22?"}{$1=$1;print $0}
output
1 ab,c def
Note leading space (ab,c is $3). Explanation: I inform GNU AWK that field separator is literal " (\x22, " is 22(hex) in ASCII) followed by zero or one (?) , followed by zero or one (?) literal " (\x22). $1=$1 trigger line rebuilt as mentioned earlier. Disclaimer: this solution assume that you never have escaped " inside your string,
(tested in gawk 4.2.1)

How to append string with special characters to each line using AWK

I have a csv and for each line in the csv, I need to add new columns. One of the columns is a timestamp and it seems to be breaking the operation.
Example:
col1,col2,col3
1,2,3
4,5,7
After
col1,col2,col3,col4,col5,col6
1,2,3,01-01-2020 01:01:01,name,class
4,5,7,01-01-2020 01:01:01,name,class
I used How to add new column with header to csv with awk
for reference.
ORIG_FILE='sample.csv'
NEW_FILE='new_sample.csv'
values="01-01-2020 01:01:01,name,class"
awk -v d=$values -F"," 'BEGIN {OFS = ","} {printf("%s%s",$0,FNR>1?d RS:"col4,col5,col6" RS)}' $ORIG_FILE > $NEW_FILE
I am very new to using bash and I am trying to figure this out.
Any help greatly appreciated.
orig_file='sample.csv'
new_file='new_sample.csv'
values='01-01-2020 01:01:01,name,class'
awk -v d="$values" 'BEGIN{FS=OFS=","} { print $0, (NR>1 ? d : "col4,col5,col6") }' "$orig_file" > "$new_file"
Always quote your shell variables and don't use all upper case for non-exported shell variable names.

how can I read an argument within gsub [duplicate]

This question already has answers here:
How do I use shell variables in an awk script?
(7 answers)
Closed 2 years ago.
I want to write a script which replaces 'gene' feature from the 3rd column of the $1 file into 'quant'.
#!/bin/bash
awk -F "\t" '{gsub("gene","quant",$3);print}' $1
The code works well, however I would like to read "gene" as an argument, so how can I specify argument $2 instead of 'gene' in the above code?
Thanks!
Use -v awkvar="$value" to create an awk variable with a given value. Thus:
#!/bin/bash
awk -v orig="$2" -F '\t' '{gsub(orig,"quant",$3);print}' "$1"

Using an awk variable in the string substitution portion of gsub

I want to use a command line variable to replace text found with a regular expression.
Something like:
awk --lint=fatal -v awk_var=XYZ '{ gsub(/^ABCD=.*$/, "ABCD=<awk_var>"); print}'
Haven't been able to figure out what the awk_var syntax should be.
Since you have not shown samples so based on your shown code and ask, could you please try following.
awk --lint=fatal -v awk_var=XYZ '{ gsub(/^ABCD=.*$/, "ABCD=" awk_var); print}'
You shouldn't give " with your variable else it will be considered as a text.

Using awk to filter a CSV file with quotes in it

I have a text file with comma separated values.
A sample line can be something like
"Joga","Bonito",7,"Machine1","Admin"
The " seen are part of the text and are needed when this csv gets converted back to a java object.
I want to filter out some lines from this file based on some field in the csv.
The following statement doesnt work.
awk -F "," '($2== "Bonito") {print}' filename.csv
I am guessing that this has something to do with the " appearing in the text.
I saw an example like:
awk -F "\"*,\"*"
I am not sure how this works. It looks like a regex, but the use of the last * flummoxed me.
Is there a better option than the last awk statement I wrote?
How does it work?
Since some parameters have double quotes and other not, you can filter with a quoted parameter:
awk -F, '$2 == "\"Bonito\""' filename.csv
To filter on parameter that do not have double quote, just do:
awk -F, '$3 == 7' filename.csv
Another way is to use the double quote in the regex (the command ? that make the double quote optional):
awk -F '"?,"?' '$2 == "Bonito"' filename.csv
But this has a drawback of also matching the following line:
"Joga",Bonito",7,"Machine1","Admin"
First a bit more through test file:
$ cat file
"Joga","Bonito",7,"Machine1","Admin"
"Joga",Bonito,7,"Machine1","Admin"
Using regex ^\"? ie. starts with or without a double quote:
$ awk -F, '$2~/^\"?Bonito\"?$/' file
"Joga","Bonito",7,"Machine1","Admin"
"Joga",Bonito,7,"Machine1","Admin"