Remove whitespace from field in awk - awk

I have a csv that I'm translating using an awk script.
One of the fields has white space that needs to be removed- however, the current function I am using is not working.
grnd_tack_number = gsub(/ /, "", $13)
input = 487 060210996314
desired output= 487060210996314
Current output = 5

I suggest try to debug with this code sample.
{
grnd_tack_number = $13;
print grnd_tack_number;
gsub(/ /, "", grnd_tack_number);
print grnd_tack_number;
}

Related

Can't replace string to multi-lined string with sed

I'm trying to replace a fixed parse ("replaceMe") in a text with multi-lined text with sed.
My bash script goes as follows:
content=$(awk'{print $5}' < data.txt | sort | uniq)
target=$(cat install.sh)
text=$(sed "s/replaceMe/$content/" <<< "$target")
echo "${text}"
If content contains one line only, replacing works, but if it contains sevrel lines I get:
sed:... untarminated `s' command
I read about "fetching" multi-lined content, but I couldn't find something about placing multi lined string
You'll have more problems than that depending on the contents of data.txt since sed doesn't understand literal strings (see Is it possible to escape regex metacharacters reliably with sed). Just use awk which does:
text="$( awk -v old='replaceMe' '
NR==FNR {
if ( !seen[$5]++ ) {
new = (NR>1 ? new ORS : "") $5
}
next
}
s = index($0,old) { $0 = substr($0,1,s-1) new substr($0,s+length(old)) }
{ print }
' data.txt install.sh )"

Awk replace nth Character with blank value

I have the below file with 100s of entries which I want to replace the 46th Character (N) with a blank with an awk command on a unix box. Does anyone know the best way to do this?
TESTENTRY1||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N|N|N|N|N
TESTENTRY2||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N|N|N|N|N
So it looks like the below:
TESTENTRY1||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N||N|N|N
TESTENTRY2||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N||N|N|N
$ awk 'BEGIN { FS=OFS="|" } { $46 = "" }1' nnn.txt
TESTENTRY1||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N||N|N|N
TESTENTRY2||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N||N|N|N
BEGIN { FS=OFS="|" } sets the input and output field separators to the vertical bar before the records are read.
{ $46 = "" } sets the 46th column to be empty in each record.
The trailing 1 prints the resulting record to the output.

How to remove space and the specific character in string - awk

Below is a input.
!{ID=34, ID2=35}
>
!{ID=99, ID2=23}
>
!{ID=18, ID2=87}
<
I am trying to make a final result like as following. That is, wanted to remove space,'{' and '}' character and check if the next line is '>' or '<'.
In fact, the input above is repeated. I also need to parse '>' and '<' character so I will put the parsed string(YES or NO) into database.
ID=34,ID=35#YES#NO
ID=99,ID=23#YES#NO
ID=18,ID=87#NO#YES
So, with 'sub' function I thought I can replace the space with blank but the result shows:
1#YES#NO
Can you let me know what is wrong?
If possible, teach me how to remove '{' and '}' as well.
Appreciated if you could show me the awk file version instead of one-liner.
BEGIN {
VALUES = ""
L_EXIST = "NO"
R_EXIST = "NO"
}
/!/ { VALUES = gsub(" ", "", $0);
getline;
if ($1 == ">") L_EXIST = "YES";
else if ($1 == "<") R_EXIST = "YES";
print VALUES"#"L_EXIST"#"R_EXIST
}
END {
}
Given your sample input:
$ cat file
!{ID=34, ID2=35}
>
!{ID=99, ID2=23}
>
!{ID=18, ID2=87}
<
This script produces the desired output:
BEGIN { FS="[}{=, ]+"; RS="!" }
NR > 1 { printf "ID=%d,ID=%d#%s\n", $3, $5, ($6==">"?"YES#NO":"NO#YES") }
The Field Separator is set to consume the spaces and other characters between the parts of the line that you're interested in. The Record Separator is set to !, so that each pair of lines is treated as a single record.
The first record is empty (the start of the first line, up to the first !), so we only process the ones after that. The output is constructed using printf, with a ternary to determine the last part (I assume that there are only two options, > or <).
Let's say you have this input:
input.txt
!{ID=34, ID2=35}
!{ID=36, ID2=37}
>
You can use the following awk command
awk -F'[!{}, ]' 'NR>1{yn="NO";if($1==">")yn="YES";print l"#"yn}{l=$3","$5}' input.txt
to produce this output:
ID=34,ID2=35#NO
ID=36,ID2=37#YES

Removing Quote From Field For Filename Using AWK

I've been playing around with this for an hour trying to work out how to embed the removal of quotes from a specific field using AWK.
Basically, the file encapsulates text in quotes, but I want to use the second field to name the file and split them based on the first field.
ID,Name,Value1,Value2,Value3
1,"AAA","DEF",1,2
1,"AAA","GGG",7,9
2,"BBB","DEF",1,2
2,"BBB","DEF",9,0
3,"CCC","AAA",1,1
What I want to get out are three files, all with the header row named:
AAA [1].csv
BBB [2].csv
CCC [3].csv
I have got it all working, except for the fact that I can't for the life of me work out how to remove the quotes around the filename!!
So, this command does everything (except the file is named with quotes around $2, but I need to do some kind of transformation on $2 before it goes into evname. In the actual file, I want to keep the encapsulating quotes.
awk -F, 'NR==1{h=$0;next}!($1 in files){evname=$2" ["$1"].csv";files[$1]=1;print h>evname}{print > evname}' DataExtract.csv
I've tried to push a gsub into this, but I'm struggling to work out exactly how this should look.
This is I think as close as I have got, but it is just calling everything "2" for $2, I'm not sure if this means I need to do an escape of $2 somehow in the gsub, but trying that doesn't seem to be working, so I'm at a loss as to what I'm doing wrong.
awk -F, 'NR==1{h=$0;next}!($1 in files){evname=gsub(""\","", $2)" - Event ID ["$1"].csv";files[$1]=1;print h>evname}{print > evname}' DataExtract.csv
Any help greatly appreciated.
Thanks in advance!!
Gannon
If I understand what you are attempting correctly, then
awk -F, 'NR==1{h=$0;next}!($1 in files){gsub(/"/, "", $2); evname=$2" ["$1"].csv";files[$1]=1;print h>evname}{print > evname}' DataExtract.csv
should work. That is
NR == 1 {
h = $0;
next
}
!($1 in files) {
stub = $2 # <-- this is the new bit: make a working copy
# of $2 (so that $2 is unchanged and the line
# is not rebuilt with changes for printing),
gsub(/"/, "", stub) # remove the quotes from it, and
evname = stub " [" $1 "].csv" # use it to assemble the filename.
files[$1] = 1;
print h > evname
}
{
print > evname
}
You can, of course, use
evname = stub " - Event ID [" $1 "].csv"
or any other format after the substitution (this one seems to be what you tried to get in your second code snippet).
The gsub function returns the number of substitutions made, not the result of the substitutions; that is why evname=gsub(""\","", $2)" - Event ID ["$1"].csv" does not work.
Things are always clearer with a little white space:
awk -F, '
NR==1 { hdr=$0; next }
!seen[$1]++ {
evname = $2
gsub(/"/,"",evname)
outfile = evname " [" $1 "].csv"
print hdr > outfile
}
{ print > outfile }
' DataExtract.csv
Aside: It's pretty unusual for someone to WANT to create files with spaces in their names given the complexity that introduces in any later scripts you write to process them. You sure you want to do that?
P.S. here's the gawk version as suggested by #JID below
awk -F, '
NR==1 { hdr=$0; next }
!seen[$1]++ {
outfile = gensub(/"/,"","g",$2) " [" $1 "].csv"
print hdr > outfile
}
{ print > outfile }
' DataExtract.csv
Apply the gsub before you make the assignment:
awk -F, 'NR==1{h=$0;next}
!($1 in files){
gsub("\"","",$2); # Add this line
evname=$2" ["$1"].csv";files[$1]=1;print...

Need help in understanding the code below awk(&&&&) code:

#!/bin/awk -f
{
if (length($0) < 80)
{
prefix = "";
for (i = 1;i<(80-length($0))/2;i++)
prefix = prefix " ";
print prefix $0;
}
else
{
print;
}
}
Could any one please tell me what exactly the prefix variable is doing in the above code.
This is to make the incoming text as Centre Aligned text.
Read the text
Declare a empty string in the variable name prefix
Calculate the position where to paste your text is determined by the for loop by prefixing spaces for the same. In this case, we print spaces until we are at the position at ((80 - length of your string ) /2)
Print your string
Note: $0 in AWK is your complete string like "I want to test this" where as $1 will be "I" and $2 will be "want", where as in shell it prints your current shell you are working with
It's adding front padding to center the string on the line if it's shorter than the line length but you can do the same thing with just:
awk '{ printf "%*s\n",(80+length($0))/2, $0 }' file
It increments prefix with blank space to create a line with space in front according to the formula.
echo "test" | awk -f script
test
it builds an empty string place holder (for left padding), which has length= (80-length of the line)/2