I have the below file with 100s of entries which I want to replace the 46th Character (N) with a blank with an awk command on a unix box. Does anyone know the best way to do this?
TESTENTRY1||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N|N|N|N|N
TESTENTRY2||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N|N|N|N|N
So it looks like the below:
TESTENTRY1||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N||N|N|N
TESTENTRY2||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N||N|N|N
$ awk 'BEGIN { FS=OFS="|" } { $46 = "" }1' nnn.txt
TESTENTRY1||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N||N|N|N
TESTENTRY2||||||N|Y|N|OFF||N||||N|L|N|0|N|0|N|N||||A|0||0||N|N|N|Y|N||0|N|N||0|||N||N|N|N
BEGIN { FS=OFS="|" } sets the input and output field separators to the vertical bar before the records are read.
{ $46 = "" } sets the 46th column to be empty in each record.
The trailing 1 prints the resulting record to the output.
Related
Since am newbie to the awk , please help me with your suggestions. I tried the below command to filter the maximum value and ignore the first & last lines from the sample text file separately. They work when I try them separately.
My query:
I need to ignore the last line and first few lines and from the file and then need to take the maximum value for the field 7 using awk .
I also need to ignore the lines with the characters . Can anyone suggest me the possibilities two use both the commands together and get the required output.
Sample file:
Linux 3.10.0-957.5.1.el7.x86_64 (j051s784) 11/24/2020 _x86_64_ (8 CPU)
12:00:02 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
12:10:01 AM 4430568 61359128 93.27 1271144 27094976 66771548 33.04 39005492 16343196 1348
12:20:01 AM 4423380 61366316 93.28 1271416 27102292 66769396 33.04 39012312 16344668 1152
12:30:04 AM 4406324 61383372 93.30 1271700 27108332 66821724 33.06 39028320 16343668 2084
12:40:01 AM 4404100 61385596 93.31 1271940 27107724 66799412 33.05 39031244 16344532 1044
06:30:04 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
07:20:01 PM 3754904 62034792 94.29 1306112 27555948 66658632 32.98 39532204 16476848 2156
Average: 4013043 61776653 93.90 1293268 27368986 66755606 33.03 39329729 16427160 2005
Commands used:
cat testfile | awk '{print $7}' | head -n -1 | tail -n+7
awk 'BEGIN{a= 0}{if ($7>0+a) a=$7} END{print a}' testfile
Expected output:
Maximum value for the column 7 by excluding the lines wherever alphabet character is available
1st solution(Generic solution): Adding one Generic solution here, where sending field name to an awk variable(which we want to look for for maximum value) it will automatically find out its field number from very first line and will work accordingly. Considering that your first line has that field name which you want to look for.
awk -v var="kbcached" '
FNR==1{
for(i=1;i<=NF;i++){
if($i==var){ field=i }
}
next
}
/kbmemused/{
next
}
{
if($2!~/^[AP]M$/){
val=$(field-1)
}
else{
val=$field
}
}
{
max=(max>val?max:val)
val=""
}
END{
print "Maximum value is:" max
}
' Input_file
2nd solution(As per shown samples only): Could you please try following, based on your shown samples only. I am assuming you want the field value of column kbcached.
awk '
/kbmemfree/{
next
}
{
if($2!~/^[AP]M$/){
val=$6
}
else{
val=$7
}
}
{
max=(max>val?max:val)
val=""
}
END{
print "Maximum value is:" max
}
' Input_file
awk '$7 ~ ^[[:digit:]]+$/ && $1 != "Average:" {
max[$7]=""
}
END {
PROCINFO["sorted_in"]="#ind_num_asc";
for (i in max) {
maxtot=i
}
print maxtot
}' file
One liner:
awk '$7 ~ /^[[:digit:]]+$/ && $1 != "Average:" { max[$7]="" } END { PROCINFO["sorted_in"]="#ind_num_asc";for (i in max) { maxtot=i } print maxtot }' file
Using GNU awk, search for lines where field 7 is only numbers and field one is not "Average:" In these instances, create an array entry with field 7 as the index. At the end, sort the array in index ascending number order. Loop through the array setting a maxtot variable. The last entry in the max array will be the highest kbcached and so print maxtot
I have a comma separated file. I would like to print every alternate columns into a new row.
Example input file:
Name : John, Age : 30, DOB : 30-Oct-2018
Example output:
Name,Age,DOB
John,30,30-Oct-2018
non-awk solution
$ sed 's/[,:]/\n/g;s/ //g' file | pr -3ts,
Name,Age,DOB
John,30,30-Oct-2018
awk 'BEGIN{FS="[[:blank:]]*[:,][[:blank:]]*"}
{ for(i=1;i<=NF;i+=2) printf (i==1?"":",") $i; print "" }
{ for(i=2;i<=NF;i+=2) printf (i==1?"":",") $i; print "" }' inputfile
Per Your example and output:
$ awk -F', ' '/ : / { for (i=1;i<=NF;i++) { if ( match($i,/ : /) ) { linekeys=linekeys substr($i,1,RSTART-1) ","; linevalues=linevalues substr($i,RSTART+RLENGTH) ","; } } print(substr(linekeys,1,length(linekeys)-1)); print(substr(linevalues,1,length(linevalues)-1)); linekeys=""; linevalues=""; }' file.txt
Name,Age,DOB
John,30,30-Oct-2018
Here's a general idea you could use to implement a solution.
Using awk's split function.
Split the entire line into an array rows with the row delimiter (", "), and save the number of rows.
Split each row into an array cols with the column delimiter (" : "), and save the number of columns. And iterate over the column values and store them into a table, indexed by row and column, e.g. data[row","col].
Finally, iterate over first number of columns and then number of of rows, printing data[row","col].
I have this simple awk script with which I attempt to check the amount of characters in the first line.
if the first line has more of less than 10 characters I want to store the amount
of caracters into a var.
Somehow the first print statement works but storing that result into a var doesn't.
Please help.
I tried removing dollar sign " thelength=(length($0))"
and removing the parenthesis "thelength=length($0)" but it doen't print anything...
Thanks!
#!/bin/ksh
awk ' BEGIN {FS=";"}
{
if (NR==1)
if(length($0)!=10)
{
print(length($0))
thelength=$(length($0))
print "The length of the first line is: ",$thelength;
exit 1;
}
}
END { print "STOP" }' $1
Two issues dealing with mixing ksh and awk scripting ...
no need to make a sub-shell call within awk to obtain the length; use thelength=length($0)
awk variables do not require a leading $ when being referenced; use print ... ,thelength
So your code becomes:
#!/bin/ksh
awk ' BEGIN {FS=";"}
{
if (NR==1)
if(length($0)!=10)
{
print(length($0))
thelength=length($0)
print "The length of the first line is: ",thelength;
exit 1;
}
}
END { print "STOP" }' $1
Below is a input.
!{ID=34, ID2=35}
>
!{ID=99, ID2=23}
>
!{ID=18, ID2=87}
<
I am trying to make a final result like as following. That is, wanted to remove space,'{' and '}' character and check if the next line is '>' or '<'.
In fact, the input above is repeated. I also need to parse '>' and '<' character so I will put the parsed string(YES or NO) into database.
ID=34,ID=35#YES#NO
ID=99,ID=23#YES#NO
ID=18,ID=87#NO#YES
So, with 'sub' function I thought I can replace the space with blank but the result shows:
1#YES#NO
Can you let me know what is wrong?
If possible, teach me how to remove '{' and '}' as well.
Appreciated if you could show me the awk file version instead of one-liner.
BEGIN {
VALUES = ""
L_EXIST = "NO"
R_EXIST = "NO"
}
/!/ { VALUES = gsub(" ", "", $0);
getline;
if ($1 == ">") L_EXIST = "YES";
else if ($1 == "<") R_EXIST = "YES";
print VALUES"#"L_EXIST"#"R_EXIST
}
END {
}
Given your sample input:
$ cat file
!{ID=34, ID2=35}
>
!{ID=99, ID2=23}
>
!{ID=18, ID2=87}
<
This script produces the desired output:
BEGIN { FS="[}{=, ]+"; RS="!" }
NR > 1 { printf "ID=%d,ID=%d#%s\n", $3, $5, ($6==">"?"YES#NO":"NO#YES") }
The Field Separator is set to consume the spaces and other characters between the parts of the line that you're interested in. The Record Separator is set to !, so that each pair of lines is treated as a single record.
The first record is empty (the start of the first line, up to the first !), so we only process the ones after that. The output is constructed using printf, with a ternary to determine the last part (I assume that there are only two options, > or <).
Let's say you have this input:
input.txt
!{ID=34, ID2=35}
!{ID=36, ID2=37}
>
You can use the following awk command
awk -F'[!{}, ]' 'NR>1{yn="NO";if($1==">")yn="YES";print l"#"yn}{l=$3","$5}' input.txt
to produce this output:
ID=34,ID2=35#NO
ID=36,ID2=37#YES
#!/bin/awk -f
{
if (length($0) < 80)
{
prefix = "";
for (i = 1;i<(80-length($0))/2;i++)
prefix = prefix " ";
print prefix $0;
}
else
{
print;
}
}
Could any one please tell me what exactly the prefix variable is doing in the above code.
This is to make the incoming text as Centre Aligned text.
Read the text
Declare a empty string in the variable name prefix
Calculate the position where to paste your text is determined by the for loop by prefixing spaces for the same. In this case, we print spaces until we are at the position at ((80 - length of your string ) /2)
Print your string
Note: $0 in AWK is your complete string like "I want to test this" where as $1 will be "I" and $2 will be "want", where as in shell it prints your current shell you are working with
It's adding front padding to center the string on the line if it's shorter than the line length but you can do the same thing with just:
awk '{ printf "%*s\n",(80+length($0))/2, $0 }' file
It increments prefix with blank space to create a line with space in front according to the formula.
echo "test" | awk -f script
test
it builds an empty string place holder (for left padding), which has length= (80-length of the line)/2