I really do not know why my ouput file gives strange arrangement in linux - col

I really do not know why my output file gives strange arrangement.
I am using the awk command to print columns of a file, e.g
awk '{print $3, $4, $5, $6, $7, $2, $1}' inputfile > outputfile.
I however do not have an output file of 5 columns in that order. instead i have output file in this order: column1, column2, column6, column7.
When I viewed the output file in excel:
I realized the file is in the order below:
row1: column3, column4, column5, column6, column7
row2: column1, column2
Can anyone help with what I am probably doing wrong?

Excel may show CSV (comma separated values) files incorrectly. Also the command awk '{print $3, $4, $5, $6, $7, $2, $1}' in fact print variables $1, $2 etc. but where comes the values of the vars?

I found the solution to my question. apparently, there is a carriage-return character (^M) in my file, which marks an end-of-line, and affects arrangement of my file. so i removed the carriage-return-character in Vim by typing :%s/^M/
(Press Ctrl+V Ctrl+M to insert that ^M.). save and quit by pressing ctrl+ZZ

Related

not enough arguments to satisfy format string

What is wrong with this, please?
awk '{printf "%10.5f %6.4f %6.4f %6.4f %6.4f R\n, $1, $4, $5, $7, $8"}' R.dat > R0
It gives the error: not enough arguments to satisfy format string
There are five % to format 5 $columns.
You should do it like following manner. Where we need to enclose all formatting parameters inside "...." and then other section(after ,) we need to put whatever field values/variables etc are the ones which we want to print in current line.
awk '{printf "%10.5f %6.4f %6.4f %6.4f %6.4f R\n", $1, $4, $5, $7, $8}' R.dat > R0
OR to make it more clear use like:
awk '{printf("%10.5f %6.4f %6.4f %6.4f %6.4f R\n", $1, $4, $5, $7, $8)}' R.dat > R0

Trying to read data from two files, and subtract values from both files using awk

I have two files
0.975301988947238963 1.75276754663189283 2.00584
0.0457467532388459441 1.21307648993841410 1.21394
-0.664000617674924687 1.57872850852906366 1.71268
-0.812129324498058969 4.86617859243825635 4.93348
and
1.98005959631337536 -3.78935155011290536 4.27549
-1.04468782080821154 4.99192849476267053 5.10007
-1.47203672235857397 -3.15493073343947694 3.48145
2.68001948430755244 -0.0630730371855307004 2.68076
I want to subtract the two values in column 3 of each file.
My first awk statement was
**awk
'BEGIN {print "Test"} FNR>1 && FNR==NR { r[$1]=$3; next} FNR>1 { print $3, r[$1], (r[$1]-$3)}' zzz0.dat zzz1.dat**
Test
5.10007 -5.10007
3.48145 -3.48145
2.68076 -2.68076
This suggests it does not recognize r[$1]=$3
I created an additional column xyz by
**awk 'NR==1{$(NF+1)="xyz"} NR>1{$(NF+1)="xyz"}1' zzz0.dat**
then
awk 'BEGIN {print "Test"} FNR>1 && FNR==NR { xyz[$4]=$3; next} FNR>1 { print $3, xyz[$4], (xyz[$4]-$3)}' zzz00.dat zzz11.dat
Test
5.10007 4.93348 -0.16659
3.48145 4.93348 1.45203
2.68076 4.93348 2.25272
This now shows three columns, but xyz[$4] is printing only the value in the last column, instead of creating a array.
My real files have thousands of lines. How can I resolve this problem ?
You can do it relatively easily using a numeric index for your array. For example:
awk 'NR==FNR {a[++n]=$3; next} o<n{++o; printf "%lf - %lf = %lf\n", a[o], $3, a[o]-$3}' file1 file2
That way you preserve the ordering of the records across files. Without a numeric index, the arrays are associative and there is no specific ordering preserved.
Example Use/Output
With your files in file1 and file2 respectively, you would have:
$ awk 'NR==FNR {a[++n]=$3; next} o<n{++o; printf "%lf - %lf = %lf\n", a[o], $3, a[o]-$3}' file1 file2
2.005840 - 4.275490 = -2.269650
1.213940 - 5.100070 = -3.886130
1.712680 - 3.481450 = -1.768770
4.933480 - 2.680760 = 2.252720
Let me know if that is what you intended or if you have any further questions. If I missed your intent, drop a comment and I will help further.
if the records are aligned in both files, easiest is
$ paste file1 file2 | awk '{print $3,$6,$3-$6}'
2.00584 4.27549 -2.26965
1.21394 5.10007 -3.88613
1.71268 3.48145 -1.76877
4.93348 2.68076 2.25272
if you're only interested in the difference, change to print $3-$6.

How to use awk script to generate a file

I have a very large compressed file(dataFile.gz) that I want to generate another file using cat and awk. So using cat to view the contents and then piping it to awk to generate the new file.
The contents of compressed as like below
Time,SequenceNumber,MsgType,MsgLength,CityOrign,RTime
7:20:13,1,A,34,Tokyo,0
7:20:13,2,C,35,Nairobi,7:20:14
7:20:14,3,E,30,Berlin,7:20:15
7:20:16,4,A,34,Berlin,7:20:17
7:20:17,5,C,35,Denver,0
7:20:17,6,D,33,Helsinki,7:20:18
7:20:18,7,F,37,Tokyo,0
….
….
….
For the new file, I want to generate, I only want the Time, MsgType and RTime. Meaning columns 0,2 and 5. And for column 5, if the value is 0, replace it with the value at column 0. i.e replace RTime with Time
Time,MsgType,RTime
7:20:13,A,7:20:13
7:20:13,C,7:20:14
7:20:14,E,7:20:15
7:20:16,A,7:20:17
7:20:17,C,7:20:17
7:20:17,D,7:20:18
7:20:18,F,7:20:18
This is my script so far:
#!/usr/bin/awk -f
BEGIN {FS=","
print %0,%2,
if ($5 == "0") {
print $0
} else {
print $5
}
}
My question is, will this script work and how do I call it. Can I call it on the terminal like below?
zcat dataFile.gz | <awk script> > generatedFile.csv
awk index starts with 1 and $0 represents full record. So column numbers would be 1, 3, 6.
You may use this awk:
awk 'BEGIN{FS=OFS=","} !$6{$6=$1} {print $1, $3, $6}' file
Time,MsgType,RTime
7:20:13,A,7:20:13
7:20:13,C,7:20:14
7:20:14,E,7:20:15
7:20:16,A,7:20:17
7:20:17,C,7:20:17
7:20:17,D,7:20:18
7:20:18,F,7:20:18
Could you please try following. A bit shorter version of #anubhava sir's solution. This one is NOT having assignment to 6th field it only checks if that is zero or not and accordingly it prints the values.
awk 'BEGIN{FS=OFS=","} {print $1, $3, $6==0?$1:$6}' Input_file

awk compare two columns in same file with letters in one column

how todo column matching in same file. for e.g compare server and FQDN column. FQDN column has extra words so i couldn't find way to strip.
"server","cpu","memory","disk","FQDN"
"host1",4,32,100,"host2.xxx.com"
"host2",2,10,20,"host2.xxx.com"
"host3",6,4,100,"host1.xxx.com"
"host4",2,10,30,"host4.xxx.com"
"host5",3,6,32,"host3.xxx.com"
awk -F, '$1 ~ /$5/' test.csv
expected results:
"host1",4,32,100,"host2.xxx.com"
"host3",6,4,100,"host1.xxx.com"
"host5",3,6,32,"host3.xxx.com"
Test a substring of $5 that has same length as $1:
awk -F, 'substr( $5, 1, length($1) ) == $1' test.csv
Your "expected results" show lines where these fields don't match. If that's what you want, do the same transformation but test for inequality instead:
awk -F, 'substr( $5, 1, length($1) ) != $1' test.csv
For getting matches: try following.
awk -F'[,.]' '$1==$5' Input_file
For getting NON matching: try following.
awk -F'[,.]' '$1!=$5' Input_file
OR to remove headers in output try:
awk -F'[,.]' 'FNR>1 && $1!=$5' Input_file
Explanation: Setting field separator as , or . for all lines in Input_file and then simply comparing $1 and $5, for matching case using == condition and for NON matching case using != condition.

Adding columns with awk. What is wrong with this awk command?

I want to add two columns to a file with ~10,000 columns. I want to insert as the first column the nr 22 on each row. Then I want the original first column as the second column, then as the third column I want to insert the line nr (NR), and after that I want the rest of the original columns to be printed. I thought I could do that with the following awk line:
awk '{print 22, $1, NR; for(i=2;i<=NF;++i) print $i}' file
It prints the first three columns (22, $1, NR) well, but after that, there is a new line started for each value, so the file is printed like this:
22 $1 NR
$2
$3
$4
etc...
instead of:
22 $1 NR $2 $3 $4 etc...
What did I do wrong?
How about using printf instead since print adds a newline.
awk '{printf("%d, %d, %d, ", 22, $1, NR); for(i=2;i<=NF;++i) printf("%d, ", i)}' file
Or you can play with the ORS and OFS, the Output Record Separator and the Output Field Separator. Normally you add those in a BEGIN statement like this:
awk 'BEGIN { ORS = " " } {print 22, $1, NR; for(i=2;i<=NF;++i) print $i}{print "\n"}' file
Note that an extra printf "\n" is needed, else everything ends up on one line...
Read more in gawk manual output separators
For more precise control over the output format than what is provided by print(which print a newline by default), use printf.