This question already has answers here:
Why does my tool output overwrite itself and how do I fix it?
(3 answers)
Closed 5 months ago.
I'm having trouble doing something very simple with awk. I'd like to print the last field, followed by another field.
Input file looks like this:
03 Oct 22, Southern ,Mad,WIN,Gro,,33.10
03 Oct 22, Mpd ,Mad,WIN,Auto,-208.56,
23 Sep 22, Thank ,n/a,WIN,,-97.93,
This way round works fine:
$ awk -F',' '{print "first " $6 " and then " $7}' input.csv
first and then 33.10
first -208.56 and then
first -97.93 and then
But when I swap the fields over I get the strangest result:
$ awk -F',' '{print "first " $7 " and then " $6}' input.csv
and then 0
and then -208.56
and then -97.93
I must be missing something really simple. What on earth is going on?
$ awk --version
GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)
Only suggestion I have is to update awk. It works perfectly fine on my MacBook - with this version:
awk --version
GNU Awk 5.1.1, API: 3.1 (GNU MPFR 4.1.0, GNU MP 6.2.1)
Related
What is the difference on Ubuntu between awk and awk -F? For example to display the frequency of the cpu core 0 we use the command
cat /proc/cpuinfo | grep -i "^ cpu MHz" | awk -F ":" '{print $ 2}' | head -1
But why it uses awk -F? We could put awk without the -F and it would work of course (already tested).
Because without -F , we couldn't find from wath separator i will begin the calculation and print the right result. It's like a way to specify the kind of separator for this awk's using. Without it, it will choose the trivial separator in the line like if i type on the terminal: ps | grep xeyes | awk '{print $1}' ; in this case it will choose the space ' ' as a separator to print the first value: pid OF the process xeyes. I found it in https://www.shellunix.com/awk.html. Thanks for all.
I would like to get the right value of the following command as a string without double quotes.
$ grep '^VERSION=' /etc/os-release
VERSION="20.04.3 LTS (Focal Fossa)"
When I pipe it with the following awk, I don't get the desired output.
$ grep '^VERSION=' /etc/os-release | awk '{print $0}'
VERSION="20.04.3 LTS (Focal Fossa)"
$ grep '^VERSION=' /etc/os-release | awk '{print $1}'
VERSION="20.04.3
$ grep '^VERSION=' /etc/os-release | awk '{print $2}'
LTS
How can I fix that?
You may use this single awk command:
awk -F= '$1=="VERSION" {gsub(/"/, "", $2); print $2}' /etc/os-release
20.04.3 LTS (Focal Fossa)
1st solution: With your shown samples, please try following awk code.
awk 'match($0,/^VERSION="[^"]*/){print substr($0,RSTART+9,RLENGTH-9)' Input_file
Explanation: Simple explanation would be, using match function of awk to match starting VERSION=" till next occurrence of " and then printing the matched part(to get only desired output as per OP's shown samples).
2nd solution: Using GNU grep with PCRE regex enabled option try following.
grep -oP '^VERSION="\K[^"]*' Input_file
3rd solution: Using awk's capability to set different field separators and then check conditions accordingly and print values.
awk -F'"' '$1=="VERSION="{print $2}' Input_file
Assuming that "the right value" you want output is 20.04.3:
$ awk -F'[" ]' '/^VERSION=/{print $2}' file
20.04.3
or if it's the whole quoted string:
$ awk -F'"' '/^VERSION=/{print $2}' file
20.04.3 LTS (Focal Fossa)
You can use an awk command like
awk 'match($0, /^VERSION="([^"]*)"/, m) {print m[1]}' /etc/os-release
Here, ^VERSION="([^"]*)" matches VERSION=" at the start of the string (^), then captures into Group 1 any zero or more chars other than " (with ([^"]*)) and then matches ". The match is saved in m where m[1] holds the Group 1 value.
Or, sed like
sed -n '/^VERSION="\([^"]*\)".*/s//\1/p' /etc/os-release
See an online test:
s='VERSION="20.04.3 LTS (Focal Fossa)"'
awk 'match($0, /^VERSION="([^"]*)"/, m) {print m[1]}' <<< "$s"
sed -n '/^VERSION="\([^"]*\)".*/s//\1/p' <<< "$s"
Here, -n option suppresses the default line output, /^VERSION="\([^"]*\)".*/ matches a string starting with VERSION=", then capturing into Group 1 any zero or more chars other than ", and then matching " and the rest of the string, and replacing the whole match with the Group 1 value. // means the previous regex pattern must be used. p only prints the result of the substition.
Both output 20.04.3 LTS (Focal Fossa).
Since the file /etc/os-release conforms to a variable assignment in bash or the shell in general (POSIX), sourcing it should do the job.
source /etc/os-release; echo "$VERSION"
Using a subshell just in case one does not want the pollute the current env variables.
( source /etc/os-release; echo "$VERSION" )
Assigning it to a variable.
version=$( source /etc/os-release; echo "$VERSION" )
If the shell you're using does not conform to POSIX.
sh -c '. /etc/os-release; echo "$VERSION"'
See your local man page if available.
man 5 os-release
Example input
42 -0.400000000000000022
I want to add 9'000'000'000'000'000'000 to the 1st column, and add 30 to the 2nd column.
$ echo 42 -0.400000000000000022 | awk '{ $1 += 9000000000000000000; $2 += 30 } { print }'
9000000000000000000 29.6
Computation for the 1st column is wrong, but the 2nd column is OK.
From the documentation and the man page, there's a --bignum option which should help me for the big integer computation.
$ echo 42 -0.400000000000000022 | awk --bignum '{ $1 += 9000000000000000000; $2 += 30 } { print }'
9000000000000000042 30
Now the 1st column is OK, but the 2nd one isn't!
Here's my AWK version, running on Ubuntu 16.04:
$ awk -V
GNU Awk 4.1.3, API: 1.1 (GNU MPFR 3.1.4, GNU MP 6.1.0)
Copyright (C) 1989, 1991-2015 Free Software Foundation.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see http://www.gnu.org/licenses/.
What's even weirder is that I tested this inside an Ubuntu 16.04 docker container, and the output is correct for both column when using --bignum.
I actually don't know what to look for to fix this.
I also recommend this syntax with big numbers.
Use LC_ALL=C:
$ echo 42 -0.400000000000000022 | LC_ALL=C awk --bignum '{ $1 += 9000000000000000000; $2 += 30 } { print }'
9000000000000000042 29.6
Successfully tested with GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0-p13, GNU MP 6.2.0)
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
I have a sample file:
đŸ’»~/dev/test[1]â‘‚master*$ cat test.properties
startTime: 0515
stopTime: 2015
dataFiles: foo
fixVersion: 4.2
retry: 5
kafkaRelay.type: kafkaSink
kafkaRelay.producerId: blah
kafkaRelay.partitioningTag: 49
kafkaRelay.topic: topicname-pre-transform-{0,date,yyyyMMdd}
I have a particular awk command I want to run. It gives different output when I use -F vs {BEGIN FS = ...:
đŸ’»~/dev/test[4]â‘‚master*$ awk 'BEGIN{ FS = ": *"; OFS =": " } \
$1 ~ /(startTime|fixVersion)/ {print $2, $1}; \
$1 ~ /kafkaRelay.topic/ {$1="kafkaWriter.topic";print; $1="kafkaReader.topic"; print}; \
$1 ~ /stopTime/ { $2+=100; $2%=2400; printf("%s: %04d\n", $1, $2) }' test.properties
0515: startTime
stopTime: 2115
4.2: fixVersion
kafkaWriter.topic: topicname-pre-transform-{0,date,yyyyMMdd}
kafkaReader.topic: topicname-pre-transform-{0,date,yyyyMMdd}
đŸ’»~/dev/test[5]â‘‚master*$ awk -F': *' -OFS': ' \
'$1 ~ /(startTime|fixVersion)/ {print $2, $1}; \
$1 ~ /kafkaRelay.topic/ {$1="kafkaWriter.topic";print; $1="kafkaReader.topic"; print}; \
$1 ~ /stopTime/ { $2+=100; $2%=2400; printf("%s: %04d\n", $1, $2) }' test.properties
startTime: 0515
stopTime: 2015: 0100
fixVersion: 4.2
kafkaWriter.topic
kafkaReader.topic
The first version outputs exactly as I expect. The second version has a bunch of differences and I don't understand how they come about. I also tried 1-4 \ in front of the * in the second version, hoping it was something to do with escaping the *, but that had no effect.
Why does this happen? I followed awk's regexp field splitting, and the command line field separator doesn't have any special instructions for -F vs FS =. The only StackOverflow question I could find fails to use BEGIN, which isn't my problem.
For reference:
đŸ’»~/dev/test[6]â‘‚master*$ awk --version
GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)
The problem is the use of -OFS.
POSIX guidelines for command-line parsing indicate that after a single dash, flags are parsed character by character. Thus, this means -O, and -FS -- with the -FS overriding the -F ': *' with a value of just S.
If you want to set OFS, doing it in a BEGIN block is the Right Thing.
Can someone explain why 2 different hexa are converted to the same decimal?
$ echo A0000044956EA2 | gawk '{print strtonum("0x" $1)}'
45035997424348832
$ echo A0000044956EA0 | gawk '{print strtonum("0x" $1)}'
45035997424348832
Starting with GNU awk 4.1 you can use --bignum or -M
$ awk -M 'BEGIN {print 0xA0000044956EA2}'
45035997424348834
$ awk -M 'BEGIN {print 0xA0000044956EA0}'
45035997424348832
§ Command-Line Options
Not as much an answer but a workaround to at least not bin the strtonum function completely:
It seems to be the doubles indeed. I found the calculation here : strtonum.
Nothing wrong with it.
However if you really need this in some awk you should strip the last digit from the hexa number and manually add that after the strtonum did its calculation on the main part of it.
So 0xA0000044956EA1 , 0xA0000044956EA2 and 0xA0000044956EA"whatever" should all become 0xA0000044956EA0 with a simple regex and then add the "whatever".
Edit* Maybe I should delete this all together as I am even to downgrade this even further. This is not working to satisfaction either, just tried it and I actually can't add a number that small to a number this big i.e. print (45035997424348832 + 4) just comes out as 45035997424348832. So this workaround will have to remain having output like 45035997424348832 + 4 for hexa 0xA0000044956EA4.