substr & index - extract from string $2 until the last string of the line - indexing

I have multiple files from which I want from a specific line to extract from string $2 until the last string of line or until the end of line (it could be one more string after $2, or it could be more)
So what I thought is use awk, substr and index, but I dont know how to write the index part so it could print until the end of line or until the last string of line
EXAMPLE
input:
DATA USA CALIFORNIA
DATA CANADA NORTH Quebec city
DATA AMERICA Washington DC
output
USA CALIFORNIA
CANADA NORTH Quebec city
AMERICA Washington DC
Code:
awk '{num=NR; var=substr($2, index($2, " ")+1, NF)}'
But this doesn't work.
Any help would be more than appreciated!
Thank you in advance

When you have one space between the words, use
cut -sd' ' -f2- inputfile

You can do $1 = "" to remove the first field, then print the updated line.
awk '{$1 = ""; print}'
The above will however print a space at the start of each line. If you want to remove that space:
awk '{$1 = ""; $0=substr($0, 2); print}'

From your examples, it seems that you only want to remove $1. In that case you could use sed to remove it:
sed -E 's/[^[:space:]]+[[:space:]]//'
[^[:space:]]+ - match 1 or more non whitespace characters
[[:space:]] - followed by a whitespace character
Substitiute with an empty string (the ending//).

You're not printing anything from your code, so that might be why "this doesn't work". Assuming that's not the problem, please edit your question to tell us what the problem is you need help with as "this doesn't work" is famously the worst possible problem statement when asking for help with software or anything else in life in general.
Having said that, regarding index($2, " ") - that's trying to find a space within a field when fields are separated by spaces so obviously that can never succeed. ITYM index($0, " ") and then substr($2... would be substr($0.... I'm not sure what you were thinking by having NF (the number of fields in the line) at the end of the substr() - maybe you meant length() (the number of chars in the line) but that'd also be wrong (and unnecessary) since that'd be more chars than are left after the substr() and just going til the end of the string as you want is substr()s default behavior anyway.
To fix your existing code try this:
$ awk '{num=NR; var=substr($0, index($0, " ")+1); print var}' file
USA CALIFORNIA
CANADA NORTH Quebec city
AMERICA Washington DC
or more robustly in case of most regexp FS and most input values:
$ awk '{num=NR; var=substr($0, match($0, FS)+1); print var}' file
USA CALIFORNIA
CANADA NORTH Quebec city
AMERICA Washington DC
The above and all other answers so far would fail if you're using the default FS and your input starts with blanks since given input like:
<blank>X<blank>Y
$1 is X and $2 is Y so if you want to print from Y on then you can't just delete whatever's before the first blank as $1 comes AFTER any leading whitespace when the default FS is used.
You also can't rely on using index() since it only matches strings while a multi-char FS is a regexp, nor can you rely on using match() since a single-char FS is a literal character.
So a robust solution to extract from string $2 until the last string of the line would have to handle:
FS being a blank to handle leading/trailing spaces and match any white space between fields,
Any other single-char FS as a literal character.
Any multi-char FS as a regexp.
FS being null in which case there's just 1 field, $1.
Let us know if you actually need that.

Your code
awk '{num=NR; var=substr($2, index($2, " ")+1, NF)}'
has three issues, firstly you are storing what substr returns into variable, but do not print it, secondly you assume desired length is number of fields, which is not case, thirdly $2 is respectively USA, CANADA, AMERICA, whilst you want also further fields.
After commencing repairs to your code it become
awk '{num=NR; var=substr($0, index($0, " ")+1); print var}' file.txt
which for
DATA USA CALIFORNIA
DATA CANADA NORTH Quebec city
DATA AMERICA Washington DC
gives output
USA CALIFORNIA
CANADA NORTH Quebec city
AMERICA Washington DC
that being, if you do not have to use index AT ANY PRICE you might use sub function to get desired output and more concise code following way
awk '{sub(/^[^ ]+ /,"");print}' file.txt
it does replace start of string (^) followed by 1-or-more (+) non (^)-spaces and space using empty string, i.e. delete it, then print such changed line.
(tested in GNU Awk 5.0.1)

mawk NF=NF FS='^[ \t]*[^ \t]+[ \t]+' OFS=
USA CALIFORNIA
CANADA NORTH Quebec city
AMERICA Washington DC

Modifying the input:
$ cat input
DATA USA CALIFORNIA # 2x spaces between DATA and USA
DATA CANADA NORTH Quebec city
DATA AMERICA Washington , DC # starts with 1x tab
DATA South Korea Seoul # starts with 2x spaces
Another awk variation:
awk '
{ print substr($0,index($0,$2)) } # strip off 1st field by finding starting point of 2nd field
' input
# or as a one-liner sans comments
awk '{ print substr($0,index($0,$2)) }' input
NOTES:
for this particular example there's no need for the num and var variables so they've been removed
this assumes the 2nd field is not a substring of the 1st field; if the 2nd field is any of D / A / T / DA / AT / TA / DAT / ATA / DATA then index() will match on the same string in the 1st field (DATA) which means we'll fail to strip off the 1st field
This generates:
USA CALIFORNIA
CANADA NORTH Quebec city
AMERICA Washington , DC
South Korea Seoul
Addressing the scenario where the 2nd field could be a substring of the 1st field ...
Sample input:
$ cat input2
DATA DA USA CALIFORNIA # DA is substring of DATA
DATA A CANADA NORTH Quebec city # A is substring of DATA
DATA TA AMERICA Washington, DC # TA is substring of DATA
DATA DAT South Korea Seoul # DAT is substring of DATA
A couple awk ideas:
awk '
{ match($0,$1) # find 1st field location
$0=substr($0,RSTART+RLENGTH) # strip off 1st field; this will not strip off any field separators so we need to ...
print substr($0,index($0,$1)) # find start of new 1st field (old 2nd field)
}
' input2
# replace match()/substr() with substr()/index()/length()
awk '
{ $0=substr($0,index($0,$1)+length($1)) # find 1st field location and length; this strips off 1st field but not the trailing field separator(s) so we still need to ...
print substr($0,index($0,$1)) # find start of new 1st field (old 2nd field)
}
' input2
These both generate:
DA USA CALIFORNIA
A CANADA NORTH Quebec city
TA AMERICA Washington, DC
DAT South Korea Seoul

Related

return matching but not exact same strings

Is there any way to find a word that contains a given string but is not the exact match. For e.g.
# cat t.txt
first line
ind is a shortform of india
I am trying to return the word "india" because it contains the string "ind" but I do not need the exact match. I have tried this...
# grep -o 'ind' t.txt
ind
ind
Would you please try the following:
grep -Eo '[A-Za-z]+ind|ind[A-Za-z]+' t.txt
Output:
india
The regex [A-Za-z]+ind|ind[A-Za-z]+ matches ind including the preceding or following alphabets.
$ grep -Eo '[[:alpha:]]+ind[[:alpha:]]*|[[:alpha:]]*ind[[:alpha:]]+' file
india
fooindbar
the above was run on this input file (note the added test case of ind appearing in the middle of a string instead of just the start or end):
$ cat file
first line
ind is a shortform of india
this fooindbar is the mid-word text
You can do the same with GNU awk (for multi-char RS, RT, and \s shorthand for [[:space:]]) if you prefer:
$ awk -v RS='\\s+' '/[[:alpha:]]+ind[[:alpha:]]*|[[:alpha:]]*ind[[:alpha:]]+/' file
india
fooindbar
or:
$ awk -v RS='[[:alpha:]]+ind[[:alpha:]]*|[[:alpha:]]*ind[[:alpha:]]+' 'RT{print RT}' file
india
fooindbar
I would use GNU AWK for this task following way, let file.txt content be
first line
ind is a shortform of india
then
awk 'BEGIN{RS="[[:space:]]+"}match($0,/ind/)&&length>RLENGTH{print}' file.txt
output
india
Explanation: I inform GNU AWK that row separator (RS) is one or more whitespaces, this way every word will be treated as row. Then for every row (that is every word) I use match function which return 1 if found else 0 and set RSTART and RLENGTH values. If match is found I check if length of current row (that is word) is greater than that of match, if it is so I print said word. Note that every word is outputted at own line so for example if input file content would be
india ind india ind india
then output would be
india
india
india
(tested in gawk 4.2.1)

Most efficient way to gsub strings in awk where strings come from a separate file

I have a tab-sebarated file called cities that looks like this:
Washington Washington N 3322 +Geo+Cap+US
Munich München N 3842 +Geo+DE
Paris Paris N 4948 +Geo+Cap+FR
I have a text file called countries.txt which looks like this:
US
DE
IT
I'm reading this file into a Bash variable and sending it to an awk program like this:
#!/usr/bin/env bash
countrylist=$(<countries.txt)
awk -v countrylist="$countrylist" -f countries.awk cities
And I have an awk file which should split the countrylist variable into an array, then process the cities file in such a way that we replace "+"VALUE with "" in $5 only if VALUE is in the countries array.
{
FS = "\t"; OFS = "\t";
split(countrylist, countries, /\n/)
# now gsub efficiently every country in $5
# but only if it's in the array
# i.e. replace "+US" with "" but not
# "+FR"
}
I am stuck in this last bit because I don't know how to check if $5 has a value from the array countries and to remove it only then.
Many thanks in advance!
[Edit]
The output should be tab-delimited:
Washington Washington N 3322 +Geo+Cap
Munich München N 3842 +Geo
Paris Paris N 4948 +Geo+Cap+FR
Could you please try following, if I understood your requirement correctly.
awk 'FNR==NR{a[$0]=$0;next} {for(i in a){if(index($5,a[i])){gsub(a[i],"",$5)}}} 1' countries.txt cities
A non-one liner form of code is as follows(you could set FS and OFS to \t in case your Input_file is TAB delimited):
awk '
FNR==NR{
a[$0]=$0
next
}
{
for(i in a){
if(index($5,a[i])){
gsub(a[i],"",$5)
}
}
}
1
' countries.txt cities
Output will be as follows.
Washington Washington N 3322 +Geo+Cap+
Munich München N 3842 +Geo+
Paris Paris N 4948 +Geo+Cap+FR
This is the awk way of doing it:
$ awk '
BEGIN {
FS=OFS="\t" # delimiters
}
NR==FNR { # process countries file
countries[$0] # hash the countries to an array
next # skip to next citi while there are cities left
}
{
n=split($5,city,"+") # split the 5th colby +
if(city[n] in countries) # search the last part in countries
sub(city[n] "$","",$5) # if found, replace in the 5th
}1' countries cities # output and mind the order of files
Output (with actual tabs in data):
Washington Washington N 3322 +Geo+Cap+
Munich München N 3842 +Geo+
Paris Paris N 4948 +Geo+Cap+FR

Printing out a particular row based on condition in another row

apologies if this really basic stuff but i just started with awk
so i have an input file im piping into awk like below. format never changes (like below)
name: Jim
gender: male
age: 40
name: Joe
gender: female
age: 36
name: frank
gender: Male
age: 40
I'm trying to list all names where age is 40
I can find them like so
awk '$2 == "40" {print $2 }'
but cant figure out how to print the name
Could you please try following(I am driving as of now so couldn't test it).
awk '/^age/{if($NF==40){print val};val="";next} /^name/{val=$0}' Input_file
Explanation: 1st condition checking ^name if a line starts from it then store that line value in variable val. Then in other condition checking if a line starts from age; then checking uf that line's 2nd field is greater than 40 then print value if variable val and nullify it too.
Using gnu awk and set Record Selector to nothing makes it works with blocks.
awk -v RS="" '/age: 40/ {print $2}' file
Jim
frank
Some shorter awk versions of suspectus and RavinderSingh13 post
awk '/^name/{n=$2} /^age/ && $NF==40 {print n}' file
awk '/^name/{n=$2} /^age: 40/ {print n}' file
Jim
frank
If line starts with name, store the name in n
IF line starts with age and age is 40 print n
Awk knows the concept records and fields.
Files are split in records where consecutive records are split by the record separator RS. Each record is split in fields, where consecutive fields are split by the field separator FS.
By default, the record separator RS is set to be the <newline> character (\n) and thus each record is a line. The record separator has the following definition:
RS:
The first character of the string value of RS shall be the input record separator; a <newline> by default. If RS contains more than one character, the results are unspecified. If RS is null, then records are separated by sequences consisting of a <newline> plus one or more blank lines, leading or trailing blank lines shall not result in empty records at the beginning or end of the input, and a <newline> shall always be a field separator, no matter what the value of FS is.
So with the file format you give, we can define the records based on RS="".
So based on this, we can immediately list all records who have the line age: 40
$ awk 'BEGIN{RS="";ORS="\n\n"}/age: 40/
There are a couple of problems with the above line:
What if we have a person that is 400 yr old, he will be listed because the line /age: 400/ contains that the requested line.
What if we have a record with a typo stating age:40 or age : 40
What if our record has a line stating wage: 40 USD/min
To solve most of these problems, it is easier to work with well-defined fields in the record and build the key-value-pairs per record:
key value
---------------
name => Jim
gender => male
age => 40
and then, we can use this to select the requested information:
$ awk 'BEGIN{RS="";FS="\n"}
# build the record
{ delete rec;
for(i=1;i<=NF;++i) {
# find the first ":" and select key and value as substrings
j=index($i,":"); key=substr($i,1,j-1); value=substr($i,j+1)
# remove potential spaces from front and back
gsub(/(^[[:blank:]]*|[[:blank:]]$)/,key)
gsub(/(^[[:blank:]]*|[[:blank:]]$)/,value)
# store key-value pair
rec[key] = value
}
}
# select requested information and print
(rec["age"] == 40) { print rec["name"] }' file
This is not a one-liner, but it is robust. Furthermore, this method is fairly flexible and adaptable to make selections based on a more complex logic.
If you are not averse to using grep and the format is always the same:
cat filename | grep -B2 "age: 40" | grep -oP "(?<=name: ).*"
Jim
frank
awk -F':' '/^name/{name=$2} \
/^age/{if ($NF==40)print name}' input_file

reodering the columns in a csv file + awk + keeping the comma delimiter

this is my file:
$ cat temp
country,latitude,longitude,name,code
AU,-25.274398,133.775136,Australia,61
CN,35.86166,104.195397,China,86
DE,51.165691,10.451526,Germany,49
FR,46.227638,2.213749,France,33
NZ,-40.900557,174.885971,New Zealand,64
WS,-13.759029,-172.104629,Samoa,685
CH,46.818188,8.227512,Switzerland,41
US,37.09024,-95.712891,United States,1
VU,-15.376706,166.959158,Vanuatu,678
I want to reorder the columns like below. but I want to keep the comma delimiter and don't want the space delimiter. How do I do this?
$ awk -F"," '{ print $5,$4,$1,$2,$3 }' temp
code name country latitude longitude
61 Australia AU -25.274398 133.775136
86 China CN 35.86166 104.195397
49 Germany DE 51.165691 10.451526
33 France FR 46.227638 2.213749
64 New Zealand NZ -40.900557 174.885971
685 Samoa WS -13.759029 -172.104629
41 Switzerland CH 46.818188 8.227512
1 United States US 37.09024 -95.712891
678 Vanuatu VU -15.376706 166.959158
The OFS record also needs to be set if you don't want the output field separator to be a space character (default).
$ awk 'BEGIN{FS=OFS=","}{ print $5,$4,$1,$2,$3 }' temp
code,name,country,latitude,longitude
61,Australia,AU,-25.274398,133.775136
86,China,CN,35.86166,104.195397
49,Germany,DE,51.165691,10.451526
33,France,FR,46.227638,2.213749
64,New Zealand,NZ,-40.900557,174.885971
685,Samoa,WS,-13.759029,-172.104629
41,Switzerland,CH,46.818188,8.227512
1,United States,US,37.09024,-95.712891
678,Vanuatu,VU,-15.376706,166.959158

Printing everything except the first field with awk

I have a file that looks like this:
AE United Arab Emirates
AG Antigua & Barbuda
AN Netherlands Antilles
AS American Samoa
BA Bosnia and Herzegovina
BF Burkina Faso
BN Brunei Darussalam
And I 'd like to invert the order, printing first everything except $1 and then $1:
United Arab Emirates AE
How can I do the "everything except field 1" trick?
$1="" leaves a space as Ben Jackson mentioned, so use a for loop:
awk '{for (i=2; i<=NF; i++) print $i}' filename
So if your string was "one two three", the output will be:
two
three
If you want the result in one row, you could do as follows:
awk '{for (i=2; i<NF; i++) printf $i " "; print $NF}' filename
This will give you: "two three"
Assigning $1 works but it will leave a leading space: awk '{first = $1; $1 = ""; print $0, first; }'
You can also find the number of columns in NF and use that in a loop.
From Thyag: To eliminate the leading space, add sed to the end of the command:
awk {'first = $1; $1=""; print $0'}|sed 's/^ //g'
Use the cut command with -f 2- (POSIX) or --complement (not POSIX):
$ echo a b c | cut -f 2- -d ' '
b c
$ echo a b c | cut -f 1 -d ' '
a
$ echo a b c | cut -f 1,2 -d ' '
a b
$ echo a b c | cut -f 1 -d ' ' --complement
b c
Maybe the most concise way:
$ awk '{$(NF+1)=$1;$1=""}sub(FS,"")' infile
United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN
Explanation:
$(NF+1)=$1: Generator of a "new" last field.
$1="": Set the original first field to null
sub(FS,""): After the first two actions {$(NF+1)=$1;$1=""} get rid of the first field separator by using sub. The final print is implicit.
awk '{sub($1 FS,"")}7' YourFile
Remove the first field and separator, and print the result (7 is a non zero value so printing $0).
awk '{ saved = $1; $1 = ""; print substr($0, 2), saved }'
Setting the first field to "" leaves a single copy of OFS at the start of $0. Assuming that OFS is only a single character (by default, it's a single space), we can remove it with substr($0, 2). Then we append the saved copy of $1.
If you're open to a Perl solution...
perl -lane 'print join " ",#F[1..$#F,0]' file
is a simple solution with an input/output separator of one space, which produces:
United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN
This next one is slightly more complex
perl -F` ` -lane 'print join " ",#F[1..$#F,0]' file
and assumes that the input/output separator is two spaces:
United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN
These command-line options are used:
-n loop around every line of the input file, do not automatically print every line
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace
-F autosplit modifier, in this example splits on ' ' (two spaces)
-e execute the following perl code
#F is the array of words in each line, indexed starting with 0
$#F is the number of words in #F
#F[1..$#F] is an array slice of element 1 through the last element
#F[1..$#F,0] is an array slice of element 1 through the last element plus element 0
Let's move all the records to the next one and set the last one as the first:
$ awk '{a=$1; for (i=2; i<=NF; i++) $(i-1)=$i; $NF=a}1' file
United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN
Explanation
a=$1 save the first value into a temporary variable.
for (i=2; i<=NF; i++) $(i-1)=$i save the Nth field value into the (N-1)th field.
$NF=a save the first value ($1) into the last field.
{}1 true condition to make awk perform the default action: {print $0}.
This way, if you happen to have another field separator, the result is also good:
$ cat c
AE-United-Arab-Emirates
AG-Antigua-&-Barbuda
AN-Netherlands-Antilles
AS-American-Samoa
BA-Bosnia-and-Herzegovina
BF-Burkina-Faso
BN-Brunei-Darussalam
$ awk 'BEGIN{OFS=FS="-"}{a=$1; for (i=2; i<=NF; i++) $(i-1)=$i; $NF=a}1' c
United-Arab-Emirates-AE
Antigua-&-Barbuda-AG
Netherlands-Antilles-AN
American-Samoa-AS
Bosnia-and-Herzegovina-BA
Burkina-Faso-BF
Brunei-Darussalam-BN
The field separator in gawk (at least) can be a string as well as a character (it can also be a regex). If your data is consistent, then this will work:
awk -F " " '{print $2,$1}' inputfile
That's two spaces between the double quotes.
awk '{ tmp = $1; sub(/^[^ ]+ +/, ""); print $0, tmp }'
Option 1
There is a solution that works with some versions of awk:
awk '{ $(NF+1)=$1;$1="";$0=$0;} NF=NF ' infile.txt
Explanation:
$(NF+1)=$1 # add a new field equal to field 1.
$1="" # erase the contents of field 1.
$0=$0;} NF=NF # force a re-calc of fields.
# and use NF to promote a print.
Result:
United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN
However that might fail with older versions of awk.
Option 2
awk '{ $(NF+1)=$1;$1="";sub(OFS,"");}1' infile.txt
That is:
awk '{ # call awk.
$(NF+1)=$1; # Add one trailing field.
$1=""; # Erase first field.
sub(OFS,""); # remove leading OFS.
}1' # print the line.
Note that what needs to be erased is the OFS, not the FS. The line gets re-calculated when the field $1 is asigned. That changes all runs of FS to one OFS.
But even that option still fails with several delimiters, as is clearly shown by changing the OFS:
awk -v OFS=';' '{ $(NF+1)=$1;$1="";sub(OFS,"");}1' infile.txt
That line will output:
United;Arab;Emirates;AE
Antigua;&;Barbuda;AG
Netherlands;Antilles;AN
American;Samoa;AS
Bosnia;and;Herzegovina;BA
Burkina;Faso;BF
Brunei;Darussalam;BN
That reveals that runs of FS are being changed to one OFS.
The only way to avoid that is to avoid the field re-calculation.
One function that can avoid re-calc is sub.
The first field could be captured, then removed from $0 with sub, and then both re-printed.
Option 3
awk '{ a=$1;sub("[^"FS"]+["FS"]+",""); print $0, a;}' infile.txt
a=$1 # capture first field.
sub( " # replace:
[^"FS"]+ # A run of non-FS
["FS"]+ # followed by a run of FS.
" , "" # for nothing.
) # Default to $0 (the whole line.
print $0, a # Print in reverse order, with OFS.
United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN
Even if we change the FS, the OFS and/or add more delimiters, it works.
If the input file is changed to:
AE..United....Arab....Emirates
AG..Antigua....&...Barbuda
AN..Netherlands...Antilles
AS..American...Samoa
BA..Bosnia...and...Herzegovina
BF..Burkina...Faso
BN..Brunei...Darussalam
And the command changes to:
awk -vFS='.' -vOFS=';' '{a=$1;sub("[^"FS"]+["FS"]+",""); print $0,a;}' infile.txt
The output will be (still preserving delimiters):
United....Arab....Emirates;AE
Antigua....&...Barbuda;AG
Netherlands...Antilles;AN
American...Samoa;AS
Bosnia...and...Herzegovina;BA
Burkina...Faso;BF
Brunei...Darussalam;BN
The command could be expanded to several fields, but only with modern awks and with --re-interval option active. This command on the original file:
awk -vn=2 '{a=$1;b=$2;sub("([^"FS"]+["FS"]+){"n"}","");print $0,a,b;}' infile.txt
Will output this:
Arab Emirates AE United
& Barbuda AG Antigua
Antilles AN Netherlands
Samoa AS American
and Herzegovina BA Bosnia
Faso BF Burkina
Darussalam BN Brunei
There's a sed option too...
sed 's/\([^ ]*\) \(.*\)/\2 \1/' inputfile.txt
Explained...
Swap
\([^ ]*\) = Match anything until we reach a space, store in $1
\(.*\) = Match everything else, store in $2
With
\2 = Retrieve $2
\1 = Retrieve $1
More thoroughly explained...
s = Swap
/ = Beginning of source pattern
\( = start storing this value
[^ ] = text not matching the space character
* = 0 or more of the previous pattern
\) = stop storing this value
\( = start storing this value
. = any character
* = 0 or more of the previous pattern
\) = stop storing this value
/ = End of source pattern, beginning of replacement
\2 = Retrieve the 2nd stored value
\1 = Retrieve the 1st stored value
/ = end of replacement
If you're open to another Perl solution:
perl -ple 's/^(\S+)\s+(.*)/$2 $1/' file
A first stab at it seems to work for your particular case.
awk '{ f = $1; i = $NF; while (i <= 0); gsub(/^[A-Z][A-Z][ ][ ]/,""); print $i, f; }'
Yet another way...
...this rejoins the fields 2 thru NF with the FS and outputs one line per line of input
awk '{for (i=2;i<=NF;i++){printf $i; if (i < NF) {printf FS};}printf RS}'
I use this with git to see what files have been modified in my working dir:
git diff| \
grep '\-\-git'| \
awk '{print$NF}'| \
awk -F"/" '{for (i=2;i<=NF;i++){printf $i; if (i < NF) {printf FS};}printf RS}'
Another and easy way using cat command
cat filename | awk '{print $2,$3,$4,$5,$6,$1}' > newfilename