I have the following sample text:
a b c
x_y_
d e f
x_y_
g h i
x_y_
k l m
x_y_
I need it to be formatted as follows:
x_y_ a b c
x_y_ d e f
x_y_ g h i
x_y_ k l m
Using sed, awk or something else in bash, how do we accomplish this?
Another awk:
$ awk 'NR%2==0{print $0,p}{p=$0}' file
Output:
x_y_ a b c
x_y_ d e f
x_y_ g h i
x_y_ k l m
Explained:
$ awk '
NR%2==0 { # on every even numbered record
print $0,p # output current record and previous
}{
p=$0 # buffer record for next round
}' file
Update:
In case of odd number of records (mostly due to the peer pressure :), you need to deal with the left-over x y z:
$ awk 'NR%2==0{print $0,p}{p=$0}END{if(NR%2)print}' file
Output:
...
x_y_ g h i
x_y_ k l m
x y z
With sed:
sed -E 'N;s/(.*)\n(.*)/\2 \1/g' sample.txt
a short pipeline:
tac file | paste -d ' ' - - | tac
$ awk 'NR%2{s=$0; next} {print $0, s}' file
x_y_ a b c
x_y_ d e f
x_y_ g h i
x_y_ k l m
1st solution: Could you please try following, tested and created with GNU awk.
awk -v RS="" -v FS="\n" '{for(i=2;i<=NF;i+=2){printf("%s\n",$i OFS $(i-1))}}' Input_file
OR(with print):
awk -v RS="" -v FS="\n" '{for(i=2;i<=NF;i+=2){print $i,$(i-1)}}' Input_file
2nd solution: By checking if a line number is completely divided by 2 then print previous and current lines values. It also checks if total number of lines are ODD in Input_file then it prints last remaining line too(by checking a flag(variable)'s status).
awk 'prev && FNR%2==0{print $0 OFS prev;prev="";next} {prev=$0} END{if(prev){print prev}}' Input_file
Output will be as follows.
x_y_ a b c
x_y_ d e f
x_y_ g h i
x_y_ k l m
This might work for you (GNU sed):
sed '1~2{h;d};G;s/\n/ /' file
Save odd line numbered lines in the hold space and append them to even numbered lines and replace the newline with a space.
Another variation:
sed -n 'h;n;G;s/\n/ /p' file
There are many more ways to achieve this, as can be seen by answers above.
How about this:
parallel -N2 echo "{2} {1}" :::: file
See here for parallel.
Related
I have data in format
A ((!(A1+A2)))
B (A1A2)
C (A1+A2)
D (!(A1+A2)B1)
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)
I want output as
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)
I want to get that line which has ! only with alphanumeric.
I used
awk -F' ' '$2 ~ /\!/' file
And this also
awk '$2'~/^[(!)_[:alnum:]]+[(!)+_[:alnum:]]+$/' file
But this is listing all the lines which has ! along with ! in alphanumeric also.
I want to get that line which has ! only with alphanumeric:
You can just use this regex for matching ! followed by an alphanumeric character:
/![[:alnum:]]/
awk command:
awk '$2 ~ /![[:alnum:]]/' file
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)
$ grep '![[:alpha:]]' file
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)
$ awk '/![[:alpha:]]/' file
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)
$ sed -n '/![[:alpha:]]/p' file
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)
If that's not all you need then edit your question to more clearly state your requirements and provide sample input/output that that doesn't work for.
I guess I want to get that line which has ! only with alphanumeric means: not !(
$ awk '/![^(]/' file
Output:
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)
If your real data is not so tight, you may want to throw in some space checking, for example: /! *[^(]/
Also with awk:
awk '/![[:alpha:]]+/' file
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)
Or
awk '/![[:alpha:]][[:digit:]]/' file
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)
You can require a starting (leading) word boundary (\<) after !:
awk '$2 ~ /!\</' file
Note that whitespace is the default field separator pattern, you do not need -F' '.
See the online awk demo:
s='A ((!(A1+A2)))
B (A1A2)
C (A1+A2)
D (!(A1+A2)B1)
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)'
awk '$2 ~ /!\</' <<< "$s"
Output:
E (!A1+!A2)
F (!(!A1!A2)+A3+A4)
G ((A1!A2)B1)
I would like to know how to put a comma in one column (space). For example.
a b c d e
And I would like this.
a b c d, e
A comma in the 4th space.
I tried with this command.
awk -F '{print $4}' < file.txt | cut -d"," -f4-
$ awk '{$4=$4","}1' file
a b c d, e
If you have only 5 fields(or in case you have more fields in your Input_file and you want to perform this for second last field) in your Input_file then following may also help you in same.
awk '{$(NF-1)=$(NF-1)","} 1' Input_file
Or with sed simply replace 4th space with comma as follows.
sed 's/ /, /4' Input_file
echo a b c d e| awk '{$0=gensub(/ /,", ",4)}1'
a b c d, e
I have a text file in the following format, the alphabets are ids separated by a space.
OG1: A B C D E
OG2: C F G D R
OG3: A D F F F
I would like to randomly extract one id from each group as
OG1: E
OG2: D
OG3: A
I tried using
shuf -n 1 data.txt
which gives me
OG2: C F G D R
awk to the rescue!
$ awk -v seed=$RANDOM 'BEGIN{srand(seed)} {print $1,$(rand()*(NF-1)+2)}' file
OG1: D
OG2: F
OG3: F
to skip a certain letter, you can change the main block to
... {while ("C"==r=$(rand()*(NF-1)+2)); print $1,r}' file
perl -lane 'print "$F[0] ".$F[rand($#F-1)+1]' data.txt
Explanation:
These command-line options are used:
-n loop around each line of the input file
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace.
-e execute the perl code
#F is the array of words in each line, indexed starting with $F[0]
$#F is the number of words in #F
output:
OG1: A
OG2: F
OG3: F
I'm trying to fetch the data from column B to D from a tab delimited file "FILE". The simple AWK code I use fetch the data, but unfortunately keeps the output in a single column and remove the identifiers (shown below).
Any suggestions please.
CODE
awk '{for(i=2;i<=4;++i)print $i}' FILE
FILE
A B C D E F G
1_at 10.8435630935 10.8559287854 8.6666141543 8.820310681 9.9024050571 8.613199083 11.9807771094
2_at 4.7615531106 4.5209119307 11.2467919586 8.8105151099 7.1831990104 11.0645055836 4.3726598561
3_at 6.0025262754 5.4058080843 3.2475272982 3.1869728585 3.5654989547
OUTPUT OBTAINED
B
C
D
10.8435630935
10.8559287854
8.6666141543
4.7615531106
4.5209119307
11.2467919586
6.0025262754
5.4058080843
3.2475272982
Why don't you directly use cut?
$ cut -d$'\t' -f2-4 < file
B C D
10.8435630935 10.8559287854 8.6666141543
4.7615531106 4.5209119307 11.2467919586
6.0025262754 5.4058080843 3.2475272982
With awk you would need printf to avoid new lines of print:
awk -F"\t" '{for(i=2;i<=4;++i) printf "%s%s", $i, (i==4?RS:FS)}'
I need to split a single column of data in a large file into two columns as follows:
A
B B A
C ----> D C
D F E
E H G
F
G
H
Is there an easy way of doing it with unix shell commands and/or small shell script? awk?
$ awk 'NR%2{s=$0;next} {print $0,s}' file
B A
D C
F E
H G
You can use the following awk script:
awk 'NR % 2 != 0 {cache=$0}; NR % 2 == 0 {print $0 cache}' data.txt
Output:
BA
DC
FE
HG
It caches the value of odd lines and outputs even lines + appends the cache to them.
I know this is tagged awk, but I just can't stop myself from posting a sed solution, since the question left it open for "easy way . . . with unix shell commands":
$ sed -n 'h;n;G;s/\n/ /g;p' data.txt
B A
D C
F E
H G