Awk print different character on seperate line - awk

I have been search aorund and could not find the answer... wonder if anyone can help here.
suppose I have a file contents the following:
File1:
name Joe
day Wednesday
lunch was fish
name John
dinner pie
day tuesday
lunch was noodles
name Mary
day Friday
lunch was fish pie
I wanted to grep and print only their name and what they had for lunch.
I suppose i can do
cat file1 | grep -iE 'name|lunch'
but what if i want to do a awk to just have their name and food like this output below?
Joe
fish
John
noodles
Mary
fish pie
I am aware to use awk to print, but this may require awk, is it possible for awk to lets say print $2 on one line, and print $3 on another?
Can I also output it in this format:
Person food
Joe fish
John noodles
Mary fish pie
Thanks

You can for example say:
$ awk '/name/ {print $2} /lunch/ {$1=$2=""; print}' file
Joe
fish
John
noodles
Mary
fish pie
Or remove the lunch was text:
awk '/name/ {print $2} /lunch/ {gsub("lunch was ",""); print}' file
To make the output in two columns:
$ awk -v OFS="\t" '/name/ {name=$2} /lunch/ {gsub("lunch was ",""); print name, $0}' a
Joe fish
John noodles
Mary fish pie

awk
with awk you can do it in one shot,
awk -v RS="" '{n=$2;sub(/.*lunch was\s*/,"");print n,$0}' file
Note that with this one-liner, the format of your input file should be fixed. Your data should be stored in data blocks and lunch was line should be at the end of each data block.
test with your example:
kent$ awk -v RS="" '{n=$2;sub(/.*lunch was\s*/,"");print n,$0}' file
Joe fish
John noodles
Mary fish pie
grep & sed
also you can do it in two steps, grep the values out, and merge lines
grep -Po 'name\s*\K.*|lunch was\s*\K.*' file|sed 'N;s/\n/ /'
with your input file, it outputs:
kent$ grep -Po 'name\s*\K.*|lunch was\s*\K.*' file|sed 'N;s/\n/ /'
Joe fish
John noodles
Mary fish pie

Related

awk: how do you ensure consistent column numbering when there are blank columns?

my_file is like this:
SELECTED NAME AGE
* adam 30
bob 70
I'd like to output:
adam
bob
however, if I try: cat my_file|awk '{print $2}' it outputs
NAME
adam
70
Any suggestions on how you get awk to account for a blank column?
with gawk field widths
$ awk -v FIELDWIDTHS='11 8 3' '{print $2}' file
NAME
adam
bob
Could you please try following.
awk '{printf("%s%s",/^ +/?$1:$2,ORS)}' Input_file
Output will be as follows.
NAME
adam
bob

Sed replace nth column of multiple tsv files without header

Here are multiple tsv files, where I want to add 'XX' characters only in the second column (everywhere except in the header) and save it to this same file.
Input:
$ls
file1.tsv file2.tsv file3.tsv
$head -n 4 file1.tsv
a b c
James England 25
Brian France 41
Maria France 18
Ouptut wanted:
a b c
James X1_England 25
Brian X1_France 41
Maria X1_France 18
I tried this, but the result is not kept in the file, and a simple redirection won't work:
# this works, but doesn't save the changes
i=1
for f in *tsv
do awk '{if (NR!=1) print $2}’ $f | sed "s|^|X${i}_|"
i=$((i+1))
done
# adding '-i' option to sed: this throws an error but would be perfect (sed no input files error)
i=1
for f in *tsv
do awk '{if (NR!=1) print $2}’ $f | sed -i "s|^|T${i}_|"
i=$((i+1))
done
Some help would be appreciated.
The second column is particularly easy because you simply replace the first occurrence of the separator.
for file in *.tsv; do
sed -i '2,$s/\t/\tX1_/' "$file"
done
If your sed doesn't recognize the symbol \t, use a literal tab (in many shells, you type it with ctrlv tab.) On *BSD (and hence MacOS) you need -i ''
AWK solution:
awk -i inplace 'BEGIN { FS=OFS="\t" } NR!=1 { $2 = "X1_" $2 } 1' file1.tsv
Input:
a b c
James England 25
Brian France 41
Maria France 18
Output:
a b c
James X1_England 25
Brian X1_France 41
Maria X1_France 18

Awk Scripting printf ignoring my sort command

I am trying to run a script that I have set up but when I go to sort the contents and display the text the content is printed but the sort command is ignored and the information is just printed. I tried this code format using awk and the sort function is ignored but I am not sure why.
Command I tried:
sort -t, -k4 -k3 | awk -F, '{printf "%-18s %-27s %-15s %s\n", $1, $2, $3, $4 }' c_list.txt
The output I am getting is:
Jim Girv 199 pathway rd Orlando FL
Megan Rios 205 highwind dr Sacremento CA
Tyler Scott 303 cross st Saint James NY
Tim Harding 1150 Washton ave Pasadena CA
The output I need is:
Tim Harding 1150 Washton ave Pasadena CA
Megan Rios 205 highwind dr Sacremento CA
Jim Girv 199 pathway rd Orlando FL
Tyler Scott 303 cross st Saint James NY
It just ignores the sort command but still prints the info I need in the format from the file.
I need it to sort based off the fourth field first the state and the third field next the town then display the information.
An example where each field is separated by a comma.
Field 1 Field 2 Field 3 Field 4
Jim Girv, 199 pathway rd, Orlando, FL
The problem is you're doing sort | awk 'script' file instead of sort file | awk 'script' so sort is sorting nothing and consequently producing no output while awk is operating on your original file and so producing output from that. You should have noticed that your sort command is hanging too for lack of input and you should have mentioned that in your question.
To demonstrate:
$ cat file
c
b
a
$ sort | awk '1' file
c
b
a
$ sort file | awk '1'
a
b
c

Exclude characters using awk

I am trying to find a way to exclude numbers on a file when I cat ti but I only want to exclude the numbers on print $1 and I want to keep the number that is in front of the word. I have something that I thought might might work but is not quite giving me what I want. I have also showed an example of what the file looks like.The file is separated by pipes.
cat files | awk -F '|' ' {print $1 "\t" $2}' |sed 's/0123456789//g'
input:
b1ark45 | dog23 | brown
m2eow66| cat24 |yellow
h3iss67 | snake57 | green
Output
b1ark dog23
m2eow cat24
h3iss nake57
try this:
awk -F'|' -v OFS='|' '{gsub(/[0-9]/,"",$1)}7' file
the output of your example would be:
bark | dog23 | brown
meow| cat24 |yellow
hiss | snake57 | green
EDIT
this outputs col1 (without ending numbers and spaces) and col2, separated by <tab>
kent$ echo "b1ark45 | dog23 | brown
m2eow66| cat24 |yellow
h3iss67 | snake57 | green"|awk -F'|' -v OFS='\t' '{gsub(/[0-9]*\s*$/,"",$1);print $1,$2}'
b1ark dog23
m2eow cat24
h3iss snake57
This might work for you (GNU sed):
sed -r 's/[0-9]*\s*\|\s*(\S*).*/ \1/' file

Substitute in only field 3

What's wrong with my syntax here?
awk -F '|' 'sub/\s*\w*/,"Visit our website!","$3"' merchant_report
it's suppose to turn
|bob|jones| blagblag| texas
|tom|markus| | alabama
into
|bob|jones|Visit our website!| texas
|tom|markus| | alabama
this line may do what you want:
awk -F'|' -v OFS="|" 'NR==1{$4="Visit our website!"}1' file
in your awk codes:
you have had FS to separate the fields, you don't need the sub func., just set $3 directly.
even if with sub( ) function, your syntax is not correct. you can get detail info by man gawk
it is actually not $3, it is $4. because your line starting with |
if you want to just change the first line, you should add NR==1 otherwise awk will do the change on all lines
example with the code:
kent$ cat file
|bob|jones| blagblag| texas
|tom|markus| | alabama
kent$ awk -F'|' -v OFS="|" 'NR==1{$4="Visit our website!"}1' file
|bob|jones|Visit our website!| texas
|tom|markus| | alabama
In awk you would just assign the field the new value for the given line. If you are more comfortable with the substitution approach try sed:
sed '1s/|[^|]*/|Visit our website!/3' file
|bob|jones|Visit our website!| texas
|tom|markus| | alabama