Get only part of a file name in Awk - awk

I have tried
awk '{print FILENAME}'
And the result was full path of the file.
I want to get only the file name, example: from "test/testing.test.txt" I just want to get "testing" without ".test.txt".

Use -F to delimit by the period and print the first string before that delimiter:
awk -F'.' '{ print $1 }'
Alternatively,
ls -l | awk '{ print $9 }' | awk -F"." '{ print $1 }'
will run through the whole folder👍
(there's a fancier way to do it, but that's easy).

Use the sub and/or split functions to extract the part of FILENAME you want.

Related

How to extract word from a string that may/may not start with a single quote

Sample string:
'kernel-rt|kernel-alt|/kernel-' 'headers|xen|firmware|tools|python|utils'
cut -d' ' -f 1 string.txt gives me
'kernel-rt|kernel-alt|/kernel-'
But how do we proceed further to get just the 'kernel' from it?
Assuming you want only the 3rd kernel (in bold) and not the others
'kernel-rt|kernel-alt|/kernel-' 'headers|xen|firmware|tools|python|utils'
Here is how you extract it using single command awk (standard Linux gawk).
input="kernel-rt|kernel-alt|/kernel-' 'headers|xen|firmware|tools|python|utils"
echo $input|awk -F"|" '{split($3,a,"-");match(a[1],"[[:alnum:]]+",b);print b[0]}'
explanation
-F"|" specify field separator is | so that only is 3rd field required
split($3,a,"-") split 3rd field by -, left part assigned to a[1]
match(a[1],"[[:alnum:]]+",b) from a[1] extract sequence of alphanumeric string into b[0]
print b[0] output the matched string.
If you want to extract kernel from 2nd or 1st fields. Change $3 to $2 or $1.
$ cat file
'kernel-rt|kernel-alt|/kernel-' 'headers|xen|firmware|tools|python|utils'
$
$ awk '{print $1}' file
'kernel-rt|kernel-alt|/kernel-'
$
$ awk '{gsub(/\047/,"",$1); print $1}' file
kernel-rt|kernel-alt|/kernel-
$
$ awk '{gsub(/\047/,""); split($1,f,/[|]/); print f[1]}' file
kernel-rt
and just to make you think...
$ awk '{gsub(/\047|\.*/,"")}1' file
kernel-rt

Extract fields from logs with awk and aggregate them for a new command

I have this kind of log:
2018-10-05 09:12:38 286 <190>1 2018-10-05T09:12:38.474640+00:00 app web - - Class uuid=uuid-number-one cp=xxx action='xxxx'
2018-10-05 10:11:23 286 <190>1 2018-10-05T10:11:23.474640+00:00 app web - - Class uuid=uuid-number-two cp=xxx action='xxxx'
I need to extract uuid and run a second query with:
./getlogs --search 'uuid-number-one OR uuid-number-two'
For the moment for the first query I do this to extract uuid:
./getlogs | grep 'uuid' | awk 'BEGIN {FS="="} { print $2 }' | cut -d' ' -f1
My three question :
I think I could get rid of grep and cut and use only awk?
How could I capture only the value of uuid. I tried awk '/uuid=\S*/{ print $1 }' or awk 'BEGIN {FS="uuid=\\S*"} { print $1 }' but it's a failure.
How could I aggregate the result and turn it into one shell variable that I can use after for the new command?
You could define two field separators:
$ awk -F['= '] '/uuid/{print $12}' file
Result:
uuid-number-one
uuid-number-two
Question 2:
The pattern part in awk just selects lines to process. It doesn't change the internal variables like $1 or NF. You need to do the replacement afterwards:
$ awk '/uuid=/{print gensub(/.*uuid=(\S*).*/, "\\1", "")}' file
Question 3:
var=$(awk -F['= '] '/uuid/{r=r","$12}END{print substr(r,2)}' file)
Implement the actual aggregation for each line (here r=r","$12).
Could you please try following(tested on shown samples and in BASH environment).
awk 'match($0,/uuid=[^ ]*/){print substr($0,RSTART+5,RLENGTH-5)}' Input_file
Solution 2nd: In case your uid is not having space in it then use following.
awk '{sub(/.*uuid=/,"");sub(/ .*/,"")} 1' Input_file
solution 3rd: using sed following may help you(considering that uid is not having any space in its values).
sed 's/\(.*uuid=\)\([^ ]*\)\(.*\)/\2/' Input_file
Solution 4th: using awk field separator method for shown samples.
awk -F'uuid=| cp' '{print $2}' Input_file
To concatenate all values into a shell variable use following.
shell_var=$(awk 'match($0,/uuid=[^ ]*/){val=val?val OFS substr($0,RSTART+5,RLENGTH-5):substr($0,RSTART+5,RLENGTH-5)} END{print val}' Input_file)

awk to parse field by using period and output unique digits

I am trying to use awk to parse $2 on using the first . in the string and output the digits with the header row above it. The current output is close but both commands seem to taking $1 as well. Do I need to specify something in the command to only prints the digits in $2, it seems close. Thank you :).
file
R_2016_09_20_12_47
IonXpress_007 16-0001.xxx.xxx
IonXpress_008 16-0002.xxx.xxx
IonXpress_009 16-0003.xxx.xxx
R_2016_09_20_12_46
IonXpress_007 16-0004.xxx.xxx
IonXpress_008 16-0005.xxx.xxx
IonXpress_009 16-0006.xxx.xxx
desired output
R_2016_09_20_12_47
16-0001
16-0002
16-0003
R_2016_09_20_12_46
16-0004
16-0005
16-0006
awk
awk -F. '{print $1}' file
cut
cut -d'.' -f1 file
current output
R_2016_09_20_12_47
IonXpress_007 16-0001
IonXpress_008 16-0002
IonXpress_009 16-0003
R_2016_09_20_12_46
IonXpress_001 16-0004
IonXpress_002 16-0005
IonXpress_003 16-0006
Try this :
% awk -F'[ .]' '{print $2 ? $2 : $1}' file
R_2016_09_20_12_47
16-0001
16-0002
16-0003
R_2016_09_20_12_46
16-0004
16-0005
16-0006
NOTE
i take space and . as separators
i use ternary operator to make a condition on $2

Awk print string with variables

How do I print a string with variables?
Trying this
awk -F ',' '{printf /p/${3}_abc/xyz/${5}_abc_def/}' file
Need this at output
/p/APPLE_abc/xyz/MANGO_abc_def/
where ${3} = APPLE
and ${5} = MANGO
printf allows interpolation of variables. With this as the test file:
$ cat file
a,b,APPLE,d,MANGO,f
We can use printf to achieve the output you want as follows:
$ awk -F, '{printf "/p/%s_abc/xyz/%s_abc_def/\n",$3,$5;}' file
/p/APPLE_abc/xyz/MANGO_abc_def/
In printf, the string %s means insert-a-variable-here-as-a-string. We have two occurrences of %s, one for $3 and one for $5.
Not as readable, but the printf isn't necessary here. Awk can insert the variables directly into the strings if you quote the string portion.
$ cat file.txt
1,2,APPLE,4,MANGO,6,7,8
$ awk -F, '{print "/p/" $3 "_abc/xyz/" $5 "_abc_def/"}' file.txt
/p/APPLE_abc/xyz/MANGO_abc_def/

use awk to print a column, adding a comma

I have a file, from which I want to retrieve the first column, and add a comma between each value.
Example:
AAAA 12345 xccvbn
BBBB 43431 fkodks
CCCC 51234 plafad
to obtain
AAAA,BBBB,CCCC
I decided to use awk, so I did
awk '{ $1=$1","; print $1 }'
Problem is: this add a comma also on the last value, which is not what I want to achieve, and also I get a space between values.
How do I remove the comma on the last element, and how do I remove the space? Spent 20 minutes looking at the manual without luck.
$ awk '{printf "%s%s",sep,$1; sep=","} END{print ""}' file
AAAA,BBBB,CCCC
or if you prefer:
$ awk '{printf "%s%s",(NR>1?",":""),$1} END{print ""}' file
AAAA,BBBB,CCCC
or if you like golf and don't mind it being inefficient for large files:
$ awk '{r=r s $1;s=","} END{print r}' file
AAAA,BBBB,CCCC
awk {'print $1","$2","$3'} file_name
This is the shortest I know
Why make it complicated :) (as long as file is not too large)
awk '{a=NR==1?$1:a","$1} END {print a}' file
AAAA,BBBB,CCCC
For better porability.
awk '{a=(NR>1?a",":"")$1} END {print a}' file
You can do this:
awk 'a++{printf ","}{printf "%s", $1}' file
a++ is interpreted as a condition. In the first row its value is 0, so the comma is not added.
EDIT:
If you want a newline, you have to add END{printf "\n"}. If you have problems reading in the file, you can also try:
cat file | awk 'a++{printf ","}{printf "%s", $1}'
awk 'NR==1{printf "%s",$1;next;}{printf "%s%s",",",$1;}' input.txt
It says: If it is first line only print first field, for the other lines first print , then print first field.
Output:
AAAA,BBBB,CCCC
In this case, as simple cut and paste solution
cut -d" " -f1 file | paste -s -d,
In case somebody as me wants to use awk for cleaning docker images:
docker image ls | grep tag_name | awk '{print $1":"$2}'
Surpised that no one is using OFS (output field separator). Here is probably the simplest solution that sticks with awk and works on Linux and Mac: use "-v OFS=," to output in comma as delimiter:
$ echo '1:2:3:4' | awk -F: -v OFS=, '{print $1, $2, $4, $3}' generates:
1,2,4,3
It works for multiple char too:
$ echo '1:2:3:4' | awk -F: -v OFS=., '{print $1, $2, $4, $3}' outputs:
1.,2.,4.,3
Using Perl
$ cat group_col.txt
AAAA 12345 xccvbn
BBBB 43431 fkodks
CCCC 51234 plafad
$ perl -lane ' push(#x,$F[0]); END { print join(",",#x) } ' group_col.txt
AAAA,BBBB,CCCC
$
This can be very simple like this:
awk -F',' '{print $1","$1","$2","$3}' inputFile
where input file is : 1,2,3
2,3,4 etc.
I used the following, because it lists the api-resource names with it, which is useful, if you want to access it directly. I also use a label "application" to find specific apps in a namespace:
kubectl -n ops-tools get $(kubectl api-resources --no-headers=true --sort-by=name | awk '{printf "%s%s",sep,$1; sep=","}') -l app.kubernetes.io/instance=application