awk split with asterix - awk

I am trying to split a variable as follows. is there any efficient way to do this preferably using awk.
echo 262146*10,69636*32 |awk -F, 'split($1, DCAP,"\\*") {print DCAP[1]}; split($2, DCAP,"\\*"){print DCAP[1]}'

echo '262146*10,69636*32' | awk -F '[,*]' '{print $1; print $3}'
or
echo '262146*10,69636*32' | awk -F '[,*]' '{printf("%d\n%d\n",$1,$3)}'
Output:
262146
69636

If you have a longer sequence you could try:
echo 262146*10,69636*32,10*3 | awk 'BEGIN {FS="*"; RS=","} {print $1}'

Related

Slow process in code

can you help to improve this code to be faster .. with 50000 lines in my file this take a lot time.
I appreciate your help
input
17/11/27 03:13:50:480000
17/11/27 03:12:54:380000
17/11/27 03:14:39:980000
output
1195787648480000
1195787592380000
1195787697980000
my code
ts=$(date -d'01/06/1980 00:00:00' +%s)
lap=18
cat file |
while read tt
do
dt=`echo $tt | awk '{print $1}' | awk -F"/" '{print $2"/"$3"/"$1}'`
tm=`echo $tt | awk '{print substr($2,1,8)}'`
ms=`echo $tt | awk '{print $2}' | awk -F":" '{print $NF}'`
line=`echo $dt" " $tm`
echo $line\ $(date -d "${line/// }" "+%s") |
awk '{print (($3 - '$ts') + '$lap')'$ms'}'
done
Please, help me to improve my code to get results faster.
Many thanks.
With single GNU awk process:
awk -F'[[:space:]]*|/|:' -v ts=$(date -d'01/06/1980 00:00:00' +%s) -v lap=18 '{
print (mktime(sprintf("20%d %d %d %d %d %d",$1,$2,$3,$4,$5,$6)) - ts)+lap $NF
}' file
The output:
1195791248480000
1195791192380000
1195791297980000
Enjoy )
similar with gawk
$ awk -F'[/: ]' -v ts=$(date -d'01/06/1980' +%s) \
-v lap=18 '{ms=$NF; $NF=""; d=sprintf(20$0);
print mktime(d)+lap-ts ms}' file
1195787648480000
1195787592380000
1195787697980000

AWK: Apply filter only if field separator is present

I surprisingly found that when you do this:
echo "hello" | awk -F'|' '{print $1;}'
you get:
hello
How to return nothing given that the field separator '|' is absent in the line ?
I do this to extract dates in beginning of log lines, but some lines don't start with a date and then give me this problem. Thanks, I am quite new in awk.
You can do this
echo "hello" | awk -F'|' 'NF>1 {print $1}'
echo "hello|1" | awk -F'|' 'NF>1 {print $1}'
hello
Only when you have more than one field, return the first field
On a file
cat testing
record1|val1
record2|val2
record3
record4|val4
awk -F'|' 'NF>1 {print $1}' testing
record1
record2
record4
Alternatively, you could use
awk -F'|' '$1==$0'
If no separator is present, then field one will contain the whole line.

Convert a decimal data field to hexadecimal using sed or awk

Who can correct this command to get the desired output :
input : "1|2|30|4"
echo "1|2|30|4" | awk -F, -v OFS=| '{print $1,$2; printf "%04X", $3; print $4}'
Output expected :
1|2|001E|4
Best regards.
$ echo "1|2|30|4" |
awk -F'|' -v OFS='|' '{print $1, $2, sprintf("%X", $3), $4}'
1|2|1E|4
echo "1|2|30|4" | awk -F"|" '{printf "%s|%s|%04X|%s", $1, $2, $3, $4}'
Output:
1|2|001E|4

multiple field separator in awk

I'm trying to process an input which has two field seperators ; and space. I'm able to parse the input with one separator using:
echo "10.23;7.15;6.23" | awk -v OFMF="%0.2f" 'BEGIN{FS=OFS=";"} {print $1,$2,$3}'
10.23;7.15;6.23
For an input with two seperators, I tried this and it doesn't parse both the seperators:
echo "10.23;7.15 6.23" | awk -v OFMF="%0.2f" 'BEGIN{FS=OFS=";" || " "} {print $1,$2,$3}'
You want to set FS to a character list:
awk -F'[; ]' 'script' file
and the other builtin variable you're trying to set is named OFMT, not OFMF:
$ echo "10.23;7.15 6.23" | awk -F'[; ]' -v OFMT="%0.2f" '{print $1,$2,$3}'
10.23 7.15 6.23
$ echo "10.23;7.15 6.23" | awk 'BEGIN{FS="[; ]"; OFS=";"; OFMT="%0.2f"} {print $1,$2,$3}'
10.23;7.15;6.23

Tab separated values in awk

How do I select the first column from the TAB separated string?
# echo "LOAD_SETTLED LOAD_INIT 2011-01-13 03:50:01" | awk -F'\t' '{print $1}'
The above will return the entire line and not just "LOAD_SETTLED" as expected.
Update:
I need to change the third column in the tab separated values.
The following does not work.
echo $line | awk 'BEGIN { -v var="$mycol_new" FS = "[ \t]+" } ; { print $1 $2 var $4 $5 $6 $7 $8 $9 }' >> /pdump/temp.txt
This however works as expected if the separator is comma instead of tab.
echo $line | awk -v var="$mycol_new" -F'\t' '{print $1 "," $2 "," var "," $4 "," $5 "," $6 "," $7 "," $8 "," $9 "}' >> /pdump/temp.txt
You need to set the OFS variable (output field separator) to be a tab:
echo "$line" |
awk -v var="$mycol_new" -F'\t' 'BEGIN {OFS = FS} {$3 = var; print}'
(make sure you quote the $line variable in the echo statement)
Make sure they're really tabs! In bash, you can insert a tab using C-v TAB
$ echo "LOAD_SETTLED LOAD_INIT 2011-01-13 03:50:01" | awk -F$'\t' '{print $1}'
LOAD_SETTLED
Use:
awk -v FS='\t' -v OFS='\t' ...
Example from one of my scripts.
I use the FS and OFS variables to manipulate BIND zone files, which are tab delimited:
awk -v FS='\t' -v OFS='\t' \
-v record_type=$record_type \
-v hostname=$hostname \
-v ip_address=$ip_address '
$1==hostname && $3==record_type {$4=ip_address}
{print}
' $zone_file > $temp
This is a clean and easy to read way to do this.
You can set the Field Separator:
... | awk 'BEGIN {FS="\t"}; {print $1}'
Excellent read:
https://docs.freebsd.org/info/gawk/gawk.info.Field_Separators.html
echo "LOAD_SETTLED LOAD_INIT 2011-01-13 03:50:01" | awk -v var="test" 'BEGIN { FS = "[ \t]+" } ; { print $1 "\t" var "\t" $3 }'
If your fields are separated by tabs - this works for me in Linux.
awk -F'\t' '{print $1}' < tab_delimited_file.txt
I use this to process data generated by mysql, which generates tab-separated output in batch mode.
From awk man page:
-F fs
--field-separator fs
Use fs for the input field separator (the value of the FS prede‐
fined variable).
1st column only
— awk NF=1 FS='\t'
LOAD_SETTLED
First 3 columns
— awk NF=3 FS='\t' OFS='\t'
LOAD_SETTLED LOAD_INIT 2011-01-13
Except first 2 columns
— {g,n}awk NF=NF OFS= FS='^([^\t]+\t){2}'
— {m}awk NF=NF OFS= FS='^[^\t]+\t[^\t]+\t'
2011-01-13 03:50:01
Last column only
— awk '($!NF=$NF)^_' FS='\t', or
— awk NF=NF OFS= FS='^.*\t'
03:50:01
Should this not work?
echo "LOAD_SETTLED LOAD_INIT 2011-01-13 03:50:01" | awk '{print $1}'