How to display the date of each file as the first element of each lines with bash/awk?

How to display the date of each file as the first element of each lines with bash/awk? - awk

I have 7 txt files which are the output of the df -m command on AIX 7.2.
I need to keep only the first column and the second column for one filesystem. So I do that :
cat *.txt | grep hd4 | awk '{print $1","$2}' > test1.txt
And the output is :
/dev/hd4,384.00
/dev/hd4,394.00
/dev/hd4,354.00
/dev/hd4,384.00
/dev/hd4,484.00
/dev/hd4,324.00
/dev/hd4,384.00
Each files are created from the crontab and their filenames are :
df_command-2019-09-03-12:50:00.txt
df_command-2019-08-28-12:59:00.txt
df_command-2019-08-29-12:51:00.txt
df_command-2019-08-30-12:52:00.txt
df_command-2019-08-31-12:53:00.txt
df_command-2019-09-01-12:54:00.txt
df_command-2019-09-02-12:55:00.txt
I would like to keep only the date on the filename, I'm able to do that :
test=df_command-2019-09-03-12:50:00.txt
echo $test | cut -d'-' -f2,3,4
outout :
2019-09-03
But I would like to put each date as the first element of each line of my test1.txt :
2019-08-28,/dev/hd4,384.00
2019-08-29,/dev/hd4,394.00
2019-08-30,/dev/hd4,354.00
2019-08-31,/dev/hd4,384.00
2019-09-01,/dev/hd4,484.00
2019-09-02,/dev/hd4,324.00
2019-09-03,/dev/hd4,384.00
Do you have any idea to do that ?

This awk may do:
awk '/hd4/ {split(FILENAME,a,"-");print a[2]"-"a[3]"-"a[4]","$1","$2}' *.txt > test1.txt
/hd4/ find line with hd4
split(FILENAME,a,"-") splits the filename in to array a split by -
print a[2]"-"a[3]"-"a[4]","$1","$2 print year-month-date, field 1, field 2
> test1.txt to file test1.txt

Date output file : dates.txt
2019-08-20
2019-08-08
2019-08-01
File system data fsys.txt
/dev/hd4,384.00
/dev/hd4,394.00
/dev/hd4,354.00
paste can be used to append the files as columns. Use -d to specify comma as the separator.
paste -d ',' dates.txt fsys.txt

Related

Match 2 columns in 2 files and get another value from the first file

I have 2 csv files which have the following structure:
File 1:
date,keyword,location,page
2019-04-11,ABC,mumbai,http://www.insurers.com
and so on.
File 2:
date,site,market,location,url
2019-05-12,denmark,de ,Frankfurt,http://lufthansa.com
2019-04-11,Netherlands,nl,amsterdam,http://www.insurers.com
The problem is I need to match the dates in both the files as well as the the url. Example:
2019-04-11 and http://www.insurers.com (File 1)
with
2019-04-11 and http://www.insurers.com (File 2)
Edit:
If this condition is satisfied the keyword (ABC) in File 1 should be inserted into the File 2 as the third column(new column).
Expected Output:
date,site,keyword,market,location,url
2019-04-11,Netherlands,ABC,nl,amsterdam,http://www.insurers.com
I have tried putting the dates and urls in a map in java, but there are too many URLs duplicated.
So I am seeking a bash, awk, grep or sed solution.
Thanks.

$ awk '
BEGIN { FS=OFS="," }
NR==FNR { m[$1,(NR>1?$4:"url")]=$2; next }
($1,$5) in m { $2=$2 OFS m[$1,$5]; print }
' file1 file2
date,site,keyword,market,location,url
2019-04-11,Netherlands,ABC,nl,amsterdam,http://www.insurers.com

try gnu sed:
sed -En 's!^([0-9]{4}-[0-9]+-[0-9]+,).+(http://\w.+)!s#^\1([^,]+),[^,]+,\\s*\2#\\1#p!p' File2| sed -Enf - File1 >Result

Awk, order foreach 12 lines to insert query

I have the following script:
curl -s 'https://someonepage=5m' | jq '.[]|.[0],.[1],.[2],.[3],.[4],.[5],.[6],.[7],.[8],.[9],.[10],.[11],.[12]' | perl -p -e 's/\"//g' |awk '/^[0-9]/{print; if (++onr%12 == 0) print ""; }'
This is part of result:
1517773500000
0.10250100
0.10275700
0.10243500
0.10256600
257.26700000
1517773799999
26.38912220
1229
104.32200000
10.70579910
0
1517773800000
0.10256600
0.10268000
0.10231600
0.10243400
310.64600000
1517774099999
31.83806883
1452
129.70500000
13.29758266
0
1517774100000
0.10243400
0.10257500
0.10211800
0.10230000
359.06300000
1517774399999
36.73708621
1296
154.78500000
15.84041910
0
I want to insert this data in a MySQL database. I want for each line this result:
(1517773800000,0.10256600,0.10268000,0.10231600,0.10243400,310.64600000,1517774099999,31.83806883,1452,129.70500000,13.29758266,0)
(1517774100000,0.10243400,0.10257500,0.10211800,0.10230000,359.06300000,151774399999,36.73708621,1296,154.78500000,15.84041910,0)
I need merge lines each 12 lines, any can help me for get this result.

Here's an all-jq solution:
.[] | .[0:12] | #tsv | gsub("\t";",") | "(\(.))"
In the sample, all the subarrays have length 12, so you might be able to drop the .[0:12] part of the pipeline. If using jq 1.5 or later, you could use join(“,”) instead of the #tsv|gsub portion of the pipeline. You might, for example, want to consider:
.[] | join(“,”) | “(\(.))”. # jq 1.5 or later
Invocation: use the -r command-line option
Sample output:
(1517627400000,0.10452300,0.10499000,0.10418200,0.10449400,819.50400000,1517627699999,85.57150693,2340,452.63400000,47.27213035,0)
(1517627700000,0.10435700,0.10449200,0.10366000,0.10370000,717.37000000,1517627999999,74.60582079,1996,321.25500000,33.42273846,0)
(1517628000000,0.10376600,0.10390000,0.10366000,0.10370400,519.59400000,1517628299999,53.88836170,1258,239.89300000,24.88613854,0)

$ awk 'BEGIN {RS=""; OFS=","} {$1=$1; $0="("$0")"}1' file
(1517773500000,0.10250100,0.10275700,0.10243500,0.10256600,257.26700000,1517773799999,26.38912220,1229,104.32200000,10.70579910,0)
(1517773800000,0.10256600,0.10268000,0.10231600,0.10243400,310.64600000,1517774099999,31.83806883,1452,129.70500000,13.29758266,0)
(1517774100000,0.10243400,0.10257500,0.10211800,0.10230000,359.06300000,1517774399999,36.73708621,1296,154.78500000,15.84041910,0)
RS="":
Treat groups of lines separated one or more blank lines as a record
OFS=","
Set the output separator to be a ","
$1=$1
Reconstitute the line, replacing the input separators with the output separator
$0="("$0")"
Surround the record with parens
1
Print the record

exclude sequences depending on description ID in AWK

I have fasta files which have some description ID ( isoforms 2 , ... Isoform 9 ), i want to exclude them in fasta files.
I used this command line to see which file contain the isoform 2 to 9 ID :
for i in `ls *.fasta`; do l=`grep 'isoform X[2-9]' $i | head -1`; echo $i $l; done | awk '(NF==1){print}' | head
There is a way to include something in my command line for removing them all ?
Thanks.

sed 's/isoform [2-9]\{1,1\}//g' *.fasta

base64 decoding from file column

I have a file, every line with 6 columns separated by ",". Last column is zipped and encoded in base 64. Output file should be column 3 and column 6(decoded/unzipped).
I tried to do this by
awk -F',' '{"echo "$6" | base64 -di | gunzip" | getline x;print $3,x }' OFS=',' inputfile.csv >outptfile_decoded.csv
The result for the first lines is ok, but after some lines the decode output is the same as the line before. It seems that decoding & unzipping hungs, but I didn't get error message.
Singe decode/unzipping works fine i.e.
echo "H4sIAAAAAAAAA7NJTkuxs0lMLrEztNEHUTZAgcy8tHw7m7zSXLuS1BwrbRNjMzMTc3MDAzMDG32QqE1uSWVBqh2QB2HYlCYX2xnb6IMoG324ASCWHQAaafi1YQAAAA==" | base64 -di | gunzip
What can be the reason for this effect? (there are no error messages).
Is there another way which works reliable?

without a test case difficult to recommend anything. Here is a working script with input data
create a test data file
$ while read f; do echo $f,$(echo $f | gzip -f | base64); done < <(seq 5) | tee file.g
1,H4sIAJhBuVkAAzPkAgBT/FFnAgAAAA==
2,H4sIAJhBuVkAAzPiAgCQr3xMAgAAAA==
3,H4sIAJhBuVkAAzPmAgDRnmdVAgAAAA==
4,H4sIAJhBuVkAAzPhAgAWCCYaAgAAAA==
5,H4sIAJhBuVkAAzPlAgBXOT0DAgAAAA==
and decode
$ awk 'BEGIN {FS=OFS=","}
{cmd="echo "$2" | base64 -di | gunzip"; cmd | getline v; print $1,v}' file.g
1,1
2,2
3,3
4,4
5,5

print whole variable contents if the number of lines are greater than N

How to print all lines if certain condition matches.
Example:
echo "$ip"
this is a sample line
another line
one more
last one
If this file has more than 3 lines then print the whole variable.
I am tried:
echo $ip| awk 'NR==4'
last one
echo $ip|awk 'NR>3{print}'
last one
echo $ip|awk 'NR==12{} {print}'
this is a sample line
another line
one more
last one
echo $ip| awk 'END{x=NR} x>4{print}'
Need to achieve this:
If this file has more than 3 lines then print the whole file. I can do this using wc and bash but need a one liner.

The right way to do this (no echo, no pipe, no loops, etc.):
$ awk -v ip="$ip" 'BEGIN{if (gsub(RS,"&",ip)>2) print ip}'
this is a sample line
another line
one more
last one

You can use Awk as follows,
echo "$ip" | awk '{a[$0]; next}END{ if (NR>3) { for(i in a) print i }}'
one more
another line
this is a sample line
last one
you can also make the value 3 configurable from an awk variable,
echo "$ip" | awk -v count=3 '{a[$0]; next}END{ if (NR>count) { for(i in a) print i }}'
The idea is to store the contents of the each line in {a[$0]; next} as each line is processed, by the time the END clause is reached, the NR variable will have the line count of the string/file you have. Print the lines if the condition matches i.e. number of lines greater than 3 or whatever configurable value using.
And always remember to double-quote the variables in bash to avoid undergoing word-splitting done by the shell.
Using James Brown's useful comment below to preserve the order of lines, do
echo "$ip" | awk -v count=3 '{a[NR]=$0; next}END{if(NR>3)for(i=1;i<=NR;i++)print a[i]}'
this is a sample line
another line
one more
last one

Another in awk. First test files:
$ cat 3
1
2
3
$ cat 4
1
2
3
4
Code:
$ awk 'NR<4{b=b (NR==1?"":ORS)$0;next} b{print b;b=""}1' 3 # look ma, no lines
[this line left intentionally blank. no wait!]
$ awk 'NR<4{b=b (NR==1?"":ORS)$0;next} b{print b;b=""}1' 4
1
2
3
4
Explained:
NR<4 { # for tghe first 3 records
b=b (NR==1?"":ORS) $0 # buffer them to b with ORS delimiter
next # proceed to next record
}
b { # if buffer has records, ie. NR>=4
print b # output buffer
b="" # and reset it
}1 # print all records after that

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to display the date of each file as the first element of each lines with bash/awk? - awk

Date output file : dates.txt 2019-08-20 2019-08-08 2019-08-01 File system data fsys.txt /dev/hd4,384.00 /dev/hd4,394.00 /dev/hd4,354.00 paste can be used to append the files as columns. Use -d to specify comma as the separator. paste -d ',' dates.txt fsys.txt

Related

Match 2 columns in 2 files and get another value from the first file

Awk, order foreach 12 lines to insert query

exclude sequences depending on description ID in AWK

base64 decoding from file column

print whole variable contents if the number of lines are greater than N

Categories

Resources