Sum of specific columns data with based on date using awk

Sum of specific columns data with based on date using awk - awk

I am having a data which is separated by a comma
LBA0SF004,2018-10-01,4681,4681
LBA0SF004,2018-10-01,919,919
LBA0SF004,2018-10-01,3,3
LBA0SF004,2018-10-01,11453,11453
LBA0SF004,2018-10-02,4681,4681
LBA0SF004,2018-10-02,1052,1052
LBA0SF004,2018-10-02,3,3
LBA0SF004,2018-10-02,8032,8032
I need an awk command to add all 3rd and 4th columns with awk command based on date. If you see the same server with different dates values are available I need data like this
LBA0SF004 2018-10-01 17056 17056
LBA0SF004 2018-10-02 13768 13768

Below GNU AWK construct should be able to do what you are looking for.
awk '
BEGIN {
FS = ","
OFS = " "
}
{
if(NF == 4)
{
a[$1][$2]["3rd"] += $3;
a[$1][$2]["4th"] += $4;
}
}
END {
for (i in a)
{
for (j in a[i])
{
print i, j, a[i][j]["3rd"], a[i][j]["4th"];
}
}
}
' Input_File.txt
Explanation :-
FS is input field Separator which in your case is ,
OFS is output field Separator which is
Create an array a with first column, second column and sum of third and forth columns
At the END, Print the contents of the array

Related

awk - split column then get average

I'm trying to compute the average based on the values in column 5.
I intend to split the entries at the comma, sum the two numbers, compute the average and assign them to two new columns (ave1 and ave2).
I keep getting the error fatal: division by zero attempted, and I cannot get around it.
Below is the table in image format since I tried to post a markdown table and it failed.
Here's my code:
awk -v FS='\t' -v OFS='\t' '{split($5,a,","); sum=a[1]+a[2]}{ print $0, "ave1="a[1]/sum, "ave2="a[2]/sum}' vcf_table.txt

Never write a[2]/sum, always write (sum ? a[2]/sum : 0) or similar instead to protect from divide-by-zero.
You also aren't taking your header row into account. Try this:
awk '
BEGIN { FS=OFS="\t" }
NR == 1 {
ave1 = "AVE1"
ave2 = "AVE2"
}
NR > 1 {
split($5,a,",")
sum = a[1] + a[2]
if ( sum ) {
ave1 = a[1] / sum
ave2 = a[2] / sum
}
else {
ave1 = 0
ave2 = 0
}
}
{ print $0, ave1, ave2 }
' vcf_table.txt

bash or awk - generating report from complex data set

I have a program that generates a large data file, and I put a small sample in the input section. What I am trying to do is start with an AOUT. Then look at the 4th column to find its next connection, which shows up in the second column somewhere else in the file and repeat those steps until it ends with an AIN in the first column. The number of connections between the AOUT and AIN varies from just one to over ten. If there isn't an AIN at the end, there shouldn't be any output. the output should start with AOUT and show each connection until it reaches AIN. Is there a way to use awk or anything to create my desired output?
input (this is a small section there are many more and the order they appear is not standard)
AOUT,03xx:LY0372A,LIC0372.OUT,LIC0372
PIDA,03xx:LIC0372,LT372_SEL.OUT,LT372_SEL
SIGSEL,03xx:LT372_SEL,LT1_0372.PNT,LT1_0372
AIN,03xx:LT1_0372
output:
03xx:LY0372A
=03xx:LT372_SEL.OUT
=03xx:LT1_0372.PNT
=03xx:LT1_0372
output format:
(AOUT)
=(any number of jumps)
=(any number of jumps))
=(AIN)

If you don't provide more input and answers to the questions in the comments above, a possible solution in AWK could be:
#!/bin/bash
awk -F',' '{
if ($1 == "AOUT") {
output = $2 "\n"
connector = $4
sub (":.*", "", $2)
label = $2
}
else if ($1 == "AIN") {
output = output " =" $2
print output
output = ""
}
else if (output != "") {
if ($2 == label ":" connector) {
output = output " =" label ":" $3 "\n"
connector = $4
}
}
}' input.csv

Print every alternate column in row in a text file

I have a comma separated file. I would like to print every alternate columns into a new row.
Example input file:
Name : John, Age : 30, DOB : 30-Oct-2018
Example output:
Name,Age,DOB
John,30,30-Oct-2018

non-awk solution
$ sed 's/[,:]/\n/g;s/ //g' file | pr -3ts,
Name,Age,DOB
John,30,30-Oct-2018

awk 'BEGIN{FS="[[:blank:]]*[:,][[:blank:]]*"}
{ for(i=1;i<=NF;i+=2) printf (i==1?"":",") $i; print "" }
{ for(i=2;i<=NF;i+=2) printf (i==1?"":",") $i; print "" }' inputfile

Per Your example and output:
$ awk -F', ' '/ : / { for (i=1;i<=NF;i++) { if ( match($i,/ : /) ) { linekeys=linekeys substr($i,1,RSTART-1) ","; linevalues=linevalues substr($i,RSTART+RLENGTH) ","; } } print(substr(linekeys,1,length(linekeys)-1)); print(substr(linevalues,1,length(linevalues)-1)); linekeys=""; linevalues=""; }' file.txt
Name,Age,DOB
John,30,30-Oct-2018

Here's a general idea you could use to implement a solution.
Using awk's split function.
Split the entire line into an array rows with the row delimiter (", "), and save the number of rows.
Split each row into an array cols with the column delimiter (" : "), and save the number of columns. And iterate over the column values and store them into a table, indexed by row and column, e.g. data[row","col].
Finally, iterate over first number of columns and then number of of rows, printing data[row","col].

Replace blank value with previous non blank first column value using awk (value separated columns)

I have comma separated file with two columns where the first column is always empty and the second one is sometimes empty (when the last column is empty there is no final comma):
,value_c1_1
,,value_c2_1
,,value_c2_2
,,value_c2_3
,value_c1_2
I would like to use awk to fill empty column value with previous non-empty column value and then get rid of the rows where the second column is empty:
,value_c1_1,value_c2_1
,value_c1_1,value_c2_2
,value_c1_1,value_c2_3
The big difference with the answer to this question
awk '/^ /{$0=(x)substr($0,length(x)+1)}{x=$1}1' file
is that the fields are character separated (instead of being of fixed length) and that the first column is always empty.

awk -F, 'BEGIN { OFS = FS } { if ($2 == "") $2 = last2; else last2 = $2; print }'
If column 2 is empty, replace it with the saved value; otherwise, save the value that's in column 2 for future use. Print the line. (The BEGIN block ensures the output field separator OFS is the same as the (input) field separator, FS.)
If you only want to print lines with 3 fields, then:
awk -F, 'BEGIN { OFS = FS }
{ if ($2 == "") $2 = last2; else last2 = $2; if (NF == 3) print }'

awk | Rearrange fields of CSV file on the basis of column value

I need you help in writing awk for the below problem. I have one source file and required output of it.
Source File
a:5,b:1,c:2,session:4,e:8
b:3,a:11,c:5,e:9,session:3,c:3
Output File
session:4,a=5,b=1,c=2
session:3,a=11,b=3,c=5|3
Notes:
Fields are not organised in source file
In Output file: fields are organised in their specific format, for example: all a values are in 2nd column and then b and then c
For value c, in second line, its coming as n number of times, so in output its merged with PIPE symbol.
Please help.

Will work in any modern awk:
$ cat file
a:5,b:1,c:2,session:4,e:8
a:5,c:2,session:4,e:8
b:3,a:11,c:5,e:9,session:3,c:3
$ cat tst.awk
BEGIN{ FS="[,:]"; split("session,a,b,c",order) }
{
split("",val) # or delete(val) in gawk
for (i=1;i<NF;i+=2) {
val[$i] = (val[$i]=="" ? "" : val[$i] "|") $(i+1)
}
for (i=1;i in order;i++) {
name = order[i]
printf "%s%s", (i==1 ? name ":" : "," name "="), val[name]
}
print ""
}
$ awk -f tst.awk file
session:4,a=5,b=1,c=2
session:4,a=5,b=,c=2
session:3,a=11,b=3,c=5|3
If you actually want the e values printed, unlike your posted desired output, just add ,e to the string in the split() in the BEGIN section wherever you'd like those values to appear in the ordered output.
Note that when b was missing from the input on line 2 above, it output a null value as you said you wanted.

Try with:
awk '
BEGIN {
FS = "[,:]"
OFS = ","
}
{
for ( i = 1; i <= NF; i+= 2 ) {
if ( $i == "session" ) { printf "%s:%s", $i, $(i+1); continue }
hash[$i] = hash[$i] (hash[$i] ? "|" : "") $(i+1)
}
asorti( hash, hash_orig )
for ( i = 1; i <= length(hash); i++ ) {
printf ",%s:%s", hash_orig[i], hash[ hash_orig[i] ]
}
printf "\n"
delete hash
delete hash_orig
}
' infile
that splits line with any comma or colon and traverses all odd fields to save either them and its values in a hash to print at the end. It yields:
session:4,a:5,b:1,c:2,e:8
session:3,a:11,b:3,c:5|3,e:9

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Sum of specific columns data with based on date using awk - awk

Related

awk - split column then get average

bash or awk - generating report from complex data set

Print every alternate column in row in a text file

Replace blank value with previous non blank first column value using awk (value separated columns)

awk | Rearrange fields of CSV file on the basis of column value

Categories

Resources