Convert a csv with headers to HTML in Bash - awk

I have data like below in a csv file
ServerName,Index,Status
10.xxx.xx.xx,1.5.1.1,2
10.xxx.xx.xx,1.5.1.2,3
I need to convert this data to html and also color the row if the value of the "Status" is 3/4/5..
please help me in this.
tried below
awk 'BEGIN{
FS=","
print "<HTML>""<TABLE border="1"><TH>JOB_NAME</TH><TH>RUN_DATE</TH><TH>STATUS</TH>"
}
{
printf "<TR>"
for(i=1;i<=NF;i++)
printf "<TD>%s</TD>", $i
print "</TR>"
}
END{
print "</TABLE></BODY></HTML>"
}
' 10.106.40.45_FinalData.csv > file.html
sed -i "s/2/<font color="green">2<\/font>/g;s/4/<font color="red">4<\/font>/g;s/5/<font color="red">5<\/font>/g;" file.html
in the latest code i tried, i need to check the value of the status column only and need to color the cell.

$ cat tst.awk
BEGIN{
FS = ","
colors[3] = "red"
colors[4] = "green"
colors[5] = "blue"
print "<HTML><BODY>"
print "<TABLE border=\"1\">"
print "<TR><TH>JOB_NAME</TH><TH>RUN_DATE</TH><TH>STATUS</TH></TR>"
}
NR>1 {
printf "<TR>"
for (i=1; i<=NF; i++) {
if ( (i == NF) && ($i in colors) ) {
on = "<font color=\"" colors[$i] "\">"
off = "</font>"
}
else {
on = off = ""
}
printf "<TD>%s%s%s</TD>", on, $i, off
}
print "</TR>"
}
END {
print "</TABLE>"
print "</BODY></HTML>"
}
.
$ awk -f tst.awk file
<HTML><BODY>
<TABLE border="1">
<TR><TH>JOB_NAME</TH><TH>RUN_DATE</TH><TH>STATUS</TH></TR>
<TR><TD>10.xxx.xx.xx</TD><TD>1.5.1.1</TD><TD>2</TD></TR>
<TR><TD>10.xxx.xx.xx</TD><TD>1.5.1.2</TD><TD><font color="red">3</font></TD></TR>
</TABLE>
</BODY></HTML>

You don't actually say what the problem is, but I presume it's colorizing the numbers when they appear in the addresses also?
The best solution is probably to add a conditional into your awk script (untested):
if (i == 3 && $i == 2) {
print "<TD><font color="green">2<\/font></TD>"
} else .....
Alternative, your status field is the only number in the column, whereas the addresses are not, so you can adjust your pattern match:
"s/>2</><font color="green">2<\/font></g;......"
I.e. match the surrounding brackets.

You can also use jq for this task.
jq structures the CSV data instead of working only on a text basis. This makes it easy to remove empty rows or to colour only the 'Status' column.
#!/bin/bash
CSV='
ServerName,Index,Status
10.xxx.xx.xx,1.5.1.1,2
10.xxx.xx.xx,1.5.1.2,3
'
jq -srR '
def colorize($status):
if $status == "3" then "yellow"
elif $status == "4" then "orange"
elif $status == "3" then "red"
else "green"
end
| "<font color=\"\(.)\">\($status)</font>";
split("\n") # split lines
| map(select(length > 0)) # remove empty lines from CSV
| map(split(",")) # split each line
| .[1:] # drop first line with headers
| "<table>", # convert to HTML table
" <tr> <th>ServerName</th> <th>Index</th> <th>Status</th> </tr>",
(.[] | " <tr> <td>\(.[0])</td> <td>\(.[1])</td> <td>\(colorize(.[2]))</td> </tr>"),
"</table>"
' <<< "$CSV"
output:
<table>
<tr> <th>ServerName</th> <th>Index</th> <th>Status</th> </tr>
<tr> <td>10.xxx.xx.xx</td> <td>1.5.1.1</td> <td><font color="green">2</font></td> </tr>
<tr> <td>10.xxx.xx.xx</td> <td>1.5.1.2</td> <td><font color="yellow">3</font></td> </tr>
</table>

Related

remove above/below line and append to a file

I have a file with the following lines. I can filter a specific word and display the lines below/above it. However, i also wanted to remove it on the original file and append it to a new file.
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
<tr>
<td>tree</td><td>apple</td><td>green</td>
</tr>
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
i can do it this by: grep -i green origfile -A1 -B1 >> newfile but how can remove it from the orig file.
origfile:
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
newfile:
<tr>
<td>tree</td><td>apple</td><td>green</td>
</tr>
Is there a cleaner/quickest way to do it?
You could do it within a single awk, segregating records into different files. This will look for word green and will place one line before and after it and output it into new file along with removing it from original file.
awk '
FNR==NR{
if($0~/green/){
words[FNR]
}
next
}
((FNR+1) in words) || (FNR in words) || ((FNR-1) in words){
print > "newfile"
next
}
1
' Input_file Input_file > temp && mv temp Input_file
Explanation: Adding detailed explanation for above code.
awk ' ##Starting awk program from here.
FNR==NR{ ##Checking condition FNR==NR which will be TRUE when first time Input_file is being read.
if($0~/green/){ ##Checking condition if line contains green string then do following.
words[FNR] ##Creating array of words with index of current line number.
}
next ##next will skip all further statements from here.
}
((FNR+1) in words) || (FNR in words) || ((FNR-1) in words){
##Checking condition if current line+1 OR current line OR current line-1 numbers are in words array then do following.
print > "newfile" ##Printing current line into newfile output file.
next ##next will skip all further statements from here.
}
1 ##Printing current line here.
' Input_file Input_file > temp && mv temp Input_file
##Mentioning Input_file(s) and doing inplace save into it.
$ cat tst.awk
$0 == "<tr>" { inRow=1; row=$0; next }
inRow {
row = row ORS $0
if ( $0 == "</tr>" ) {
inRow = 0
if ( index(row,"<td>green</td>") ) {
print row | "cat>&2"
next
}
else {
$0 = row
}
}
}
!inRow
$ awk -f tst.awk file >o1 2>o2
$ head o?
==> o1 <==
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
==> o2 <==
<tr>
<td>tree</td><td>apple</td><td>green</td>
</tr>
To modify the original file:
$ awk -f tst.awk file >o1 2>o2 && mv o1 file
$ cat file
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
<tr>
<td>tree</td><td>apple</td><td>red</td>
</tr>
Here is an ed solution.
#!/usr/bin/env bash
ed -s origfile.txt <<-EOF
/<td>green<\/td>/;?^<tr>?;/^<\/tr>/w newfile.txt
.;/^<\/tr>/d
w
q
EOF
Or a separate ed script, just name to script.ed
/<td>green<\/td>/;?^<tr>?;/^<\/tr>/w newfile.txt
.;/^<\/tr>/d
w
q
Then
ed -s origfile.txt < script.ed

grep between multiple pattern

Here is a (real-world) text:
<tr>
randomtext
ip_(45.54.58.85)
randomtext..
port(randomtext45)
randomtext random...
</tr>
<tr>
randomtext ran
ip_(5.55.45.8)
randomtext4
port(other$_text_other_length444)
</tr>
<tr>
randomtext
random
port(other$text52)
</tr>
output should be:
45.54.58.85 45
5.55.45.8 444
I know how to grep 45.54.58.85 and 5.55.45.8
awk 'BEGIN{ RS="<tr>"}1' file | grep -oP '(?<=ip_\()[^)]*'
how to grep port taking into account that we have random text/length after port( ?
I put a third record that should not appear in the output as there is no ip
Using GNU Awk:
gawk 'BEGIN { RS = "<tr>" } match($0, /.*^ip_[(]([^)]+).*^port[(].*[^0-9]+([0-9]+)[)].*/, a) { print a[1], a[2] }' your_file
And another that's compatible with any Awk:
awk -F '[()]' '$1 == "<tr>" { i = 0 } $1 == "ip_" { i = $2 } $1 == "port" && i { sub(/.*[^0-9]/, "", $2); if (length($2)) print i, $2 }' your_file
Output:
45.54.58.85 45
5.55.45.8 444
Through GNU awk , grep and paste.
$ awk 'BEGIN{ RS="<tr>"}/ip_/{print;}' file | grep -oP 'ip_\(\K[^)]*|port\(\D*\K\d+' | paste - -
45.54.58.85 45
5.55.45.8 444
Explanation:
awk 'BEGIN{ RS="<tr>"}/ip_/{print;}' file with the Record Separator value as <tr>, this awk command prints only the record which contains the string ip_
ip_\(\K[^)]* prints only the text which was just after to ip_( upto the next ) symbol. \K in the pattern discards the previously matched characters.
| Logical OR symbol.
port\(\D*\K\d+ Prints only the numbers which was inside port() string.
paste - - combine every two lines.
Here is another awk
awk -F"[()]" '/^ip/ {ip=$2;f=NR} f && NR==f+2 {n=split($2,a,"[a-z]+");print ip,a[n]}' file
45.54.58.85 45
5.55.45.8 444
How it works:
awk -F"[()]" ' # Set field separator to "()"
/^ip/ { # If line starts with "ip" do
ip=$2 # Set "ip" to field $2
f=NR} # Set "f" to line number
f && NR==f+2 { # Go two line down and
n=split($2,a,"[a-z]+") # Split second part to get port
print ip,a[n] # Print "ip" and "port"
}' file # Read the file
WIth any modern awk:
$ awk -F'[()]' '
$1=="ip_" { ip=$2 }
$1=="port" { sub(/.*[^[:digit:]]/,"",$2); port=$2 }
$1=="</tr>" { if (ip) print ip, port; ip="" }
' file
45.54.58.85 45
5.55.45.8 444
Couldn't be much simpler and clearer IMHO.

awk | Rearrange fields of CSV file on the basis of column value

I need you help in writing awk for the below problem. I have one source file and required output of it.
Source File
a:5,b:1,c:2,session:4,e:8
b:3,a:11,c:5,e:9,session:3,c:3
Output File
session:4,a=5,b=1,c=2
session:3,a=11,b=3,c=5|3
Notes:
Fields are not organised in source file
In Output file: fields are organised in their specific format, for example: all a values are in 2nd column and then b and then c
For value c, in second line, its coming as n number of times, so in output its merged with PIPE symbol.
Please help.
Will work in any modern awk:
$ cat file
a:5,b:1,c:2,session:4,e:8
a:5,c:2,session:4,e:8
b:3,a:11,c:5,e:9,session:3,c:3
$ cat tst.awk
BEGIN{ FS="[,:]"; split("session,a,b,c",order) }
{
split("",val) # or delete(val) in gawk
for (i=1;i<NF;i+=2) {
val[$i] = (val[$i]=="" ? "" : val[$i] "|") $(i+1)
}
for (i=1;i in order;i++) {
name = order[i]
printf "%s%s", (i==1 ? name ":" : "," name "="), val[name]
}
print ""
}
$ awk -f tst.awk file
session:4,a=5,b=1,c=2
session:4,a=5,b=,c=2
session:3,a=11,b=3,c=5|3
If you actually want the e values printed, unlike your posted desired output, just add ,e to the string in the split() in the BEGIN section wherever you'd like those values to appear in the ordered output.
Note that when b was missing from the input on line 2 above, it output a null value as you said you wanted.
Try with:
awk '
BEGIN {
FS = "[,:]"
OFS = ","
}
{
for ( i = 1; i <= NF; i+= 2 ) {
if ( $i == "session" ) { printf "%s:%s", $i, $(i+1); continue }
hash[$i] = hash[$i] (hash[$i] ? "|" : "") $(i+1)
}
asorti( hash, hash_orig )
for ( i = 1; i <= length(hash); i++ ) {
printf ",%s:%s", hash_orig[i], hash[ hash_orig[i] ]
}
printf "\n"
delete hash
delete hash_orig
}
' infile
that splits line with any comma or colon and traverses all odd fields to save either them and its values in a hash to print at the end. It yields:
session:4,a:5,b:1,c:2,e:8
session:3,a:11,b:3,c:5|3,e:9

awk system not setting variables properly

I am having a issue in having the output of the grep (used in system() in nawk ) assigned to a variable .
nawk '{
CITIZEN_COUNTRY_NAME = "INDIA"
CITIZENSHIP_CODE=system("grep "CITIZEN_COUNTRY_NAME " /tmp/OFAC/country_codes.config | cut -d # -f1")
}'/tmp/*****
The value IND is displayed in the console but when i give a printf the value of citizenshipcode is 0 - Can you pls help me here
printf("Country Tags|%s|%s\n", CITIZEN_COUNTRY_NAME ,CITIZENSHIP_CODE)
Contents of country_codes.config file
IND#INDIA
IND#INDIB
CAN#CANADA
system returns the exit value of the called command, but the output of the command is not returned to awk (or nawk). To get the output, you want to use getline directly. For example, you might re-write your script:
awk ' {
file = "/tmp/OFAC/country_codes.config";
CITIZEN_COUNTRY_NAME = "INDIA";
FS = "#";
while( getline < file ) {
if( $0 ~ CITIZEN_COUNTRY_NAME ) {
CITIZENSHIP_CODE = $1;
}
}
close( file );
}'
Pre-load the config file with awk:
nawk '
NR == FNR {
split($0, x, "#")
country_code[x[2]] = x[1]
next
}
{
CITIZEN_COUNTRY_NAME = "INDIA"
if (CITIZEN_COUNTRY_NAME in country_code) {
value = country_code[CITIZEN_COUNTRY_NAME]
} else {
value = "null"
}
print "found " value " for country name " CITIZEN_COUNTRY_NAME
}
' country_codes.config filename

Concatenating multiple lines with a discriminator

I have the input like this
Input:
a,b,c
d,e,f
g,h,i
k,l,m
n,o,p
q,r,s
I wan to be able to concatenate the lines with a discriminator like "|"
Output:
a,b,c|d,e,f|g,h,i
k,l,m|n,o.p|q,r,s
The file has 1million lines and I want to be able to concatenate lines like the example before.
Any ideas about how to approach this?
#OP, if you want to group them for every 3 records,
$ awk 'ORS=(NR%3==0)?"\n":"|"' file
a,b,c|d,e,f|g,h,i
k,l,m|n,o,p|q,r,s
with Perl,
$ perl -lne 'print $_ if $\ = ($. % 3 == 0) ? "\n" : "|"' file
a,b,c|d,e,f|g,h,i
k,l,m|n,o,p|q,r,s
Since your tags include sed here is a way to use it:
sed 'N;N;s/\n/|/g' datafile
gawk:
BEGIN {
state=0
}
state==0 {
line=$0
state=1
next
}
state==1 {
line=line "|" $0
state=2
next
}
state==2 {
print line "|" $0
state=0
next
}
If Perl is fine, you can try:
$i = 1;
while(<>) {
chomp;
unless($i % 3)
{ print "$line\n"; $i = 1; $line = "";}
$line .= "$_|";
$i++;
}
to run:
perl perlfile.pl 1millionlinesfile.txt
$ paste -sd'|' input | sed -re 's/([^|]+\|[^|]+\|[^|]+)\|/\1\n/g'
With paste, we join the lines together, and then sed dices them up. The pattern grabs runs of 3 pipe-terminated fields and replaces their respective final pipes with newlines.
With Perl:
#! /usr/bin/perl -ln
push #a => $_;
if (#a == 3) {
print join "|" => #a;
#a = ();
}
END { print join "|" => #a if #a }