tcl code to catch repeated lines in file - process

Hi I have the following scenario. I read in a file line by line; then each like will look like: 2 0, 3 0, 4 0, 9 0, 11 3 etc; like "string string". each like then will be put is a variable; $line will have one set of value in each iteration within while loop; now I want to be able to catch if a line is repeated or similar to one we saw previously..
myFile will contain:
2 0
3 0
9 0
11 3
3 5
2 9
2 0
3 5
Here is the code:
set in [open myFile r]
set exline ""
while {[gets $in line] >= 0} {
lappend exline $line
if { [lsearch $exline $line] > 0} {
puts "same number repeated $line"
}
}
close $in

How about:
set fid [open myFile]
while {[gets $fid line] != -1} {dict incr lines [string trim $line]}
close $fid
dict for {line count} $lines {if {$count > 1} {puts "duplicated: $line"}}
duplicated: 2 0
duplicated: 3 5

Related

Using awk, how to average numbers in column between two strings in a text file

A text file containing multiple tabular delimited columns between strings with an example below.
Code 1 (3)
5 10 7 1 1
6 10 9 1 1
7 10 10 1 1
Code 2 (2)
9 11 3 1 3
10 8 5 2 1
Code 3 (1)
12 10 2 1 1
Code 4 (2)
14 8 1 1 3
15 8 7 5 1
I would like to average the numbers in the third column for each code block. The example below is what the output should look like.
8.67
4
2
4
Attempt 1
awk '$3~/^[[:digit:]]/ {i++; sum+=$3; print $3} $3!~/[[:digit:]]/ {print sum/i; sum=0;i=0}' in.txt
Returned fatal: division by zero attempted.
Attempt 2
awk -v OFS='\t' '/^Code/ { if (NR > 1) {i++; sum+=$3;} {print sum/i;}}' in.txt
Returned another division by zero error.
Attempt 3
awk -v OFS='\t' '/^Code/ { if (NR > 1) { print s/i; s=0; i=0; } else { s += $3; i += 1; }}' in.txt
Returned 1 value: 0.
Attempt 4
awk -v OFS='\t' '/^Code/ {
if (NR > 1)
i++
print sum += $3/i
}
END {
i++
print sum += $3/i
}'
Returned:
0
0
0
0.3
I am not sure where that last number is coming from, but this has been the closest solution so far. I am getting a number for each block, but not the average.
Could you please try following.
awk '
/^Code/{
if(value!=0 && value){
print sum/value
}
sum=value=""
next
}
{
sum+=$NF;
value++
}
END{
if(value!=0 && value){
print sum/value
}
}
' Input_file

awk only specific rows transpose into multiple columns

Does anybody know how to transpose this input of rows in a file?
invStatus: ONLINE
System: 55
StatFail: 0
invState: 0 Unknown
invFailReason: None
invBase: SYS-MG5-L359-XO1-TRAFFIC STAT 5: TRAF2
invFlag: 0xeee5 SEMAN PRESENT STATUS H_DOWN BASE LOGIC_ONLINE DEX EPASUS INDEX ACK
dexIn: 0
dexIO: 0
badTrans: 0
badSys: 0
io_IN: 0
io_OUT: 0
Tr_in: 0
Tr_out: 0
into similar output:
invBase: SYS-MG5-L359-XO1-TRAFFIC STAT 5: TRAF2
invFlag: 0xeee5 SEMAN PRESENT STATUS H_DOWN BASE LOGIC_ONLINE DEX EPASUS INDEX ACK
invStatus: ONLINE System: 55 StatFail: 0 invState: 0 Unknown invFailReason: None
dexIn: 0 dexIO: 0 badTrans 0 badSys: 0
io_IN: 0 io_OUT: 0 Tr_in: 0 Tr_out: 0
i tried 1st time at the beginning to add at the end of each row ";" then join multiple rows > then split them based on string but still getting messy output
I am at this stage with formatting:
cat port | sed 's/$/;/g' | awk 'ORS=/;$/?" ":"\n"'
I'd start with this
awk -F: '
{data[$1] = $0}
END {
OFS="\t"
print data["invBase"]
print data["invFlag"]
print data["invStatus"], data["System"], data["StatFail"], data["invState"], data["invFailReason"]
print data["dexIn"], data["dexIO"], data["badTrans"], data["badSys"]
print data["io_IN"], data["io_OUT"], data["Tr_in"], data["Tr_out"]
}
' file
invBase: SYS-MG5-L359-XO1-TRAFFIC STAT 5: TRAF2
invFlag: 0xeee5 SEMAN PRESENT STATUS H_DOWN BASE LOGIC_ONLINE DEX EPASUS INDEX ACK
invStatus: ONLINE System: 55 StatFail: 0 invState: 0 Unknown invFailReason: None
dexIn: 0 dexIO: 0 badTrans: 0 badSys: 0
io_IN: 0 io_OUT: 0 Tr_in: 0 Tr_out: 0
Then, to make it as pretty as you want, start with storing the line lengths and change the print statements to printf statements using some of those lengths.
A closer look at the file reveals that, except for 3 lines, they are sequential and can be pasted into 4 columns:
awk -F: '
$1 == "invBase" || $1 == "invFlag" {print; next}
$1 == "invStatus" {invStatus = $0; next}
{line[n++] = $0}
END {
printf invStatus "\t"
paste = "paste - - - -"
for (i=0; i<n; i++) {print line[i] | paste}
close(paste)
}
' file
which provides the same output as above.

Difference between adjacent data rows, with multiple columns

If I have:
1 2 3 4 5 6 . .
3 4 5 4 2 1 . .
5 7 5 7 2 0 . .
.
.
I want to show the difference of adjacent data rows, so that it would show:
2 2 2 0 -3 -5 . .
2 3 0 3 0 -1 . .
.
.
I found the post difference between number in the same column using AWK, and adapting the second answer, I thought that this will do the job:
awk 'NR>1{print $0-p} {p=$0}' file
But that produces output in and of a single column. How do I get it to retain the column structure of the data?
$ cat tst.awk
NR>1 {
for (i=1; i<=NF; i++) {
printf "%2d%s", $i - p[i], (i<NF ? OFS : ORS)
}
}
{ split($0,p) }
$ awk -f tst.awk file
2 2 2 0 -3 -5
2 3 0 3 0 -1
Try something like this:
awk '{for (i=1; i <= NF; i++) { c[i] = $i - c[i] }; count = NF }
END { for (i = 1; i <= count; i++) { printf c[i] " "}}' numbers
Written out:
$ cat > subtr.awk
{
for (i=1; i<=NF; i++) b[i]=a[i]
# for (i in a) b[i]=a[i]
n=split($0,a)
}
NR > 1 {
for (i=1; i<=NF; i++) {
#for(i in a) {
printf "%s%s", a[i]-b[i], (i==n?ORS:OFS)
}
delete b
}
Test it:
$ awk -f subtr.awk file
2 2 2 0 -3 -5
2 3 0 3 0 -1

Awk Script to process data from a trace file

I have a table (.tr file) with different rows (events).
**Event** **Time** **PacketLength** PacketId
sent 1 100 1
dropped 2 100 1
sent 3 100 2
sent 4.5 100 3
dropped 5 100 2
sent 6 100 4
sent 7 100 5
sent 8 100 6
sent 10 100 7
And I would like to create a new table as the following and I don't know how to it in AWK.
**SentTime** **PacketLength Dropped**
1 100 Yes
3 100 Yes
4.5 100
6 100
7 100
8 100
10 100
I have a simple code to find dropped or sent packets, time and id but I do not know how to create a column in my table with the results for dropped packets.
BEGIN{}
{
Event = $1;
Time = $2;
Packet = $6;
Node = $10;
id = $11;
if (Event=="s" && Node=="1.0.1.2"){
printf ("%f\t %d\n", $2, $6);
}
}
END {}
You have to save all the information in an array to postprocess it at the end of the file. Obviously, if the file is huge, this could cause memory problems.
BEGIN {
template="#sentTime\t#packetLength\t#dropped";
}
{
print $0;
event = $1;
time = $2;
packet_length = $3;
packet_id = $4;
# save all the info in an array
packet_info[packet_id] = packet_info[packet_id] "#" packet_length "#" time "#" event;
}
END {
# traverse the information of the array
for( time in packet_info )
{
print "the time is: " time " = " packet_info[time];
# for every element in the array (= packet),
# the data has this format "#100#1#sent#100#2#dropped"
split( packet_info[time], info, "#" );
# info[2] <-- 100
# info[3] <-- 1
# info[4] <-- sent
# info[5] <-- 100
# info[6] <-- 2
# info[7] <-- dropped
line = template;
line = gensub( "#sentTime", info[3], "g", line );
line = gensub( "#packetLength", info[2], "g", line );
if( info[4] == "dropped" )
line = gensub( "#dropped", "yes", "g", line );
if( info[7] == "dropped" )
line = gensub( "#dropped", "yes", "g", line );
line = gensub( "#dropped", "", "g", line );
print line;
} # for
}
I would say...
awk '/sent/{pack[$4]=$2; len[$4]=$3}
/dropped/{drop[$4]}
END {print "Sent time", "PacketLength", "Dropped";
for (p in pack)
print pack[p], len[p], ((p in drop)?"yes":"")
}' file
This stores the packages in pack[], the lengths in len[] and the dropped in drop[], so that they are fetched later on.
Test
$ awk '/sent/{pack[$4]=$2; len[$4]=$3} /dropped/{drop[$4]} END {print "Sent time", "PacketLength", "Dropped"; for (p in pack) print pack[p], len[p], ((p in drop)?"yes":"")}' a
Sent time PacketLength Dropped
1 100 yes
3 100 yes
4.5 100
6 100
7 100
8 100
10 100

Remove data using AWK

I have a file of values that I wish to plot using gnuplot. The problem is that there are some values that I wish to remove.
Here is an example of my data:
1 52
2 3
3 0
4 4
5 1
6 1
7 1
8 0
9 0
I want to remove any row in which the right column is 0, so the data above would end up looking like this:
1 52
2 3
4 4
5 1
6 1
7 1
Let's just check field 2:
awk '$2' file
If the 2nd field has a True value, that is, not 0 or empty, the condition is True. In such case, awk performs its default action: print $0, meaning print the current line.
Updated, shorter:
awk '$2 == 0 { next; } { print; }'
awk '{ if ($2 == 0) { next; } else { print; } }'