Find out range for every field in a column - indexing

I have a column that looks like this:
A 1
B 3
C 5
D 4
E 7
F 1
G 1
H 3
For every filed in column#2, I want to calculate the range (max-min) of 3 field radius up and down.
A range(1 3 5 4)
B range(1 3 5 4 7)
C range(1 3 5 4 7 1)
D range(1 3 5 4 7 1 1)
E range( 3 5 4 7 1 1 3)
F range( 5 4 7 1 1 3)
G range( 4 7 1 1 3)
H range( 7 1 1 3)
How can do this in awk?
I could do the same in perl using:
my $set_size = #values;
for ( my $i = 0 ; $i < $set_size ; $i++ ) {
my $min = $i - 4;
if ( $min < 0 ) { $min = 0; }
my $max = $i + 4;
if ( $max > ( $set_size - 1 ) ) { $max = $set_size - 1; }
my $min_val = $values[$min];
my $max_val = $values[$min];
for ( my $j = $min ; $j <= $max ; $j++ ) {
if ( $values[$j] <= $min_val ) { $min_val = $values[$j]; }
if ( $values[$j] >= $max_val ) { $max_val = $values[$j]; }
}
my $range = $max_val - $min_val;
printf "$points[$i] %.15f\n", $range;
}

idk exactly what I want to calculate the range (max-min) of 3 field radius up and down. means but to get the output you posted from the input you posted would be:
$ cat tst.awk
{
keys[NR] = $1
values[NR] = $2
}
END {
range = 3
for (i=1; i<=NR; i++) {
min = ( (i - range) >= 1 ? i - range : 1 )
max = ( (i + range) <= NR ? i + range : NR )
printf "%s range(", keys[i]
for (j=min; j<=max; j++) {
printf "%s%s", values[j], (j<max ? " " : ")\n")
}
}
}
$
$ awk -f tst.awk file
A range(1 3 5 4)
B range(1 3 5 4 7)
C range(1 3 5 4 7 1)
D range(1 3 5 4 7 1 1)
E range(3 5 4 7 1 1 3)
F range(5 4 7 1 1 3)
G range(4 7 1 1 3)
H range(7 1 1 3)

Your sample perl doesn't print out your sample output, it seems to do something different... so here's how I'd do it in perl:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say/;
use List::Util qw/min max/;
my (#col1, #col2);
while (<>) {
chomp;
my ($v1, $v2) = split;
push #col1, $v1;
push #col2, $v2;
}
my #prefix;
for my $i (0 .. $#col1) {
my #range = #col2[max($i - 3, 0) .. min($i + 3, $#col2)];
push #prefix, ' ' if $i > 3;
unshift #range, #prefix;
say "$col1[$i] range(#range)"
}
running it:
$ perl range.pl input.txt
A range(1 3 5 4)
B range(1 3 5 4 7)
C range(1 3 5 4 7 1)
D range(1 3 5 4 7 1 1)
E range( 3 5 4 7 1 1 3)
F range( 5 4 7 1 1 3)
G range( 4 7 1 1 3)
H range( 7 1 1 3)
The formatting will break if any of the numbers are greater than 9, though.

Since you tagged tcl:
#!/usr/bin/env tclsh
lassign $argv file size
set fh [open $file r]
# assume exactly 2 space-separated words per line
set data [dict create {*}[split [read -nonewline $fh]]]
close $fh
set len [dict size $data]
set values [dict values $data]
set i 0
dict for {key _} $data {
set first [expr {max($i - $size, 0)}]
set last [expr {min($i + $size, $len)}]
puts [format "%s range(%s%s)" \
$key \
[string repeat " " $first] \
[lrange $values $first $last] \
]
incr i
}
outputs
A range(1 3 5 4)
B range(1 3 5 4 7)
C range(1 3 5 4 7 1)
D range(1 3 5 4 7 1 1)
E range( 3 5 4 7 1 1 3)
F range( 5 4 7 1 1 3)
G range( 4 7 1 1 3)
H range( 7 1 1 3)

TXR at the shell prompt:
bash$ txr -c '#(collect)
#label #num
#(end)
#(bind rng #[window-map 3 nil (opip list (remq nil) (mapcar toint)) num])
#(output)
# (repeat)
#label range(#rng) -> #(find-min rng)..#(find-max rng)
# (end)
#(end)'
A 1
B 3
C 5
D 4
E 7
F 1
G 1
H 3
[Ctrl-D][Enter]
A range(1 3 5 4) -> 1..5
B range(1 3 5 4 7) -> 1..7
C range(1 3 5 4 7 1) -> 1..7
D range(1 3 5 4 7 1 1) -> 1..7
E range(3 5 4 7 1 1 3) -> 1..7
F range(5 4 7 1 1 3) -> 1..7
G range(4 7 1 1 3) -> 1..7
H range(7 1 1 3) -> 1..7

Related

Find and replace and move a line that contains a specific string

Assuming I have the following text file:
a b c d 1 2 3
e f g h 1 2 3
i j k l 1 2 3
m n o p 1 2 3
How do I replace '1 2 3' with '4 5 6' in the line that contains the letter (e) and move it after the line that contains the letter (k)?
N.B. the line that contains the letter (k) may come in any location in the file, the lines are not assumed to be in any order
My approach is
Remove the line I want to replace
Find the lines before the line I want to move it after
Find the lines after the line I want to move it after
append the output to a file
grep -v 'e' $original > $file
grep -B999 'k' $file > $output
grep 'e' $original | sed 's/1 2 3/4 5 6/' >> $output
grep -A999 'k' $file | tail -n+2 >> $output
rm $file
mv $output $original
but there is a lot of issues in this solution:
a lot of grep commands that seems unnecessary
the argument -A999 and -B999 are assuming the file would not contain lines more than 999, it would be better to have another way to get lines before and after the matched line
I am looking for a more efficient way to achieve that
Using sed
$ sed '/e/{s/1 2 3/4 5 6/;h;d};/k/{G}' input_file
a b c d 1 2 3
i j k l 1 2 3
e f g h 4 5 6
m n o p 1 2 3
Here is a GNU awk solution:
awk '
/\<e\>/{
s=$0
sub("1 2 3", "4 5 6", s)
next
}
/\<k\>/ && s {
printf("%s\n%s\n",$0,s)
next
} 1
' file
Or POSIX awk:
awk '
function has(x) {
for(i=1; i<=NF; i++) if ($i==x) return 1
return 0
}
has("e") {
s=$0
sub("1 2 3", "4 5 6", s)
next
}
has("k") && s {
printf("%s\n%s\n",$0,s)
next
} 1
' file
Either prints:
a b c d 1 2 3
i j k l 1 2 3
e f g h 4 5 6
m n o p 1 2 3
This works regardless of the order of e and k in the file:
awk '
function has(x) {
for(i=1; i<=NF; i++) if ($i==x) return 1
return 0
}
has("e") {
s=$0
sub("1 2 3", "4 5 6", s)
next
}
FNR<NR && has("k") && s {
printf("%s\n%s\n",$0,s)
s=""
next
}
FNR<NR
' file file
This awk should work for you:
awk '
/(^| )e( |$)/ {
sub(/1 2 3/, "4 5 6")
p = $0
next
}
1
/(^| )k( |$)/ {
print p
p = ""
}' file
a b c d 1 2 3
i j k l 1 2 3
e f g h 4 5 6
m n o p 1 2 3
This might work for you (GNU sed):
sed -n '/e/{s/1 2 3/4 5 6/;s#.*#/e/d;/k/s/.*/\&\\n&/#p};' file | sed -f - file
Design a sed script by passing the file twice and applying the sed instructions from the first pass to the second.
Another solution is to use ed:
cat <<\! | ed file
/e/s/1 2 3/4 5 6/
/e/m/k/
wq
!
Or if you prefer:
<<<$'/e/s/1 2 3/4 5 6/\n.m/k/\nwq' ed -s file

How can I aggregate strings from many cells into one cell?

Say I have two classes with a handful of students each, and I want to think of the possible pairings in each class. In my original data, I have one line per student.
What's the easiest way in Pandas to turn this dataset
Class Students
0 1 A
1 1 B
2 1 C
3 1 D
4 1 E
5 2 F
6 2 G
7 2 H
Into this new stuff?
Class Students
0 1 A,B
1 1 A,C
2 1 A,D
3 1 A,E
4 1 B,C
5 1 B,D
6 1 B,E
7 1 C,D
6 1 B,E
8 1 C,D
9 1 C,E
10 1 D,E
11 2 F,G
12 2 F,H
12 2 G,H
Try This:
import itertools
import pandas as pd
cla = [1, 1, 1, 1, 1, 2, 2, 2]
s = ["A", "B", "C", "D" , "E", "F", "G", "H"]
df = pd.DataFrame(cla, columns=["Class"])
df['Student'] = s
def create_combos(list_students):
combos = itertools.combinations(list_students, 2)
str_students = []
for i in combos:
str_students.append(str(i[0])+","+str(i[1]))
return str_students
def iterate_df(class_id):
df_temp = df.loc[df['Class'] == class_id]
list_student = list(df_temp['Student'])
list_combos = create_combos(list_student)
list_id = [class_id for i in list_combos]
return list_id, list_combos
list_classes = set(list(df['Class']))
new_id = []
new_combos = []
for idx in list_classes:
tmp_id, tmp_combo = iterate_df(idx)
new_id += tmp_id
new_combos += tmp_combo
new_df = pd.DataFrame(new_id, columns=["Class"])
new_df["Student"] = new_combos
print(new_df)

Using awk to count number of row range

I have a data set: (file.txt)
X Y
1 a
2 b
3 c
10 d
11 e
12 f
15 g
20 h
25 i
30 j
35 k
40 l
41 m
42 n
43 o
46 p
I have two Up10 and Down10 columns,
Up10: From (X) to (X-10) count of row.
Down10 : From (X) to (X+10)
count of row
For example:
X Y Up10 Down10
35 k 3 5
For Up10; 35-10 X=35 X=30 X=25 Total = 3 row
For Down10; 35+10 X=35 X=40 X=41 X=42 X=42 Total = 5 row
I have tried, but i cant show 3rd and 4rth column:
awk 'BEGIN{ FS=OFS="\t" }
NR==FNR{
a[$1]+=$3
next
}
{ $(NF+10)=a[$3] }
{ $(NF-10)=a[$4] }
1
' file.txt file.txt > file-2.txt
Desired Output:
X Y Up10 Down10
1 a 1 5
2 b 2 5
3 c 3 4
10 d 4 5
11 e 5 4
12 f 5 3
15 g 4 3
20 h 5 3
25 i 3 3
30 j 3 3
35 k 3 5
40 l 3 5
41 m 3 4
42 n 4 3
43 o 5 2
46 p 5 1
This is the Pierre François' solution: Thanks again #Pierre François
awk '
BEGIN{OFS="\t"; print "X\tY\tUp10\tDown10"}
(NR == FNR) && (FNR > 1){a[$1] = $1 + 0}
(NR > FNR) && (FNR > 1){
up = 0; upl = $1 - 10
down = 0; downl = $1 + 10
for (i in a) { i += 0 # tricky: convert i to integer
if ((i >= upl) && (i <= $1)) {up++}
if ((i >= $1) && (i <= downl)) {down++}
}
print $1, $2, up, down;
}
' file.txt file.txt > file-2.txt
This is the Pierre François' solution: Thanks again #Pierre François
awk '
BEGIN{OFS="\t"; print "X\tY\tUp10\tDown10"}
(NR == FNR) && (FNR > 1){a[$1] = $1 + 0}
(NR > FNR) && (FNR > 1){
up = 0; upl = $1 - 10
down = 0; downl = $1 + 10
for (i in a) { i += 0 # tricky: convert i to integer
if ((i >= upl) && (i <= $1)) {up++}
if ((i >= $1) && (i <= downl)) {down++}
}
print $1, $2, up, down;
}
' file.txt file.txt > file-2.txt

Excel VBA Loop x = x + 1

I'm new to Excel VBA and i'm trying to make a loop that sums X = X + 1 but when the loop ends it continues with the last X and doesn't starts again.
This is what I have:
For I = 1 To 3
J = 2
For K = 1 To J * 2 Step 1
Debug.Print K
Next K
Next I
This is what i get: 1 2 3 4 1 2 3 4 1 2 3 4 .
What i would like to get is: 1 2 3 4 5 6 7 8 9 10 11 12 .
Thanks for the help provided. I thought this would solve my problem but it's a bit more complicated. I need this because i'm adding coordinates in X, Y, Z format with this code:
For I = 1 To 6
X = 0
J = 10
RobApp.Project.Structure.Nodes.Create X = X + 1, 0, 0, J * (I - 1)
RobApp.Project.Structure.Nodes.Create X = X + 1, Range("N34") * 0.15, 0, J *
(I - 1)
Next I
"X = X+1" is the node number. I want it to be sequencial, 1,2,3,4 and so on while J is increasing in the Z coordinate. For example for the first line of code:
Node 1 = 0,0,0
Node 2 = 0,0,10
Node 3 = 0,0,20
and so on!
Or rather, use the extra variable X as you originally planned:
X = 0
For I = 1 To 3
J = 2
For K = 1 To J * 2 Step 1
X = X + 1
Debug.Print X
Next K
Next I

Counter in in awk if else loop

can you explain to me why this simple onliner does not work? Thanks for your time.
awk 'BEGIN{i=1}{if($2 == i){print $0} else{print "0",i} i=i+1}' check
input text file with name "check":
a 1
b 2
c 3
e 5
f 6
g 7
desired output:
a 1
b 2
c 3
0 4
e 5
f 6
g 7
output received:
a 1
b 2
c 3
0 4
0 5
0 6
awk 'BEGIN{i=1}{ if($2 == i){print $0; } else{print "0",i++; print $0 } i++ }' check
increment i one more time in the else (you are inserting a new line)
print the currentline in the else, too
this works only if there is only one line missing between the present lines, otherwise you need a loop printing the missing lines
Or simplified:
awk 'BEGIN{i=1}{ if($2 != i){print "0",i++; } print $0; i++ }' check
Yours is broken because:
you read the next line ("e 5"),
$2 is not equal to your counter,
you print the placeholder line and increment your counter (to 5),
you do not print the current line
you read the next line ("f 6")
goto 2
A while loop is warranted here -- that will also handle the case when you have gaps greater than a single number.
awk '
NR == 1 {prev = $2}
{
while ($2 > prev+1)
print "0", ++prev
print
prev = $2
}
' check
or, if you like impenetrable one-liners:
awk 'NR==1{p=$2}{while($2>p+1)print "0",++p;p=$2}1' check
All you need is:
awk '{while (++i<$2) print 0, i}1' file
Look:
$ cat file
a 1
b 2
c 3
e 5
f 6
g 7
k 11
n 14
$ awk '{while (++i<$2) print 0, i}1' file
a 1
b 2
c 3
0 4
e 5
f 6
g 7
0 8
0 9
0 10
k 11
0 12
0 13
n 14