OCaml: Print a long int list 10 elements per row - printf

I'm working with really long lists of integers and need a way of printing them 10 to a row. This is what I've got so far and now I'm stuck:
open Printf
let print_list list = List.iter (printf "%d ") list;;
(* Remove first n elements from list *)
let rec remove n list =
if n== 0 then list
else match list with
| [] -> []
| hd::tl -> remove (n-1) tl;;
(* Remove and return first n elements from a list *)
let rec take n list =
match n with
| 0 -> []
| _ -> List.hd list :: take (n-1) (List.tl list);;
let rec print_rows list =
if List.length list > 10 then
begin
let l = take 10 list;
print_list l;
print_endline " ";
print_rows (remove 5 list)
end else print_list list;;
I'm sure there is a better way recursively with matching patterns, but I can't figure this out. Help!

Here's a function that does something close to what you want. It doesn't do anything fancy, it just counts the number of ints printed so far and inserts endlines at the right times.
let printby10 intlist =
let iprint count n =
Printf.printf "%d " n;
if count mod 10 = 9 then Printf.printf "\n";
count + 1
in
ignore (List.fold_left iprint 0 intlist)
This code leaves an incomplete line if the number of ints isn't a multiple of 10. Maybe you would want to fix that up.

Another (but very close to that of #Jeffrey Scofield) approach would be to use the standard function List.iteri, which provides the current element's index:
let print_by_rows n_per_row =
List.iteri (fun i x ->
print_int x;
if (i + 1) mod n_per_row <> 0 then print_string " "
else print_newline ())
A test:
μ> print_by_rows 10 (Array.to_list (Array.make 20 42));;
42 42 42 42 42 42 42 42 42 42
42 42 42 42 42 42 42 42 42 42
- : unit = ()
And one more:
μ> print_by_rows 5 (Array.to_list (Array.make 20 42));;
42 42 42 42 42
42 42 42 42 42
42 42 42 42 42
42 42 42 42 42
- : unit = ()

Related

How to print optimal tours of a vehicle routing problem in CPLEX?

I modeled a Vehicle Routing Problem in CPLEX and now I'd like to print the optimal tours it found using post-processing.
My decision variable looks like this:
dvar boolean x[vehicles][edges];
1, if the edge is traversed by the vehicle, 0 otherwise.
Edge is a tuple containg two customers as follows:
tuple edge {
string i;
string j;
}
with customers being:
{string} customers = {"0", "1", "2", "3", "4", "5", "6"}
where 0 and 6 represent the depot where all tours start and end.
My post-processing right now looks the following:
execute {
writeln("Optimal value: ", cplex.getObjValue());
writeln("The following tours should be driven:");
for (var k in vehicles) {
write("Vehicle ", k, ": ");
var y = 0;
write(y);
for (var a in edges) {
if (x[k][a] == 1 && a.i == y) {
write(" - ", a.j);
y = a.j;
}
}
writeln();
}
}
Sadly it doesn't work the intented way.
you need to turn boolean values for edges into tours.
See MTZ from How to with OPL
// What is better and relies on CPLEX is the MTZ model ( Miller-Tucker-Zemlin formulation )
// Cities
int n = ...;
range Cities = 1..n;
// Edges -- sparse set
tuple edge {int i; int j;}
setof(edge) Edges = {<i,j> | ordered i,j in Cities};
int dist[Edges] = ...;
setof(edge) Edges2 = {<i,j> | i,j in Cities : i!=j};
int dist2[<i,j> in Edges2] = (<i,j> in Edges)?dist[<i,j>]:dist[<j,i>];
// Decision variables
dvar boolean x[Edges2];
dvar int u[1..n] in 1..n;
/*****************************************************************************
*
* MODEL
*
*****************************************************************************/
// Objective
minimize sum (<i,j> in Edges2) dist2[<i,j>]*x[<i,j>];
subject to {
// Each city is linked with two other cities
forall (j in Cities)
{
sum (<i,j> in Edges2) x[<i,j>]==1;
sum (<j,k> in Edges2) x[<j,k>] == 1;
}
// MTZ
u[1]==1;
forall(i in 2..n) 2<=u[i]<=n;
forall(e in Edges2:e.i!=1 && e.j!=1) (u[e.j]-u[e.i])+1<=(n-1)*(1-x[e]);
};
{edge} solution={e | e in Edges2 : x[e]==1};
int follower[Cities];
{int} sol;
execute
{
//writeln("path ",solution);
for(var e in solution) follower[e.i]=e.j;
var k=1;
for(var i in Cities)
{
sol.add(k);
k=follower[k];
}
writeln("sol = ",sol);
}
/*
which gives
// solution (optimal) with objective 7542
sol = {1 22 31 18 3 17 21 42 7 2 30 23 20 50 29 16 46 44 34 35 36 39 40 37 38 48
24 5 15 6 4 25 12 28 27 26 47 13 14 52 11 51 33 43 10 9 8 41 19 45 32
49}
*/

Using awk to count number of row group

I have a data set: (file.txt)
X Y
1 a
2 b
3 c
10 d
11 e
12 f
15 g
20 h
25 i
30 j
35 k
40 l
41 m
42 n
43 o
46 p
I want to add two columns which are Up10 and Down10,
Up10: From (X) to (X-10) count of row.
Down10 : From (X) to (X+10)
count of row
For example:
X Y Up10 Down10
35 k 3 5
For Up10; 35-10 X=35 X=30 X=25 Total = 3 row
For Down10; 35+10 X=35 X=40 X=41 X=42 X=42 Total = 5 row
Desired Output:
X Y Up10 Down10
1 a 1 5
2 b 2 5
3 c 3 4
10 d 4 5
11 e 5 4
12 f 5 3
15 g 4 3
20 h 5 3
25 i 3 3
30 j 3 3
35 k 3 5
40 l 3 5
41 m 3 4
42 n 4 3
43 o 5 2
46 p 5 1
This is the Pierre François' solution: Thanks again #Pierre François
awk '
BEGIN{OFS="\t"; print "X\tY\tUp10\tDown10"}
(NR == FNR) && (FNR > 1){a[$1] = $1 + 0}
(NR > FNR) && (FNR > 1){
up = 0; upl = $1 - 10
down = 0; downl = $1 + 10
for (i in a) { i += 0 # tricky: convert i to integer
if ((i >= upl) && (i <= $1)) {up++}
if ((i >= $1) && (i <= downl)) {down++}
}
print $1, $2, up, down;
}
' file.txt file.txt > file-2.txt
But when i use this command for 13GB data, it takes too long.
I have used this way for 13GB data again:
awk 'BEGIN{ FS=OFS="\t" }
NR==FNR{a[NR]=$1;next} {x=y=FNR;while(--x in a&&$1-10<a[x]){} while(++y in a&&$1+10>a[y]){} print $0,FNR-x,y-FNR}
' file.txt file.txt > file-2.txt
When file-2.txt reaches 1.1GB it is frozen. I am waiting several hours, but i can not see finish of command and final output file.
Note: I am working on Gogole cloud. Machine type
e2-highmem-8 (8 vCPUs, 64 GB memory)
A single pass awk that keeps the sliding window of 10 last records and uses that to count the ups and downs. For symmetricy's sake there should be deletes in the END but I guess a few extra array elements in memory isn't gonna make a difference:
$ awk '
BEGIN {
FS=OFS="\t"
}
NR==1 {
print $1,$2,"Up10","Down10"
}
NR>1 {
a[NR]=$1
b[NR]=$2
for(i=NR-9;i<=NR;i++) {
if(a[i]>=a[NR]-10&&i>=2)
up[NR]++
if(a[i]<=a[NR-9]+10&&i>=2)
down[NR-9]++
}
}
NR>10 {
print a[NR-9],b[NR-9],up[NR-9],down[NR-9]
delete a[NR-9]
delete b[NR-9]
delete up[NR-9]
delete down[NR-9]
}
END {
for(nr=NR+1;nr<=NR+9;nr++) {
for(i=nr-9;i<=nr;i++)
if(a[i]<=a[nr-9]+10&&i>=2&&i<=NR)
down[nr-9]++
print a[nr-9],b[nr-9],up[nr-9],down[nr-9]
}
}' file
Output:
X Y Up10 Down10
1 a 1 5
2 b 2 5
...
35 k 3 5
...
43 o 5 2
46 p 5 1
Another single pass approach with a sliding window
awk '
NR == 1 { next } # skip the header
NR == 2 { min = max = cur = 1; X[cur] = $1; Y[cur] = $2; next }
{ X[++max] = $1; Y[max] = $2
if (X[cur] >= $1 - 10) next
for (; X[cur] + 10 < X[max]; ++cur) {
for (; X[min] < X[cur] - 10; ++min) {
delete X[min]
delete Y[min]
}
print X[cur], Y[cur], cur - min + 1, max - cur
}
}
END {
for (; cur <= max; ++cur) {
for (; X[min] < X[cur] - 10; ++min);
for (i = max; i > cur && X[cur] + 10 < X[i]; --i);
print X[cur], Y[cur], cur - min + 1, i - cur + 1
}
}
' file
The script assumes the X column is ordered numerically.

Splitting a coordinate string into X and Y columns with a pandas data frame

So I created a pandas data frame showing the coordinates for an event and number of times those coordinates appear, and the coordinates are shown in a string like this.
Coordinates Occurrences x
0 (76.0, -8.0) 1 0
1 (-41.0, -24.0) 1 1
2 (69.0, -1.0) 1 2
3 (37.0, 30.0) 1 3
4 (-60.0, 1.0) 1 4
.. ... ... ..
63 (-45.0, -11.0) 1 63
64 (80.0, -1.0) 1 64
65 (84.0, 24.0) 1 65
66 (76.0, 7.0) 1 66
67 (-81.0, -5.0) 1 67
I want to create a new data frame that shows the x and y coordinates individually and shows their occurrences as well like this--
x Occurrences y Occurrences
76 ... -8 ...
-41 ... -24 ...
69 ... -1 ...
37 ... -30 ...
60 ... 1 ...
I have tried to split the string but don't think I am doing it correctly and don't know how to add it to the table regardless--I think I'd have to do something like a for loop later on in my code--I scraped the data from an API, here is the code to set up the data frame shown.
for key in contents['liveData']['plays']['allPlays']:
# for plays in key['result']['event']:
# print(key)
if (key['result']['event'] == "Shot"):
#print(key['result']['event'])
scoordinates = (key['coordinates']['x'], key['coordinates']['y'])
if scoordinates not in shots:
shots[scoordinates] = 1
else:
shots[scoordinates] += 1
if (key['result']['event'] == "Goal"):
#print(key['result']['event'])
gcoordinates = (key['coordinates']['x'], key['coordinates']['y'])
if gcoordinates not in goals:
goals[gcoordinates] = 1
else:
goals[gcoordinates] += 1
#create data frame using pandas
gdf = pd.DataFrame(list(goals.items()),columns = ['Coordinates','Occurences'])
print(gdf)
sdf = pd.DataFrame(list(shots.items()),columns = ['Coordinates','Occurences'])
print()
try this
import re
df[['x', 'y']] = df.Coordinates.apply(lambda c: pd.Series(dict(zip(['x', 'y'], re.findall('[-]?[0-9]+\.[0-9]+', c.strip())))))
using the in-built string methods to achieve this should be performant:
df[["x", "y"]] = df["Coordinates"].str.strip(r"[()]").str.split(",", expand=True).astype(np.float)
(this also converts x,y to float values, although not requested probably desired)

Output the result of each loop in different columns

price.txt file has two columns: (name and value)
Mary 134
Lucy 56
Jack 88
range.txt file has three columns: (fruit and min_value and max_value)
apple 57 136
banana 62 258
orange 88 99
blueberry 98 121
My aim is to test whether the value in price.txt file is between the min_value and max_value in range.txt. If yes, putout 1, If not, output "x".
I tried:
awk 'FNR == NR { name=$1; price[name]=$2; next} {
for (name in price) {
if ($2<=price[name] && $3>=price[name]) {print 1} else {print "x"}
}
}' price.txt range.txt
But my results are all in one column, just like follows:
1
1
x
x
x
x
x
x
1
1
1
x
Actually, I want my result to be like: (Each name has one column)
1 x 1
1 x 1
x x 1
x x x
Because I need to use paste to add the output file and range.txt file together. The final result should be like:
apple 57 136 1 x 1
banana 62 258 1 x 1
orange 88 99 x x 1
blueberry 98 121 x x x
So, how can I get the result of each loop in different columns? And is there anyway to output the final result without paste based on my current code? Thank you.
This builds on what you provided,
# load prices by index to maintain read order
FNR == NR {
price[names++]=$2
next
}
# save max index to avoid using non-standard length(array)
END {
names=NR
}
{
l = $1 " " $2 " " $3
for (i=0; i < names; i++) {
if ($2 <= price[i] && $3 >= price[i]) {
l = l " 1"
} else {
l = l " x"
}
}
print l
}
and generates output,
apple 57 136 1 x 1
banana 62 258 1 x 1
orange 88 99 x x 1
blueberry 98 121 x x x
However, you don't have the person name for the score (anonymous results) - maybe that's intentional?
The change here is to explicitly index array populated in first block to maintain order.

Get Ascii Code?

To retrieve the ascii code of all charterers of column 13th of a file I write this script
awk -v ch="'" '{
for (i=1;i<=length(substr($13,6,length($13)));i++)
{cmd = printf \"%d\\n\" \"" ch substr(substr($13,6,length($13)),i,1) "\"" cmd | getline output close(cmd) ;
Number= Number " " output
}
print Number ; Number=""
}' ~/a.test
but it doesn't work in the right way! I mean it works fine a while then produces the weird results!?
As an example , for this input (assume it's column 13th)
CQ:Z:%8%%%%0%%%%9%%%%:%%%%%%%%%%%%%%%%%%
I have to get this
37 56 37 37 37 37 48 37 37 37 37 57 37 37 37 37 58 37 37 37 37 ...............
But I have this
37 56 37 37 37 37 48 48 48 48 48 57 57 57 57 57 58 58 58 58 58 ...............
As you can see first miss-computation appear after character "0" (48 in result).
Do you know which part of my code is responsible for this error ?!
Try this:
awk '{
str = substr($13, 6)
for (i=1; i<=length(str); i++) {
cmd = "printf %d \42\47" substr(str, i, 1) "\42"
cmd | getline output
close(cmd)
Number= Number " " output
}
print Number
Number=""
}' ~/a.test
\42 is " and \47 is ', so this runs printf %d "'${char}" in the shell for each ${char}, which triggers evaluation as a C constant with the POSIX extension dictating a numeric value as noted in the final bullet of the POSIX printf definition's §Extended Description.
N.B. The formatting matters!
Don't try to squeeze the code unless you know exactly what you're doing!
And a pure awk solution (I took the ord/chr functions directly from the manual):
printf '%s\n' 'CQ:Z:%8%%%%0%%%%9%%%%:%%%%%%%%%%%%%%%%%%'|
awk 'BEGIN { _ord_init() }
{
str = substr($0, 6)
for (i = 0; ++i <= length(str);)
printf "%s", (ord(substr(str, i, 1)) (i < length(str) ? OFS : ORS))
}
func _ord_init( low, high, i, t) {
low = sprintf("%c", 7) # BEL is ascii 7
if (low == "\a") { # regular ascii
low = 0
high = 127
}
else if (sprintf("%c", 128 + 7) == "\a") {
# ascii, mark parity
low = 128
high = 255
}
else { # ebcdic(!)
low = 0
high = 255
}
for (i = low; i <= high; i++) {
t = sprintf("%c", i)
_ord_[t] = i
}
}
func ord(str, c) {
# only first character is of interest
c = substr(str, 1, 1)
return _ord_[c]
}
func chr(c) {
# force c to be numeric by adding 0
return sprintf("%c", c + 0)
}'
This might work for you:
awk -vSQ="'" -vDQ='"' '{args=space="";n=split($13,a,"");for(i=1;i<=n;i++){args=args space DQ SQ a[i] DQ;format=format space "%d";space=" "};format=DQ format "\\n" DQ;system("printf " format " " args)}'