Compact way to save JuMP optimization results in DataFrames - dataframe

I would like to save all my variables and dual variables of my finished lp-optimization in an efficient manner. My current solution works, but is neither elegant nor suited for larger optimization programs with many variables and constraints because I define and push! every single variable into DataFrames separately. Is there a way to iterate through the variables using all_variables() and all_constraints() for the duals? While iterating, I would like to push the results into DataFrames with the variable index name as columns and save the DataFrame in a Dict().
A conceptual example would be for variables:
Result_vars = Dict()
for vari in all_variables(Model)
Resul_vars["vari"] = DataFrame(data=[indexval(vari),value(vari)],columns=[index(vari),"Value"])
end
An example of the appearance of the declared variable in JuMP and DataFrame:
#variable(Model, p[t=s_time,n=s_n,m=s_m], lower_bound=0,base_name="Expected production")
And Result_vars[p] shall approximately look like:
t,n,m,Value
1,1,1,50
2,1,1,60
3,1,1,145

Presumably, you could go something like:
x = all_variables(model)
DataFrame(
name = variable_name.(x),
Value = value.(x),
)
If you want some structure more complicated, you need to write custom code.
T, N, M, primal_solution = [], [], [], []
for t in s_time, n in s_n, m in s_m
push!(T, t)
push!(N, n)
push!(M, m)
push!(primal_solution, value(p[t, n, m]))
end
DataFrame(t = T, n = N, m = M, Value = primal_solution)
See here for constraints: https://jump.dev/JuMP.jl/stable/constraints/#Accessing-constraints-from-a-model-1. You want something like:
for (F, S) in list_of_constraint_types(model)
for con in all_constraints(model, F, S)
#show dual(con)
end
end

Thanks to Oscar, I have built a solution that could help to automatize the extraction of results.
The solution is build around a naming convention using base_name in the variable definition. One can copy paste the variable definition into base_name followed by :. E.g.:
#variable(Model, p[t=s_time,n=s_n,m=s_m], lower_bound=0,base_name="p[t=s_time,n=s_n,m=s_m]:")
The naming convention and syntax can be changed, comments can e.g. be added, or one can just not define a base_name. The following function divides the base_name into variable name, sets (if needed) and index:
function var_info(vars::VariableRef)
split_conv = [":","]","[",","]
x_str = name(vars)
if occursin(":",x_str)
x_str = replace(x_str, " " => "") #Deletes all spaces
x_name,x_index = split(x_str,split_conv[1]) #splits raw variable name+ sets and index
x_name = replace(x_name, split_conv[2] => "")
x_name,s_set = split(x_name,split_conv[3])#splits raw variable name and sets
x_set = split(s_set,split_conv[4])
x_index = replace(x_index, split_conv[2] => "")
x_index = replace(x_index, split_conv[3] => "")
x_index = split(x_index,split_conv[4])
return (x_name,x_set,x_index)
else
println("Var base_name not properly defined. Special Syntax required in form var[s=set]: ")
end
end
The next functions create the columns and the index values plus columns for the primal solution ("Value").
function create_columns(x)
col_ind=[String(var_info(x)[2][col]) for col in 1:size(var_info(x)[2])[1]]
cols = append!(["Value"],col_ind)
return cols
end
function create_index(x)
col_ind=[String(var_info(x)[3][ind]) for ind in 1:size(var_info(x)[3])[1]]
index = append!([string(value(x))],col_ind)
return index
end
function create_sol_matrix(varss,model)
nested_sol_array=[create_index(xx) for xx in all_variables(model) if varss[1]==var_info(xx)[1]]
sol_array=hcat(nested_sol_array...)
return sol_array
end
Finally, the last function creates the Dict which holds all results of the variables in DataFrames in the previously mentioned style:
function create_var_dict(model)
Variable_dict=Dict(vars[1]
=>DataFrame(Dict(vars[2][1][cols]
=>create_sol_matrix(vars,model)[cols,:] for cols in 1:size(vars[2][1])[1]))
for vars in unique([[String(var_info(x)[1]),[create_columns(x)]] for x in all_variables(model)]))
return Variable_dict
end
When those functions are added to your script, you can simply retrieve all the solutions of the variables after the optimization by calling create_var_dict():
var_dict = create_var_dict(model)
Be aware: they are nested functions. When you change the naming convention, you might have to update the other functions as well. If you add more comments you have to avoid using [, ], and ,.
This solution is obviously far from optimal. I believe there could be a more efficient solution falling back to MOI.

Related

Stata: for loop for storing values of Gini coefficient

I have 133 variables on income (each variable represents a group). I want the Gini coefficients of all these groups, so I use ineqdeco in Stata. I can't compute all these coefficients by hand so I created a for loop:
gen sgini = .
foreach var of varlist C07-V14 {
forvalue i=1/133 {
ineqdeco `var'
replace sgini[i] = $S_gini
}
}
Also tried changing the order:
foreach var of varlist C07-V14 {
ineqdeco `var'
forvalue i=1/133 {
replace sgini[i] = $S_gini
}
}
And specifying i beforehand:
gen i = 1
foreach var of varlist C07-V14 {
ineqdeco `var'
replace sgini[i] = $S_gini
replace i = i+1
}
}
I don't know if this last method works anyway.
In all cases I get the error: weight not allowed r(101). I don't know what this means, or what to do. Basically, I want to compute the Gini coefficient of all 133 variables, and store these values in a vector of length 133, so a single variable with all the coefficients stored in it.
Edit: I found that the error has to do with the replace command. I replaced this line with:
replace sgini = $S_gini in `i'
But now it does not "loop", so I get the first value in all entries of sgini.
There is no obvious reason for your inner loop. If you have no more variables than observations, then this might work:
gen sgini = .
gen varname = ""
local i = 1
foreach var of varlist C07-V14 {
ineqdeco `var'
replace sgini = $S_gini in `i'
replace varname = "`var'" in `i'
local i = `i' + 1
}
The problems evident in your code (seem to) include:
Confusion between variables and local macros. If you have much experience with other languages, it is hard to break old mental habits. (Mata is more like other languages here.)
Not being aware that a loop over observations is automatic. Or perhaps not seeing that there is just a single loop needed here, the twist being that the loop over variables is easy but your concomitant loop over observations needs to be arranged with your own code.
Putting a subscript on the LHS of a replace. The [] notation is reserved for weights but is illegal there in any case. To find out about weights, search weights or help weight.
Note that with this way of recording results, the Gini coefficients are not aligned with anything else. A token fix for that is to record the associated variable names alongside, as done above.
A more advanced version of this solution would be to use postfile to save to a new dataset.

Matlab's arrayfun for uniform output of class objects

I need to build an array of objects of class ID using arrayfun:
% ID.m
classdef ID < handle
properties
id
end
methods
function obj = ID(id)
obj.id = id;
end
end
end
But get an error:
>> ids = 1:5;
>> s = arrayfun(#(id) ID(id), ids)
??? Error using ==> arrayfun
ID output type is not currently implemented.
I can build it alternatively in a loop:
s = [];
for k = 1 : length(ids)
s = cat(1, s, ID(ids(k)));
end
but what is wrong with this usage of arrayfun?
Edit (clarification of the question): The question is not how to workaround the problem (there are several solutions), but why the simple syntax s = arrayfun(#(id) ID(id), ids); doesn't work. Thanks.
Perhaps the easiest is to use cellfun, or force arrayfun to return a cell array by setting the 'UniformOutput' option. Then you can convert this cell array to an array of obects (same as using cat above).
s = arrayfun(#(x) ID(x), ids, 'UniformOutput', false);
s = [s{:}];
You are asking arrayfun to do something it isn't built to do.
The output from arrayfun must be:
scalar values (numeric, logical, character, or structure) or cell
arrays.
Objects don't count as any of the scalar types, which is why the "workarounds" all involve using a cell array as the output. One thing to try is using cell2mat to convert the output to your desired form; it can be done in one line. (I haven't tested it though.)
s = cell2mat(arrayfun(#(id) ID(id), ids,'UniformOutput',false));
This is how I would create an array of objects:
s = ID.empty(0,5);
for i=5:-1:1
s(i) = ID(i);
end
It is always a good idea to provide a "default constructor" with no arguments, or at least use default values:
classdef ID < handle
properties
id
end
methods
function obj = ID(id)
if nargin<1, id = 0; end
obj.id = id;
end
end
end

Import a dictionary into the current scope as variables

I have a .mat file in which I put data previously processed. When I perform
dict = scipy.io.loadmat('training_data.mat')
I get back a dict that is like this
{'encoders' : ......, 'decoders' : ........, 'stuff' : .....}
I want to selectively import the encoders and decoders variables into my current scope. The effect is the same as:
encoders = dict['encoders']
decoders = dict['decoders']
How do I cleanly do this without typing 10-15 lines?
You could import a dictionary d into the global scope using
globals().update(d)
The same thing is impossible for local scopes, since modifying the dictionary returned by locals() results in undefined behaviour.
A slightly hacky trick you could use in this situation is to import the names into the dictionary of an on-the-fly created type:
d = {"encoders": 1, "decoders": 2}
t = type("", (), d)
print t.encoders
print t.decoders
This will at least be slightly more convenient than using d["decoders"] etc.
Alternatively, you could use exec statements to create your variables:
d = {"encoders": 1, "decoders": 2}
for k, v in d.iteritems():
exec k + " = v"
This could also be done selectively.

overriding a method in R, using NextMethod

how dows this work in R...
I am using a package (zoo 1.6-4) that defines a S3 class for time series sets.
I am writing a derived class where I want to override a few methods and can't get past this one:[.zoo!
in my derived class rows are indexed by timestamp, like in zoo, but differently from zoo, I allow only POSIXct values in the index. my users will be selecting columns all of the time, while slicing series only occasionally so I want to offer obj[name] instead of obj[, name].
my objects have class c("delftfews", "zoo").
but...
how do I override a method?
I tried this:
"[.delftfews" <- function(x, i, j, drop=TRUE, ...) {
if (missing(i)) return(NextMethod())
if (all(class(i) == "character") && missing(j)) {
return(NextMethod('[', x=x, i=1:NROW(x), j=i, drop=drop, ...))
}
NextMethod()
}
but I get this error: Error in rval[i, j, drop = drop., ...] : incorrect number of dimensions.
I have solved by editing the source from zoo: I removed those ..., but I don't get why that works. anybody can explain what is going on here?
The problem is that with the above definition of [.delftfews this code:
library(zoo)
z <- structure(zoo(cbind(a = 1:3, b = 4:6)), class = c("delftfews", "zoo"))
z["a"]
# generates this call: `[.zoo`(x = 1:6, i = 1:3, j = "a", drop = TRUE, z, "a")
Your code does work as is if you write the call like this:
z[j = "a"]
# generates this call: `[.zoo`(x = z, j = "a")
I think what you want is to change the relevant line in [.delftfews to this:
return(NextMethod(.Generic, object = x, i = 1:NROW(x), drop = drop))
# z["a"] now generates this call: `[.zoo`(x = z, i = 1:3, j = "a", drop = TRUE)
A point of clarification: allowing only POSIXct index values does not allow indexing columns by name only. I'm not sure how you arrived at that conclusion.
You're overriding zoo correctly, but I think you misunderstand NextMethod. The error is caused by if (missing(i)) return(NextMethod()), which calls [.zoo if i is missing, but [.zoo requires i because zoo's internal data structure is a matrix. Something like this should work:
if (missing(i)) i <- 1:NROW(x)
though I'm not sure if you have to explicitly pass this new i to NextMethod...
You may be interested in the xts package, if you haven't already taken a look at it.

Why is MATLAB reporting my variable uninitialized?

I made a class and in one of its methods I needed to calculate the distance between two points. So I wrote an ordinary function named "remoteness" to do this for me.
Compilation Error:
At compilation, "remoteness" was
determined to be a variable and this
variable is uninitialized.
"remoteness" is also a function name
and previous versions of MATLAB would
have called the function.
However, MATLAB 7 forbids the use of the same name in the same context as both a function and a variable.
Error in ==> TRobot>TRobot.makeVisibilityGraph at 58
obj.visiblityGraph(k,k+1) = remoteness(:,obj.VGVertices(k),obj.VGVertices(:,k+1));
I thought the name remoteness might be a name of another function, but when I changed its name to kamran the error persisted. It should be noted that I can use the kamran function (or remoteness) in the command line without any problem.
Command line example:
>> kamran([0,0],[3,4])
ans = 5
The code of the kamran function is in a separate m file.
Code for kamran function:
function dist = kamran(v1,v2)
dist = sqrt( (v1(1) - v2(1)) ^2 + (v1(2) - v2(2)) ^2 );
Code example for how kamran function is used:
function obj = makeVisibilityGraph(obj)
verticesNumber = 0;
for num = 1: size(obj.staticObstacle,2)
verticesNumber = verticesNumber + size(obj.staticObstacle(num).polygon,2);
end
% in the below line, 2 is for start and goal vertices
obj.visibilityGraph = ones(2 + size(obj.VGVertices,2)) * Inf;
for j=1 : size(obj.staticObstacle,2)
index = size(obj.VGVertices,2);
obj.VGVertices = [obj.VGVertices, obj.staticObstacle(j).polygon];
obj.labelVGVertices = [obj.labelVGVertices, ones(1,size(obj.staticObstacle(j).polygon,2))* j ];
for k = index+1 : (size(obj.VGVertices,2)-1)
obj.visiblityGraph(k,k+1) = kamran(:,obj.VGVertices(k),obj.VGVertices(:,k+1));
end
% as the first and last point of a polygon are visible to each
% other, so set them visible to each other
obj.visibilityGraph(index+1,size(obj.VGVertices,2)) = ...
kamran( obj.VGVertices(:,index+1), obj.VGVertices(:,size(obj.VGVertices,2)));
end
end
You seem to be trying to use kamran as an array:
kamran(:,obj.VGVertices(k),obj.VGVertices(:,k+1));
Notice the first parameter ":"?
I would bet MATLAB assumes that kamran (as called here) should be a 3-dimensional array, and you are trying to select the subset containing
kamran(all-of-first-index, Nth-of-second, Mth-of-third)
The second invocation of kamran looks right:
kamran( obj.VGVertices(:,index+1), obj.VGVertices(:,size(obj.VGVertices,2))
I do not know MATLAB but I notice on this line, you are running kamran with what looks like 3 arguments. In all other cases, it is executed with 2 arguments. Maybe there is something to that?
kamran(:,obj.VGVertices(k),obj.VGVertices(:,k+1));