How to iterate over all but the last index of an AbstractArray - iteration

In Julia, the recommended way to iterate over all indices of an AbstractArray is to use eachindex, e.g.,
for i in eachindex(a)
do_something(a[i], i)
end
In contrast to 1:length(a), eachindex(a) supports arrays with unconventional indexing, i.e., indices not starting at 1. Additionally, it is more efficient for arrays with slow linear indexing.
If I want to skip the first index I can use Iterators.drop(eachindex(a), 1) (is there a better way?) but how do I skip the last one in a generic way?

A "front" iterator is relatively simple and generally useful. Edit: it's also totally overkill to define it in full generality just for this case. It's much easier and simpler to rely on Base's builtins with a definition like:
front(itr, n=1) = Iterators.take(itr, length(itr)-n)
This will work for all iterators with length defined — which will include everything that eachindex will return.
Alternatively, you can define a specialized iterator from first principles that doesn't depend upon length being defined. I'm not aware of such a structure in any existing packages. Using Julia 0.6, an implementation could look like:
struct Front{T}
itr::T
end
# Basic iterator definition
function Base.start(f::Front)
s = start(f.itr)
done(f.itr, s) && throw(ArgumentError("cannot take the front of an empty iterator"))
return next(f.itr, s)
end
function Base.next(f::Front, state)
val, s = state
return val, next(f.itr, s)
end
Base.done(f::Front, state) = done(f.itr, state[2])
# Inherit traits as appropriate
Base.iteratorsize(::Type{Front{T}}) where {T} = _dropshape(Base.iteratorsize(T))
_dropshape(x) = x
_dropshape(::Base.HasShape) = Base.HasLength()
Base.iteratoreltype(::Type{Front{T}}) where {T} = Base.iteratoreltype(T)
Base.length(f::Front) = length(f.itr) - 1
Base.eltype(f::Front{T}) where {T} = eltype(T)
Now:
julia> collect(Front(eachindex(rand(5))))
4-element Array{Int64,1}:
1
2
3
4
julia> collect(Front(eachindex(sprand(3, 2, .2))))
5-element Array{CartesianIndex{2},1}:
CartesianIndex{2}((1, 1))
CartesianIndex{2}((2, 1))
CartesianIndex{2}((3, 1))
CartesianIndex{2}((1, 2))
CartesianIndex{2}((2, 2))

Another way to define #MattB.'s Front is
front(itr,n=1) = (first(x) for x in Iterators.partition(itr,n+1,1))
This also gives:
julia> front(eachindex([1,2,3,4,5]))|>collect
4-element Array{Int64,1}:
1
2
3
4
and as a bonus:
julia> front(eachindex([1,2,3,4,5]),2)|>collect
3-element Array{Int64,1}:
1
2
3
the corresponding iterator to drop(eachindex([1,2,3,4,5]),2).

There's also the following:
for I in CartesianRange(Base.front(indices(A)))
#show I A[I, :]
end
On A = reshape(1:27, 3, 3, 3) this yields
I = CartesianIndex{2}((1,1))
A[I,:] = [1,10,19]
I = CartesianIndex{2}((2,1))
A[I,:] = [2,11,20]
I = CartesianIndex{2}((3,1))
A[I,:] = [3,12,21]
I = CartesianIndex{2}((1,2))
A[I,:] = [4,13,22]
I = CartesianIndex{2}((2,2))
A[I,:] = [5,14,23]
I = CartesianIndex{2}((3,2))
A[I,:] = [6,15,24]
I = CartesianIndex{2}((1,3))
A[I,:] = [7,16,25]
I = CartesianIndex{2}((2,3))
A[I,:] = [8,17,26]
I = CartesianIndex{2}((3,3))
A[I,:] = [9,18,27]

Related

Trouble writing OptimizationFunction for automatic forward differentiation during Parameter Estimation of an ODEProblem

I am trying to learn Julia for its potential use in parameter estimation. I am interested in estimating kinetic parameters of chemical reactions, which usually involves optimizing reaction parameters with multiple independent batches of experiments. I have successfully optimized a single batch, but need to expand the problem to use many different batches. In developing a sample problem, I am trying to optimize using two toy batches. I know there are probably smarter ways to do this (subject of a future question), but my current workflow involves calling an ODEProblem for each batch, calculating its loss against the data, and minimizing the sum of the residuals for the two batches. Unfortunately, I get an error when initiating the optimization with Optimization.jl. The current code and error are shown below:
using DifferentialEquations, Plots, DiffEqParamEstim
using Optimization, ForwardDiff, OptimizationOptimJL, OptimizationNLopt
using Ipopt, OptimizationGCMAES, Optimisers
using Random
#Experimental data, species B is NOT observed in the data
times = [0.0, 0.071875, 0.143750, 0.215625, 0.287500, 0.359375, 0.431250,
0.503125, 0.575000, 0.646875, 0.718750, 0.790625, 0.862500,
0.934375, 1.006250, 1.078125, 1.150000]
A_obs = [1.0, 0.552208, 0.300598, 0.196879, 0.101175, 0.065684, 0.045096,
0.028880, 0.018433, 0.011509, 0.006215, 0.004278, 0.002698,
0.001944, 0.001116, 0.000732, 0.000426]
C_obs = [0.0, 0.187768, 0.262406, 0.350412, 0.325110, 0.367181, 0.348264,
0.325085, 0.355673, 0.361805, 0.363117, 0.327266, 0.330211,
0.385798, 0.358132, 0.380497, 0.383051]
P_obs = [0.0, 0.117684, 0.175074, 0.236679, 0.234442, 0.270303, 0.272637,
0.274075, 0.278981, 0.297151, 0.297797, 0.298722, 0.326645,
0.303198, 0.277822, 0.284194, 0.301471]
#Create additional data sets for a multi data set optimization
#Simple noise added to data for testing
times_2 = times[2:end] .+ rand(range(-0.05,0.05,100))
P_obs_2 = P_obs[2:end] .+ rand(range(-0.05,0.05,100))
A_obs_2 = A_obs[2:end].+ rand(range(-0.05,0.05,100))
C_obs_2 = C_obs[2:end].+ rand(range(-0.05,0.05,100))
#ki = [2.78E+00, 1.00E-09, 1.97E-01, 3.04E+00, 2.15E+00, 5.27E-01] #Target optimized parameters
ki = [0.1, 0.1, 0.1, 0.1, 0.1, 0.1] #Initial guess of parameters
IC = [1.0, 0.0, 0.0, 0.0] #Initial condition for each species
tspan1 = (minimum(times),maximum(times)) #tuple timespan of data set 1
tspan2 = (minimum(times_2),maximum(times_2)) #tuple timespan of data set 2
# data = VectorOfArray([A_obs,C_obs,P_obs])'
data = vcat(A_obs',C_obs',P_obs') #Make multidimensional array containing all observed data for dataset1, transpose to match shape of ODEProblem output
data2 = vcat(A_obs_2',C_obs_2',P_obs_2') #Make multidimensional array containing all observed data for dataset2, transpose to match shape of ODEProblem output
#make dictionary containing data, time, and initial conditions
keys1 = ["A","B"]
keys2 = ["time","obs","IC"]
entryA =[times,data,IC]
entryB = [times_2, data2,IC]
nest=[Dict(zip(keys2,entryA)),Dict(zip(keys2,entryB))]
exp_dict = Dict(zip(keys1,nest)) #data dictionary
#rate equations in power law form r = k [A][B]
function rxn(x, k)
A = x[1]
B = x[2]
C = x[3]
P = x[4]
k1 = k[1]
k2 = k[2]
k3 = k[3]
k4 = k[4]
k5 = k[5]
k6 = k[6]
r1 = k1 * A
r2 = k2 * A * B
r3 = k3 * C * B
r4 = k4 * A
r5 = k5 * A
r6 = k6 * A * B
return [r1, r2, r3, r4, r5, r6] #returns reaction rate of each equation
end
#Mass balance differential equations
function mass_balances(di,x,args,t)
k = args
r = rxn(x, k)
di[1] = - r[1] - r[2] - r[4] - r[5] - r[6] #Species A
di[2] = + r[1] - r[2] - r[3] - r[6] #Species B
di[3] = + r[2] - r[3] + r[4] #Species C
di[4] = + r[3] + r[5] + r[6] #Species P
end
function ODESols(time,uo,parms)
time_init = (minimum(time),maximum(time))
prob = ODEProblem(mass_balances,uo,time_init,parms)
sol = solve(prob, Tsit5(), reltol=1e-8, abstol=1e-8,save_idxs = [1,3,4],saveat=time) #Integrate prob
return sol
end
function cost_function(data_dict,parms)
res_dict = Dict(zip(keys(data_dict),[0.0,0.0]))
for key in keys(data_dict)
pred = ODESols(data_dict[key]["time"],data_dict[key]["IC"],parms)
loss = L2Loss(data_dict[key]["time"],data_dict[key]["obs"])
err = loss(pred)
res_dict[key] = err
end
residual = sum(res_dict[key] for key in keys(res_dict))
#show typeof(residual)
return residual
end
lb = [0.0,0.0,0.0,0.0,0.0,0.0] #parameter lower bounds
ub = [10.0,10.0,10.0,10.0,10.0,10.0] #parameter upper bounds
optfun = Optimization.OptimizationFunction(cost_function,Optimization.AutoForwardDiff())
optprob = Optimization.OptimizationProblem(optfun,exp_dict, ki,lb=lb,ub=ub,reltol=1E-8) #Set up optimization problem
optsol=solve(optprob, BFGS(),maxiters=10000) #Solve optimization problem
println(optsol.u) #print solution
when I call optsol I get the error:
ERROR: MethodError: no method matching ForwardDiff.GradientConfig(::Optimization.var"#89#106"{OptimizationFunction{true, Optimization.AutoForwardDiff{nothing}, typeof(cost_function), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED_NO_TIME), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}}, ::Dict{String, Dict{String, Array{Float64}}}, ::ForwardDiff.Chunk{2})
Searching online suggests that the issue may be that my cost_function function is not generic enough for ForwardDiff to handle, however I am not sure how to identify where the issue is in this function, or whether it is related to the functions (mass_balances and rxn) that are called within cost_function. Another potential issue is that I am not calling the functions appropriately when building the OptimizationFunction or the OpptimizationProblem, but I cannot identify the issue here either.
Thank you for any suggestions and your help in troubleshooting this application!
res_dict = Dict(zip(keys(data_dict),[0.0,0.0]))
This dictionary is declared to the wrong type.
zerotype = zero(params[1])
res_dict = Dict(zip(keys(data_dict),[zerotype ,zerotype]))
or
res_dict = Dict(zip(keys(data_dict),zeros(eltype(params),2)))
Either way, you want your intermediate calculations to match the type of params when using AutoForwardDiff().
In addition to the variable type specification suggested by Chris, my model also had an issue with the order of the arguments of cost_function and how I passed the arguments to the problem in optprob. This solution was shown by Contradict here

How to add a new variable to an already existing set of variables (based on a SparseAxisArray) in JuMP?

I am currently working with a JuMP model where I define the following example variables:
using JuMP
N = 3
outN = [[4,5],[1,3],[5,7]]
m = Model()
#variable(m, x[i=1:N,j in outN[i]] >=0)
At some point, I want to add, for example, a variable x[1,7]. How can I do that in an effective way? Likewise, how can I remove it afterwards? Is there an alternative to just fixing it to 0?
Thanks in advance
You're probably better off just using a dictionary:
using JuMP
N = 3
outN = [[4,5],[1,3],[5,7]]
model = Model()
x = Dict(
(i, j) => #variable(model, lower_bound = 0, base_name = "x[$i, $j]")
for i in 1:N for j in outN[i]
)
x[1, 7] = #variable(model, lower_bound = 0)
delete(model, x[1, 4])
delete!(x, (1, 4))
Nothing about JuMP restricts you to using only the built-in variable containers: https://jump.dev/JuMP.jl/stable/variables/#User-defined-containers-1

Is there a wrapper library for solving optimisation problems by declaring known and unknown variables?

cvxpy has a very neat way to write out the optimisation form without worrying too much about converting it into a "standard" matrix form as this is done internally somehow. Best to explain with an example:
def cvxpy_implementation():
var1 = cp.Variable()
var2 = cp.Variable()
constraints = [
var1 <= 3,
var2 >= 2
]
obj_fun = cp.Minimize(var1**2 + var2**2)
problem = cp.Problem(obj_fun, constraints)
problem.solve()
return var1.value, var2.value
def scipy_implementation1():
A = np.diag(np.ones(2))
lb = np.array([-np.inf, 2])
ub = np.array([3, np.inf])
con = LinearConstraint(A, lb, ub)
def obj_fun(x):
return (x**2).sum()
result = minimize(obj_fun, [0, 0], constraints=con)
return result.x
def scipy_implementation2():
con = [
{'type': 'ineq', 'fun': lambda x: 3 - x[0]},
{'type': 'ineq', 'fun': lambda x: x[1] - 2},]
def obj_fun(x):
return (x**2).sum()
result = minimize(obj_fun, [0, 0], constraints=con)
return result.x
All of the above give the correct result but the cvxpy implementation is much "easier" to write out, specifically I don't have to worry about the inequalities and can name variables useful thinks when writing out the inequalities. Compare that to the scipy1 and scipy2 implementations where in the first case I have to write out these extra infs and in the second case I have to remember which variable is which. You can imagine a case where I have 100 variables and while concatenating them will ultimately need to be done I'd like to be able to write it out like in cvxpy.
Question:
Has anyone implemented this for scipy? or is there an alternative library that could make this work?
thank you
Wrote something up that would do this and seems to cover the main issues I had in mind.
The general idea is you define variables and then create a simple expression as you would normally write it out and then the solver class optimises over the defined variables
https://github.com/evan54/optimisation/blob/master/var.py
The example below illustrates a simple use case
# fake data
a = 2
m = 3
x = np.linspace(0, 10)
y = a * x + m + np.random.randn(len(x))
a_ = Variable()
m_ = Variable()
y_ = a_ * x + m_
error = y_ - y
prob = Problem((error**2).sum(), None)
prob.minimize() print(f'a = {a}, a_ = {a_}') print(f'm = {m}, m_ = {m_}')

Numpy: Construct Slice A La Carte

Suppose I have the following:
# in pseudo code
# function input 1
chord = [0,1,17,35,47,0]
dims = [0,1,2,4,5,6]
x_axis = 3
t_axis = 7
# what I'd like to return
np.squeeze(arr[0,1,17,:,35,47,0,:])
# function input 2
chord = [0,3,4,5,6,7]
dims = [0,2,3,4,5,6]
x_axis = 1
t_axis = 7
# desired return
np.squeeze(arr[0,:,3,4,5,6,7,:])
How do I construct these numpy slices given input that I can arbitrarily specify a pair of axes and a chord coordinate?
I implemented a reflection-based solution:
def reflection_window(arr:np.ndarray,chord:list,dim0,dim1):
var = "arr"
bra = "["
ket = "]"
coord = [str(i) for int(i) in chord]
coord.insert(dim0,':')
coord.insert(dim1,':')
chordstr = ','.join(coord)
slicer = var+bra+chordstr+ket
return eval(slicer)
Staying native to numpy is probably better, but since python is a shell scripting language, it probably makes sense to treat it that way if necessary.

How to map different indices in Pyomo?

I am a new Pyomo/Python user. Now I need to formulate one set of constraints with index 'n', where all of the 3 components are with different indices but correlate with index 'n'. I am just curious that how I can map the relationship between these sets.
In my case, I read csv files in which their indices are related to 'n' to generate my set. For example: a1.n1, a2.n3, a3.n5 /// b1.n2, b2.n4, b3.n6, b4.n7 /// c1.n1, c2.n2, c3.n4, c4.n6 ///. The constraint expression of index n1 and n2 is the follows for example:
for n1: P(a1.n1) + L(c1.n1) == D(n1)
for n2: - F(b1.n2) + L(c2.n2) == D(n2)
Now let's go the coding. The set creating codes are as follow, they are within a class:
import pyomo
import pandas
import pyomo.opt
import pyomo.environ as pe
class MyModel:
def __init__(self, Afile, Bfile, Cfile):
self.A_data = pandas.read_csv(Afile)
self.A_data.set_index(['a'], inplace = True)
self.A_data.sort_index(inplace = True)
self.A_set = self.A_data.index.unique()
... ...
Then I tried to map the relationship in the constraint construction like follows:
def createModel(self):
self.m = pe.ConcreteModel()
self.m.A_set = pe.Set( initialize = self.A_set )
def obj_rule(m):
return ...
self.m.OBJ = pe.Objective(rule = obj_rule, sense = pe.minimize)
def constr(m, n)
As = self.A_data.reset_index()
Amap = As[ As['n'] == n ]['a']
Bs = self.B_data.reset_index()
Bmap = Bs[ Bs['n'] == n ]['b']
Cs = self.C_data.reset_index()
Cmap = Cs[ Cs['n'] == n ]['c']
return sum(m.P[(p,n)] for p in Amap) - sum(m.F[(s,n)] for s in Bmap) + sum(m.L[(r,n)] for r in Cmap) == self.D_data.ix[n, 'D']
self.m.cons = pe.Constraint(self.m.D_set, rule = constr)
def solve(self):
... ...
Finally, the error raises when I run this:
KeyError: "Index '(1, 1)' is not valid for indexed component 'P'"
I know it is the wrong way, so I am wondering if there is a good way to map their relationships. Thanks in advance!
Gabriel
I just forgot to post my answer to my own question when I solved this one week ago. The key thing towards this problem is setting up a map index.
Let me just modify the code in the question. Firstly, we need to modify the dataframe to include the information of the mapped indices. Then, the set for the mapped index can be constructed, taking 2 mapped indices as example:
self.m.A_set = pe.Set( initialize = self.A_set, dimen = 2 )
The names of the two mapped indices are 'alpha' and 'beta' respectively. Then the constraint can be formulated, based on the variables declared at the beginning:
def constr(m, n)
Amap = self.A_data[ self.A_data['alpha'] == n ]['beta']
Bmap = self.B_data[ self.B_data['alpha'] == n ]['beta']
return sum(m.P[(i,n)] for i in Amap) + sum(m.L[(r,n)] for r in Bmap) == D.loc[n, 'D']
m.TravelingBal = pe.Constraint(m.A_set, rule = constr)
The summation groups all associated B to A with a mapped index set.