Does the mxnet symbol API support conditional control flow? - mxnet

I want to add some conditional control in my symbol, it seems that if-else is evaluated in symbol construction time. But I want it to evaluated in symbol run time.
a = mx.symbol.Variable(name='a')
b = mx.symbol.Variable(name='b')
if a>b:
c = a-b
else:
c = a+b
TensorFlow provides the tf.cond() operator to deal with it, is there a counterpart in mxnet?

You can use mx.symbol.where.
You can compute a_minus_b and a_plus_b and return an array where each element is either from a_minus_b or a_plus_b depending on the corresponding value in another condition array. Here is an example:
a = mx.symbol.Variable(name='a')
b = mx.symbol.Variable(name='b')
a_minus_b = a - b
a_plus_b = a + b
# gt = a > b
gt = a.__gt__(b)
result = mx.sym.where(condition=gt, x=a_minus_b, y=a_plus_b)
ex = result.bind(ctx=mx.cpu(), args={'a':mx.nd.array([1,2,3]), 'b':mx.nd.array([3,2,1])})
r = ex.forward()
print(r[0].asnumpy()) #result should be [1+3, 2+2, 3-1]

Related

Trouble writing OptimizationFunction for automatic forward differentiation during Parameter Estimation of an ODEProblem

I am trying to learn Julia for its potential use in parameter estimation. I am interested in estimating kinetic parameters of chemical reactions, which usually involves optimizing reaction parameters with multiple independent batches of experiments. I have successfully optimized a single batch, but need to expand the problem to use many different batches. In developing a sample problem, I am trying to optimize using two toy batches. I know there are probably smarter ways to do this (subject of a future question), but my current workflow involves calling an ODEProblem for each batch, calculating its loss against the data, and minimizing the sum of the residuals for the two batches. Unfortunately, I get an error when initiating the optimization with Optimization.jl. The current code and error are shown below:
using DifferentialEquations, Plots, DiffEqParamEstim
using Optimization, ForwardDiff, OptimizationOptimJL, OptimizationNLopt
using Ipopt, OptimizationGCMAES, Optimisers
using Random
#Experimental data, species B is NOT observed in the data
times = [0.0, 0.071875, 0.143750, 0.215625, 0.287500, 0.359375, 0.431250,
0.503125, 0.575000, 0.646875, 0.718750, 0.790625, 0.862500,
0.934375, 1.006250, 1.078125, 1.150000]
A_obs = [1.0, 0.552208, 0.300598, 0.196879, 0.101175, 0.065684, 0.045096,
0.028880, 0.018433, 0.011509, 0.006215, 0.004278, 0.002698,
0.001944, 0.001116, 0.000732, 0.000426]
C_obs = [0.0, 0.187768, 0.262406, 0.350412, 0.325110, 0.367181, 0.348264,
0.325085, 0.355673, 0.361805, 0.363117, 0.327266, 0.330211,
0.385798, 0.358132, 0.380497, 0.383051]
P_obs = [0.0, 0.117684, 0.175074, 0.236679, 0.234442, 0.270303, 0.272637,
0.274075, 0.278981, 0.297151, 0.297797, 0.298722, 0.326645,
0.303198, 0.277822, 0.284194, 0.301471]
#Create additional data sets for a multi data set optimization
#Simple noise added to data for testing
times_2 = times[2:end] .+ rand(range(-0.05,0.05,100))
P_obs_2 = P_obs[2:end] .+ rand(range(-0.05,0.05,100))
A_obs_2 = A_obs[2:end].+ rand(range(-0.05,0.05,100))
C_obs_2 = C_obs[2:end].+ rand(range(-0.05,0.05,100))
#ki = [2.78E+00, 1.00E-09, 1.97E-01, 3.04E+00, 2.15E+00, 5.27E-01] #Target optimized parameters
ki = [0.1, 0.1, 0.1, 0.1, 0.1, 0.1] #Initial guess of parameters
IC = [1.0, 0.0, 0.0, 0.0] #Initial condition for each species
tspan1 = (minimum(times),maximum(times)) #tuple timespan of data set 1
tspan2 = (minimum(times_2),maximum(times_2)) #tuple timespan of data set 2
# data = VectorOfArray([A_obs,C_obs,P_obs])'
data = vcat(A_obs',C_obs',P_obs') #Make multidimensional array containing all observed data for dataset1, transpose to match shape of ODEProblem output
data2 = vcat(A_obs_2',C_obs_2',P_obs_2') #Make multidimensional array containing all observed data for dataset2, transpose to match shape of ODEProblem output
#make dictionary containing data, time, and initial conditions
keys1 = ["A","B"]
keys2 = ["time","obs","IC"]
entryA =[times,data,IC]
entryB = [times_2, data2,IC]
nest=[Dict(zip(keys2,entryA)),Dict(zip(keys2,entryB))]
exp_dict = Dict(zip(keys1,nest)) #data dictionary
#rate equations in power law form r = k [A][B]
function rxn(x, k)
A = x[1]
B = x[2]
C = x[3]
P = x[4]
k1 = k[1]
k2 = k[2]
k3 = k[3]
k4 = k[4]
k5 = k[5]
k6 = k[6]
r1 = k1 * A
r2 = k2 * A * B
r3 = k3 * C * B
r4 = k4 * A
r5 = k5 * A
r6 = k6 * A * B
return [r1, r2, r3, r4, r5, r6] #returns reaction rate of each equation
end
#Mass balance differential equations
function mass_balances(di,x,args,t)
k = args
r = rxn(x, k)
di[1] = - r[1] - r[2] - r[4] - r[5] - r[6] #Species A
di[2] = + r[1] - r[2] - r[3] - r[6] #Species B
di[3] = + r[2] - r[3] + r[4] #Species C
di[4] = + r[3] + r[5] + r[6] #Species P
end
function ODESols(time,uo,parms)
time_init = (minimum(time),maximum(time))
prob = ODEProblem(mass_balances,uo,time_init,parms)
sol = solve(prob, Tsit5(), reltol=1e-8, abstol=1e-8,save_idxs = [1,3,4],saveat=time) #Integrate prob
return sol
end
function cost_function(data_dict,parms)
res_dict = Dict(zip(keys(data_dict),[0.0,0.0]))
for key in keys(data_dict)
pred = ODESols(data_dict[key]["time"],data_dict[key]["IC"],parms)
loss = L2Loss(data_dict[key]["time"],data_dict[key]["obs"])
err = loss(pred)
res_dict[key] = err
end
residual = sum(res_dict[key] for key in keys(res_dict))
#show typeof(residual)
return residual
end
lb = [0.0,0.0,0.0,0.0,0.0,0.0] #parameter lower bounds
ub = [10.0,10.0,10.0,10.0,10.0,10.0] #parameter upper bounds
optfun = Optimization.OptimizationFunction(cost_function,Optimization.AutoForwardDiff())
optprob = Optimization.OptimizationProblem(optfun,exp_dict, ki,lb=lb,ub=ub,reltol=1E-8) #Set up optimization problem
optsol=solve(optprob, BFGS(),maxiters=10000) #Solve optimization problem
println(optsol.u) #print solution
when I call optsol I get the error:
ERROR: MethodError: no method matching ForwardDiff.GradientConfig(::Optimization.var"#89#106"{OptimizationFunction{true, Optimization.AutoForwardDiff{nothing}, typeof(cost_function), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED_NO_TIME), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}}, ::Dict{String, Dict{String, Array{Float64}}}, ::ForwardDiff.Chunk{2})
Searching online suggests that the issue may be that my cost_function function is not generic enough for ForwardDiff to handle, however I am not sure how to identify where the issue is in this function, or whether it is related to the functions (mass_balances and rxn) that are called within cost_function. Another potential issue is that I am not calling the functions appropriately when building the OptimizationFunction or the OpptimizationProblem, but I cannot identify the issue here either.
Thank you for any suggestions and your help in troubleshooting this application!
res_dict = Dict(zip(keys(data_dict),[0.0,0.0]))
This dictionary is declared to the wrong type.
zerotype = zero(params[1])
res_dict = Dict(zip(keys(data_dict),[zerotype ,zerotype]))
or
res_dict = Dict(zip(keys(data_dict),zeros(eltype(params),2)))
Either way, you want your intermediate calculations to match the type of params when using AutoForwardDiff().
In addition to the variable type specification suggested by Chris, my model also had an issue with the order of the arguments of cost_function and how I passed the arguments to the problem in optprob. This solution was shown by Contradict here

jDE(Adaptive Differential Evolution)

In jDE, each individual has its own F and CR values. How to assign these values to each individuals programmatically. How to update these values.
A pseudo-code will help.
If you want each individual to have its own F and CR values, you can simply save it in a list. (Pseudo-code: Python)
ID_POS = 0
ID_FIT = 1
ID_F = 2
ID_CR = 3
def create_solution(problem_size):
pos = np.random.uniform(lower_bound, upper_bound, problem_size)
fit = fitness_function(pos)
F = your_values
CR = your values
return [pos, fit, F, CR]
def training(problem_size, pop_size, max_iteration):
# Initialization
pop = [create_solution(problem_size) for _ in range(0, pop_size)]
# Evolution process
for iteration in range(0, max_iteration):
for i in range(0, pop_size):
# Do your stuff here
pos_new = ....
fit_new = ....
F_new = ...
CR_new = ...
if pop[i][ID_FIT] < fit_new: # meaning the new solution has better fitness than the old one.
pop[i][ID_F] = F_new
pop[i][ID_CR] = CR_new # This is how you update F and CR for every individual.
...
You can check out my repo's contains most of the state-of-the-art meta-heuristics here.
https://github.com/thieunguyen5991/metaheuristics

How to define a function to use with scipy.integrate.solve_ivp

I am trying to solve a differential equation using scipy.integrate.solve_ivp
L*Q'' + R*Q' + (1/C)*Q = E(t), E(t) = 230*sin(50*t)
for Q(t) and Q'(t)
C = 0.0014 #F
dQ_0 = 2.6 #A
L = 1.8 #H
n = 575 #/
Q_0 = 1e-06 #C
R = 43 #Ohm
t_f = 2.8 #s
import numpy as np
from scipy.integrate import solve_ivp
t = np.linspace(0, t_f, n)
def E(x):
return 230*np.sin(50*x)
y = E(y)
def Q(t, y, R, L, C):
return (y - L*Q'' - R*Q')*C
init_cond = [Q_0, dQ_0]
y_ivp = solve_ivp(Q, t_span=(0, t_f), y0=init_cond)
I am only trying to understand how to correctly define a function that is passed as an argument 'fun' in scipy.integrate.solve_ivp
The answer below is no longer valid, solve_ode has since implemented the args parameter similar to odeint. So indeed
y_ivp = solve_ivp(Q, t_span=(0, t_f), y0=init_cond, args=(R, L, C))
is now valid (do not forget to set appropriate error tolerances, or at least check that the default values arol=1e-3, rtol=1e-6 are appropriate).
Always available was the use of semi-global variables in a closure or lambda expression
y_ivp = solve_ivp(lambda t,y: Q(t,y,R, L, C), t_span=(0, t_f), y0=init_cond, args=(R, L, C))
(obsolete part) solve_ivp has no parameter passing mechanism, so treat the parameters as global variables. You are formulating an ODE for Q, as it is a second order ODE, the state also contains the first derivative, as you somehow recognized in the composition of the initial state. The ODE function then needs to produce the derivative values at a given state. Identify Q(t)=Q[0] and Q'(t)=Q[1], then
def Q_ode(t, Q):
return [ Q[1], (E(t) - R*Q[1] - (1/C)*Q[0])/L ]
I would continue to name the variables containing Q values with the letter Q.

How to iterate over all but the last index of an AbstractArray

In Julia, the recommended way to iterate over all indices of an AbstractArray is to use eachindex, e.g.,
for i in eachindex(a)
do_something(a[i], i)
end
In contrast to 1:length(a), eachindex(a) supports arrays with unconventional indexing, i.e., indices not starting at 1. Additionally, it is more efficient for arrays with slow linear indexing.
If I want to skip the first index I can use Iterators.drop(eachindex(a), 1) (is there a better way?) but how do I skip the last one in a generic way?
A "front" iterator is relatively simple and generally useful. Edit: it's also totally overkill to define it in full generality just for this case. It's much easier and simpler to rely on Base's builtins with a definition like:
front(itr, n=1) = Iterators.take(itr, length(itr)-n)
This will work for all iterators with length defined — which will include everything that eachindex will return.
Alternatively, you can define a specialized iterator from first principles that doesn't depend upon length being defined. I'm not aware of such a structure in any existing packages. Using Julia 0.6, an implementation could look like:
struct Front{T}
itr::T
end
# Basic iterator definition
function Base.start(f::Front)
s = start(f.itr)
done(f.itr, s) && throw(ArgumentError("cannot take the front of an empty iterator"))
return next(f.itr, s)
end
function Base.next(f::Front, state)
val, s = state
return val, next(f.itr, s)
end
Base.done(f::Front, state) = done(f.itr, state[2])
# Inherit traits as appropriate
Base.iteratorsize(::Type{Front{T}}) where {T} = _dropshape(Base.iteratorsize(T))
_dropshape(x) = x
_dropshape(::Base.HasShape) = Base.HasLength()
Base.iteratoreltype(::Type{Front{T}}) where {T} = Base.iteratoreltype(T)
Base.length(f::Front) = length(f.itr) - 1
Base.eltype(f::Front{T}) where {T} = eltype(T)
Now:
julia> collect(Front(eachindex(rand(5))))
4-element Array{Int64,1}:
1
2
3
4
julia> collect(Front(eachindex(sprand(3, 2, .2))))
5-element Array{CartesianIndex{2},1}:
CartesianIndex{2}((1, 1))
CartesianIndex{2}((2, 1))
CartesianIndex{2}((3, 1))
CartesianIndex{2}((1, 2))
CartesianIndex{2}((2, 2))
Another way to define #MattB.'s Front is
front(itr,n=1) = (first(x) for x in Iterators.partition(itr,n+1,1))
This also gives:
julia> front(eachindex([1,2,3,4,5]))|>collect
4-element Array{Int64,1}:
1
2
3
4
and as a bonus:
julia> front(eachindex([1,2,3,4,5]),2)|>collect
3-element Array{Int64,1}:
1
2
3
the corresponding iterator to drop(eachindex([1,2,3,4,5]),2).
There's also the following:
for I in CartesianRange(Base.front(indices(A)))
#show I A[I, :]
end
On A = reshape(1:27, 3, 3, 3) this yields
I = CartesianIndex{2}((1,1))
A[I,:] = [1,10,19]
I = CartesianIndex{2}((2,1))
A[I,:] = [2,11,20]
I = CartesianIndex{2}((3,1))
A[I,:] = [3,12,21]
I = CartesianIndex{2}((1,2))
A[I,:] = [4,13,22]
I = CartesianIndex{2}((2,2))
A[I,:] = [5,14,23]
I = CartesianIndex{2}((3,2))
A[I,:] = [6,15,24]
I = CartesianIndex{2}((1,3))
A[I,:] = [7,16,25]
I = CartesianIndex{2}((2,3))
A[I,:] = [8,17,26]
I = CartesianIndex{2}((3,3))
A[I,:] = [9,18,27]

Pandas Apply(), Transform() ERROR = invalid dtype determination in get_concat_dtype

Flowing on from this question, which i link as background, but question is standalone.
4 questions:
I cannot understand the error I see when using apply or transform:
"invalid dtype determination in get_concat_dtype"
Why does ClipNetMean work but the other 2 methods not?
Unsure if or why i need the .copy(deep=True)
Why the slightly different syntax needed to call the InnerFoo function
The DataFrame:
cost
section item
11 1 25
2 100
3 77
4 10
12 5 50
1 39
2 7
3 32
13 4 19
1 21
2 27
The code:
import pandas as pd
import numpy as np
df = pd.DataFrame(data = {'section' : [11,11,11,11,12,12,12,12,13,13,13]
,'item' : [1,2,3,4,5,1,2,3,4,1,2]
,'cost' : [25.,100.,77.,10.,50.,39.,7.,32.,19.,21.,27.]
})
df.set_index(['section','item'],inplace=True)
upper =50
lower = 10
def ClipAndNetMean(cost,upper,lower):
avg = cost.mean()
new_cost = (cost- avg).clip(lower,upper)
return new_cost
def MiniMean(cost,upper,lower):
cost_clone = cost.copy(deep=True)
cost_clone['A'] = lower
cost_clone['B'] = upper
v = cost_clone.apply(np.mean,axis=1)
return v.to_frame()
def InnerFoo(lower,upper):
def inner(group):
group_clone = group.copy(deep=True)
group_clone['lwr'] = lower
group_clone['upr'] = upper
v = group_clone.apply(np.mean,axis=1)
return v.to_frame()
return inner
#These 2 work fine.
print df.groupby(level = 'section').apply(ClipAndNetMean,lower,upper)
print df.groupby(level = 'section').transform(ClipAndNetMean,lower,upper)
#apply works but not transform
print df.groupby(level = 'section').apply(MiniMean,lower,upper)
print df.groupby(level = 'section').transform(MiniMean,lower,upper)
#apply works but not transform
print df.groupby(level = 'section').apply(InnerFoo(lower,upper))
print df.groupby(level = 'section').transform(InnerFoo(lower,upper))
exit()
So to Chris's answer, note that if I add back the column header the methods will work in a Transform call.
see v.columns = ['cost']
def MiniMean(cost,upper,lower):
cost_clone = cost.copy(deep=True)
cost_clone['A'] = lower
cost_clone['B'] = upper
v = cost_clone.apply(np.mean,axis=1)
v = v.to_frame()
v.columns = ['cost']
return v
def InnerFoo(lower,upper):
def inner(group):
group_clone = group.copy(deep=True)
group_clone['lwr'] = lower
group_clone['upr'] = upper
v = group_clone.apply(np.mean,axis=1)
v = v.to_frame()
v.columns = ['cost']
return v
return inner
1 & 2) transform expects something "like-indexed", while apply is flexible. The two failing functions are adding additional columns.
3) In some cases, (e.g. if you're passing a whole DataFrame into a function) it can be necessary to copy to avoid mutating the original. It should not be necessary here.
4) The first two functions take a DataFrame with two parameters and returns data. InnerFoo actually returns another function, so it needs to be called before being passed into apply.