Understanding with julia dataframes indexing

Understanding with julia dataframes indexing - dataframe

I am learning julia and i've just found this line:
if(any(mach_df[start_slot:(start_slot + task_setup_time), Symbol(machine)].== 0))
What does it mean?, I know any is a function that returns true if every value of the parameter is true but I just can't understand what is inside the brakets.
Regards

Let us work inside out:
mach_df[start_slot:(start_slot + task_setup_time), Symbol(machine)] selects you rows from the range start_slot:(start_slot + task_setup_time) and column named Symbol(machine) (Symbol is most likely not needed, but I would need to see your source code to tell you exacly); as a result you get a vector.
mach_df[start_slot:(start_slot + task_setup_time), Symbol(machine)] .== 0 gives you another vector that has true if the value in the LHS vector is 0.
the any part will return true if any of the values in the vector produced above is true.
A more advanced (and efficient) way to write it would be:
any(==(0), #view mach_df[start_slot:(start_slot + task_setup_time), Symbol(machine)])
but I am not sure if you need performance in your use case.

Related

writing a vector using "readTrajectory" function in Dymola

I write a vector in Dymola mos script in a simple manner like this:
x_axis = cell.spatialSummary.x_cell;
output: x_axis={1,2,3,4,5} // row vector
I want to do the same thing in a function.'x_cell' has 5 values which I want to store in a row vector. I use DymolaCommands.Trajectories.readTrajectory function to read x_cell values one by one in for loop (I use for loop because, readTrajectory throws an error when I try to read entire x_cell)
Real x_axis[:],axis_value[:,:];
Integer len=5;
for i in 1:len loop
axis_value:=readTrajectory(result,{"cell.spatialSummary.x_cell["+String(i)+"]"},1); //This intermediate variable returns [1,1] matrix
x_axis[i]:=scalar(axis_value);
end for;
I get an error:
Assignment failed x_axis[i] = scalar(axis_value);
what's wrong here? All I want to do is read all values of x_cell and write it into a vector. How can I do this in dymola function?
Thank you!

Solution: Initialize the vector with a certain value. In this case,
x_axis :=fill(0, len);
This solved the above problem for me.

Pre filling as in the other solution works, and is generally the best solution. However, in some cases you might have to append to the vector as follows:
x_axis=fill(0.0, 0);
for i in 1:len loop
axis_value:=readTrajectory(result,{"cell.spatialSummary.x_cell["+String(i)+"]"},1); //This intermediate variable returns [1,1] matrix
x_axis:=cat(1, x_axis, {scalar(axis_value)});
end for;
(This takes x_axis and concatenates a new element at the end. It is generally slower.)

Defining a family of variables in sage

I am trying to migrate my scripts from mathematica to sage. I am stuck in something that it seems elementary.
I need to work with arbitrarily large polynomials say of the form
a00 + a10*x + a01*y + a20 *x^2 + a11*x*y + ...
I consider them polynomials only on x and y and I need given such a polynomial P to get the list of its monomials.
For example if P = a20*x^2 + a12*x*y^2
I want a list of the form [a20*x^2,a12*x*y^2].
I figured out that a polynomial in sage has a class function called coefficients that returns the coefficients and a class function called monomials that returns the monomials without the coefficients. Multiplying these two list together, gives the result I want.
The problem is that for this to work I need to explicitly declare all the a's as variables with is something that is not always possible.
Is there any way to tell sage that anything of the form a[number][number] is a variable? Or is there any way to define a whole family of variables in sage?
In a perfect world I would like to make sage behave like mathematica, in the sense that anything which is not defines is considered a variable, but I guess this is too optimistic.

My answer is not fully addressing your question but one trick I found to define variables was to use the PolynomialRing(). For example:
sage: R = PolynomialRing(RR, 'c', 20)
sage: c = R.gens()
sage: pol=sum(c[i]*x^i for i in range(10));pol
c9*x^9 + c8*x^8 + c7*x^7 + c6*x^6 + c5*x^5 + c4*x^4 + c3*x^3 + c2*x^2 + c1*x + c0
and later on you can define them as variables to solve(), for example:
sage: variables=[SR(c[i]) for i in srange(0,len(eq_list))];
sage: solution = solve(eqs,variables);

You'll almost certainly need some very minor string processing; the answers
this way of getting lists of symbolic variables
this other way of getting them that is similar
this sage-support post
are better than anything I can say. Naturally, this is possible to implement, but ...
In a perfect world I would like to make sage behave like mathematica, in the sense that anything which is not defines is considered a variable, but I guess this is too optimistic.
True; indeed, that goes against Python's (and hence Sage's) philosophy of "explicit is better than implicit"; there were arguments for a long time over whether even x should be predefined as a symbolic variable (it is!).
(And truthfully, given how often I make typos, I'd really rather not have any arbitrary thing be considered a symbolic variable.)

NLopt with univariate optimization

Anyone know if NLopt works with univariate optimization. Tried to run following code:
using NLopt
function myfunc(x, grad)
x.^2
end
opt = Opt(:LD_MMA, 1)
min_objective!(opt, myfunc)
(minf,minx,ret) = optimize(opt, [1.234])
println("got $minf at $minx (returned $ret)")
But get following error message:
> Error evaluating untitled
LoadError: BoundsError: attempt to access 1-element Array{Float64,1}:
1.234
at index [2]
in myfunc at untitled:8
in nlopt_callback_wrapper at /Users/davidzentlermunro/.julia/v0.4/NLopt/src/NLopt.jl:415
in optimize! at /Users/davidzentlermunro/.julia/v0.4/NLopt/src/NLopt.jl:514
in optimize at /Users/davidzentlermunro/.julia/v0.4/NLopt/src/NLopt.jl:520
in include_string at loading.jl:282
in include_string at /Users/davidzentlermunro/.julia/v0.4/CodeTools/src/eval.jl:32
in anonymous at /Users/davidzentlermunro/.julia/v0.4/Atom/src/eval.jl:84
in withpath at /Users/davidzentlermunro/.julia/v0.4/Requires/src/require.jl:37
in withpath at /Users/davidzentlermunro/.julia/v0.4/Atom/src/eval.jl:53
[inlined code] from /Users/davidzentlermunro/.julia/v0.4/Atom/src/eval.jl:83
in anonymous at task.jl:58
while loading untitled, in expression starting on line 13
If this isn't possible, does anyone know if a univariate optimizer where I can specify bounds and an initial condition?

There are a couple of things that you're missing here.
You need to specify the gradient (i.e. first derivative) of your function within the function. See the tutorial and examples on the github page for NLopt. Not all optimization algorithms require this, but the one that you are using LD_MMA looks like it does. See here for a listing of the various algorithms and which require a gradient.
You should specify the tolerance for conditions you need before you "declare victory" ¹ (i.e. decide that the function is sufficiently optimized). This is the xtol_rel!(opt,1e-4) in the example below. See also the ftol_rel! for another way to specify a different tolerance condition. According to the documentation, for example, xtol_rel will "stop when an optimization step (or an estimate of the optimum) changes every parameter by less than tol multiplied by the absolute value of the parameter." and ftol_rel will "stop when an optimization step (or an estimate of the optimum) changes the objective function value by less than tol multiplied by the absolute value of the function value. " See here under the "Stopping Criteria" section for more information on various options here.
The function that you are optimizing should have a unidimensional output. In your example, your output is a vector (albeit of length 1). (x.^2 in your output denotes a vector operation and a vector output). If you "objective function" doesn't ultimately output a unidimensional number, then it won't be clear what your optimization objective is (e.g. what does it mean to minimize a vector? It's not clear, you could minimize the norm of a vector, for instance, but a whole vector - it isn't clear).
Below is a working example, based on your code. Note that I included the printing output from the example on the github page, which can be helpful for you in diagnosing problems.
using NLopt
count = 0 # keep track of # function evaluations
function myfunc(x::Vector, grad::Vector)
if length(grad) > 0
grad[1] = 2*x[1]
end
global count
count::Int += 1
println("f_$count($x)")
x[1]^2
end
opt = Opt(:LD_MMA, 1)
xtol_rel!(opt,1e-4)
min_objective!(opt, myfunc)
(minf,minx,ret) = optimize(opt, [1.234])
println("got $minf at $minx (returned $ret)")
¹ (In the words of optimization great Yinyu Ye.)

pivot_table error - InvalidOperation: [<class 'decimal.InvalidOperation'>]

The above error is being raised from a pivot_table operation for a variable set to be the column grouping (if it matters, it's failing in the format.py module)
/anaconda/lib/python3.4/site-packages/pandas/core/format.py in __call__(self, num)
2477 sign = 1
2478
-> 2479 if dnum < 0: # pragma: no cover
2480 sign = -1
2481 dnum = -dnum
(Pandas v17.1)
If I create random values for the 'problem' variable via numpy there is no error.
Whilst I doubt it's an edge case for the pivot_table function, I can't figure out what might be causing the problem on the data side:
i) The variable is the first integer from a modest sized sequence of integers (eg 2 from 246) (via df.var.str[0]).
ii) pd.unique(df.var) returns the expected 1-9 values
iii) There are no NaNs: notnull(df.var).all() returns True
iv) The dtype is int64 (or if the integer is cast as a string - or set to label these alternatives still fail with the same error)
v) a period index is used - and that forms the index for pivot table.
vi) the aggregation is 'count'
Creating a another variable with random values with those characteristics (1-9 values from from numpy's random.randint) - the pivot_table call works. If I cast it as a string, or use labels, it still works.
Likewise, I've been playing with the data set for a while - usually on some other position in the sequence without issue. But today - the first place is causing a problem.
Possibly, it's a data issue - but why doesn't pivot_table return empty cells or NaNs, rather than failing at that point.
But I'm at a loss after a day exploring.
any thoughts on why the above error is being raised would be much appreciated (as it'll help me track down the data issue if that is the case).
thanks
Chris

The simplest solution is to reset pandas formatting options by
pd.set_option('display.float_format', None)
further details
I had encoutered same problem. As a workaround you can also filter dataframe that is pivoted to avoid NaNs in result.
My problem is related to use of pd.set_eng_float_format(2, True). Without this all pivots works well.

Smart indexing for matching values in Julia

Suppose I have the following:
# valuesToFind: n x 1 vector
# allValues: m x n matrix, in which every column allValues[:,i]
# contains among it's components exactly 1 instance of the
# corresponding value valuesToFind[i] at some row position
I am trying to determine the position (row index) at which this match occurs for every value in valuesToFind and, currently, I achieve it with the following loop:
idx=Array(Int16, length(valuesToFind))
for (i, v) in enumerate(valuesToFind)
idx[i] = findfirst(articleIDs[:,i], v)
end
Is it possible to do this without loop in a single statement?

Are you looking for:
[findfirst(allValues[:,i], v) for (i,v) in enumerate(valuesToFind)]
?
I'm not 100% convinced this is clearer (in terms of code readability) then a simple loop, but it would do the job in one line if that's what you're after.

Try:
map(x->ind2sub(allValues,x)[1],findin(allValues,valuesToFind))
This is a one-line solution to get the row number of each value in the column. Note that it uses the assumption laid out in the question (unique value in each column). It also uses some assumption on the column first layout of a matrix. The layout assumption can be removed with a sort on the first index returned by ind2sub.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Understanding with julia dataframes indexing - dataframe

I am learning julia and i've just found this line: if(any(mach_df[start_slot:(start_slot + task_setup_time), Symbol(machine)].== 0)) What does it mean?, I know any is a function that returns true if every value of the parameter is true but I just can't understand what is inside the brakets. Regards

Related

writing a vector using "readTrajectory" function in Dymola

Defining a family of variables in sage

NLopt with univariate optimization

pivot_table error - InvalidOperation: [<class 'decimal.InvalidOperation'>]

Smart indexing for matching values in Julia

Categories

Resources