matlab subsref: {} with string argument fails, why? - oop

There are a few implementations of a hash or dictionary class in the Mathworks File Exchange repository. All that I have looked at use parentheses overloading for key referencing, e.g.
d = Dict;
d('foo') = 'bar';
y = d('foo');
which seems a reasonable interface. It would be preferable, though, if you want to easily have dictionaries which contain other dictionaries, to use braces {} instead of parentheses, as this allows you to get around MATLAB's (arbitrary, it seems) syntax limitation that multiple parentheses are not allowed but multiple braces are allowed, i.e.
t{1}{2}{3} % is legal MATLAB
t(1)(2)(3) % is not legal MATLAB
So if you want to easily be able to nest dictionaries within dictionaries,
dict{'key1'}{'key2'}{'key3'}
as is a common idiom in Perl and is possible and frequently useful in other languages including Python, then unless you want to use n-1 intermediate variables to extract a dictionary entry n layers deep, this seems a good choice. And it would seem easy to rewrite the class's subsref and subsasgn operations to do the same thing for {} as they previously did for (), and everything should work.
Except it doesn't when I try it.
Here's my code. (I've reduced it to a minimal case. No actual dictionary is implemented here, each object has one key and one value, but this is enough to demonstrate the problem.)
classdef TestBraces < handle
properties
% not a full hash table implementation, obviously
key
value
end
methods(Access = public)
function val = subsref(obj, ref)
% Re-implement dot referencing for methods.
if strcmp(ref(1).type, '.')
% User trying to access a method
% Methods access
if ismember(ref(1).subs, methods(obj))
if length(ref) > 1
% Call with args
val = obj.(ref(1).subs)(ref(2).subs{:});
else
% No args
val = obj.(ref.subs);
end
return;
end
% User trying to access something else.
error(['Reference to non-existant property or method ''' ref.subs '''']);
end
switch ref.type
case '()'
error('() indexing not supported.');
case '{}'
theKey = ref.subs{1};
if isequal(obj.key, theKey)
val = obj.value;
else
error('key %s not found', theKey);
end
otherwise
error('Should never happen')
end
end
function obj = subsasgn(obj, ref, value)
%Dict/SUBSASGN Subscript assignment for Dict objects.
%
% See also: Dict
%
if ~strcmp(ref.type,'{}')
error('() and dot indexing for assignment not supported.');
end
% Vectorized calls not supported
if length(ref.subs) > 1
error('Dict only supports storing key/value pairs one at a time.');
end
theKey = ref.subs{1};
obj.key = theKey;
obj.value = value;
end % subsasgn
end
end
Using this code, I can assign as expected:
t = TestBraces;
t{'foo'} = 'bar'
(And it is clear that the assignment work from the default display output for t.) So subsasgn appears to work correctly.
But I can't retrieve the value (subsref doesn't work):
t{'foo'}
??? Error using ==> subsref
Too many output arguments.
The error message makes no sense to me, and a breakpoint at the first executable line of my subsref handler is never hit, so at least superficially this looks like a MATLAB issue, not a bug in my code.
Clearly string arguments to () parenthesis subscripts are allowed, since this works fine if you change the code to work with () instead of {}. (Except then you can't nest subscript operations, which is the object of the exercise.)
Either insight into what I'm doing wrong in my code, any limitations that make what I'm doing unfeasible, or alternative implementations of nested dictionaries would be appreciated.

Short answer, add this method to your class:
function n = numel(obj, varargin)
n = 1;
end
EDIT: The long answer.
Despite the way that subsref's function signature appears in the documentation, it's actually a varargout function - it can produce a variable number of output arguments. Both brace and dot indexing can produce multiple outputs, as shown here:
>> c = {1,2,3,4,5};
>> [a,b,c] = c{[1 3 5]}
a =
1
b =
3
c =
5
The number of outputs expected from subsref is determined based on the size of the indexing array. In this case, the indexing array is size 3, so there's three outputs.
Now, look again at:
t{'foo'}
What's the size of the indexing array? Also 3. MATLAB doesn't care that you intend to interpret this as a string instead of an array. It just sees that the input is size 3 and your subsref can only output 1 thing at a time. So, the arguments mismatch. Fortunately, we can correct things by changing the way that MATLAB determines how many outputs are expected by overloading numel. Quoted from the doc link:
It is important to note the significance of numel with regards to the
overloaded subsref and subsasgn functions. In the case of the
overloaded subsref function for brace and dot indexing (as described
in the last paragraph), numel is used to compute the number of
expected outputs (nargout) returned from subsref. For the overloaded
subsasgn function, numel is used to compute the number of expected
inputs (nargin) to be assigned using subsasgn. The nargin value for
the overloaded subsasgn function is the value returned by numel plus 2
(one for the variable being assigned to, and one for the structure
array of subscripts).
As a class designer, you must ensure that the value of n returned by
the built-in numel function is consistent with the class design for
that object. If n is different from either the nargout for the
overloaded subsref function or the nargin for the overloaded subsasgn
function, then you need to overload numel to return a value of n that
is consistent with the class' subsref and subsasgn functions.
Otherwise, MATLAB produces errors when calling these functions.
And there you have it.

Related

Assign a Seq(Seq) into arrays

What is it the correct syntax to assign a Seq(Seq) into multiple typed arrays without assign the Seq to an scalar first? Has the Seq to be flattened somehow? This fails:
class A { has Int $.r }
my A (#ra1, #ra2);
#create two arrays with 5 random numbers below a certain limit
#Fails: Type check failed in assignment to #ra1; expected A but got Seq($((A.new(r => 3), A.n...)
(#ra1, #ra2) =
<10 20>.map( -> $up_limit {
(^5).map({A.new( r => (^$up_limit).pick ) })
});
TL;DR Binding is faster than assignment, so perhaps this is the best practice solution to your problem:
:(#ra1, #ra2) := <10 20>.map(...);
While uglier than the solution in the accepted answer, this is algorithmically faster because binding is O(1) in contrast to assignment's O(N) in the length of the list(s) being bound.
Assigning / copying
Simplifying, your non-working code is:
(#listvar1, #listvar2) = list1, list2;
In Raku infix = means assignment / copying from the right of the = into one or more of the container variables on the left of the =.
If a variable on the left is bound to a Scalar container, then it will assign one of the values on the right. Then the assignment process starts over with the next container variable on the left and the next value on the right.
If a variable on the left is bound to an Array container, then it uses up all remaining values on the right. So your first array variable receives both list1 and list2. This is not what you want.
Simplifying, here's Christoph's answer:
#listvar1, #listvar2 Z= list1, list2;
Putting the = aside for a moment, Z is an infix version of the zip routine. It's like (a physical zip pairing up consecutive arguments on its left and right. When used with an operator it applies that operator to the pair. So you can read the above Z= as:
#listvar1 = list1;
#listvar2 = list2;
Job done?
Assignment into Array containers entails:
Individually copying as many individual items as there are in each list into the containers. (In the code in your example list1 and list2 contain 5 elements each, so there would be 10 copying operations in total.)
Forcing the containers to resize as necessary to accommodate the items.
Doubling up the memory used by the items (the original list elements and the duplicates copied into the Array elements).
Checking that the type of each item matches the element type constraint.
Assignment is in general much slower and more memory intensive than binding...
Binding
:(#listvar1, #listvar2) := list1, list2;
The := operator binds whatever's on its left to the arguments on its right.
If there's a single variable on the left then things are especially simple. After binding, the variable now refers precisely to what's on the right. (This is especially simple and fast -- a quick type check and it's done.)
But that's not so in our case.
Binding also accepts a standalone signature literal on its left. The :(...) in my answer is a standalone Signature literal.
(Signatures are typically attached to a routine without the colon prefix. For example, in sub foo (#var1, #var2) {} the (#var1, #var2) part is a signature attached to the routine foo. But as you can see, one can write a signature separately and let Raku know it's a signature by prefixing a pair of parens with a colon. A key difference is that any variables listed in the signature must have already been declared.)
When there's a signature literal on the left then binding happens according to the same logic as binding arguments in routine calls to a receiving routine's signature.
So the net result is that the variables get the values they'd have inside this sub:
sub foo (#listvar1, #listvar2) { }
foo list1, list2;
which is to say the effect is the same as:
#listvar1 := list1;
#listvar2 := list2;
Again, as with Christoph's answer, job done.
But this way we'll have avoided assignment overhead.
Not entirely sure if it's by design, but what seems to happen is that both of your sequences are getting stored into #ra1, while #ra2 remains empty. This violates the type constraint.
What does work is
#ra1, #ra2 Z= <10 20>.map(...);

Summation iterated over a variable length

I have written an optimization problem in pyomo and need a constraint, which contains a summation that has a variable length:
u_i_t[i, t]*T_min_run - sum (tnewnew in (t-T_min_run+1)..t-1) u_i_t[i,tnewnew] <= sum (tnew in t..(t+T_min_run-1)) u_i_t[i,tnew]
T is my actual timeline and N my machines
usually I iterate over t, but I need to guarantee the machines are turned on for certain amount of time.
def HP_on_rule(model, i, t):
return model.u_i_t[i, t]*T_min_run - sum(model.u_i_t[i, tnewnew] for tnewnew in range((t-T_min_run+1), (t-1))) <= sum(model.u_i_t[i, tnew] for tnew in range(t, (t+T_min_run-1)))
model.HP_on_rule = Constraint(N, rule=HP_on_rule)
I hope you can provide me with the correct formulation in pyomo/python.
The problem is that t is a running variable and I do not know how to implement this in Python. tnew is only a help variable. E.g. t=6 (variable), T_min_run=3 (constant) and u_i_t is binary [00001111100000...] then I get:
1*3 - 1 <= 3
As I said, I do not know how to implement this in my code and the current version is not running.
TypeError: HP_on_rule() missing 1 required positional argument: 't'
It seems like you didn't provide all your arguments to the function rule.
Since t is a parameter of your function, I assume that it corresponds to an element of set T (your timeline).
Then, your last line of your code example should include not only the set N, but also the set T. Try this:
model.HP_on_rule = Constraint(N, T, rule=HP_on_rule)
Please note: Building a Constraint with a "for each" part, you must provide the Pyomo Sets that you want to iterate over at the begining of the call for Constraint construction. As a rule of thumb, your constraint rule function should have 1 more argument than the number of Pyomo Sets specified in the Constraint initilization line.

Make interpreter execute faster

I've created an interprter for a simple language. It is AST based (to be more exact, an irregular heterogeneous AST) with visitors executing and evaluating nodes. However I've noticed that it is extremely slow compared to "real" interpreters. For testing I've ran this code:
i = 3
j = 3
has = false
while i < 10000
j = 3
has = false
while j <= i / 2
if i % j == 0 then
has = true
end
j = j+2
end
if has == false then
puts i
end
i = i+2
end
In both ruby and my interpreter (just finding primes primitively). Ruby finished under 0.63 second, and my interpreter was over 15 seconds.
I develop the interpreter in C++ and in Visual Studio, so I've used the profiler to see what takes the most time: the evaluation methods.
50% of the execution time was to call the abstract evaluation method, which then casts the passed expression and calls the proper eval method. Something like this:
Value * eval (Exp * exp)
{
switch (exp->type)
{
case EXP_ADDITION:
eval ((AdditionExp*) exp);
break;
...
}
}
I could put the eval methods into the Exp nodes themselves, but I want to keep the nodes clean (Terence Parr saied something about reusability in his book).
Also at evaluation I always reconstruct the Value object, which stores the result of the evaluated expression. Actually Value is abstract, and it has derived value classes for different types (That's why I work with pointers, to avoid object slicing at returning). I think this could be another reason of slowness.
How could I make my interpreter as optimized as possible? Should I create bytecodes out of the AST and then interpret bytecodes instead? (As far as I know, they could be much faster)
Here is the source if it helps understanding my problem: src
Note: I haven't done any error handling yet, so an illegal statement or an error will simply freeze the program. (Also sorry for the stupid "error messages" :))
The syntax is pretty simple, the currently executed file is in OTZ1core/testfiles/test.txt (which is the prime finder).
I appreciate any help I can get, I'm really beginner at compilers and interpreters.
One possibility for a speed-up would be to use a function table instead of the switch with dynamic retyping. Your call to the typed-eval is going through at least one, and possibly several, levels of indirection. If you distinguish the typed functions instead by name and give them identical signatures, then pointers to the various functions can be packed into an array and indexed by the type member.
value (*evaltab[])(Exp *) = { // the order of functions must match
Exp_Add, // the order type values
//...
};
Then the whole switch becomes:
evaltab[exp->type](exp);
1 indirection, 1 function call. Fast.

Overloading functions

Is there a way to have two functions with the same name but with different arguments inside the same class in Matlab?
In short : No, it is not possible.
However, You can mimic this kind of behavior:
Obviously, since Matlab is a dynamic language, you can pass arguments of any type and check them.
function foo(x)
if isnumeric(x)
disp(' Numeric behavior');
elseif ischar(x)
disp(' String behavior');
end
end
You can also use varargin, and check the number of parameters, and change the behavior
function goo(varargin)
if nargin == 2
disp('2 arguments behavior');
elseif nargin == 3
disp('3 arguments behavior');
end
end
The correct answer has already been given by Andrey. However, I've been running some experiments for some time now and I'd like to show what I think is another relatively straightforward way that has some benefits. Also, it's a method MATLAB uses for its built-in functions quite a bit.
I'm referring to this kind of key-value pair way of passing arguments:
x = 0:pi/50:2*pi;
y = sin(x);
plot(x, y, 'Color', 'blue', 'MarkerFaceColor', 'green');
There are numerous ways of parsing a varargin cell array, but the cleanest way to do this that I've found so far uses the MATLAB inputParser class.
Example:
function makeSandwiches(amount, varargin)
%// MAKESANDWICHES Make a number of sandwiches.
%// By default, you get a ham and egg sandwich with butter on white bread.
%// Options:
%// amount : number of sandwiches to make (integer)
%// 'butter' : boolean
%// 'breadType' : string specifying 'white', 'sourdough', or 'rye'
%// 'topping' : string describing everything you like, we have it all!
p = inputParser(); %// instantiate inputParser
p.addRequired('amount', #isnumeric); %// use built-in MATLAB for validation
p.addOptional('butter', 1, #islogical);
p.addOptional('breadType', 'white', ... %// or use your own (anonymous) functions
#(x) strcmp(x, 'white') || strcmp(x, 'sourdough') || strcmp(x, 'rye'));
p.addOptional('toppings', 'ham and egg', #(x) ischar(x) || iscell(x))
p.parse(amount, varargin{:}); %// Upon parsing, the variables are
%// available as p.Results.<var>
%// Get some strings
if p.Results.amount == 1
stringAmount = 'one tasty sandwich';
else
stringAmount = sprintf('%d tasty sandwiches', p.Results.amount);
end
if p.Results.butter
stringButter = 'with butter';
else
stringButter = 'without butter';
end
%// Make the sandwiches
fprintf(['I made you %s %s from %s bread with %s and taught you ' ...
'something about input parsing and validation in MATLAB at ' ...
'the same time!\n'], ...
stringAmount, stringButter, p.Results.breadType, p.Results.toppings);
end
(slashes after comments because SO doesn't support MATLAB syntax highlighting)
The added benefits of this method are:
- Possibility to set defaults (like Python, where you can set arg=val in the function signature)
- Possibility to perform input validation in an easy way
- You only have to remember the option names, not their order as the order doesn't matter
Downsides:
- there may be some overhead that may become significant when doing many function calls and not much else (not tested)
- you have to use the Results property from the inputParser class instead of using the variables directly

Why can't you use a object property as an index in a for loop? (MATLAB)

Below is an example that doesn't work in Matlab because obj.yo is used as the for loop's index. You can just convert this to the equivalent while loop and it works fine, so why won't Matlab let this code run?
classdef iter_test
properties
yo = 1;
end
methods
function obj = iter_test
end
function run(obj)
for obj.yo = 1:10
disp('yo');
end
end
end
end
Foreword: You shouldn't expect too much from Matlab's oop capabilities. Even though things have gotten better with matlab > 2008a, compared to a real programming language, oop support in Matlab is very poor.
From my experience, Mathworks is trying to protect the user as much as possible from doing mistakes. This sometimes also means that they are restricting the possibilities.
Looking at your example I believe that exactly the same is happening.
Possible Answer: Since Matlab doesn't have any explicit typing (variables / parameters are getting typed on the fly), your code might run into problems. Imagine:
$ a = iter_test()
% a.yo is set to 1
% let's overwrite 'yo'
$ a.yo = struct('somefield', [], 'second_field', []);
% a.yo is now a struct
The following code will therefore fail:
$ for a.yo
disp('hey');
end
I bet that if matlab would support typing of parameters / variables, your code would work just fine. However, since you can assign a completely different data type to a parameter / variable after initialization, the compiler doesn't allow you to do what you want to do because you might run into trouble.
From help
"properties are like fields of a struct object."
Hence, you can use a property to read/write to it. But not use it as variable like you are trying to do. When you write
for obj.yo = 1:10
disp('yo');
end
then obj.yo is being used as a variable, not a field name.
compare to actual struct usage to make it more clear:
EDU>> s = struct('id',10)
for s.id=1:10
disp('hi')
end
s =
id: 10
??? for s.id=1:10
|
Error: Unexpected MATLAB operator.
However, one can 'set' the struct field to new value
EDU>> s.id=4
s =
id: 4
compare the above error to what you got:
??? Error using ==> iter_test
Error: File: iter_test.m Line: 9 Column: 20
Unexpected MATLAB operator.
Therefore, I do not think what you are trying to do is possible.
The error is
??? Error: File: iter_test.m Line: 9 Column: 20
Unexpected MATLAB operator.
Means that the MATLAB parser doesn't understand it. I'll leave it to you to decide whether it's a bug or deliberate. Raise it with TMW Technical Support.
EDIT: This also occurs for all other kinds of subscripting:
The following all fail to parse:
a = [0 1];
for a(1) = 1:10, end
a = {0 1};
for a{1} = 1:10, end
a = struct('a', 0, 'b', 0);
for a.a = 1:10, end
It's an issue with the MATLAB parser. Raise it with Mathworks.