Calculate variance with VB.NET lambda expression - vb.net

I am trying to convert the following code for the variance calculation
public static double Variance(this IEnumerable<double> source)
{
double avg = source.Average();
double d = source.Aggregate(0.0,
(total, next) => total += Math.Pow(next - avg, 2));
return d / (source.Count() - 1);
}
described on CodeProject into corresponded VB.NET lambda expression syntax, but I am stuck in the conversion of Aggregate function.
How can I implement that code in VB.NET?

The following will only work in VB 10. Prior versions didn’t support multi-line lambdas.
Dim d = source.Aggregate(0.0,
Function(total, next)
total += (next - avg) ^ 2
Return total
End Function)
Function(foo) bar corresponds to the single-statement lambda (foo) => bar in C# but you need the multi-line lambda here which only exists since VB 10.
However, I’m wary of the original code. Modifying total seems like an error, since no Aggregate overload passes its arguments by reference. So I’m suggesting that the original code is wrong (even though it may actually compile), and that the correct solution (in VB) would look like this:
Dim d = source.Aggregate(0.0, _
Function(total, next) total + (next - avg) ^ 2)
Furthermore, this doesn’t require any multi-line lambdas, and thus also works on older versions of VB.

Related

How kotlin optimizes checks that number belongs to range?

I'm investigating kotlin using decompilation to java code.
I've found one interesting nuance and can't understand how it is implemented.
Here's the kotlin code:
val result = 50 in 1..100
I'm using intelij idea decompilation to look for the equivalent of java code and here's what we have:
public final class Test14Kt {
private static final boolean result = true;
public static final boolean getResult() {
return result;
}
}
So as i understand it, kotlinc somehow knows that the element is in range and saves true to result variable on the stage of compilation.
It's cool. But how is it achieved?
This is very simple constant folding:
Terms in constant expressions are typically simple literals, such as the integer literal 2, but they may also be variables whose values are known at compile time. Consider the statement:
i = 320 * 200 * 32;
Most modern compilers would not actually generate two multiply instructions and a store for this statement. Instead, they identify constructs such as these and substitute the computed values at compile time (in this case, 2,048,000). The resulting code would load the computed value and store it rather than loading and multiplying several values.
Constant folding can even use arithmetic identities. When x is an integer type, the value of 0 * x is zero even if the compiler does not know the value of x.
Here,
50 in 1..100 ==
1 <= 50 && 50 <= 100 ==
true && true ==
true

Make interpreter execute faster

I've created an interprter for a simple language. It is AST based (to be more exact, an irregular heterogeneous AST) with visitors executing and evaluating nodes. However I've noticed that it is extremely slow compared to "real" interpreters. For testing I've ran this code:
i = 3
j = 3
has = false
while i < 10000
j = 3
has = false
while j <= i / 2
if i % j == 0 then
has = true
end
j = j+2
end
if has == false then
puts i
end
i = i+2
end
In both ruby and my interpreter (just finding primes primitively). Ruby finished under 0.63 second, and my interpreter was over 15 seconds.
I develop the interpreter in C++ and in Visual Studio, so I've used the profiler to see what takes the most time: the evaluation methods.
50% of the execution time was to call the abstract evaluation method, which then casts the passed expression and calls the proper eval method. Something like this:
Value * eval (Exp * exp)
{
switch (exp->type)
{
case EXP_ADDITION:
eval ((AdditionExp*) exp);
break;
...
}
}
I could put the eval methods into the Exp nodes themselves, but I want to keep the nodes clean (Terence Parr saied something about reusability in his book).
Also at evaluation I always reconstruct the Value object, which stores the result of the evaluated expression. Actually Value is abstract, and it has derived value classes for different types (That's why I work with pointers, to avoid object slicing at returning). I think this could be another reason of slowness.
How could I make my interpreter as optimized as possible? Should I create bytecodes out of the AST and then interpret bytecodes instead? (As far as I know, they could be much faster)
Here is the source if it helps understanding my problem: src
Note: I haven't done any error handling yet, so an illegal statement or an error will simply freeze the program. (Also sorry for the stupid "error messages" :))
The syntax is pretty simple, the currently executed file is in OTZ1core/testfiles/test.txt (which is the prime finder).
I appreciate any help I can get, I'm really beginner at compilers and interpreters.
One possibility for a speed-up would be to use a function table instead of the switch with dynamic retyping. Your call to the typed-eval is going through at least one, and possibly several, levels of indirection. If you distinguish the typed functions instead by name and give them identical signatures, then pointers to the various functions can be packed into an array and indexed by the type member.
value (*evaltab[])(Exp *) = { // the order of functions must match
Exp_Add, // the order type values
//...
};
Then the whole switch becomes:
evaltab[exp->type](exp);
1 indirection, 1 function call. Fast.

matlab subsref: {} with string argument fails, why?

There are a few implementations of a hash or dictionary class in the Mathworks File Exchange repository. All that I have looked at use parentheses overloading for key referencing, e.g.
d = Dict;
d('foo') = 'bar';
y = d('foo');
which seems a reasonable interface. It would be preferable, though, if you want to easily have dictionaries which contain other dictionaries, to use braces {} instead of parentheses, as this allows you to get around MATLAB's (arbitrary, it seems) syntax limitation that multiple parentheses are not allowed but multiple braces are allowed, i.e.
t{1}{2}{3} % is legal MATLAB
t(1)(2)(3) % is not legal MATLAB
So if you want to easily be able to nest dictionaries within dictionaries,
dict{'key1'}{'key2'}{'key3'}
as is a common idiom in Perl and is possible and frequently useful in other languages including Python, then unless you want to use n-1 intermediate variables to extract a dictionary entry n layers deep, this seems a good choice. And it would seem easy to rewrite the class's subsref and subsasgn operations to do the same thing for {} as they previously did for (), and everything should work.
Except it doesn't when I try it.
Here's my code. (I've reduced it to a minimal case. No actual dictionary is implemented here, each object has one key and one value, but this is enough to demonstrate the problem.)
classdef TestBraces < handle
properties
% not a full hash table implementation, obviously
key
value
end
methods(Access = public)
function val = subsref(obj, ref)
% Re-implement dot referencing for methods.
if strcmp(ref(1).type, '.')
% User trying to access a method
% Methods access
if ismember(ref(1).subs, methods(obj))
if length(ref) > 1
% Call with args
val = obj.(ref(1).subs)(ref(2).subs{:});
else
% No args
val = obj.(ref.subs);
end
return;
end
% User trying to access something else.
error(['Reference to non-existant property or method ''' ref.subs '''']);
end
switch ref.type
case '()'
error('() indexing not supported.');
case '{}'
theKey = ref.subs{1};
if isequal(obj.key, theKey)
val = obj.value;
else
error('key %s not found', theKey);
end
otherwise
error('Should never happen')
end
end
function obj = subsasgn(obj, ref, value)
%Dict/SUBSASGN Subscript assignment for Dict objects.
%
% See also: Dict
%
if ~strcmp(ref.type,'{}')
error('() and dot indexing for assignment not supported.');
end
% Vectorized calls not supported
if length(ref.subs) > 1
error('Dict only supports storing key/value pairs one at a time.');
end
theKey = ref.subs{1};
obj.key = theKey;
obj.value = value;
end % subsasgn
end
end
Using this code, I can assign as expected:
t = TestBraces;
t{'foo'} = 'bar'
(And it is clear that the assignment work from the default display output for t.) So subsasgn appears to work correctly.
But I can't retrieve the value (subsref doesn't work):
t{'foo'}
??? Error using ==> subsref
Too many output arguments.
The error message makes no sense to me, and a breakpoint at the first executable line of my subsref handler is never hit, so at least superficially this looks like a MATLAB issue, not a bug in my code.
Clearly string arguments to () parenthesis subscripts are allowed, since this works fine if you change the code to work with () instead of {}. (Except then you can't nest subscript operations, which is the object of the exercise.)
Either insight into what I'm doing wrong in my code, any limitations that make what I'm doing unfeasible, or alternative implementations of nested dictionaries would be appreciated.
Short answer, add this method to your class:
function n = numel(obj, varargin)
n = 1;
end
EDIT: The long answer.
Despite the way that subsref's function signature appears in the documentation, it's actually a varargout function - it can produce a variable number of output arguments. Both brace and dot indexing can produce multiple outputs, as shown here:
>> c = {1,2,3,4,5};
>> [a,b,c] = c{[1 3 5]}
a =
1
b =
3
c =
5
The number of outputs expected from subsref is determined based on the size of the indexing array. In this case, the indexing array is size 3, so there's three outputs.
Now, look again at:
t{'foo'}
What's the size of the indexing array? Also 3. MATLAB doesn't care that you intend to interpret this as a string instead of an array. It just sees that the input is size 3 and your subsref can only output 1 thing at a time. So, the arguments mismatch. Fortunately, we can correct things by changing the way that MATLAB determines how many outputs are expected by overloading numel. Quoted from the doc link:
It is important to note the significance of numel with regards to the
overloaded subsref and subsasgn functions. In the case of the
overloaded subsref function for brace and dot indexing (as described
in the last paragraph), numel is used to compute the number of
expected outputs (nargout) returned from subsref. For the overloaded
subsasgn function, numel is used to compute the number of expected
inputs (nargin) to be assigned using subsasgn. The nargin value for
the overloaded subsasgn function is the value returned by numel plus 2
(one for the variable being assigned to, and one for the structure
array of subscripts).
As a class designer, you must ensure that the value of n returned by
the built-in numel function is consistent with the class design for
that object. If n is different from either the nargout for the
overloaded subsref function or the nargin for the overloaded subsasgn
function, then you need to overload numel to return a value of n that
is consistent with the class' subsref and subsasgn functions.
Otherwise, MATLAB produces errors when calling these functions.
And there you have it.

Creating a log function with a custom base

I have a formula and this formula uses a log function with a custom base for example log with a base of b and value of x. In objective-c, I know there are log functions that calculate without a base and base of either 2 or 10.
Is there a function that has the ability to calculate a log function with a custom/variable base? or maybe there is an alternative method of completing this formula.
The basic idea of my formula is this log(1+0.02)(1.26825) (1+0.02 is the base). This should equal 12.000.
Like this:
double logWithBase(double base, double x) {
return log(x) / log(base);
}
You can calculate arbitrary logarithms with logbx = logcx / logcb, where c is one of the more readily available bases such as 10 or e.
For your particular example, loge1.26825 = 0.237637997 and loge1.02 = 0.019802627. That's 12.000 (within the limits of my calculator's accuracy): 0.237637997 / 0.019802627 = 12.000326876.
In fact, 1.0212 is actually 1.268241795 and, if you use that value, you get much closer to 12:
loge1.268241795 = 0.237631528
loge1.02 = 0.019802627
0.237631528 / 0.019802627 = 12.000000197.
Ray is right but here is a Obj-C method modification of it:
-(double) logWithBase:(double)base andNumber:(double)x {
return log(x) / log(base);
}

LINQ statement where result count is used in expression's condition

O' LINQ-fu masters, please help.
I have a requirement where I have to add items into a List(Of T) (let's call it Target) from an IEnumerable(Of T) (let's call it Source) using Target.AddRange() in VB.NET.
Target.AddRange(Source.TakeWhie(Function(X, Index) ?))
The ? part is a tricky condition that is something like: As long as the as yet unenumerated count is not equal to what is needed to fill the list to the minimum required then randomly decide if the current item should be taken, otherwise take the item.
Somethig like...
Source.Count() - Index = _minimum_required - _curr_count_of_items_taken _
OrElse GetRandomNumberBetween1And100() <= _probability_this_item_is_taken
' _minimum_required and _probability_this_item_is_taken are constants
The confounding part is that _curr_count_of_items_taken needs to be incremented each time the TakeWhile statement is satisfied. How would I go about doing that?
I'm also open to a solution that uses any other LINQ methods (Aggregate, Where, etc.) instead of TakeWhile.
If all else fails then I will go back to using a good old for-loop =)
But hoping there is a LINQ solution. Thanks in advance for any suggestions.
EDIT: Good old for-loop version as requested:
Dim _source_total As Integer = Source.Count()
For _index As Integer = 0 To _source_total - 1
If _source_total - _index = MinimumRows - Target.Count _
OrElse NumberGenerator.GetRandomNumberBetween1And100 <= _possibility_item_is_taken Then
Target.Add(Source(_index))
End If
Next
EDITDIT:
David's no-side-effects answer comes closes to what I need while staying readable. Maybe he's the only one who could understand my poorly communicated pseudo-code =). The OrderBy(GetRandomNumber) is brilliant in hindsight. I just need to change the Take(3) part to Take(MinimumRequiredPlusAnOptionalRandomAmountExtra) and drop the OrderBy and Select at the end. Thanks to the rest for suggestions.
You need to introduce a side-effect, basically.
In C# this is relatively easy - you can use a lambda expression which updates a captured variable. In VB this may still be possible, but I wouldn't like to guess at the syntax. I don't quite understand your condition (it sounds a little backwards) but you could do something like:
The C# would be something like:
int count = 0;
var query = source.TakeWhile(x => count < minimumRequired ||
rng.Next(100) < probability)
.Select(x => { count++; return x; });
target.AddRange(query);
The count will be incremented each time an item is actually taken.
Note that I suspect you actually want Where instead of TakeWhile - otherwise the first time the rng gives a high number, the sequence will end.
EDIT: If you can't use side-effects directly you may be able to use a horrible hack. I haven't tried this, but...
public static T Increment<T>(ref int counter, T value)
{
counter++;
return value;
}
...
int count = 0;
var query = source.TakeWhile(x => count < minimumRequired ||
rng.Next(100) < probability)
.Select(x => Increment(ref count, x));
target.AddRange(query);
In other words, you put the side-effect into a separate method, and call the method using pass-by-reference for the counter. No idea if it would work in VB, but possibly worth a try. On the other hand, a loop might be simpler...
As a completely different way of approaching it, is your source an in-memory collection already, which you can iterate through cheaply? If so, just use:
var query = Enumerable.Concat(source.Take(minimumRequired),
source.Skip(minimumRequired)
.TakeWhile(condition));
In other words, definitely grab the first n elements, and then start again, skip the first n elements and take the rest based on the condition.
If your task is to extract 3 random images from a collection of 50 random images, this works great.
target.AddRange( source.OrderBy(GetRandomNumber).Take(3) );
If you require order preservation, that's not too hard to add:
target.AddRange( source
.Select( (x, i) => new {x, i})
.OrderBy(GetRandomNumber)
.Take(3)
.OrderBy( z => z.i)
.Select( z => z.x)
);
If requirements are to (for whatever reason)
favor items at the end of the list
allow more items through than requested (5 instead of 3, but only sometimes)
then I'd write the foreach loop.