range() as method argument for loop iterator - iterator

this code generates a nice loop :
for i in range(self.ix):
__pyx_t_3 = __pyx_v_self->ix;
__pyx_t_4 = __pyx_t_3;
for (__pyx_t_5 = 0; __pyx_t_5 < __pyx_t_4; __pyx_t_5+=1) {
__pyx_v_i = __pyx_t_5;
but i want to be able to pass this iterator as argument (so i can have different iterators/generators)
if i try this :
cdef ixrange(self): return range(self.ix)
already is too unwieldy and the loop is no longer a simple loop. :
static PyObject *__pyx_f_3lib_10sparse_ary_9SparseAry_ixrange(struct __pyx_obj_3lib_10sparse_ary_SparseAry *__pyx_v_self) {
PyObject *__pyx_r = NULL;
__Pyx_RefNannyDeclarations
__Pyx_RefNannySetupContext("ixrange", 0);
__Pyx_XDECREF(__pyx_r);
__pyx_t_1 = __Pyx_PyInt_From_unsigned_int(__pyx_v_self->ix); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 139, __pyx_L1_error)
__Pyx_GOTREF(__pyx_t_1);
__pyx_t_2 = __Pyx_PyObject_CallOneArg(__pyx_builtin_range, __pyx_t_1); if (unlikely(!__pyx_t_2)) __PYX_ERR(0, 139, __pyx_L1_error)
__Pyx_GOTREF(__pyx_t_2);
__Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
__pyx_r = __pyx_t_2;
__pyx_t_2 = 0;
goto __pyx_L0;
/* function exit code */
__pyx_L1_error:;
__Pyx_XDECREF(__pyx_t_1);
__Pyx_XDECREF(__pyx_t_2);
__Pyx_AddTraceback("lib.sparse_ary.SparseAry.ixrange", __pyx_clineno, __pyx_lineno, __pyx_filename);
__pyx_r = 0;
__pyx_L0:;
__Pyx_XGIVEREF(__pyx_r);
__Pyx_RefNannyFinishContext();
return __pyx_r;
}
any way i can make it simple loop again ?

I don't think it is possible.
The effect of for i in range(self.ix): is to generate a range object, and then call next on that range object until a StopIteration exception is raised. That's also the behaviour that Cython takes for general for loops.
There's actually a fairly long list of conditions that have to be met for Cython to transform that into an optimized loop:
range must be used directly - i.e. it = range(...); for i in it won't work because you might need the range object after the loop.
the types of the start and stop point must be integers.
the type of the loop variable must be an integer (Cython can infer this, but if you assign something of a different type to the loop variable then it won't work)
Cython must know that you haven't reassigned range
The sign of the stop variable must be known.
In your case I think you're hoping for Cython to look inside the cdef function and pick out the range. This is beyond what the current optimizer can do. But also, you can override cdef functions in a derived class (in C++ terms, they're "virtual"), so Cython could almost never safely make the optimization.
Cython is usually able to optimize best when you really restrict the types it's working with. The request that it should generate a fast C for loop but also be able to handle arbitrary iterators does not seem realistically achievable.

Related

Why can't I iterate after an assignment in Raku?

Given the following code, it seems that I cannot iterate over a Buf if it had been assigned to a variable, unless I cast it to a list, even though it's not a lazy sequence. What gives?
my $file = open $path, bin => True;
$_.chr.say for $file.read: 8; # works well
my $test = $file.read: 8;
$_.chr.say for $test; # fails with "No such method 'chr' for invocant of type 'Buf[uint8]'"
$_.chr.say for $test.list; # works well
$test.is-lazy.say; # False
The reason it fails, is that:
my $test = $file.read: 8;
puts the Buf that is returned by $file.read into a Scalar variable, aka inside a container. And containerized is interpreted by for as itemized, to be considered a single item. So with:
.chr.say for $test;
you're calling the .chr method on the whole Buf, rather than on the individual elements.
There are a number of solutions to this:
make sure there's no container:
my $test := $file.read: 8;
This makes sure there is no container by binding the Buf.
make it look like an array
my #test := $file.read: 8;
Same as 1 basically, make the #test be an alias for the Buf. Note that this should also use binding, otherwise you'll get the same effect as you saw.
make it work like an Iterable
.chr.say for #$test;
By prefixing the # you're telling to iterate over it. This is basically syntactic sugar for the $test.list workaround you already found.
Re the $test.is-lazy.say, that is False for just about anything, e.g. 42.is-lazy.say; # False, so that doesn't tell you very much :-)

How does one use Tensorflow's OpOutputList?

On the github, an OpOutputList is initialized like so:
OpOutputList outputs;
OP_REQUIRES_OK(context, context->output_list("output",&outputs));
And tensors are added like this:
Tensor* tensor0 = nullptr;
Tensor* tensor1 = nullptr;
long long int sz0 = 3;
long long int sz1 = 4;
...
OP_REQUIRES_OK(context, outputs.allocate(0, TensorShape({sz0}), &tensor0));
OP_REQUIRES_OK(context, outputs.allocate(1, TensorShape({sz1}), &tensor1));
I'm assuming that OpOutputList is like OpInputList in that jagged arrays are allowed.
My question is, how does OpOutputList work? Sometimes I get segfaults where I can't access the first index when I use Eigen::Tensor::flat() but because I don't understand how allocation works I can't pinpoint the error.
Many thanks.
OpOutputList object itself is a very simple value object containing just two integers - the start and end indices of the op outputs that are contained in this list. Being simple value objects, you generally just create them on the stack, no "allocation" required.
You allocate the tensors that logically belong to an OpOutputList just like any other tensor. Generally using allocate_output(). Here is the implementation of OpOutputList::allocate:
Status OpOutputList::allocate(int i, const TensorShape& shape,
Tensor** output) {
DCHECK_GE(i, 0);
DCHECK_LT(i, stop_ - start_);
return ctx_->allocate_output(start_ + i, shape, output);
}
As you can see it just checks that the index i is indeed within this OpOutputList and call allocate_output.

Eigen:Avoid using for loop

In the below code instead of using for loop I wanted to implement a one line code that would use Eigen library functions and help in vectorisation of code itself and thus making parallelization through OpenMP easy.
Eigen::VectorXd get_vector(int n, int j , int start){
Eigen::VectorXd foo(n);
indices = Eigen::VectorXd::LinSpaced(n, start + n - 1, start).array();
for(int i =0;i<indices.size();i++)
foo(i) = (array(indices(i)) - array(j))*(array(indices(i)) - array(j));
return foo;
}
// array is globally declared as Eigen::VectorXd and have length greater than n, it is already been defined.(set of N(>n) random double numbers)
Assuming array is an VectorXd and you don't need indices outside your function:
return (array.segment(start, n).array() - array(j)).square();
And you should consider returning a ArrayXd instead of VectorXd.
If array is actually a ArrayXd, you can omit the .array().

Is it possible to init a variable in the while condition body for Kotlin?

In the code below:
var verticesCount: Int // to read a vertices count for graph
// Reading until we get a valid vertices count.
while (!Assertions.checkEnoughVertices(
verticesCount = consoleReader.readInt(null, Localization.getLocStr("type_int_vertices_count"))))
// The case when we don't have enough vertices.
println(String.format(Localization.getLocStr("no_enough_vertices_in_graph"),
Assertions.CONFIG_MIN_VERTICES_COUNT))
val resultGraph = Graph(verticesCount)
we are getting next error on the last line:
Error:(31, 33) Kotlin: Variable 'verticesCount' must be initialized
Assertions.checkEnoughVertices accepts a safe type variable as an argument (verticesCount: Int), so it's impossible for verticesCount to be uninitialized or null here (and we're getting no corresponding errors on those lines).
What's going on on the last line when already initialized variable becomes uninitialized again?
The syntax you've used denotes a function call with named arguments, not the assignment of a local variable. So verticesCount = is just an explanation to the reader that the value which is being passed here to checkEnoughVertices corresponds to the parameter of that function named verticesCount. It has nothing to do with the local variable named verticesCount declared just above, so the compiler thinks you've still to initialize that variable.
In Kotlin, the assignment to a variable (a = b) is not an expression, so it cannot be used as a value in other expressions. You have to split the assignment and the while-loop condition to achieve what you want. I'd do this with an infinite loop + a condition inside:
var verticesCount: Int
while (true) {
verticesCount = consoleReader.readInt(...)
if (Assertions.checkEnoughVertices(verticesCount)) break
...
}
val resultGraph = Graph(verticesCount)
Well, technically it is possible to assign values to variables in the while condition - and anything else you might want to do there, too.
The magic comes from the also function:
Try this: (excuse the completely useless thing this is doing...)
var i = 10
var doubleI: Int
while ((i * 2).also { doubleI = it } > 0) {
i--
println(doubleI)
}
Any expression can be "extended" with "something to do" by calling also which takes the expression it is called upon as the it parameter and executes the given block. The value also returns is identical to its caller value.
Here's a very good article to explain this and much more: https://medium.com/#elye.project/mastering-kotlin-standard-functions-run-with-let-also-and-apply-9cd334b0ef84

How do I write a generic memoize function?

I'm writing a function to find triangle numbers and the natural way to write it is recursively:
function triangle (x)
if x == 0 then return 0 end
return x+triangle(x-1)
end
But attempting to calculate the first 100,000 triangle numbers fails with a stack overflow after a while. This is an ideal function to memoize, but I want a solution that will memoize any function I pass to it.
Mathematica has a particularly slick way to do memoization, relying on the fact that hashes and function calls use the same syntax:
triangle[0] = 0;
triangle[x_] := triangle[x] = x + triangle[x-1]
That's it. It works because the rules for pattern-matching function calls are such that it always uses a more specific definition before a more general definition.
Of course, as has been pointed out, this example has a closed-form solution: triangle[x_] := x*(x+1)/2. Fibonacci numbers are the classic example of how adding memoization gives a drastic speedup:
fib[0] = 1;
fib[1] = 1;
fib[n_] := fib[n] = fib[n-1] + fib[n-2]
Although that too has a closed-form equivalent, albeit messier: http://mathworld.wolfram.com/FibonacciNumber.html
I disagree with the person who suggested this was inappropriate for memoization because you could "just use a loop". The point of memoization is that any repeat function calls are O(1) time. That's a lot better than O(n). In fact, you could even concoct a scenario where the memoized implementation has better performance than the closed-form implementation!
You're also asking the wrong question for your original problem ;)
This is a better way for that case:
triangle(n) = n * (n - 1) / 2
Furthermore, supposing the formula didn't have such a neat solution, memoisation would still be a poor approach here. You'd be better off just writing a simple loop in this case. See this answer for a fuller discussion.
I bet something like this should work with variable argument lists in Lua:
local function varg_tostring(...)
local s = select(1, ...)
for n = 2, select('#', ...) do
s = s..","..select(n,...)
end
return s
end
local function memoize(f)
local cache = {}
return function (...)
local al = varg_tostring(...)
if cache[al] then
return cache[al]
else
local y = f(...)
cache[al] = y
return y
end
end
end
You could probably also do something clever with a metatables with __tostring so that the argument list could just be converted with a tostring(). Oh the possibilities.
In C# 3.0 - for recursive functions, you can do something like:
public static class Helpers
{
public static Func<A, R> Memoize<A, R>(this Func<A, Func<A,R>, R> f)
{
var map = new Dictionary<A, R>();
Func<A, R> self = null;
self = (a) =>
{
R value;
if (map.TryGetValue(a, out value))
return value;
value = f(a, self);
map.Add(a, value);
return value;
};
return self;
}
}
Then you can create a memoized Fibonacci function like this:
var memoized_fib = Helpers.Memoize<int, int>((n,fib) => n > 1 ? fib(n - 1) + fib(n - 2) : n);
Console.WriteLine(memoized_fib(40));
In Scala (untested):
def memoize[A, B](f: (A)=>B) = {
var cache = Map[A, B]()
{ x: A =>
if (cache contains x) cache(x) else {
val back = f(x)
cache += (x -> back)
back
}
}
}
Note that this only works for functions of arity 1, but with currying you could make it work. The more subtle problem is that memoize(f) != memoize(f) for any function f. One very sneaky way to fix this would be something like the following:
val correctMem = memoize(memoize _)
I don't think that this will compile, but it does illustrate the idea.
Update: Commenters have pointed out that memoization is a good way to optimize recursion. Admittedly, I hadn't considered this before, since I generally work in a language (C#) where generalized memoization isn't so trivial to build. Take the post below with that grain of salt in mind.
I think Luke likely has the most appropriate solution to this problem, but memoization is not generally the solution to any issue of stack overflow.
Stack overflow usually is caused by recursion going deeper than the platform can handle. Languages sometimes support "tail recursion", which re-uses the context of the current call, rather than creating a new context for the recursive call. But a lot of mainstream languages/platforms don't support this. C# has no inherent support for tail-recursion, for example. The 64-bit version of the .NET JITter can apply it as an optimization at the IL level, which is all but useless if you need to support 32-bit platforms.
If your language doesn't support tail recursion, your best option for avoiding stack overflows is either to convert to an explicit loop (much less elegant, but sometimes necessary), or find a non-iterative algorithm such as Luke provided for this problem.
function memoize (f)
local cache = {}
return function (x)
if cache[x] then
return cache[x]
else
local y = f(x)
cache[x] = y
return y
end
end
end
triangle = memoize(triangle);
Note that to avoid a stack overflow, triangle would still need to be seeded.
Here's something that works without converting the arguments to strings.
The only caveat is that it can't handle a nil argument. But the accepted solution can't distinguish the value nil from the string "nil", so that's probably OK.
local function m(f)
local t = { }
local function mf(x, ...) -- memoized f
assert(x ~= nil, 'nil passed to memoized function')
if select('#', ...) > 0 then
t[x] = t[x] or m(function(...) return f(x, ...) end)
return t[x](...)
else
t[x] = t[x] or f(x)
assert(t[x] ~= nil, 'memoized function returns nil')
return t[x]
end
end
return mf
end
I've been inspired by this question to implement (yet another) flexible memoize function in Lua.
https://github.com/kikito/memoize.lua
Main advantages:
Accepts a variable number of arguments
Doesn't use tostring; instead, it organizes the cache in a tree structure, using the parameters to traverse it.
Works just fine with functions that return multiple values.
Pasting the code here as reference:
local globalCache = {}
local function getFromCache(cache, args)
local node = cache
for i=1, #args do
if not node.children then return {} end
node = node.children[args[i]]
if not node then return {} end
end
return node.results
end
local function insertInCache(cache, args, results)
local arg
local node = cache
for i=1, #args do
arg = args[i]
node.children = node.children or {}
node.children[arg] = node.children[arg] or {}
node = node.children[arg]
end
node.results = results
end
-- public function
local function memoize(f)
globalCache[f] = { results = {} }
return function (...)
local results = getFromCache( globalCache[f], {...} )
if #results == 0 then
results = { f(...) }
insertInCache(globalCache[f], {...}, results)
end
return unpack(results)
end
end
return memoize
Here is a generic C# 3.0 implementation, if it could help :
public static class Memoization
{
public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> function)
{
var cache = new Dictionary<T, TResult>();
var nullCache = default(TResult);
var isNullCacheSet = false;
return parameter =>
{
TResult value;
if (parameter == null && isNullCacheSet)
{
return nullCache;
}
if (parameter == null)
{
nullCache = function(parameter);
isNullCacheSet = true;
return nullCache;
}
if (cache.TryGetValue(parameter, out value))
{
return value;
}
value = function(parameter);
cache.Add(parameter, value);
return value;
};
}
}
(Quoted from a french blog article)
In the vein of posting memoization in different languages, i'd like to respond to #onebyone.livejournal.com with a non-language-changing C++ example.
First, a memoizer for single arg functions:
template <class Result, class Arg, class ResultStore = std::map<Arg, Result> >
class memoizer1{
public:
template <class F>
const Result& operator()(F f, const Arg& a){
typename ResultStore::const_iterator it = memo_.find(a);
if(it == memo_.end()) {
it = memo_.insert(make_pair(a, f(a))).first;
}
return it->second;
}
private:
ResultStore memo_;
};
Just create an instance of the memoizer, feed it your function and argument. Just make sure not to share the same memo between two different functions (but you can share it between different implementations of the same function).
Next, a driver functon, and an implementation. only the driver function need be public
int fib(int); // driver
int fib_(int); // implementation
Implemented:
int fib_(int n){
++total_ops;
if(n == 0 || n == 1)
return 1;
else
return fib(n-1) + fib(n-2);
}
And the driver, to memoize
int fib(int n) {
static memoizer1<int,int> memo;
return memo(fib_, n);
}
Permalink showing output on codepad.org. Number of calls is measured to verify correctness. (insert unit test here...)
This only memoizes one input functions. Generalizing for multiple args or varying arguments left as an exercise for the reader.
In Perl generic memoization is easy to get. The Memoize module is part of the perl core and is highly reliable, flexible, and easy-to-use.
The example from it's manpage:
# This is the documentation for Memoize 1.01
use Memoize;
memoize('slow_function');
slow_function(arguments); # Is faster than it was before
You can add, remove, and customize memoization of functions at run time! You can provide callbacks for custom memento computation.
Memoize.pm even has facilities for making the memento cache persistent, so it does not need to be re-filled on each invocation of your program!
Here's the documentation: http://perldoc.perl.org/5.8.8/Memoize.html
Extending the idea, it's also possible to memoize functions with two input parameters:
function memoize2 (f)
local cache = {}
return function (x, y)
if cache[x..','..y] then
return cache[x..','..y]
else
local z = f(x,y)
cache[x..','..y] = z
return z
end
end
end
Notice that parameter order matters in the caching algorithm, so if parameter order doesn't matter in the functions to be memoized the odds of getting a cache hit would be increased by sorting the parameters before checking the cache.
But it's important to note that some functions can't be profitably memoized. I wrote memoize2 to see if the recursive Euclidean algorithm for finding the greatest common divisor could be sped up.
function gcd (a, b)
if b == 0 then return a end
return gcd(b, a%b)
end
As it turns out, gcd doesn't respond well to memoization. The calculation it does is far less expensive than the caching algorithm. Ever for large numbers, it terminates fairly quickly. After a while, the cache grows very large. This algorithm is probably as fast as it can be.
Recursion isn't necessary. The nth triangle number is n(n-1)/2, so...
public int triangle(final int n){
return n * (n - 1) / 2;
}
Please don't recurse this. Either use the x*(x+1)/2 formula or simply iterate the values and memoize as you go.
int[] memo = new int[n+1];
int sum = 0;
for(int i = 0; i <= n; ++i)
{
sum+=i;
memo[i] = sum;
}
return memo[n];