Conversion of a python variable to a real*16 fortran one - variables

I'm writing a code in Python that calls some subroutines written in Fortran. When the variables are defined in Fortran as:
real*8, intent(in) :: var1,var2
and, respectively in Python,
var1 = 1.0
var1 = 1.0
everything is fine. But if I define an extended real, that is:
real*16, intent(in) :: var1,var2
and in python use
import numpy as np
var1 = np.float16(2)
var2 = np.float16(2)
the variables take a strange number when passing them to the fortran routine. Can anyone see what I'm doing wrong?

This numpy-discussion thread from last year indicates that numpy's quadruple precision varies form machine to machine. My guess is that your bunk data comes from two different language's inconsistency as to what quad-precision means.
Note also that f2py really only understands <type>(kind=<precision>) where <type> is REAL/INTEGER/COMPLEX and <precision> is an integer 1, 2, 4, 8 (cf the FAQ).

Related

How to make an np.array in numba with input-dependent rank?

I would like to #numba.njit this simple function that returns an array with a shape, in particular a rank, that depends on the input i:
E.g. for i = 4 the shape should be shape=(2, 2, 2, 2, 4)
import numpy as np
from numba import njit
#njit
def make_array_numba(i):
shape = np.array([2] * i + [i], dtype=np.int64)
return np.empty(shape, dtype=np.int64)
make_array_numba(4).shape
I tried many different ways, but always fail at the fact that I can't generate the shape tuple that numba wants to see in np.empty / np.reshape / np.zeros /...
In normal numpy one can pass lists / np.arrays as the shape, or I can generate a tuple on the fly such as (2,) * i + (i,).
Output:
>>> empty(array(int64, 1d, C), dtype=class(int64))
There are 4 candidate implementations:
- Of which 4 did not match due to:
Overload in function '_OverloadWrapper._build.<locals>.ol_generated': File: numba/core/overload_glue.py: Line 131.
With argument(s): '(array(int64, 1d, C), dtype=class(int64))':
Rejected as the implementation raised a specific error:
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<intrinsic stub>) found for signature:
>>> stub(array(int64, 1d, C), class(int64))
There are 2 candidate implementations:
- Of which 2 did not match due to:
Intrinsic of function 'stub': File: numba/core/overload_glue.py: Line 35.
With argument(s): '(array(int64, 1d, C), class(int64))':
No match.
This is not possible only with #njit. The reason is that Numba needs to set a type for the array independently of variable values so to compile the function and only then execute it. The thing is the dimension of an array is part of its type. Thus, here, Numba cannot find the type of the array since it is dependent of a value that is not a compile-time constant.
The only way to solve this problem (assuming you do not want to linearize your array) is to recompile the function for each possible i which is certainly overkill and completely defeat the benefit of using Numba (at least in your example). Note that #generated_jit can be used in such a case when you really want to recompile the function for different values or input types. I strongly advise you not to use it for your current use-case. If you try, then you will have other similar issues due to the array not being indexable using a runtime-defined variables and the resulting code will quickly be insane.
A more general and cleaner solution is simply to linearize the array. This means flattening it and perform some fancy indexing computation like (((... + z) * stride_z) + y) * stride_y + x. The size and the index can be computed at runtime independently of the typing system. Note that indexing can be quite slow but Numpy will not use a faster code in this case.

pycharm ignores command "integ()" from numpy.polynomial import Polynomialfunction / on jupyter it works

When using pycharm the integ() function is ignored:
from numpy.polynomial import Polynomial as P
p = P([1, 2, 3])
p.integ()
print(p)
outcome: 1.0 + 2.0 x**1 + 3.0 x**2 (no errors)
on jupyter
it gives me the correct result: 𝑥↦0.0+1.0𝑥+1.0𝑥2+1.0𝑥3
but I really prefer writing code on pycharm - can anyone tell me why this happens or how I could change it??
First, note that p.integ() doesn't change p. It returns a new polynomial object. When you execute print(p) after this expression, you are printing the original p that was created earlier.
In an interactive shell, with a line such as p.integ() that contains an expression (with no assignment), the shell (i.e. Jupyter) prints the value of the expression in the terminal. This is a feature of Jupyter, not of the Python interpreter. When such an expression is encountered in a program, the Python interpreter evaluates the expression, but does not print it. If you want to print the integral of p, you can do something like
q = p.integ()
print(q)

Object arrays not supported on numpy with mkl?

I recently switched from numpy compiled with open blas to numpy compiled with mkl. In pure numeric operations there was a clear speed up for matrix multiplication. However when I ran some code I have been using which multiplies matrices containing sympy variables, I now get the error
'Object arrays are not currently supported'
Does anyone have information on why this is the case for mkl and not for open blas?
Release notes for 1.17.0
Support of object arrays in matmul
It is now possible to use matmul (or the # operator) with object arrays. For instance, it is now possible to do:
from fractions import Fraction
a = np.array([[Fraction(1, 2), Fraction(1, 3)], [Fraction(1, 3), Fraction(1, 2)]])
b = a # a
Are you using # (matmul or dot)? A numpy array containing sympy objects will be object dtype. Math on object arrays depends on delegating the action to the object's own methods. It cannot be performed by the fast compiled libraries, which only work with c types such as float and double.
As a general rule you should not be trying to mix numpy and sympy. Math is hit-or-miss, and never fast. Use sympy's own Matrix module, or lambdify the sympy expressions for numeric work.
What's the mkl version? You may have to explore this with creator of that compilation.

Tensorflow: How does tf.get_variable work?

I have read about tf.get_variable from this question and also a bit from the documentation available at the tensorflow website. However, I am still not clear and was unable to find an answer online.
How does tf.get_variable work? For example:
var1 = tf.Variable(3.,dtype=float64)
var2 = tf.get_variable("var1",[],dtype=tf.float64)
Does it mean that var2 is another variable with initialization similar to var1? Or is var2 an alias for var1 (I tried and it doesn't seem to)?
How are var1 and var2 related?
How is a variable constructed when the variable we are getting doesn't really exist?
tf.get_variable(name) creates a new variable called name (or add _ if name already exists in the current scope) in the tensorflow graph.
In your example, you're creating a python variable called var1.
The name of that variable in the tensorflow graph is not ** var1, but is Variable:0.
Every node you define has its own name that you can specify or let tensorflow give a default (and always different) one. You can see the name value accessing the name property of the python variable. (ie print(var1.name)).
On your second line, you're defining a Python variable var2 whose name in the tensorflow graph is var1.
The script
import tensorflow as tf
var1 = tf.Variable(3.,dtype=tf.float64)
print(var1.name)
var2 = tf.get_variable("var1",[],dtype=tf.float64)
print(var2.name)
In fact prints:
Variable:0
var1:0
If you, instead, want to define a variable (node) called var1 in the tensorflow graph and then getting a reference to that node, you cannot simply use tf.get_variable("var1"), because it will create a new different variable valled var1_1.
This script
var1 = tf.Variable(3.,dtype=tf.float64, name="var1")
print(var1.name)
var2 = tf.get_variable("var1",[],dtype=tf.float64)
print(var2.name)
prints:
var1:0
var1_1:0
If you want to create a reference to the node var1, you first:
Have to replace tf.Variable with tf.get_variable. The variables created with tf.Variable can't be shared, while the latter can.
Know what the scope of the var1 is and allow the reuse of that scope when declaring the reference.
Looking at the code is the better way for understanding
import tensorflow as tf
#var1 = tf.Variable(3.,dtype=tf.float64, name="var1")
var1 = tf.get_variable(initializer=tf.constant_initializer(3.), dtype=tf.float64, name="var1", shape=())
current_scope = tf.contrib.framework.get_name_scope()
print(var1.name)
with tf.variable_scope(current_scope, reuse=True):
var2 = tf.get_variable("var1",[],dtype=tf.float64)
print(var2.name)
outputs:
var1:0
var1:0
If you define a variable with a name that has been defined before, then TensorFlow throws an exception. Hence, it is convenient to use the tf.get_variable() function instead of tf.Variable(). The function tf.get_variable() returns the existing variable with the same name if it exists, and creates the variable with the specified shape and initializer if it does not exist.

Not contract X*X*X to pow(X,3) in sympy's `printing.ccode` method

I have a sympy equation that I need to translate to CUDA.
In its default configuration, sympy.printing.ccode will transform the expression x*x into pow(x,2) which unfortunately, CUDA behaves a bit strangely with (e.g. pow(0.1,2) is 0 according to CUDA).
I would prefer sympy.printing.ccode to leave these kinds of expressions unaltered, or put another way, I would like it to expand any instance of pow into a simple product. E.g. pow(x,4) would become x*x*x*x -- does anyone know how to make this happen?
This should do it:
>>> import sympy as sp
>>> from sympy.utilities.codegen import CCodePrinter
>>> print(sp.__version__)
0.7.6.1
>>> x = sp.Symbol('x')
>>> CCodePrinter().doprint(x*x*x)
'pow(x, 3)'
>>> class MyCCodePrinter(CCodePrinter):
... def _print_Pow(self, expr):
... if expr.exp.is_integer and expr.exp.is_number:
... return '(' + '*'.join([self._print(expr.base)]*expr.exp) + ')'
... else:
... return super(MyCCodePrinter, self)._print_Pow(expr)
...
>>> MyCCodePrinter().doprint(x*x*x)
'x*x*x'
Note that this was a proposed change (with a restriction on this size of the exponent) for a while. Then the motivation was performance of regular C code, but flags such as -ffast-math made to point moot. However if this is something that is useful for CUDA code we should definitely support the behaviour through a setting, feel free to open an issue for it if you think it is needed.