Can I use python float with precision of (float32) ?
Because it will be serialized from primitive python.float(32) and will be deserialized to primitive java.float.
Any libraries are not considered. ex) numpy.float32
Python floats are 64-bit (double in C/Java). However you can serialize/deserialize these to 32-bit floats (float in C/Java) using the struct module with the "f" format:
>>> import struct
>>> struct.pack("f", 0.1)
'\xcd\xcc\xcc='
Related
Unpacking binary data in Python:
import struct
bytesarray = "01234567".encode('utf-8')
# Return a new Struct object which writes and reads binary data according to the format string.
s = struct.Struct('=BI3s')
s = s.unpack(bytesarray) # Output: (48, 875770417, b'567')
Does Raku have a similar function to Python's Struct? How can I unpack binary data according to a format string in Raku?
There's the experimental unpack
use experimental :pack;
my $bytearray = "01234567".encode('utf-8');
say $bytearray.unpack("A1 L H");
It's not exactly the same, though; this outputs "(0 875770417 35)". You can tweak your way through it a bit, maybe.
There's also an implementation of Perl's pack / unpack in P5pack
I wonder which format floats are in numpy array by default.
(or do they even get converted when declaring a np.array? if so how about python lists?)
e.g. float16,float32 or float64?
float64. You can check it like
>>> np.array([1, 2]).dtype
dtype('int64')
>>> np.array([1., 2]).dtype
dtype('float64')
If you dont specify the data type when you create the array then numpy will infer the type, from the docs
dtypedata-type, optional - The desired data-type for the array. If not given, then the type will be determined as the minimum type
required to hold the objects in the sequence
I recently switched from numpy compiled with open blas to numpy compiled with mkl. In pure numeric operations there was a clear speed up for matrix multiplication. However when I ran some code I have been using which multiplies matrices containing sympy variables, I now get the error
'Object arrays are not currently supported'
Does anyone have information on why this is the case for mkl and not for open blas?
Release notes for 1.17.0
Support of object arrays in matmul
It is now possible to use matmul (or the # operator) with object arrays. For instance, it is now possible to do:
from fractions import Fraction
a = np.array([[Fraction(1, 2), Fraction(1, 3)], [Fraction(1, 3), Fraction(1, 2)]])
b = a # a
Are you using # (matmul or dot)? A numpy array containing sympy objects will be object dtype. Math on object arrays depends on delegating the action to the object's own methods. It cannot be performed by the fast compiled libraries, which only work with c types such as float and double.
As a general rule you should not be trying to mix numpy and sympy. Math is hit-or-miss, and never fast. Use sympy's own Matrix module, or lambdify the sympy expressions for numeric work.
What's the mkl version? You may have to explore this with creator of that compilation.
I would like to pass an numpy array to cython. The Cython C type should be float. Which numpy type do I have to choose. When I choose float or np.float, then its actually a C double.
You want np.float32. This is a 32-bit C float.
I have two computers with python 2.7.2 (MSC v.1500 32 bit (Intel)] on win32) and numpy 1.6.1.
But
numpy.mean(data)
returns
1.13595094681 on my old computer
and
1.13595104218 on my new computer
where
Data = [ 0.20227873 -0.02738848 0.59413314 0.88547146 1.26513398 1.21090782
1.62445402 1.80423951 1.58545554 1.26801944 1.22551131 1.16882968
1.19972098 1.41940248 1.75620842 1.28139281 0.91190684 0.83705413
1.19861531 1.30767155]
In both cases
s=0
for n in data[:20]:
s+=n
print s/20
gives
1.1359509334
Can anyone explain why and how to avoid?
Mads
If you want to avoid any differences between the two, then make them explicitly 32-bit or 64-bit float arrays. NumPy uses several other libraries that may be 32 or 64 bit. Note that rounding can occur in your print statements as well:
>>> import numpy as np
>>> a = [0.20227873, -0.02738848, 0.59413314, 0.88547146, 1.26513398,
1.21090782, 1.62445402, 1.80423951, 1.58545554, 1.26801944,
1.22551131, 1.16882968, 1.19972098, 1.41940248, 1.75620842,
1.28139281, 0.91190684, 0.83705413, 1.19861531, 1.30767155]
>>> x32 = np.array(a, np.float32)
>>> x64 = np.array(a, np.float64)
>>> x32.mean()
1.135951042175293
>>> x64.mean()
1.1359509335
>>> print x32.mean()
1.13595104218
>>> print x64.mean()
1.1359509335
Another point to note is that if you have lower level libraries (e.g., atlas, lapack) that are multi-threaded, then for large arrays, you may have differences in your result regardless, due to possible variable order of operations and floating point precision.
Also, you are at the limit of precision for 32 bit numbers:
>>> x32.sum()
22.719021
>>> np.array(sorted(x32)).sum()
22.719019
This is happening because you have Float32 arrays (single precision). With single precision, the operations are only accurate to 6 decimal place. Hence your results are the same up to the 6th decimal place (after the decimal point, rounding the last digit), but they are not accurate after that. Different architectures/machines/compilers will yield the different results after that. If you want the same results you should use higher precision arrays (e.g. Float64).