Related Numpy array typcasting - numpy

I have the below python code:
a = np.array([1, 2, '3'])
print(a)
output:
['1' '2' '3']
My question is, why all elements are converted into strings?
I know that in numpy array, if the array consist of different elements it will be typecasted. But on what basis it will be typecasted?

This is fairly well explained in the numpy.array documentation (highlighting is mine):
numpy.array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0, like=None)
[…]
dtype: data-type, optional
The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence.
An integer can always be converted to string, the other way around it not always possible (e.g., a cannot be converted to string).
This is the same if you mix floats and integers, the array will be casted as float.

Related

Why are numpy array called homogeneous?

Why are numpy arrays called homogeneous when you can have elements of different type in the same numpy array like this?
np.array([1,2,3,4,"a"])
I understand that I cannot perform some types of broadcasting operations like I cannot perform
np1*4 here and it results in an error.
but my question really is when it can have elements of different types, why it is called homogeneous?
Numpy automatically converts them to most applicable datatype.
e.g.,
>>> np.array([1,2,3,4,"a"]).dtype.type
numpy.str_
In short this means all elements are of string.
>>> np.array([1,2,3,4]).dtype.type
numpy.int64

How to compare numpy arrays of tuples?

Here's an MWE that illustrates the issue I have:
import numpy as np
arr = np.full((3, 3), -1, dtype="i,i")
doesnt_work = arr == (-1, -1)
n_arr = np.full((3, 3), -1, dtype=int)
works = n_arr == 10
arr is supposed to be an array of tuples, but it doesn't behave as expected.
works is an array of booleans, as expected, but doesnt_work is False. Is there a way to get numpy to do elementwise comparisons on more complex types, or do I have to resort to list comprehension, flatten and reshape?
There's a second problem:
f = arr[(0, 0)] == (-1, -1)
f is False, because arr[(0,0)] is of type numpy.void rather than a tuple. So even if the componentwise comparison worked, it would give the wrong result. Is there a clever numpy way to do this or should I just resort to list comprehension?
Both problems are actually the same problem! And are both related to the custom data type you created when you specified dtype="i,i".
If you run arr.dtype you will get dtype([('f0', '<i4'), ('f1', '<i4')]). That is a 2 signed integers that are placed in one continuous block of memory. This is not a python tuple. Thus it is clear why the naive comparison fails, since (-1,-1) is a python tuple and is not represented in memory the same way that the numpy data type is.
However if you compare with a_comp = np.array((-1,-1), dtype="i,i") you get the exact behavior you are expecting!
You can read more about how the custom dtype stuff works on the numpy docs:
https://numpy.org/doc/stable/reference/arrays.dtypes.html
Oh and to address what np.void is: it comes from the idea that it is a void c pointer which essentially means that it is an address to a continuous block of memory of unspecified type. But, provided you (the programer) knows what is going to be stored in that memory (in this case two back to back integers) it's fine provided you are careful (compare with the same custom data type).

which float precision are numpy arrays by default?

I wonder which format floats are in numpy array by default.
(or do they even get converted when declaring a np.array? if so how about python lists?)
e.g. float16,float32 or float64?
float64. You can check it like
>>> np.array([1, 2]).dtype
dtype('int64')
>>> np.array([1., 2]).dtype
dtype('float64')
If you dont specify the data type when you create the array then numpy will infer the type, from the docs
dtypedata-type, optional - The desired data-type for the array. If not given, then the type will be determined as the minimum type
required to hold the objects in the sequence

Evaluate several elements of numpy object array

I have an ndarray A that stores objects of the same type, in particular various LinearNDInterpolator objects. For example's sake assume it's just 2:
>>> A
array([ <scipy.interpolate.interpnd.LinearNDInterpolator object at 0x7fe122adc750>,
<scipy.interpolate.interpnd.LinearNDInterpolator object at 0x7fe11daee590>], dtype=object)
I want to be able to do two things. First, I'd like to evaluate all objects in A at a certain point and get back an ndarray of A.shape with all the values in it. Something like
>> A[[0,1]](1,1) =
array([ 1, 2])
However, I get
TypeError: 'numpy.ndarray' object is not callable
Is it possible to do that?
Second, I would like to change the interpolation values without constructing new LinearNDInterpolator objects (since the nodes stay the same). I.e., something like
A[[0,1]].values = B
where B is an ndarray containing the new values for every element of A.
Thank you for your suggestions.
The same issue, but with simpler functions:
In [221]: A=np.array([add,multiply])
In [222]: A[0](1,2) # individual elements can be called
Out[222]: 3
In [223]: A(1,2) # but not the array as a whole
---------------------------------------------------------------------------
TypeError: 'numpy.ndarray' object is not callable
We can iterate over a list of functions, or that array as well, calling each element on the parameters. Done right we can even zip a list of functions and a list of parameters.
In [224]: ll=[add,multiply]
In [225]: [x(1,2) for x in ll]
Out[225]: [3, 2]
In [226]: [x(1,2) for x in A]
Out[226]: [3, 2]
Another test, the callable function:
In [229]: callable(A)
Out[229]: False
In [230]: callable(A[0])
Out[230]: True
Can you change the interpolation values for individual Interpolators? If so, just iterate through the list and do that.
In general, dtype object arrays function like lists. They contain the same kind of object pointers. Most operations requires the same sort of iteration. Unless you need to organize the elements in multiple dimensions, dtype object arrays have few, if any advantages over lists.
Another thought - the normal array dtype is numeric or fixed length strings. These elements are not callable, so there's no need to implement a .__call__ method on these arrays. They could write something like that to operate on object dtype arrays, but the core action is a Python call. So such a function would just hide the kind of iteration that I outlined.
In another recent question I showed how to use np.char.upper to apply a string method to every element of a S dtype array. But my time tests showed that this did not speedup anything.

numpy concatenate of empty with non empty arrays yields in float

I just found that concatenating an empty array with a non-empty array yielded in a one value array containing the non-empty array but changed to a float.
for example:
import numpy as np
np.concatenate([1], [1])
array([1, 1])
but
np.concatenate([], [1])
array([1.])
this works the same with np.hstack
By default, the empty array in the code
np.concatenate([], [1])
is initialized with dtype=float, and concatenate casts the second int array to float.
Now, it's worth asking if it ever happens that you use concatenate on empty arrays. Clearly, you never write code like
a=array([1,2,3])#int array
b=np.concatenate([], a)
One case scenario where it may happens follows:
a=array([1,2,3])#int array
b=concatenate((a[:j],a)) #usually j!=0 here
Then for some reasons the code is run with j=0. it is true that a[:0] is empty, but it still retains the dtype=int and the result of concatenate is an array of integer anyway, as you expected.
So I would say that yes, your example shows somehow an unexpected behaviour at first sight, but it's quite harmless.