can anyone tell my what I am doing wrong?
I created an integer array from a boolean array but still cannot use it as an index for a list:
dataset = []
dataset.append({
"a": "few",
"b": "cd"
})
dataset.append({
"a": "fe",
"b": "c"
})
dataset.append({
"a": "f",
"b": "cwef"
})
split = 0.5
# generate bolean mask
msk = np.random.rand(len(dataset)) < split
print(msk)
# transform mask to int version
msk = np.where(msk)
print(msk)
# take only first part of touple as index mask
# ERROR: only integer scalar arrays can be converted to a scalar index
dataset_low = dataset[msk[0]]
dataset_high = dataset[~msk[0]]
That magic only works with numpy arrays, dataset is a list.
You could convert it to a numpy array with a custom datatype.
And it only works if you don't use np.where and stick with a boolean array.
Negating ~ a result of np.where does not make sense, as the result are indices, not boolean values.
...
#msk = np.where(msk)
...
dataset = np.array(dataset)
dataset_low = dataset[msk[0]]
dataset_high = dataset[~msk[0]]
....
Related
I have two arrays:
values_arr = [[100,1], [20,5], [40,50]...[50,30]]
images_arr = [img1, img2, img3,...imgn]
Both the arrays are numpy arrays.
The values_arr and images_arr are in the same order.
i.e
[100, 1] corresponds to img1
How do I get the image given the value of index?
index = [20,5]
In this case, I should get img2 given the value of index = [20,5].
You can make a dict as
values_arr_tup = [tuple(i) for i in values_arr]
dict_ = {key:value for key,value in zip(values_arr_tup ,images_arr)}
then perform dict_[tuple(index)] to get the image
You can use np.where to extract the index of the item :
images_arr[np.where(values_arr == [20,5])[0][0]]
I have a DataFrame which has many String columns where they should be float64 instead. I would like to transform all the column at once and transform the dataframe into a float array. How can do this? Importantly, there are some float columns too.
df = DataFrame(a=["1", "2", "3"], b=["1.1", "2.2", "3.3"], c=[0.1, 0.2, 0.3])
# Verbose option
df.a = parse.(Float64, df.a)
df.b = parse.(Float64, df.b)
matrix = Matrix{Float64}(df)
# Is is possible to do this at once especially when there are float columns too?
# Here parse.(Float64, df.c) would throw an error
One way of doing this is by looping over the String columns:
for c ∈ names(df, String)
df[!, c]= parse.(Float64, df[!, c])
end
Note that you don't need Matrix{Float64} if you've already turned everything into Floats, just Matrix(df) will do.
I've had the same question and landed on this page and found that the above code does not work for me, a slight change that made it work for me is:
for c ∈ names(df, Any)
df[!, c]= Float64.(df[!, c])
end
Note that for the names(df, Any) argument Any can be specified as String or any other data type.
I have a list of 2-d numpy arrays, and I wish to create one array consisting of the non-zero values (or-wise) of each array set to 1. For example
arr1 = np.array([[1,0],[0,0]])
arr2 = np.array([[0,10],[0,0]])
arr3 = np.array([[0,0],[0,8]])
arrs = [arr1, arr2, arr3]
And so my op would yield
op(arrs) = [[1, 1], [0, 1]]]
What is an efficient way to do this in numpy (for about 8 image arrays of 600 by 600)?
Took me a while to understand. Try just summing all the arrays keeping their dimensions and then replace non-zero values with 1 as follows-
def op(arrs):
return np.where(np.add.reduce(arrs) != 0, 1, 0)
I have a numpy array, a:
a = np.array([[-21.78878256, 97.37484004, -11.54228119],
[ -5.72592375, 99.04189958, 3.22814204],
[-19.80795922, 95.99377136, -10.64537733]])
I have another array, b:
b = np.array([[ 54.64642121, 64.5172014, 44.39991983],
[ 9.62420892, 95.14361441, 0.67014312],
[ 49.55036427, 66.25136632, 40.38778238]])
I want to extract minimum value indices from the array, b.
ixs = [[2],
[2],
[2]]
Then, want to extract elements from the array, a using the indices, ixs:
The expected answer is:
result = [[-11.54228119]
[3.22814204]
[-10.64537733]]
I tried as:
ixs = np.argmin(b, axis=1)
print ixs
[2,2,2]
result = np.take(a, ixs)
print result
Nope!
Any ideas are welcomed
You can use
result = a[np.arange(a.shape[0]), ixs]
np.arange will generate indices for each row and ixs will have indices for each column. So effectively result will have required result.
You can try using below code
np.take(a, ixs, axis = 1)[:,0]
The initial section will create a 3 by 3 array and slice the first column
>>> np.take(a, ixs, axis = 1)
array([[-11.54228119, -11.54228119, -11.54228119],
[ 3.22814204, 3.22814204, 3.22814204],
[-10.64537733, -10.64537733, -10.64537733]])
I have a structured numpy array, in which one of field has subfields:
import numpy, string, random
dtype = [('name', 'a10'), ('id', 'i4'),
('size', [('length', 'f8'), ('width', 'f8')])]
a = numpy.zeros(10, dtype = dtype)
for idx in range(len(a)):
a[idx] = (''.join(random.sample(string.ascii_lowercase, 10)), idx,
numpy.random.uniform(0, 1, size=[1, 2]))
I can easily get it sorted by any of fields, like this:
a.sort(order = ['name'])
a.sort(order = ['size'])
When I try to sort it by a structured field ('size' in this example), it is effectively getting sorted by the first subfield ('length' in this example). However, I would like to have my elements sorted by 'height'. I tried something like this, but it does not work:
a.sort(order = ['size[\'height\']']))
ValueError: unknown field name: size['height']
a.sort(order = ['size', 'height'])
ValueError: unknown field name: height
Therefore, I wonder, if there is a way to accomplish the task?
I believe this is what you want:
a[a["size"]["width"].argsort()]