Generate random sample of values in a range with replacements using numpy

Generate random sample of values in a range with replacements using numpy - numpy

np.random.uniform(low, high, size) does not give you replacement values.
Is there any way to generate sample values with replacements?

numpy.random.randint
Ex: draw 10 samples from 0 to 10, with replacement.
import numpy as np
np.random.randint(low=0, high=10, size=10)
# array([5, 0, 7, 8, 9, 8, 0, 6, 4, 8])
See tons of examples on this post: Numpy distributions and statistical funcitons: examples and reference
You can also use np.random.choice:
import numpy as np
np.random.choice(10,10,replace=True)
# array([5, 0, 7, 8, 9, 8, 0, 6, 4, 8])

Related

Numpy subarrays and relative indexing

I have been searching if there is an standard mehtod to create a subarray using relative indexes. Take the following array into consideration:
>>> m = np.arange(25).reshape([5, 5])
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
I want to access the 3x3 matrix at a specific array position, for example [2,2]:
>>> x = 2, y = 2
>>> m[slice(x-1,x+2), slice(y-1,y+2)]
array([[ 6, 7, 8],
[11, 12, 13],
[16, 17, 18]])
For example for the above somethig like m.subarray(pos=[2,2], shape=[3,3])
I want to sample a ndarray of n dimensions on a specific position which might change.
I did not want to use a loop as it might be inneficient. Scipy functions correlate and convolve do this very efficiently, but for all positions. I am interested only in the sampling of one.
The best answer could solve the issues at edges, in my case I would like for example to have wrap mode:
(a b c d | a b c d | a b c d)
--------------------EDITED-----------------------------
Based on the answer from #Carlos Horn, I could create the following function.
def cell_neighbours(array, index, shape):
pads = [(floor(dim/2), ceil(dim / 2)) for dim in shape]
array = np.pad(self.configuration, pads, "wrap")
views = np.lib.stride_tricks.sliding_window_view
return views(array, shape)[tuple(index)]
Last concern might be about speed, from docs: For many applications using a sliding window view can be convenient, but potentially very slow. Often specialized solutions exist.
From here maybe is easier to get a faster solution.

You could build a view of 3x3 matrices into the array as follows:
import numpy as np
m = np.arange(25).reshape(5,5)
m3x3view = np.lib.stride_tricks.sliding_window_view(m, (3,3))
Note that it will change slightly your indexing on half the window size meaning
x_view = x - 3//2
y_view = y - 3//2
print(m3x3view[x_view,y_view]) # gives your result
In case a copy operation is fine, you could use:
mpad = np.pad(m, 1, mode="wrap")
mpad3x3view = np.lib.stride_tricks.sliding_window_view(mpad, (3,3))
print(mpad3x3view[x % 5,y % 5])
to use arbitrary x, y integer values.

Determine number of preceding equal elements

Using numpy, given a sorted 1D array, how to efficiently obtain a 1D array with equal size where the value at each position is the number of preceding equal elements? I have very large arrays and processing each element in Python code one way or another is not acceptable.
Example:
input = [0, 0, 4, 4, 4, 5, 5, 5, 5, 6]
output = [0, 1, 0, 1, 2, 0, 1, 2, 3, 0]

import numpy as np
A=np.array([0, 0, 4, 4, 4, 5, 5, 5, 5, 6])
uni,counts=np.unique(A, return_counts=True)
out=np.concatenate([np.arange(n) for n in counts])
print(out)
Not certain about the efficiency (probably better way to form the out array rather than concatenating), but a very straightforward way to get the result you are looking for. Counts the unique elements, then does np.arange on each count to get the ascending sequence, then concatenates these arrays together.

what does `replace` in `pandas.DataFrame.sample()` do? [duplicate]

Here explains the function numpy.random.choice. However, I am confused about the third parameter replace. What is it? And in which case will it be useful? Thanks!

It controls whether the sample is returned to the sample pool. If you want only unique samples then this should be false.

You can use it when you want sample some elements from a list, and meanwhile you want the elements no repeat, then you can set the "replace=False".
eg.
from numpy import random as rd
ary = list(range(10))
# usage
In[18]: rd.choice(ary, size=8, replace=False)
Out[18]: array([0, 5, 9, 8, 2, 1, 6, 3]) # no repeated elements
In[19]: rd.choice(ary, size=8, replace=True)
Out[19]: array([4, 9, 8, 5, 4, 1, 1, 9]) # elements may be repeated

numpy find values of maxima pointed to by argmax [duplicate]

This question already has answers here:
Index n dimensional array with (n-1) d array
(3 answers)
Closed 4 years ago.
I have a 3-d array. I find the indexes of the maxima along an axis using argmax. How do I now use these indexes to obtain the maximal values?
2nd part: How to do this for arrays of N-d?
Eg:
u = np.arange(12).reshape(3,4,1)
In [125]: e = u.argmax(axis=2)
Out[130]: e
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
It would be nice if u[e] produced the expected results, but it doesn't work.

The return value of argmax along an axis can't be simply used as an index. It only works in a 1d case.
In [124]: u = np.arange(12).reshape(3,4,1)
In [125]: e = u.argmax(axis=2)
In [126]: u.shape
Out[126]: (3, 4, 1)
In [127]: e.shape
Out[127]: (3, 4)
e is (3,4), but its values only index the last dimension of u.
In [128]: u[e].shape
Out[128]: (3, 4, 4, 1)
Instead we have to construct indices for the other 2 dimensions, ones which broadcast with e. For example:
In [129]: I,J=np.ix_(range(3),range(4))
In [130]: I
Out[130]:
array([[0],
[1],
[2]])
In [131]: J
Out[131]: array([[0, 1, 2, 3]])
Those are (3,1) and (1,4). Those are compatible with (3,4) e and the desired output
In [132]: u[I,J,e]
Out[132]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
This kind of question has been asked before, so probably should be marked as a duplicate. The fact that your last dimension is size 1, and hence e is all 0s, distracting readers from the underlying issue (using a multidimensional argmax as index).
numpy: how to get a max from an argmax result
Get indices of numpy.argmax elements over an axis
Assuming you've taken the argmax on the last dimension
In [156]: ij = np.indices(u.shape[:-1])
In [157]: u[(*ij,e)]
Out[157]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
or:
ij = np.ix_(*[range(i) for i in u.shape[:-1]])
If the axis is in the middle, it'll take a bit more tuple fiddling to arrange the ij elements and e.

so for general N-d array
dims = np.ix_(*[range(x) for x in u.shape[:-1]])
u.__getitem__((*dims,e))
You can't write u[*dims,e], that's a syntax error, so I think you must use getitem directly.

Most efficient way to do this slice based multiplication in Tensorflow

I'm trying to perform an operation of multiplying a slice of a 2D matrix by a constant.
For example, if i wanted to multiply everything but the first 2 columns
To perform this in numpy, one could do:
a = np.array([[0,7,4],
[1,6,4],
[0,2,4],
[4,2,7]])
a[:, 2:] = 2.0*a[:, 2:]
>> a
>> array([[ 0, 7, 8],
[ 1, 6, 8],
[ 0, 2, 8],
[ 4, 2, 14]])
However, at least from what i've searched, tensorflow currently doesn't have a straightforward way to do this.
My current solution is to create a originally as two separate Tensors a1 and a2, multiply the second one by 2.0 and then concatenate them across axis=1. The operation is simple enough that this is possible. However I have two questions
Is that the most efficient way to do this
Is there a better (general/efficient) way to perform this to bring the functionality closer to numpy's slicing magic (perhaps https://www.tensorflow.org/api_docs/python/tf/scatter_

One option is to perform entrywise multiplication, as follows:
import tensorflow as tf
a = tf.Variable(initial_value=[[0,7,4],[1,6,4],[0,2,4],[4,2,7]])
b = tf.mul(a,[1,1,2])
s=tf.InteractiveSession()
s.run(tf.global_variables_initializer())
b.eval()
This prints
array([[ 0, 7, 8],
[ 1, 6, 8],
[ 0, 2, 8],
[ 4, 2, 14]])
More generally, if a has more columns, you can do something like that:
import tensorflow as tf
a = tf.Variable(initial_value=[[0,7,4],[1,6,4],[0,2,4],[4,2,7]])
b = tf.mul(a,[1,1]+[2 for i in range(a.get_shape()[1]-2)])
s=tf.InteractiveSession()
s.run(tf.global_variables_initializer())
b.eval()
Or if your matrix has many columns you could replace
b = tf.mul(a,[1,1]+[2 for i in range(a.get_shape()[1]-2)])
with
import numpy as np
b = tf.mul(a,np.concatenate((np.array([1,1]),2*np.ones(a.get_shape()[1]-2))))

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Generate random sample of values in a range with replacements using numpy - numpy

np.random.uniform(low, high, size) does not give you replacement values. Is there any way to generate sample values with replacements?

Related

Numpy subarrays and relative indexing

Determine number of preceding equal elements

what does `replace` in `pandas.DataFrame.sample()` do? [duplicate]

numpy find values of maxima pointed to by argmax [duplicate]

Most efficient way to do this slice based multiplication in Tensorflow

Categories

Resources