I generated input tensor A tensor using the following codes in tensorflow;
import tensorflow as tf
A = tf.constant(1.0, shape = [10, 10])
with tf.Session() as sess:
output = [[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
I want to set parts of the entries to zero, say half or quarter along either the column or raw and I did the following;
import numpy as np
output = np.array(A)
A1 = output[:, output.shape[1]//2:] = 0
But I was getting error 'tuple index out of range' Please help
Just create the single parts separately and then concatenate them:
A = tf.ones(shape=[10, 5])
B = tf.zeros(shape=[10,5])
AB = tf.concat((A,B), axis=1)
Same holds for a row-wise split.
I have a tensor T with shape (A,?,B,C). I also have a tensor index I defined as I=tf.argmax(something). I want to define T(I,?,:,:). The operation T(I,:,:,:) works well when the index I is not a tensor object but an integer. How to do when I is a tensor=tf.argmax?
Slicing a Tensor, T(I,:,:,:) with I being a Tensor and an Integer seems to work fine with Tensorflow version 2.1.
Mentioned below is the code which uses a Tensor with shape (A, ?, B, C) which slicesT(I,:,:,:) using both Tensor Value of tf.argmax and Integer Value of tf.argmax.
%tensorflow_version 2.x
import tensorflow as tf
import numpy as np
b = np.ones(shape = (2,3,3,4))
a = tf.Variable(b, shape = (2,None, 3,4), dtype = tf.int32)
l = [166.32, 10, 26.9, 2.8, 1, 62.3]
b = tf.math.argmax(input = l)
c = tf.keras.backend.eval(b)
print('b = {} and c = {}'.format(b,c))
a = tf.ones(shape = (2,5,3,4))
Arg_Max_Tensor_Val = a[b,:,:,:]
Arg_Max_Int_Val = a[c,:,:,:]
Mentioned below is the Output:
[[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]], shape=(5, 3, 4), dtype=float32)
[[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]], shape=(5, 3, 4), dtype=float32)
Please let me know if this code resolves your problem. If it doesn't, please share the code you are using so that we can investigate further. Thanks!
I have a numpy array named "distances" which looks like this:
[[ 5. 1. 1. 1. 2. 1. 3. 1. 1. 1.]
[ 5. 4. 4. 5. 7. 10. 3. 2. 1. 1.]
[ 3. 1. 1. 1. 2. 2. 3. 1. 1. 0.]
[ 6. 8. 8. 1. 3. 4. 3. 7. 1. 1.]
[ 4. 1. 1. 3. 2. 1. 3. 1. 1. 1.]
[ 8. 10. 10. 8. 7. 10. 9. 7. 1. 1.]
[ 1. 1. 1. 1. 2. 10. 3. 1. 1. 0.]
[ 2. 1. 2. 1. 2. 1. 3. 1. 1. 0.]
[ 2. 1. 1. 1. 2. 1. 1. 1. 5. 2.]
[ 4. 2. 1. 1. 2. 1. 2. 1. 1. 1.]]
I want to make a new 3*9 numpy array by taking mean like this:
If last column is 0, define an array c0 (1*9) which is mean of all such rows where last column is 0 where each column is mean of the columns from such rows.
If last column is 1, define an array c1 (1*9) which is mean of all such rows where last column is 1 where each column is mean of the columns from such rows.
If last column is 2, define an array c2 (1*9) which is mean of all such rows where last column is 2 where each column is mean of the columns from such rows.
Post doing this I am doing hstack to get final 3*9 array. I am sure this is the long approach but none the less wrong.
final = np.hstack((c0,c1,c2))
Doing this I get 1*10 array where each column is average of each column from distances array, however I am unable to find a way to do so on a condition that only take average when last column of rows is 0 only ?
With pandas
Would be straight-forward with pandas -
import pandas as pd
df = pd.DataFrame(distances)
df_out = df.groupby(df.shape[1]-1).mean()
df_out['ID'] = df_out.index
out = df_out.values
With NumPy
Using Custom-function
For a NumPy-specific one, we can use groupbycol (perform group-based summations) and hence solve our case, like so -
sums = groupbycol(distances, assume_sorted_col=False, colID=-1)
out = sums/np.bincount(distances[:,-1]).astype(float)[:,None]
With matrix-multiplication
mask = distances[:,-1,None] == np.arange(distances[:,-1].max()+1)
out = mask.T.dot(distances)/mask.sum(0)[:,None].astype(float)
I was able to do it like this:
c0= (distances[distances[:,-1] == 0][:,0:9]).mean(axis=0)
c1 = (distances[distances[:,-1] == 1][:,0:9]).mean(axis=0)
c2 = (distances[distances[:,-1] == 2][:,0:9]).mean(axis=0)
I want to create an M*N tensor where all elements are all zeros except one random element per row which shall be one but I don't know how.
This is one way to do that:
import tensorflow as tf
m = 4
n = 6
dt = tf.float32
random_idx = tf.random_uniform((m, 1), maxval=n, dtype=tf.int32)
result = tf.cast(tf.equal(tf.range(n)[tf.newaxis], random_idx), dtype=dt)
with tf.Session() as sess:
[[ 0. 0. 0. 0. 0. 1.]
[ 0. 0. 1. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0.]]
I have the following array (3 decks of 7 cards). They are sorted by row and I want to see if there are 5 consecutive numbers. The below code works but has a mistake: when there is a duplicate (like in row 1) the result is incorrect:
[[ 12. 6. 6. 5. 4. 2. 1.]
[ 12. 9. 6. 6. 1. 1. 1.]
[ 6. 6. 1. 1. 0. 0. 0.]]
isStraight=np.any(has4 == 4, axis=1)
has4 (shows if there is a difference of 4 between any of the cards 5 positions apart)
[[ 8. 4. 5.]
[ 11. 8. 5.]
[ 6. 6. 1.]]
isStraight checks if any of the rows contains a 4, which means there is a straight. Result is incorrect for the first row because the duplicates are not ignored.
[ True False False]
The difficulty is that there is no way in numpy to do a np.unique with return_counts=True on a by row basis, as the results would have different lengths.
Any suggestions are appreciated. It has to be numpy only (or pandas if the speed is not compromised).
I think this is the solution. Is there a way to make it even simpler?
counts=(cards[:,:,None] == np.arange(12,0,-1)).sum(1) # occurences of each cards
s1=np.sum(present[:,0:5], axis=1)
s2=np.sum(present[:,1:6], axis=1)
s3=np.sum(present[:,2:7], axis=1)
s[s < 5] = -1
s[s == 6] = 5
s[s ==7] = 5
I am trying to apply DBSCAN on a dataset of (Lan,Lat) .. The algorithm is very sensitive for the parameter; EPS & MinPts.
I would like to have a look through a Histogram over the data, to determine the proper values. Unfortunately, Matplotlib Hist() take only 1D array.
Passing a 2D matrix as argument, Hist() treats each column as a separate input.
Scatter plot and histograms:
Does anyone has a way to solve this,
If you follow the DBSCAN article, you only need the 4-nearest-neighbor distance for each object, not all pairwise distances. I.e., a 1 dimensional array.
Instead of doing a histogram, they sort the values, and try to choose a knee in this plot.
find the 4 nearest neighbor of each object
collect all 4NN distances in one array
sort this array in descending order
plot the resulting curve
look for a knee, often best at around 5%-10% of your x axis (so 95%-90% of objects are core points).
For details, see the original DBSCAN publication!
You could use numpy.histogram2d:
import numpy as np
N = 100
arr = np.random.random((N, 2))
xedges = np.linspace(0, 1, 10)
yedges = np.linspace(0, 1, 10)
lat = arr[:, 0]
lng = arr[:, 1]
hist, xedges, yedges = np.histogram2d(lat, lng, (xedges, yedges))
[[ 0. 0. 5. 0. 3. 0. 0. 0. 3.]
[ 0. 3. 0. 3. 0. 0. 4. 0. 2.]
[ 2. 2. 1. 1. 1. 1. 3. 0. 1.]
[ 2. 1. 0. 3. 1. 2. 1. 1. 3.]
[ 3. 0. 3. 2. 0. 1. 0. 2. 0.]
[ 3. 2. 3. 1. 1. 2. 1. 1. 0.]
[ 2. 3. 0. 1. 0. 1. 3. 0. 0.]
[ 1. 1. 1. 1. 2. 0. 2. 1. 1.]
[ 0. 1. 1. 0. 1. 1. 2. 0. 0.]]
Or to visualize the histogram:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()