how to generate array tensor in tensorflow - tensorflow

I generated input tensor A tensor using the following codes in tensorflow;
import tensorflow as tf
A = tf.constant(1.0, shape = [10, 10])
with tf.Session() as sess:
print(sess.run(A))
output = [[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
I want to set parts of the entries to zero, say half or quarter along either the column or raw and I did the following;
import numpy as np
output = np.array(A)
A1 = output[:, output.shape[1]//2:] = 0
print(A1)
But I was getting error 'tuple index out of range' Please help
print(sess.run(A1))

Just create the single parts separately and then concatenate them:
A = tf.ones(shape=[10, 5])
B = tf.zeros(shape=[10,5])
AB = tf.concat((A,B), axis=1)
Same holds for a row-wise split.

Related

Tensorflow access to a tensor with a tensorflow index

I have a tensor T with shape (A,?,B,C). I also have a tensor index I defined as I=tf.argmax(something). I want to define T(I,?,:,:). The operation T(I,:,:,:) works well when the index I is not a tensor object but an integer. How to do when I is a tensor=tf.argmax?
Slicing a Tensor, T(I,:,:,:) with I being a Tensor and an Integer seems to work fine with Tensorflow version 2.1.
Mentioned below is the code which uses a Tensor with shape (A, ?, B, C) which slicesT(I,:,:,:) using both Tensor Value of tf.argmax and Integer Value of tf.argmax.
%tensorflow_version 2.x
import tensorflow as tf
import numpy as np
b = np.ones(shape = (2,3,3,4))
a = tf.Variable(b, shape = (2,None, 3,4), dtype = tf.int32)
l = [166.32, 10, 26.9, 2.8, 1, 62.3]
b = tf.math.argmax(input = l)
c = tf.keras.backend.eval(b)
print('b = {} and c = {}'.format(b,c))
a = tf.ones(shape = (2,5,3,4))
Arg_Max_Tensor_Val = a[b,:,:,:]
Arg_Max_Int_Val = a[c,:,:,:]
print(Arg_Max_Int_Val)
print(Arg_Max_Tensor_Val)
Mentioned below is the Output:
tf.Tensor(
[[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]], shape=(5, 3, 4), dtype=float32)
tf.Tensor(
[[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]], shape=(5, 3, 4), dtype=float32)
Please let me know if this code resolves your problem. If it doesn't, please share the code you are using so that we can investigate further. Thanks!

Conditional mean in numpy arrays?

I have a numpy array named "distances" which looks like this:
[[ 5. 1. 1. 1. 2. 1. 3. 1. 1. 1.]
[ 5. 4. 4. 5. 7. 10. 3. 2. 1. 1.]
[ 3. 1. 1. 1. 2. 2. 3. 1. 1. 0.]
[ 6. 8. 8. 1. 3. 4. 3. 7. 1. 1.]
[ 4. 1. 1. 3. 2. 1. 3. 1. 1. 1.]
[ 8. 10. 10. 8. 7. 10. 9. 7. 1. 1.]
[ 1. 1. 1. 1. 2. 10. 3. 1. 1. 0.]
[ 2. 1. 2. 1. 2. 1. 3. 1. 1. 0.]
[ 2. 1. 1. 1. 2. 1. 1. 1. 5. 2.]
[ 4. 2. 1. 1. 2. 1. 2. 1. 1. 1.]]
I want to make a new 3*9 numpy array by taking mean like this:
If last column is 0, define an array c0 (1*9) which is mean of all such rows where last column is 0 where each column is mean of the columns from such rows.
If last column is 1, define an array c1 (1*9) which is mean of all such rows where last column is 1 where each column is mean of the columns from such rows.
If last column is 2, define an array c2 (1*9) which is mean of all such rows where last column is 2 where each column is mean of the columns from such rows.
Post doing this I am doing hstack to get final 3*9 array. I am sure this is the long approach but none the less wrong.
code:
c0=distances.mean(axis=1)
final = np.hstack((c0,c1,c2))
Doing this I get 1*10 array where each column is average of each column from distances array, however I am unable to find a way to do so on a condition that only take average when last column of rows is 0 only ?
With pandas
Would be straight-forward with pandas -
import pandas as pd
df = pd.DataFrame(distances)
df_out = df.groupby(df.shape[1]-1).mean()
df_out['ID'] = df_out.index
out = df_out.values
With NumPy
Using Custom-function
For a NumPy-specific one, we can use groupbycol (perform group-based summations) and hence solve our case, like so -
sums = groupbycol(distances, assume_sorted_col=False, colID=-1)
out = sums/np.bincount(distances[:,-1]).astype(float)[:,None]
With matrix-multiplication
mask = distances[:,-1,None] == np.arange(distances[:,-1].max()+1)
out = mask.T.dot(distances)/mask.sum(0)[:,None].astype(float)
I was able to do it like this:
c0= (distances[distances[:,-1] == 0][:,0:9]).mean(axis=0)
c1 = (distances[distances[:,-1] == 1][:,0:9]).mean(axis=0)
c2 = (distances[distances[:,-1] == 2][:,0:9]).mean(axis=0)

Creating all zeros except one nonzero element in tensorflow

I want to create an M*N tensor where all elements are all zeros except one random element per row which shall be one but I don't know how.
This is one way to do that:
import tensorflow as tf
m = 4
n = 6
dt = tf.float32
random_idx = tf.random_uniform((m, 1), maxval=n, dtype=tf.int32)
result = tf.cast(tf.equal(tf.range(n)[tf.newaxis], random_idx), dtype=dt)
with tf.Session() as sess:
print(sess.run(result))
Output:
[[ 0. 0. 0. 0. 0. 1.]
[ 0. 0. 1. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0. 0.]]

Find 5 consecutive numbers in numpy array by row, ignore duplicates

I have the following array (3 decks of 7 cards). They are sorted by row and I want to see if there are 5 consecutive numbers. The below code works but has a mistake: when there is a duplicate (like in row 1) the result is incorrect:
cards=
[[ 12. 6. 6. 5. 4. 2. 1.]
[ 12. 9. 6. 6. 1. 1. 1.]
[ 6. 6. 1. 1. 0. 0. 0.]]
cardAmount=cards[0,:].size
has4=cards[:,np.arange(0,cardAmount-4)]-cards[:,np.arange(cardAmount-3,cardAmount)]
isStraight=np.any(has4 == 4, axis=1)
has4 (shows if there is a difference of 4 between any of the cards 5 positions apart)
[[ 8. 4. 5.]
[ 11. 8. 5.]
[ 6. 6. 1.]]
isStraight checks if any of the rows contains a 4, which means there is a straight. Result is incorrect for the first row because the duplicates are not ignored.
[ True False False]
The difficulty is that there is no way in numpy to do a np.unique with return_counts=True on a by row basis, as the results would have different lengths.
Any suggestions are appreciated. It has to be numpy only (or pandas if the speed is not compromised).
I think this is the solution. Is there a way to make it even simpler?
iterations=3
cardAmount=cards[0,:].size
counts=(cards[:,:,None] == np.arange(12,0,-1)).sum(1) # occurences of each cards
present=counts
present[present>1]=1
s1=np.sum(present[:,0:5], axis=1)
s2=np.sum(present[:,1:6], axis=1)
s3=np.sum(present[:,2:7], axis=1)
s=np.stack((s1,s2,s3)).T
s[s < 5] = -1
s[s == 6] = 5
s[s ==7] = 5
s_index=np.argmax(s,axis=1)
straight=s[np.arange(iterations),s_index]>=0

Plotting a histogram of 2D numpyArray of (latitude, latitude), in order to determine the proper values for DBSCAN

I am trying to apply DBSCAN on a dataset of (Lan,Lat) .. The algorithm is very sensitive for the parameter; EPS & MinPts.
I would like to have a look through a Histogram over the data, to determine the proper values. Unfortunately, Matplotlib Hist() take only 1D array.
Passing a 2D matrix as argument, Hist() treats each column as a separate input.
Scatter plot and histograms:
Does anyone has a way to solve this,
If you follow the DBSCAN article, you only need the 4-nearest-neighbor distance for each object, not all pairwise distances. I.e., a 1 dimensional array.
Instead of doing a histogram, they sort the values, and try to choose a knee in this plot.
find the 4 nearest neighbor of each object
collect all 4NN distances in one array
sort this array in descending order
plot the resulting curve
look for a knee, often best at around 5%-10% of your x axis (so 95%-90% of objects are core points).
For details, see the original DBSCAN publication!
You could use numpy.histogram2d:
import numpy as np
np.random.seed(2016)
N = 100
arr = np.random.random((N, 2))
xedges = np.linspace(0, 1, 10)
yedges = np.linspace(0, 1, 10)
lat = arr[:, 0]
lng = arr[:, 1]
hist, xedges, yedges = np.histogram2d(lat, lng, (xedges, yedges))
print(hist)
yields
[[ 0. 0. 5. 0. 3. 0. 0. 0. 3.]
[ 0. 3. 0. 3. 0. 0. 4. 0. 2.]
[ 2. 2. 1. 1. 1. 1. 3. 0. 1.]
[ 2. 1. 0. 3. 1. 2. 1. 1. 3.]
[ 3. 0. 3. 2. 0. 1. 0. 2. 0.]
[ 3. 2. 3. 1. 1. 2. 1. 1. 0.]
[ 2. 3. 0. 1. 0. 1. 3. 0. 0.]
[ 1. 1. 1. 1. 2. 0. 2. 1. 1.]
[ 0. 1. 1. 0. 1. 1. 2. 0. 0.]]
Or to visualize the histogram:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.imshow(hist)
plt.show()