Facing an IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices - numpy

I have been working on link prediction problem in which the data set, which is a numpy array, has to be parsed and stored into another numpy array. I am trying to do the same but at 9th line it is throwing an IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices. I even tried typecasting the indices with int but it seems to not work. What am I missing here ?
1. train_edges, test_edges, = train_test_split(edgeL,test_size=0.3,random_state=16)
2. out_dim = int(W_out.shape[1])
3. in_dim = int(W_in.shape[1])
4. train_x = np.zeros((len(train_edges), (out_dim + in_dim) * 2))
5. train_y = np.zeros((len(train_edges), 1))
6. for i, edge in enumerate(train_edges):
7. u = edge[0]
8. v = edge[1]
9. train_x[int(i), : int(out_dim)] = W_out[u]
10. train_x[int(i), int(out_dim): int(out_dim + in_dim)] = W_in[u]
11. train_x[i, out_dim + in_dim: out_dim * 2 + in_dim] = W_out[v]
12. train_x[i, out_dim * 2 + in_dim:] = W_in[v]
13. if edge[2] > 0:
14. train_y[i] = 1
15. else:
16. train_y[i] = -1
EDIT:
For reference, The W_out is a 64-dimensional tuple which looks like this
print(W_out[0])
type(W_out.shape[1])
Output:
[[0.10160154 0. 0.70414263 0.6772633 0.07685234 0.75205046
0.421092 0.1776721 0.8622188 0.15669271 0. 0.40653425
0.5768579 0.75861764 0.6745151 0.37883565 0.18074909 0.73928916
0.6289512 0. 0.33160248 0.7441727 0. 0.8810399
0.1110919 0.53732747 0. 0.33330196 0.36220717 0.298112
0.10643011 0.8997948 0.53510064 0.6845873 0.03440218 0.23005858
0.8097505 0.7108275 0.38826624 0.28532124 0.37821335 0.3566149
0.42527163 0.71940386 0.8075657 0.5775364 0.01444144 0.21734199
0.47439903 0.21176265 0.32279345 0.00187511 0.43511534 0.4302601
0.39407462 0.20941389 0.199842 0.8710182 0.2160332 0.30246672
0.27159846 0.19009161 0.32349357 0.08938174]]
int
And edge is a tuple which is from training data set which has source, destination, sign. It looks like this...
train_edges, test_edges, = train_test_split(edgeL,test_size=0.3,random_state=16)
for i, edge in enumerate(train_edges):
print(edge)
print(i)
type(i)
type(edge)
Output:
Streaming output truncated to the last 5000 lines.
2936
['16936', '17031', '1']
2937
['15307', '14904', '1']
2938
['22852', '13045', '1']
2939
['14291', '96703', '1']
2940
Any help/suggestion is highly appreciated.

Your syntax is causing the error.
Looks like accessing the edge object may be the issue. Debug using type() and len() of edge and see what the index error is.
implicitly specifying int(i) is not needed, so the issue will be in the assignment of train_index[x] or your enumeration logic is not right.

As mentioned by #indigo_4_alpha, The error is caused by the 'edge[0]` element which is a string.
Code for checking the train_edges
train_edges, test_edges, = train_test_split(edgeL,test_size=0.3,random_state=16)
for i, edge in enumerate(train_edges):
print(edge)
print(i)
print(edge[0], edge[1],edge[2])
print(type(edge[0]))
Output
['11635' '22046' '1']
2608
11635 22046 1
<class 'str'>
After observing the output, I noticed that individually edge[0] is a string. Then I realized that int(W_out[u] is of no-effect when u itself is a string.
So, I type-casted u=edge[0] to u=int(edge[0]) in the lines 7 and 8 of the code, as shown below.
Master code for Train and test data split
1. train_edges, test_edges, = train_test_split(edgeL,test_size=0.3,random_state=16)
2. out_dim = int(W_out.shape[1])
3. in_dim = int(W_in.shape[1])
4. train_x = np.zeros((len(train_edges), (out_dim + in_dim) * 2))
5. train_y = np.zeros((len(train_edges), 1))
6. for i, edge in enumerate(train_edges):
7. u = int(edge[0])
8. v = int(edge[1])
Thank you one and all for sparing your time and giving me your valuable suggestions.

Related

df.ix not working , whats the right iloc method?

This is my program-
#n= no. of days
def ATR(df , n):
df['H-L'] = abs(df['High'] - df['Low'])
df['H-PC'] = abs(df['High'] - df['Close'].shift(1))
df['L-PC'] = abs(df['Low'] - df['Close'].shift(1))
df['TR']=df[['H-L','H-PC','L-PC']].max(axis=1)
df['ATR'] = np.nan
df.ix[n-1,'ATR']=df['TR'][:n-1].mean()
for i in range(n , len(df)):
df['ATR'][i] = (df['ATR'][i-1]*(n-1) + df['TR'][i])/n
return
A warning shows up
'DataFrame' object has no attribute 'ix
I tried to replace it with iloc:
df.iloc[df.index[n-1],'ATR'] = df['TR'][:n-1].mean()
But this time another error pops up :
only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
How to fix this?
Converting code is a pain and we have all been there...
df.ix[n-1,'ATR'] = df['TR'][:n-1].mean()
should become
df['ATR'].iloc[n-1] = df['TR'][:n-1].mean()
Hope this fits the bill

IndexError: too many indices for array in numpy

I'm trying some exercise.
I looked for this problem before, but didn't find one for my problem.
This code seems to work with trainX, but not with trainY.
I have 1672 data for trainY in 1D for one neuron output.
batch_dim = trainX.shape[0]
input_dim = windowSize
hidden_dim = 6
output_dim = 1
O: batch_dim=1 with value "1672"
X = trainX[index:index+batch_dim,:]
Y = trainY[index:index+batch_dim,:]
index = index+batch_dim
The problem seems to be in the dimension. So I try to reshape it
Y = np.reshape(trainY[index:index+batch_dim,:],-1,1)
but it doesn't solve anything. The output still work, but error still there.
I just wanted the error to go away.
The variable size output:
batch_dim = 1 (value = 1672)
index = 1 (value = 0)
X : (1672,3)
Y : (1672,)
Y = trainY[index:index+batch_dim,:]
IndexError: too many indices for array

Sketch_RNN , ValueError: Cannot feed value of shape

I get the following error:
ValueError: Cannot feed value of shape (1, 251, 5) for Tensor u'vector_rnn_1/Placeholder_1:0', which has shape '(1, 117, 5)'
when running code from here
https://github.com/tensorflow/magenta-demos/blob/master/jupyter-notebooks/Sketch_RNN.ipynb
The error occurs in this method:
def encode(input_strokes):
strokes = to_big_strokes(input_strokes).tolist()
strokes.insert(0, [0, 0, 1, 0, 0])
seq_len = [len(input_strokes)]
draw_strokes(to_normal_strokes(np.array(strokes)))
return sess.run(eval_model.batch_z, feed_dict={eval_model.input_data: [strokes], eval_model.sequence_lengths: seq_len})[0]
I have to mention I trained my own model following the instructions here:
https://github.com/tensorflow/magenta/tree/master/magenta/models/sketch_rnn
Can someone help me into understanding and solving this issue ?
Thanks
Regards
For my case, the problem is caused by to_big_strokes() function. If you do not modify the to_big_stroke() in sketch_rnn/utils.py, it will by default prolong the input_strokes sequence to the length of 250.
All you need to do, is to modify the parameter max_len in that function. You need to change that value to the maximum sequence length of your own dataset, which is 21 for me, as the line marked with "change" shown below.
def to_big_strokes(stroke, max_len=21): # change: 250 -> 21
"""Converts from stroke-3 to stroke-5 format and pads to given length."""
# (But does not insert special start token).
result = np.zeros((max_len, 5), dtype=float)
l = len(stroke)
assert l <= max_len
result[0:l, 0:2] = stroke[:, 0:2]
result[0:l, 3] = stroke[:, 2]
result[0:l, 2] = 1 - result[0:l, 3]
result[l:, 4] = 1
return result
The problem was that the strokes size is not equal as the array size expected by the algorithm.
So adapting the strokes array fixed the issue.

Add a constant variable to a cuda.FloatTensor

I have two question:
1) I'd like to know how can I add/subtract a constante torch.FloatTensor of size 1 to all of the elemets of a torch.FloatTensor of size 30.
2) How can I multiply each element of a torch.FloatTensor of size 30 by a random value (different or not for each).
My code:
import torch
dtype = torch.cuda.FloatTensor
def main():
pop, xmax, xmin = 30, 5, -5
x = (xmax-xmin)*torch.rand(pop).type(dtype)+xmin
y = torch.pow(x, 2)
[miny, indexmin] = y.min(0)
gxbest = x[indexmin]
pxbest = x
pybest = y
v = torch.rand(pop)
vnext = torch.rand()*v + torch.rand()*(pxbest - x) + torch.rand()*(gxbest - x)
main()
What is the best way to do it? I think I should so how convert the gxbest into a torch.FloatTensor of size 30 but how can I do that?
I've try to create a vector:
Variable(torch.from_numpy(np.ones(pop)))*gxbest
But it did not work. The multiplication is not working also.
RuntimeError: inconsistent tensor size
Thank you all for your help!
1) How can I add/subtract a constant torch.FloatTensor of size 1 to all of the elements of a torch.FloatTensor of size 30?
You can do it directly in pytorch 0.2.
import torch
a = torch.randn(30)
b = torch.randn(1)
print(a-b)
In case if you get any error due to size mismatch, you can make a small change as follows.
print(a-b.expand(a.size(0))) # to make both a and b tensor of same shape
2) How can I multiply each element of a torch.FloatTensor of size 30 by a random value (different or not for each)?
In pytorch 0.2, you can do it directly as well.
import torch
a = torch.randn(30)
b = torch.randn(1)
print(a*b)
In case, if you get an error due to size mismatch, do as follows.
print(a*b.expand(a.size(0)))
So, in your case you can simply change the size of gxbest tensor from 1 to 30 as follows.
gxbest = gxbest.expand(30)

numpy concatenate function not works when handling with large matrix

I was trying to concatenate a 3-by-n 3d coordinate matrix called VTrans with a 1-by-n all one value vector called lr to augment the coordinate matrix to the 4-by-n homogeneous matrix. n in my case is the vertex Number 141669, which is pretty big.
The code below is not working while it does work in a very small dataset.
lr = np.ones(vertexNum).reshape((1, vertexNum))
VtransAppend = np.concatenate((VTrans, lr), axis=0)
update2:
Just found the problem, my vertexNum is wrong! IT is actually 47223 instead of 141669. 141669 is its size! All solution work and I will accept the first one. Thank you all!
The error says "all the input array dimensions except for the concatenation axis must match exactly"
I further verify lr and VtransAppend has the same length by printing the size out.
print lr.size
print VTrans.size
Anyone once has the same weird problem before and know how to solve it?
Here is the update:
My VTrans matrix is attached, where vertextNum is 141669
This is the code followed by YXD's suggestion, but the issue still exits...
vertexNum = VTrans.size # Total vertex in current model
lr = np.ones(vertexNum)
VtransAppend = np.concatenate((VTrans, lr.reshape(1, -1)), axis=0)
You have to fiddle lr to have the same number of dimensions as vTrans
>>> n = 4
>>> vTrans = np.random.random_sample((3, n))
>>> lr = np.ones(n)
>>> np.concatenate((vTrans, lr.reshape(1, -1)), axis=0)
array([[ 0.65769116, 0.41008341, 0.66046706, 0.86501781],
[ 0.51584699, 0.60601466, 0.93800371, 0.25077702],
[ 0.16696658, 0.41839794, 0.0938594 , 0.48484606],
[ 1. , 1. , 1. , 1. ]])
>>>
i.e. after the reshape, the non-concatenation dimension matches vTrans
>>> lr.shape
(4,)
>>> lr.reshape(1, -1).shape
(1, 4)
>>>
Try vstack instead of concatenate:
a = np.random.random((3,5))
b = np.random.random(5)
np.vstack((a, b))
Alternatively:
np.concatenate((a, b[None,:]))
The None adds an axis to the 1D array b.