how to train model in which labels is [5,30]? - tensorflow

How to train on a dataset which has each label of shape [5,30]. For example :
[
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 54, 55, 21, 56, 57, 3,
22, 19, 58, 6, 59, 4, 60, 1, 61, 62, 23, 63, 23, 64],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 65, 7, 66, 2, 67, 68, 3, 69, 70],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 11, 12, 5, 13, 14, 9, 10, 5, 15, 16, 17, 2, 8],
[ 0, 0, 0, 0, 0, 2, 71, 1, 72, 73, 74, 7, 75, 76, 77, 3,
20, 78, 18, 79, 1, 21, 80, 81, 3, 82, 83, 84, 6, 85],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2,
86, 87, 3, 88, 89, 1, 90, 91, 22, 92, 93, 4, 6, 94]
]
One way was to reshape the labels to [150], but that will make the tokenized sentences lose their meanings. Please suggest me how to arrange the keras layers and which layers to be able to make the model? I want to able to generate sentences later.
My code for model right now is this.
model = tf.keras.Sequential([ feature_layer,
layers.Dense(128, activation='relu'),
layers.Dense(128, activation='relu'),
layers.Dropout(.1),
layers.Dense(5),
layers.Dense(30, activation='softmax'), ])
opt = Adam(learning_rate=0.01)
model.compile(optimizer=opt, loss='mean_absolute_percentage_error', metrics=['accuracy'])
The actual data.
state
district
month
rainfall
max_temp
min_temp
max_rh
min_rh
wind_speed
advice
Orissa
Kendrapada
february
0.0
34.6
19.4
88.2
29.6
12.0
chances of foot rot disease in paddy crop; apply urea at 3 weeks after transplanting at active tillering stage for paddy;......
Jharkhand
Saraikela Kharsawan
february
0
35.2
16.6
29.4
11.2
3.6
provide straw mulch and go for intercultural operations to avoid moisture losses from soil; chance of leaf blight disease in potato crop; .......
I need to be able to generate the advices.

If you do consider that the output needs to be in this shape (and not flattened), the easiest (and also correct solution in my opinion) is to have a multi-output network, each output having a layers.Dense(30,activation='softmax').
You would have something like:
def create_model():
base_model = .... (stacked Dense units + other) # you can even create multi-input multi-output if you really want that.
first_output = Dense(30,activation='softmax',name='output_1')(base_model)
second_output = Dense(30,activation='softmax',name='output_2')(base_model)
...
fifth_output = Dense(30,activation='softmax',name='output_5')(base_model)
model = Model(inputs=input_layer,
outputs=[first_output,second_output,third_output,fourth_output,fifth_output])
return model
optimizer = tf.keras.optimizers.Adam()
model.compile(optimizer=optimizer,
loss={'output_1': 'sparse_categorical_crossentropy',
'output_2': 'sparse_categorical_crossentropy',
'output_3': 'sparse_categorical_crossentropy',
'output_4': 'sparse_categorical_crossentropy',
'output_5': 'sparse_categorical_crossentropy'},
metrics={'output_1':tf.keras.metrics.Accuracy(),
'output_2':tf.keras.metrics.Accuracy(),
'output_3':tf.keras.metrics.Accuracy(),
'output_4':tf.keras.metrics.Accuracy(),
'output_5':tf.keras.metrics.Accuracy()})
model.fit(X, y,
epochs=100, batch_size=10, validation_data=(val_X, val_y))
Here, note that y (both for train and valid) is a a numpy array of length 5 (number of outputs) and each element has length 30.
Again, ensure that you actually need such a configuration; I posted the answer as a demonstration of multi-output label in TensorFlow and Keras and for the benefit of the others, but I am not 100% sure you actually need this exact configuration (perhaps you can opt for something easier).
Note the usage of sparse_categorical_crossentropy, since your labels are not one-hot encoded (also MAPE is for regression, not classification).

Related

How to shape input for Ragged tensor with LSTM

I'm getting the following error when I try to pass multiple ragged tensors to my model:
ValueError: Layer "sequential_3" expects 1 input(s), but it received 2 input tensors
I suspect it has something to do with the "shape" argument in the Input layer of the model.
Yes, I've reviewed the ragged tensor documentation.
Yes, I've spent many hours scouring the Google-net.
Yes, I've browsed stackoverflow.com related articles.
Yes, I'm most likely noobing it up right now and there is a simple solution.
Below is a reproducible example
(tensorflow 2.7; keras 2.7; python 3.7)
dependencies
import tensorflow as tf
import pandas as pd
import keras
import keras.layers
Dummy Dataset
d0 = pd.DataFrame(data={
"id":[
1,
2, 2,
3, 3, 3,
4, 4, 4, 4,
5, 5, 5, 5, 5,
6, 6, 6, 6, 6, 6
],
"date":[
pd.to_datetime("2008-03-31"),
pd.to_datetime("2008-03-31"), pd.to_datetime("2008-06-30"),
pd.to_datetime("2008-03-31"), pd.to_datetime("2008-06-30"), pd.to_datetime("2008-09-30"),
pd.to_datetime("2008-03-31"), pd.to_datetime("2008-06-30"), pd.to_datetime("2008-09-30"), pd.to_datetime("2008-12-31"),
pd.to_datetime("2008-03-31"), pd.to_datetime("2008-06-30"), pd.to_datetime("2008-09-30"), pd.to_datetime("2008-12-31"), pd.to_datetime("2009-03-31"),
pd.to_datetime("2008-03-31"), pd.to_datetime("2008-06-30"), pd.to_datetime("2008-09-30"), pd.to_datetime("2008-12-31"), pd.to_datetime("2009-03-31"), pd.to_datetime("2009-06-30")
],
"date2":[
1,
1, 2,
1, 2, 3,
1, 2, 3, 4,
1, 2, 3, 4, 5,
1, 2, 3, 4, 5, 6
],
"input":[
10,
11, 12,
13, 14, 15,
16, 17, 18, 19,
20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30
],
"target":[
60,
61, 62,
63, 64, 65,
66, 67, 68, 69,
70, 71, 72, 73, 74,
75, 76, 77, 78, 79, 80
],
})
inputs
inputs = []
inputs.append(tf.ragged.constant(d0.groupby("id")["input"].apply(list)))
inputs.append(tf.ragged.constant(d0.groupby("id")["date2"].apply(list)))
target
target = tf.ragged.constant(d0.groupby("id")["target"].apply(list))
model
mod1 = keras.Sequential([
keras.layers.Input(shape=(None, 4), dtype=tf.float32, batch_size=1, ragged=True),
keras.layers.LSTM(units=32, dtype=tf.float32, return_sequences=True, use_bias=False),
keras.layers.Dense(units=32),
keras.layers.Dense(units=1)
])
compile
mod1.compile(
optimizer="adam",
loss="mse",
metrics=["accuracy"]
)
fit
history = mod1.fit(
x=inputs,
y=target,
epochs=1
)
The error occurs during the fit step.

Fastest way to do vectorized reduce product with boolean mask

I have a 3D numpy array A and 2D numpy boolean mask B.
The first two dimensions of A matches B
And I'm wondering if there is any fast way for each first dimension of A, select the third dimension along second based on B, perform a reduced product over the second dimension.
My expected out C would be a 2D numpy array, with the first dimension of A and the second dimension from the third of A.
My current solution is C = np.prod(A*np.repeat(B[...,np.newaxis], A.shape[-1], 2), 1)
Is there any better alternative?
With concrete example:
In [364]: A=np.arange(1,25).reshape(2,3,4); B=np.arange(1,7).reshape(2,3)
In [365]: C = np.prod(A*np.repeat(B[...,np.newaxis], A.shape[-1], 2), 1)
That repeat does:
In [366]: np.repeat(B[...,np.newaxis], A.shape[-1], 2)
Out[366]:
array([[[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3]],
[[4, 4, 4, 4],
[5, 5, 5, 5],
[6, 6, 6, 6]]])
In [367]: _.shape
Out[367]: (2, 3, 4)
In [368]: A*np.repeat(B[...,np.newaxis], A.shape[-1], 2)
Out[368]:
array([[[ 1, 2, 3, 4],
[ 10, 12, 14, 16],
[ 27, 30, 33, 36]],
[[ 52, 56, 60, 64],
[ 85, 90, 95, 100],
[126, 132, 138, 144]]])
But by broadcasting rules, the repeat is no needed:
In [369]: A*B[...,np.newaxis]
Out[369]:
array([[[ 1, 2, 3, 4],
[ 10, 12, 14, 16],
[ 27, 30, 33, 36]],
[[ 52, 56, 60, 64],
[ 85, 90, 95, 100],
[126, 132, 138, 144]]])
In [371]: np.prod(_369, axis=1)
Out[371]:
array([[ 270, 720, 1386, 2304],
[556920, 665280, 786600, 921600]])
You could apply prod to A and B individually, but I don't know if that makes much of a difference:
In [373]: np.prod(A,1)*np.prod(B,1)[:,None]
Out[373]:
array([[ 270, 720, 1386, 2304],
[556920, 665280, 786600, 921600]])

LSTM encoder decoder model training errors: ValueError

Resources
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
print(tf.version.VERSION)
print(keras.__version__)
#2.5.0
#2.5.0
LSTM Encoder Decoder Model with Attention
n_features = 129
type_max = 3
n_padded_in = 10
n_padded_out = 10
input_item = layers.Input(batch_input_shape=[None, n_padded_in],
name="item_input",
dtype=tf.int64)
input_type = layers.Input(batch_input_shape=[None, n_padded_in],
name="type_input",
dtype=tf.int64)
encoding_padding_mask = tf.math.logical_not(tf.math.equal(input_item, 0))
embedding_item = layers.Embedding(input_dim=n_features,
output_dim=batch_size,
name="item_embedding")(input_item)
embedding_type = layers.Embedding(input_dim=type_max+1,
output_dim=batch_size,
name="rec_embedding")(input_type)
concat_inputs = layers.Concatenate(name="concat_inputs")(
[embedding_item, embedding_type])
concat_inputs = tf.keras.layers.BatchNormalization(
name="batchnorm_inputs")(concat_inputs)
encoder_lstm = layers.LSTM(units=latent_dim,
return_state=True,
name="lstm_encoder")
encoder_output, hidden, cell = encoder_lstm(concat_inputs)
states = [hidden, cell]
decoder_output = hidden
decoder_lstm = layers.LSTM(units=latent_dim,
return_state=True,
name="lstm_decoder")
output_dense = layers.Dense(n_features, name="output")
att = layers.Attention(use_scale=False,
causal=True,
name="attention")
inputs = np.zeros((batch_size, 1, n_features))
all_outputs = []
for _ in range(n_padded_out):
context_vector = att([decoder_output, encoder_output])
context_vector = tf.expand_dims(context_vector, 1)
inputs = tf.cast(inputs, tf.float32)
inputs = tf.concat([context_vector, inputs], axis=-1)
decoder_output, state_h, state_c = decoder_lstm(inputs, initial_state=states)
output = output_dense(decoder_output)
output = tf.expand_dims(output, 1)
all_outputs.append(output)
inputs = output
states = [state_h, state_c]
all_outputs = layers.Lambda(lambda x: tf.concat(x, axis=1))(all_outputs)
type_encoder_model = keras.Model([input_item, input_type],
all_outputs,
name="type_encoder_model")
type_encoder_model.compile(loss=keras.losses.SparseCategoricalCrossentropy(),
optimizer=keras.optimizers.Adam(learning_rate=l_rate),
metrics=["sparse_categorical_accuracy"])
type_encoder_model.summary()
Data Preparation
#second input as sequence
type_seq_padded = keras.preprocessing.sequence.pad_sequences(
data["product_type"].to_list(),
maxlen=n_padded_in,
padding="pre",
value=0.0
)
#first input sequence
input_seq_padded = keras.preprocessing.sequence.pad_sequences(
data["input_seq"].to_list(),
maxlen=n_padded_in,
padding="pre",
value=0.0
)
#output sequence
output_seq_padded = keras.preprocessing.sequence.pad_sequences(
data["output_seq"].to_list(),
maxlen=n_padded_out,
padding="pre",
value=0.0
)
Data Samples
type_seq_padded
array([[0, 0, 0, ..., 1, 1, 1],
[0, 0, 0, ..., 2, 3, 3],
[0, 0, 0, ..., 3, 3, 3],
...,
[0, 0, 0, ..., 1, 3, 3],
[0, 0, 0, ..., 3, 3, 3],
[0, 0, 0, ..., 3, 3, 3]], dtype=int32)
input_seq_padded
array([[ 0, 0, 0, ..., 101, 58, 123],
[ 0, 0, 0, ..., 79, 95, 87],
[ 0, 0, 0, ..., 98, 109, 123],
...,
[ 0, 0, 0, ..., 123, 109, 98],
[ 0, 0, 0, ..., 109, 98, 123],
[ 0, 0, 0, ..., 95, 123, 95]], dtype=int32)
output_seq_padded
array([[ 0, 0, 0, ..., 58, 123, 43],
[ 0, 0, 0, ..., 95, 87, 123],
[ 0, 0, 0, ..., 109, 123, 10],
...,
[ 0, 0, 0, ..., 109, 98, 123],
[ 0, 0, 0, ..., 98, 123, 43],
[ 0, 0, 0, ..., 123, 95, 95]], dtype=int32)
My LSTM Encoder Decoder model takes 2 input as sequence: items and item-types, and 1 output sequence: items. Last dense layer calculates probability of purchase of 129 different items as next item to be purchased. Model is trained with code below:
hist = type_encoder_model.fit([input_seq_padded[:64000],
type_seq_padded[:64000]],
output_seq_padded[:64000],
epochs=1,
batch_size=128,
verbose=1)
And when i attempt to use the model for prediction with code below:
y_pred = base_model_X.predict([input_seq_padded_test,
type_seq_padded_test])
Test Samples
type_seq_padded_test
array([[0, 0, 0, ..., 2, 3, 3],
[0, 0, 0, ..., 3, 2, 2],
[0, 0, 0, ..., 3, 3, 3],
...,
[0, 0, 0, ..., 3, 2, 1],
[0, 0, 0, ..., 3, 2, 3],
[0, 0, 0, ..., 2, 3, 3]], dtype=int32)
input_seq_padded_test
array([[ 0, 0, 0, ..., 31, 10, 13],
[ 0, 0, 0, ..., 9, 6, 6],
[ 0, 0, 0, ..., 13, 13, 9],
...,
[ 0, 0, 0, ..., 10, 51, 18],
[ 0, 0, 0, ..., 12, 44, 12],
[ 0, 0, 0, ..., 6, 9, 11]], dtype=int32)
I get an error like below:
ValueError: in user code:
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:1569 predict_function *
return step_function(self, iterator)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:1559 step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:1285 run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:2833 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3608 _call_for_each_replica
return fn(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:1552 run_step **
outputs = model.predict_step(data)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:1525 predict_step
return self(x, training=False)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py:1030 __call__
outputs = call_fn(inputs, *args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/functional.py:421 call
inputs, training=training, mask=mask)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/functional.py:556 _run_internal_graph
outputs = node.layer(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py:1030 __call__
outputs = call_fn(inputs, *args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/core.py:1363 _call_wrapper
return self._call_wrapper(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/core.py:1395 _call_wrapper
result = self.function(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper
return target(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/array_ops.py:1768 concat
return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_array_ops.py:1228 concat_v2
"ConcatV2", values=values, axis=axis, name=name)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/op_def_library.py:750 _apply_op_helper
attrs=attr_protos, op_def=op_def)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py:601 _create_op_internal
compute_device)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py:3565 _create_op_internal
op_def=op_def)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py:2042 __init__
control_input_ops, op_def)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py:1883 _create_c_op
raise ValueError(str(e))
ValueError: Dimension 0 in both shapes must be equal, but are 32 and 128. Shapes are [32,1] and [128,1]. for '{{node base_model_X/tf.concat_40/concat}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32](base_model_X/tf.expand_dims_80/ExpandDims, base_model_X/148834, base_model_X/tf.concat_40/concat/axis)' with input shapes: [32,1,256], [128,1,137], [] and with computed input tensors: input[2] = <-1>.
Now, i am looking for a solution for the next error above. I don't know what the folks think, but I really start to hate encoder-decoder lstm. Thanks in advance for your ideas or different model configuration recommendations.

bilstm and attention to find the topic representation of text

#Preprocessing of data
df = pd.read_csv("small_quac.csv")
df = df.drop(['Unnamed: 0'], axis = 1)
shared_topic, section_title, for_tokenize = read_data(df)
# Define x_train and x_test
x_train = np.asarray(shared_topic)
y_train = np.asarray(section_title)
# Find max_seq_len
max_seq_len_x = get_max_seq_len(x_train, remove_stopwords=False)
max_seq_len_y = get_max_seq_len(y_train, remove_stopwords=False)
max_seq_len = max(max_seq_len_x, max_seq_len_y)
tokenizer = Tokenizer(filters='\n')
tokenizer.fit_on_texts(for_tokenize)
vocab_size = len(tokenizer.word_index) + 1
X = tokenizer.texts_to_sequences(x_train)
y = tokenizer.texts_to_sequences(y_train)
# print(X[0])
word2idx = tokenizer.word_index
idx2word = tokenizer.index_word
fdist = tokenizer.word_counts
X = pad_sequences(X, maxlen=max_seq_len_x, padding='post')
y = pad_sequences(y, maxlen=max_seq_len_y, padding='post')
# from here modelling starts
rnn_cell_size = 128
max_seq_len_y = 14
max_seq_len_x = 139
class Attention(tf.keras.Model):
def __init__(self, units):
super(Attention, self).__init__()
self.W1 = tf.keras.layers.Dense(units)
self.W2 = tf.keras.layers.Dense(units)
self.V = tf.keras.layers.Dense(1)
def call(self, features, hidden):
hidden_with_time_axis = tf.expand_dims(hidden, 1)
score = tf.nn.tanh(self.W1(features) + self.W2(hidden_with_time_axis))
attention_weights = tf.nn.softmax(self.V(score), axis=1)
context_vector = attention_weights * features
context_vector = tf.reduce_sum(context_vector, axis=1)
return context_vector, attention_weights
sequence_input = tf.keras.layers.Input(shape=(max_seq_len_x,))
embedded_sequences = tf.keras.layers.Embedding(vocab_size,
300, weights=[embedding_matrix],
trainable=False, mask_zero=True, name='Encoder-Word-Embedding')(sequence_input)
lstm = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM
(rnn_cell_size,
dropout=0.3,
return_sequences=True,
return_state=True,
recurrent_activation='relu',
recurrent_initializer='glorot_uniform'), name="bi_lstm_0")(embedded_sequences)
lstm, forward_h, forward_c, backward_h, backward_c = tf.keras.layers.Bidirectional \
(tf.keras.layers.LSTM
(rnn_cell_size,
dropout=0.2,
return_sequences=True,
return_state=True,
recurrent_activation='relu',
recurrent_initializer='glorot_uniform'))(lstm)
state_h = tf.keras.layers.Concatenate()([forward_h, backward_h])
state_c = tf.keras.layers.Concatenate()([forward_c, backward_c])
context_vector, attention_weights = Attention(32)(lstm, state_h)
output = keras.layers.Dense(max_seq_len_y, activation='softmax')(context_vector)
model = keras.Model(inputs=sequence_input, outputs=output)
# summarize layers
print(model.summary())
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
history = model.fit(x=X, y=y,epochs=30)
Also, I am using glove embeddings 300 Dimensions.
Here my X is a matrix of shape (59,139) where 59 = number of samples and 139 = length of maximum sentence in my text rows. These 139 values are filled with the word2idx of my vocabulary.
Y is a matrix of shape (59, 14) where 59=same above and 14 = length of my maximum title and filled with word2idx of vocabulary.
For example I want this:
Input:
array([293, 40, 294, 129, 75, 130, 129, 131, 295, 296, 132, 297, 298,
2, 299, 34, 12, 76, 300, 27, 301, 15, 1, 302, 133, 4,
77, 303, 3, 134, 304, 78, 34, 305, 11, 306, 307, 4, 1,
132, 135, 22, 10, 308, 11, 136, 4, 1, 309, 50, 4, 310,
11, 78, 311, 312, 3, 77, 1, 313, 130, 10, 137, 11, 12,
109, 7, 314, 315, 7, 1, 76, 316, 4, 317, 318, 34, 138,
319, 139, 320, 3, 77, 321, 79, 322, 4, 1, 323, 324, 4,
1, 325, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0])
Output:
array([1040, 1041, 2, 1042, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Please help me out, I have spend so many days to find the approach but I am unable to find it.

2D Numpy Array Boolean Slicing [duplicate]

I've got a strange situation.
I have a 2D Numpy array, x:
x = np.random.random_integers(0,5,(20,8))
And I have 2 indexers--one with indices for the rows, and one with indices for the column. In order to index X, I am having to do the following:
row_indices = [4,2,18,16,7,19,4]
col_indices = [1,2]
x_rows = x[row_indices,:]
x_indexed = x_rows[:,column_indices]
Instead of just:
x_new = x[row_indices,column_indices]
(which fails with: error, cannot broadcast (20,) with (2,))
I'd like to be able to do the indexing in one line using the broadcasting, since that would keep the code clean and readable...also, I don't know all that much about python under the hood, but as I understand it, it should be faster to do it in one line (and I'll be working with pretty big arrays).
Test Case:
x = np.random.random_integers(0,5,(20,8))
row_indices = [4,2,18,16,7,19,4]
col_indices = [1,2]
x_rows = x[row_indices,:]
x_indexed = x_rows[:,col_indices]
x_doesnt_work = x[row_indices,col_indices]
Selections or assignments with np.ix_ using indexing or boolean arrays/masks
1. With indexing-arrays
A. Selection
We can use np.ix_ to get a tuple of indexing arrays that are broadcastable against each other to result in a higher-dimensional combinations of indices. So, when that tuple is used for indexing into the input array, would give us the same higher-dimensional array. Hence, to make a selection based on two 1D indexing arrays, it would be -
x_indexed = x[np.ix_(row_indices,col_indices)]
B. Assignment
We can use the same notation for assigning scalar or a broadcastable array into those indexed positions. Hence, the following works for assignments -
x[np.ix_(row_indices,col_indices)] = # scalar or broadcastable array
2. With masks
We can also use boolean arrays/masks with np.ix_, similar to how indexing arrays are used. This can be used again to select a block off the input array and also for assignments into it.
A. Selection
Thus, with row_mask and col_mask boolean arrays as the masks for row and column selections respectively, we can use the following for selections -
x[np.ix_(row_mask,col_mask)]
B. Assignment
And the following works for assignments -
x[np.ix_(row_mask,col_mask)] = # scalar or broadcastable array
Sample Runs
1. Using np.ix_ with indexing-arrays
Input array and indexing arrays -
In [221]: x
Out[221]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 92, 46, 67, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
In [222]: row_indices
Out[222]: [4, 2, 5, 4, 1]
In [223]: col_indices
Out[223]: [1, 2]
Tuple of indexing arrays with np.ix_ -
In [224]: np.ix_(row_indices,col_indices) # Broadcasting of indices
Out[224]:
(array([[4],
[2],
[5],
[4],
[1]]), array([[1, 2]]))
Make selections -
In [225]: x[np.ix_(row_indices,col_indices)]
Out[225]:
array([[76, 56],
[70, 47],
[46, 95],
[76, 56],
[92, 46]])
As suggested by OP, this is in effect same as performing old-school broadcasting with a 2D array version of row_indices that has its elements/indices sent to axis=0 and thus creating a singleton dimension at axis=1 and thus allowing broadcasting with col_indices. Thus, we would have an alternative solution like so -
In [227]: x[np.asarray(row_indices)[:,None],col_indices]
Out[227]:
array([[76, 56],
[70, 47],
[46, 95],
[76, 56],
[92, 46]])
As discussed earlier, for the assignments, we simply do so.
Row, col indexing arrays -
In [36]: row_indices = [1, 4]
In [37]: col_indices = [1, 3]
Make assignments with scalar -
In [38]: x[np.ix_(row_indices,col_indices)] = -1
In [39]: x
Out[39]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, -1, 46, -1, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, -1, 56, -1, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Make assignments with 2D block(broadcastable array) -
In [40]: rand_arr = -np.arange(4).reshape(2,2)
In [41]: x[np.ix_(row_indices,col_indices)] = rand_arr
In [42]: x
Out[42]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 0, 46, -1, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, -2, 56, -3, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
2. Using np.ix_ with masks
Input array -
In [19]: x
Out[19]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[88, 92, 46, 67, 44, 81, 17, 67],
[31, 70, 47, 90, 52, 15, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[74, 46, 95, 27, 81, 97, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Input row, col masks -
In [20]: row_mask = np.array([0,1,1,0,0,1,0],dtype=bool)
In [21]: col_mask = np.array([1,0,1,0,1,1,0,0],dtype=bool)
Make selections -
In [22]: x[np.ix_(row_mask,col_mask)]
Out[22]:
array([[88, 46, 44, 81],
[31, 47, 52, 15],
[74, 95, 81, 97]])
Make assignments with scalar -
In [23]: x[np.ix_(row_mask,col_mask)] = -1
In [24]: x
Out[24]:
array([[17, 39, 88, 14, 73, 58, 17, 78],
[-1, 92, -1, 67, -1, -1, 17, 67],
[-1, 70, -1, 90, -1, -1, 24, 22],
[19, 59, 98, 19, 52, 95, 88, 65],
[85, 76, 56, 72, 43, 79, 53, 37],
[-1, 46, -1, 27, -1, -1, 93, 69],
[49, 46, 12, 83, 15, 63, 20, 79]])
Make assignments with 2D block(broadcastable array) -
In [25]: rand_arr = -np.arange(12).reshape(3,4)
In [26]: x[np.ix_(row_mask,col_mask)] = rand_arr
In [27]: x
Out[27]:
array([[ 17, 39, 88, 14, 73, 58, 17, 78],
[ 0, 92, -1, 67, -2, -3, 17, 67],
[ -4, 70, -5, 90, -6, -7, 24, 22],
[ 19, 59, 98, 19, 52, 95, 88, 65],
[ 85, 76, 56, 72, 43, 79, 53, 37],
[ -8, 46, -9, 27, -10, -11, 93, 69],
[ 49, 46, 12, 83, 15, 63, 20, 79]])
What about:
x[row_indices][:,col_indices]
For example,
x = np.random.random_integers(0,5,(5,5))
## array([[4, 3, 2, 5, 0],
## [0, 3, 1, 4, 2],
## [4, 2, 0, 0, 3],
## [4, 5, 5, 5, 0],
## [1, 1, 5, 0, 2]])
row_indices = [4,2]
col_indices = [1,2]
x[row_indices][:,col_indices]
## array([[1, 5],
## [2, 0]])
import numpy as np
x = np.random.random_integers(0,5,(4,4))
x
array([[5, 3, 3, 2],
[4, 3, 0, 0],
[1, 4, 5, 3],
[0, 4, 3, 4]])
# This indexes the elements 1,1 and 2,2 and 3,3
indexes = (np.array([1,2,3]),np.array([1,2,3]))
x[indexes]
# returns array([3, 5, 4])
Notice that numpy has very different rules depending on what kind of indexes you use. So indexing several elements should be by a tuple of np.ndarray (see indexing manual).
So you need only to convert your list to np.ndarray and it should work as expected.
I think you are trying to do one of the following (equlvalent) operations:
x_does_work = x[row_indices,:][:,col_indices]
x_does_work = x[:,col_indices][row_indices,:]
This will actually create a subset of x with only the selected rows, then select the columns from that, or vice versa in the second case. The first case can be thought of as
x_does_work = (x[row_indices,:])[:,col_indices]
Your first try would work if you write it with np.newaxis
x_new = x[row_indices[:, np.newaxis],column_indices]