Expected to see 2 array(s), but instead got the following list of 1 arrays - tensorflow

I want to create a model that receive one image and compute the image by two Softmax(two output). The code is:
base_model = InceptionV3(include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
# first softmax
x_1 = Dense(1024, activation='relu')(x)
predictions_1 = Dense(4, activation='softmax')(x_1)
# second Softmax
x_2 = Dense(1024, activation='relu')(x)
predictions_2 = Dense(4, activation='softmax')(x_2)
my_model = Model(inputs=base_model.input, outputs=[predictions_1,predictions_2])
# train
When training, I got error:
ValueError: Expected to see 2 array(s), but instead got the following
list of 1 arrays: [array([[0., 0., 0., 1.], [0., 0., 0., 1.], [0., 0.,
0., 1.], [0., 1., 0., 0.], [1., 0., 0., 0.], [0., 1., 0., 0.], [0., 1., 0., 0.], [0., 1., 0., 0.],...


Tensorflow: Reshape a 1-D tensor with given numbers of elements in each row

I want to convert a 1-D tensor to 2-D shape in Tensorflow 2, with the number of elements in each row is given. I found a solution using the RaggedTensor like this
numbers = tf.constant([1,3,2,5,4])
c0 = tf.ones(shape=(15,)) # the tensor need to be reshape
rc0 = tf.RaggedTensor.from_row_lengths(values=c0, row_lengths=numbers)
c1 = rc0.to_tensor()
The final value of c1 should be
<tf.Tensor: shape=(5, 5), dtype=float32, numpy=
array([[1., 0., 0., 0., 0.],
[1., 1., 1., 0., 0.],
[1., 1., 0., 0., 0.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 0.]], dtype=float32)>
I found it refused to work when the size of the input tensor is large, and the performance is not good enough.
Is there any other high performance solution?

One-hot encode labels in keras

I have a set of integers from a label column in a CSV file - [1,2,4,3,5,2,..]. The number of classes is 5 ie range of 1 to 6. I want to one-hot encode them using the below code.
y = df.iloc[:,10].values
y = tf.keras.utils.to_categorical(y, num_classes = 5)
But this code gives me an error
IndexError: index 5 is out of bounds for axis 1 with size 5
How can I fix this?
If you use tf.keras.utils.to_categorical to one-hot the label vector, the integers should start from 0 to num_classes, source. In your case, you should do as follows
import tensorflow as tf
import numpy as np
a = np.array([1,2,4,3,5,2,4,2,1])
y_tf = tf.keras.utils.to_categorical(a-1, num_classes = 5)
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0.]], dtype=float32)
or, you can use pd.get_dummies,
import pandas as pd
import numpy as np
a = np.array([1,2,4,3,5,2,4,2,1])
a_pd = pd.get_dummies(a).astype('float32').values
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0.]], dtype=float32)

Implementing BandRNN with pytorch and tensorflow

So I am trying to figure out how to train my matrix in a way that I will get a BandRNN.
BandRnn is a diagonalRNN model with a different number of connections per neuron.
For example:
C is the number of connections per neuron.
I found out that there is a way to turn off some of the gradients in a for loop, in a way that prevents them from being trained as follows:
for p in model.input.parameters():
p.requires_grad = False
But I can't find a proper way to do so, in a way that will make my matrix become a BandRNN.
Hopefully, someone will be able to help me with this issue.
As far as I know you can only activate/deactivate requires_grad on a tensor, and not on distinct components of that tensor. Instead what you could do is zero out the values outside the band.
First create a mask for the band, you could use torch.ones with torch.diagflat:
>>> torch.diagflat(torch.ones(5), offset=1)
By setting the right dimension for torch.ones as well as the right offset you can generate offset diagonal matrices with consistent shapes.
>>> N = 10; i = -1
>>> torch.diagflat(torch.ones(N-abs(i)), offset=i)
tensor([[0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.]])
>>> N = 10; i = 0
>>> torch.diagflat(torch.ones(N-abs(i)), offset=i)
tensor([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])
>>> N = 10; i = 1
>>> torch.diagflat(torch.ones(N-abs(i)), offset=i)
tensor([[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0.]])
You get the point, summing these matrices element-wise allows use to get a mask:
>>> N = 10; b = 3
>>> mask = sum(torch.diagflat(torch.ones(N-abs(i)), i) for i in range(-b//2,b//2+1))
>>> mask
tensor([[1., 1., 0., 0., 0.],
[1., 1., 1., 0., 0.],
[1., 1., 1., 1., 0.],
[0., 1., 1., 1., 1.],
[0., 0., 1., 1., 1.]])
Then you can zero out the values outside the band on your nn.Linear:
>>> m = nn.Linear(N, N)
>>> m.weight.data = m.weight * mask
>>> m.weight
Parameter containing:
tensor([[-0.3321, -0.3377, -0.0000, -0.0000, -0.0000],
[-0.4197, 0.1729, 0.2101, 0.0000, 0.0000],
[ 0.3467, 0.2857, -0.3919, -0.0659, 0.0000],
[ 0.0000, -0.4060, 0.0908, 0.0729, -0.1318],
[ 0.0000, -0.0000, -0.4449, -0.0029, -0.1498]], requires_grad=True)
Note, you might need to perform this on each forward pass as the parameters outside the band might get updated to non-zero values during the training. Of course, you can initialize mask once and keep it in memory.
It would be more convenient to wrap everything into a custom nn.Module.

How is the IoU calculated for multiple bounding box predictions in Tensorflow Object Detection API?

How is the IoU metric calculated for multiple bounding box predictions in Tensorflow Object Detection API ?
Not sure exactly how TensorFlow does it but here is one way that I recently got it to work since I didn't find a good solution online. I used numpy matrices to get the IoU, & other metrics (TP, FP, TN, FN) for multi-object detection.
Lets say for this example that your image is 6x6.
import cv2
empty_array = np.zeros(36).reshape([6, 6])
array([[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]])
And you have the ground truth for 2 objects, one in the bottom left of the image and one smaller one in the top right.
bbox_actual_obj1 = [[0, 3], [2, 5]] # top left coord & bottom right coord
bbox_actual_obj2 = [[4, 0], [5, 1]]
Using OpenCV, you can add these objects to a copy of the empty image array.
actual = empty.copy()
actual = cv2.rectangle(
actual = cv2.rectangle(
array([[0., 0., 0., 0., 1., 1.],
[0., 0., 0., 0., 1., 1.],
[0., 0., 0., 0., 0., 0.],
[1., 1., 1., 0., 0., 0.],
[1., 1., 1., 0., 0., 0.],
[1., 1., 1., 0., 0., 0.]])
Now let's say that below are our predicted bounding boxes:
bbox_pred_obj1 = [[1, 3], [3, 5]] # top left coord & bottom right coord
bbox_pred_obj2 = [[3, 0], [5, 2]]
Now we do the same thing as above but change the value we assign within the array.
pred = empty.copy()
pred = cv2.rectangle(
pred = cv2.rectangle(
array([[0., 0., 0., 2., 2., 2.],
[0., 0., 0., 2., 2., 2.],
[0., 0., 0., 2., 2., 2.],
[0., 2., 2., 2., 0., 0.],
[0., 2., 2., 2., 0., 0.],
[0., 2., 2., 2., 0., 0.]])
If we convert these arrays to matrices and add them, we get the following result
actual_matrix = np.matrix(actual)
pred_matrix = np.matrix(pred)
combined = actual_matrix + pred_matrix
matrix([[0., 0., 0., 2., 3., 3.],
[0., 0., 0., 2., 3., 3.],
[0., 0., 0., 2., 2., 2.],
[1., 3., 3., 2., 0., 0.],
[1., 3., 3., 2., 0., 0.],
[1., 3., 3., 2., 0., 0.]])
Now all we need to do is count the amount of each number in the combined matrix to get the TP, FP, TN, FN rates.
combined = np.squeeze(
pred_matrix + actual_matrix
unique, counts = np.unique(combined, return_counts=True)
zipped = dict(zip(unique, counts))
{0.0: 15, 1.0: 3, 2.0: 8, 3.0: 10}
True Negative: 0
False Negative: 1
False Positive: 2
True Positive/Intersection: 3
Union: 1 + 2 + 3
IoU: 0.48 10/(3 + 8 + 10)
Precision: 0.56 10/(10 + 8)
Recall: 0.77 10/(10 + 3)
F1: 0.65 10/(10 + 0.5 * (3 + 8))
Each bounding box around an object has an IoU (intersection over union) with the ground-truth box of that object. It is calculated by dividing the common area (overlap) between the predicted bounding box and the actual correct (ground-truth box) by the cumulative area of the two boxes. After calculating all the IoUs for the boxes around an object, the ones with the highest IoU are selected as the result. Here it is explained better.
Also you can print the IoU value after this line.

Keras: result of model.evaluate() stays high with all the weights and biases being 0

I created a VGG16 model using Keras application (TensorFlow backend). Then I wanted to change part of those weights and then test the accuracy of this modified model. To be direct and intuitive, I changed ALL the weights and biases in ALL layers to 0 like this:
model = VGG16(weights='imagenet', include_top=True)
# here is the test data and label containing 10 pictures I created.
data = np.load('./10_random_samples_array.npz')
data, label = data["X"], data["Y"]
# Modify the weights to zero
for z in [1, 2, 4, 5, 7, 8, 9, 11, 12, 13, 15, 16, 17]: # Conv layers
weight_bias = model.layers[z].get_weights()
shape_weight = np.shape(weight_bias[0])
shape_bias = np.shape(weight_bias[1])
weight_bias[0] = np.zeros(shape=(shape_weight[0],shape_weight[1],shape_weight[2],shape_weight[3]))
weight_bias[1] = np.zeros(shape=(shape_bias[0],))
for z in [20,21,22]: # FC layers
weight_bias = model.layers[z].get_weights()
shape_weight = np.shape(weight_bias[0])
print(z, shape_weight)
shape_bias = np.shape(weight_bias[1])
weight_bias[0] = np.zeros(shape=(shape_weight[0],shape_weight[1],))
weight_bias[1] = np.zeros(shape=(shape_bias[0],))
optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
# To check if the weights have been modified.
loss, acc = model.evaluate(data, label, verbose=1)
Then I got result like this:
[array([[[[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]],
...(All zero, I omit them)
[[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]]], dtype=float32),
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)]
10/10 [==============================] - 2s 196ms/step
Firstly, You can tell that all the weights and biases have already been changed to 0 but the accuracy still stays very high. That is unreasonable.(The original result returned by model.evaluate() is 0.9993000030517578)
Secondly, I used only 10 pictures as my test dataset. The result must be a decimal with only one digit after the point. But I got 0.9989999532699585.
I also tried to modify all weights only in Conv1-1 to zero and the result is also 0.9989999532699585. It seems that it is the minimum result. Is there something wrong with my model? Or the weights cannot be modified in this way? Or model.evaluate() doesn't work as I suppose?