Tensorflow compute image gradient loss - tensorflow

I am trying to optimize my network over the gradient of the reconstructed image and the ground truth but am receiving this error
InvalidArgumentError: Input is not invertible.
I think it is because tensorflow wants to backpropagate through the image transformation. How do I fix this ?
def image_gradient_loss(y_prediction, y):
gradient_loss = tf.abs(tf.abs(y_prediction - tf.contrib.image.transform(y_prediction, [1, 0, 1, 0, 0, 0, 0, 0])) - tf.abs(y - tf.contrib.image.transform(y, [1, 0, 1, 0, 0, 0, 0, 0]))) + \
tf.abs(tf.abs(y_prediction - tf.contrib.image.transform(y_prediction, [0, 0, 0, 0, 1, 1, 0, 0])) - tf.abs(y - tf.contrib.image.transform(y, [0, 0, 0, 0, 1, 1, 0, 0])))
return tf.reduce_mean(gradient_loss)
loss = image_gradient_loss(y_pred, y)
optimizer = tf.train.RMSPropOptimizer(learning_rate=0.001).minimize(loss)

I did these steps and it worked for me:
dy_true, dx_true = tf.image.image_gradients(y_true)
dy_pred, dx_pred = tf.image.image_gradients(y_pred)
term3 = K.mean(K.abs(dy_pred - dy_true) + K.abs(dx_pred - dx_true), axis=-1)

Related

How to use an input layer that also feeds on a previous layer of a neural network?

Let's say I want to predict the winner of a tag-team race, where some drivers are more usually place higher in certain weather conditions:
Race |Driver | Weather | Time
Dummy1 |D1 | Rain | 2:00
Dummy1 |D2 | Rain | 5:00
Dummy1 |D3 | Rain | 4:50
Dummy2 |D1 | Sunny | 3:00
Dummy2 |D2 | Sunny | 2:50
Dummy2 |D2 | Sunny | 2:30
...
The logic is that a team composed of D1 and D3 would outperform any other combination on Rain, but wouldn't have the same luck on other weather. With that said, I thought about the following model:
Layer 1 | Layer 2 | Layer 3 (output)
Driver encoding | weather encoding | expected race time
----------------------------------------------------------------
Input of 0 or 1 | sum(Layer 1 * weights | sum(Layer 2 * weights)
| * Input of 0 or 1) |
This means that layer 2 uses layer 1 as well as input values to compute a value.
The reason I want this architecture instead of having every feature on layer 1 is that I want different features to multiply each other instead of their sum.
I could not find anything like this, but it is probably just me not knowing the name of this approach. Can someone point me to sources or explain know how to replicate this on tensorflow/pytorch/any other lib?
Turns out it was actually pretty simple, for anyone that might stumble upon this post and would like to test this approach, here's rough code:
Racing dataset
# TEAM 1 TEAM 2 "Weather" "WON"
# "A","B","C","D","E", "A","B","C","D","E", W1 W2 W3 (combined times of team 1< combined times of team 2)
dataset=[[ 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1],
[ 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1],
[ 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1],
[ 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1],
[ 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0],
[ 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0],
[ 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0],
[ 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0],
[ 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0],
[ 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1],
]
inputs=[[x[0:-4],x[-4:-1]] for x in dataset]
results=[[x[-1]] for x in dataset]
Typings to make code more readable
from typing import Iterator
class InputLayer():
def __init__(self, inputs,useBias=False):
self.inputs=inputs
self.useBias=useBias
def __str__(self):
return "Layer of size "+ str(self.inputs)
def __repr__(self) -> str:
return self.__str__()
class InputLayerValue():
def __init__(self, values):
self.values=np.array(values)
Actual model
import torch
from torch import nn
class MutipleInputModel(nn.Module):
def __init__(self,input_layers:Iterator[InputLayer],output_size):
super(MutipleInputModel, self).__init__()
print(input_layers)
self.nns=[]
for i in range(len(input_layers)-1):
current:InputLayer=input_layers[i]
next:InputLayer=input_layers[i+1]
il=nn.Linear(current.inputs,next.inputs,current.useBias)
#To have hidden layers, you need to either use another model or create and attach multiple Linear models - nn.Linear(next.inputs,next.inputs)
name="nn"+str(i)
#models must be directly under self to be found by model.parameters()
self.__setattr__(name,il)
self.nns.append(name)
il=nn.Linear(input_layers[-1].inputs,output_size,current.useBias)
name="nnOutput"
self.__setattr__(name,il)
self.nns.append(name)
def forward(self, inputs:Iterator[InputLayerValue]):
inputsLen=len(inputs[0])
if inputsLen != len(self.nns):
raise Exception("Number of input values provided and input layers must be equal. Provided "+str(inputsLen)+" sets of inputs for a "+str(len(self.nns))+"-input-layer network")
#Initialize first layer of inputs with ones which will then be multiplied by the actual input values
lastOutput=torch.ones(len(inputs),len(inputs[0][0].values)) # Layer 1 Outputs | Layer 2 provided Inputs | Layer 2 actual Inputs
for i in range(inputsLen): # lastOutput | multiplier | input
multiplier=torch.from_numpy(np.array([x[i].values for x in inputs])).float() # 0.2 | 0 | 0
input=lastOutput*multiplier # 1.5 | 1 | 1.5
lastOutput=self.__getattr__(self.nns[i])(input) # 1.0 | 5 | 5
return lastOutput
Training
# Define hyperparameters
model = MutipleInputModel(input_layers=[InputLayer(len(x)) for x in inputs[0]],output_size=1)
n_epochs = 1000
lr=0.001
criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
for epoch in range(1, n_epochs + 1):
optimizer.zero_grad() # Clears existing gradients from previous epoch
output = model([[InputLayerValue(y) for y in x] for x in inputs])
loss = criterion(output, torch.from_numpy(np.array(results)).float())
loss.backward()
optimizer.step()
print('Epoch: {}/{}.............'.format(epoch, n_epochs), end=' ')
print("Loss: {:.4f}".format(loss.item()))
Testing:
def predict(model, input):
input = [[InputLayerValue(y) for y in input]]
out = model(input)
return nn.Sigmoid()(out[0][0]).item()
print(predict(model,[[1, 1, 0, 0, 0, 0, 0, 1, 1, 0], [1, 0, 0]]))
print(predict(model,[[1, 1, 0, 0, 0, 0, 0, 1, 1, 0], [0, 1, 0]]))
print(predict(model,[[1, 1, 0, 0, 0, 0, 0, 1, 1, 0], [0, 0, 1]]))
This is a really basic implementation, but could easily be modified to have hidden layers.
Clearly needs further testing to see if it is actually better than a traditional NN, but I would say it is great for NN explainability.

How to one hot encode numpy array of arrays using keras to_categorical?

I have target data y_tr in shape (92107, 49) where each sample (row) is an array of length 49. I would like to loop over y_tr and one hot encode each array. I have been using keras to_categorical though I am facing an Indexerror.
Array in y_tr:
array([ 379, 379, 379, 252, 4166, 391, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0], dtype=int32)
The number of classes to encode each integer index = 10,000. (I do not wish to encode zero values):
onehots=[]
for seq in y_tr[:,1:]:
try:
onehots.append(to_categorical(seq - 1, num_classes=tar_vocab_size, dtype=np.int32))
except IndexError as e:
raise e
76 n = y.shape[0]
77 categorical = np.zeros((n, num_classes), dtype=dtype)
---> 78 categorical[np.arange(n), y] = 1
79 output_shape = input_shape + (num_classes,)
80 categorical = np.reshape(categorical, output_shape)
IndexError: index 28350 is out of bounds for axis 1 with size 10000

Standard implementation of vectorize_sequences

In François Chollet's Deep Learning with Python, appears this function:
def vectorize_sequences(sequences, dimension=10000):
results = np.zeros((len(sequences), dimension))
for i, sequence in enumerate(sequences):
results[i, sequence] = 1.
return results
I understand what this function does. This function is asked about in this quesion and in this question as well, also mentioned here, here, here, here, here & here. Despite being so wide-spread, this vectorization is, according to Chollet's book is done "manually for maximum clarity." I am interested whether there is a standard, not "manual" way of doing it.
Is there a standard Keras / Tensorflow / Scikit-learn / Pandas / Numpy implementation of a function which behaves very similarly to the function above?
Solution with MultiLabelBinarizer
Assuming sequences is an array of integers with maximum possible value upto dimension-1, we can use MultiLabelBinarizer from sklearn.preprocessing to replicate the behaviour of the function vectorize_sequences
from sklearn.preprocessing import MultiLabelBinarizer
mlb = MultiLabelBinarizer(classes=range(dimension))
mlb.fit_transform(sequences)
Solution with Numpy broadcasting
Assuming sequences is an array of integers with maximum possible value upto dimension-1
(np.array(sequences)[:, :, None] == range(dimension)).any(1).view('i1')
Worked out example
>>> sequences
[[4, 1, 0],
[4, 0, 3],
[3, 4, 2]]
>>> dimension = 10
>>> mlb = MultiLabelBinarizer(classes=range(dimension))
>>> mlb.fit_transform(sequences)
array([[1, 1, 0, 0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0, 0, 0]])
>>> (np.array(sequences)[:, :, None] == range(dimension)).any(1).view('i1')
array([[0, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 1, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 1, 0, 0, 0, 0, 0]])

Counting zeros in a rolling - numpy array (including NaNs)

I am trying to find a way of Counting zeros in a rolling using numpy array ?
Using pandas I can get it using:
df['demand'].apply(lambda x: (x == 0).rolling(7).sum()).fillna(0))
or
df['demand'].transform(lambda x: x.rolling(7).apply(lambda x: 7 - np.count _nonzero(x))).fillna(0)
In numpy, using the code from Here
def rolling_window(a, window_size):
shape = (a.shape[0] - window_size + 1, window_size) + a.shape[1:]
print(shape)
strides = (a.strides[0],) + a.strides
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
arr = np.asarray([10, 20, 30, 5, 6, 0, 0, 0])
np.count_nonzero(rolling_window(arr==0, 7), axis=1)
Output:
array([2, 3])
However, I need the first 6 NaNs as well, and fill it with zeros:
Expected output:
array([0, 0, 0, 0, 0, 0, 2, 3])
Think an efficient one would be with 1D convolution -
def sum_occurences_windowed(arr, W):
K = np.ones(W, dtype=int)
out = np.convolve(arr==0,K)[:len(arr)]
out[:W-1] = 0
return out
Sample run -
In [42]: arr
Out[42]: array([10, 20, 30, 5, 6, 0, 0, 0])
In [43]: sum_occurences_windowed(arr,W=7)
Out[43]: array([0, 0, 0, 0, 0, 0, 2, 3])
Timings on varying length arrays and window of 7
Including count_rolling from #Quang Hoang's post.
Using benchit package (few benchmarking tools packaged together; disclaimer: I am its author) to benchmark proposed solutions.
import benchit
funcs = [sum_occurences_windowed, count_rolling]
in_ = {n:(np.random.randint(0,5,(n)),7) for n in [10,20,50,100,200,500,1000,2000,5000]}
t = benchit.timings(funcs, in_, multivar=True, input_name='Length')
t.plot(logx=True, save='timings.png')
Extending to generic n-dim arrays
from scipy.ndimage.filters import convolve1d
def sum_occurences_windowed_ndim(arr, W, axis=-1):
K = np.ones(W, dtype=int)
out = convolve1d((arr==0).astype(int),K,axis=axis,origin=-(W//2))
out.swapaxes(axis,0)[:W-1] = 0
return out
So, on a 2D array, for counting along each row, use axis=1 and for cols, axis=0 and so on.
Sample run -
In [155]: np.random.seed(0)
In [156]: a = np.random.randint(0,3,(3,10))
In [157]: a
Out[157]:
array([[0, 1, 0, 1, 1, 2, 0, 2, 0, 0],
[0, 2, 1, 2, 2, 0, 1, 1, 1, 1],
[0, 1, 0, 0, 1, 2, 0, 2, 0, 1]])
In [158]: sum_occurences_windowed_ndim(a, W=7)
Out[158]:
array([[0, 0, 0, 0, 0, 0, 3, 2, 3, 3],
[0, 0, 0, 0, 0, 0, 2, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 4, 3, 4, 3]])
# Verify with earlier 1D solution
In [159]: np.vstack([sum_occurences_windowed(i,7) for i in a])
Out[159]:
array([[0, 0, 0, 0, 0, 0, 3, 2, 3, 3],
[0, 0, 0, 0, 0, 0, 2, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 4, 3, 4, 3]])
Let's test out our original 1D input array -
In [187]: arr
Out[187]: array([10, 20, 30, 5, 6, 0, 0, 0])
In [188]: sum_occurences_windowed_ndim(arr, W=7)
Out[188]: array([0, 0, 0, 0, 0, 0, 2, 3])
I would modify the function as follow:
def count_rolling(a, window_size):
shape = (a.shape[0] - window_size + 1, window_size) + a.shape[1:]
strides = (a.strides[0],) + a.strides
rolling = np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
out = np.zeros_like(a)
out[window_size-1:] = (rolling == 0).sum(1)
return out
arr = np.asarray([10, 20, 30, 5, 6, 0, 0, 0])
count_rolling(arr,7)
Output:
array([0, 0, 0, 0, 0, 0, 2, 3])

Is there a way to slice out multiple 2D numpy arrays from one 2D numpy array in one batch operation?

I have a numpy array heatmap of shape (img_height, img_width) and another array bboxes of shape (K, 4), where K is a number of bounding boxes.
Each bounding box is defined
like so: [x_top_left, y_top_left, width, height].
Here's an example of such array:
bboxes = np.array([
[0, 0, 4, 7],
[3, 4, 3, 4],
[7, 2, 3, 7]
])
heatmap is initally filled with zeros.
What I need to do is to put value 1 for each bounding box in it's corresponding place.
The resulting heatmap should be:
heatmap = np.array([
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0],
[1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0],
[0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
])
Important things to note:
axis 0 corresponds to image height
axis 1 corresponds to image width
I've already solved this using python for loop, like so:
for bbox in bboxes:
# y_top_left:y_top_left + img_height, x_top_left:x_top_left + img_width
heatmap[bbox[1] : bbox[1] + bbox[3], bbox[0] : bbox[0] + bbox[2]] = 1
I would like to avoid using python for loops (if it's possible) and be able to do something like this:
heatmap[bboxes[:,1] : bboxes[:,1] + bboxes[:,3], bboxes[:,0]:bboxes[:,0] + bboxes[:,2]] = 1
Is there a way of doing such multiple slicing in numpy?
I am aware of numpy integer array indexing, but to generate such indices I am also unable to avoid python for loops.