I would like to add skip connections for my inner layers of a fully convolutional network in keras, there is a keras.layers.Add option and there is a keras.layers.concatenate option.
What is the difference? and which one I should use?
What is the difference?
Add layer adds two input tensor while concatenate appends two tensors. You can refer this documentation for more info.
Example:
import keras
import tensorflow as tf
import keras.backend as K
a = tf.constant([1,2,3])
b = tf.constant([4,5,6])
add = keras.layers.Add()
print(K.eval(add([a,b])))
#output: [5 7 9]
concat = keras.layers.Concatenate()
print(K.eval(concat([a,b])))
#output: array([1, 2, 3, 4, 5, 6], dtype=int32)
which one I should use?
You can use Add for skip connections.
Related
When I tried using Sequential API and Functional API in Tensorflow to apply the same simple embedding function, I see different result.
The result is as follows:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras import layers
inputs = np.random.randint(0, 99, [32, 100, 1])
myLayer = layers.Embedding(input_dim = 100, output_dim = 8)
# Sequential API
sm = keras.Sequential()
sm.add(myLayer)
sm_out = sm(inputs)
sm_out.shape # Shape of sm_out is: TensorShape([32, 100, 8])
# Functional API
fm_out = myLayer(inputs)
fm_out.shape # Shape of fm_out is: TensorShape([32, 100, 1, 8])
Is it intended or a bug?
First of all, your second call is not a functional API call. You need to wrap your layer output (with a tf.keras.layers.Input) in a tf.keras.models.Model for this to be a functional API call.
Secondly, when you're calling the sequential model, it is smart enough to detect that last dimension is 1 and ignore that when looking up embeddings (I'm not sure where exactly this is handled, maybe someone else can point to). So when you pass in a tensor of [32, 100, 1], what the embedding layer really sees is a [32, 100] sized array. This, after the look up, gets converted to a [32, 100, 8] sized tensor.
In your second call, when calling the model directly, it doesn't do this. So it simply converts the [32, 100, 1] sized input to a [32, 100, 1, 8] sized input.
You can get the same result from both these methods if you set your inputs shape to [32, 100] or [32, 100, 2] (last dimension != 1).
I guess the lesson here is always use the input_shape argument (to the first layer of the Sequential model) to prevent such unexpected behaviors.
I'm trying to re-implement code generated in tensorflow into pytroch, but I came across maxpooling, looked into the documentation of the two frameworks, and found that their behavior is not the same. Can someone please explain to me why they are different, and which one is more efficient (I ask this because they give a different result)?
import tensorflow
from tensorflow.keras.layers import GlobalMaxPool1D
tf_tensor = tensorflow.random.normal([8, 6, 5])
tf_maxpool = GlobalMaxPool1D()
print("output shape : ", tf_maxpool(tf_tensor).shape)
output shape : (8, 5)
import torch
import torch.nn as nn
torch_tensor = torch.tensor(tf_tensor.numpy())
maxpool = nn.MaxPool1d(kernel_size=2)
print("output shape : ", maxpool(torch_tensor).shape)
output shape : torch.Size([8, 6, 2])
MaxPool vs GlobalMaxPool
torch.nn.MaxPool1d pools every N adjacent values by performing max operation.
For these values:
[1, 2, 3, 4, 5, 6, 7, 8]
with kernel_size=2 as you've specified, you would get the following values:
[2, 4, 6, 8]
which means a sliding window of size 2 gets the maximum value and moves on to the next pair.
Global Pooling is a similar operation, but gets the maximum value from the whole list, as pointed out in Ivan's answer. In our case, we would simply get one 8 value.
This operation, in PyTorch, is called torch.nn.AdaptiveAvgPool1d (optionally followed by torch.nn.Flatten):
import torch
tensor = torch.randn(8, 6, 5)
global_max_pooling = torch.nn.Sequential(
torch.nn.AdaptiveMaxPool1d(1), # (8, 6, 1) shape
torch.nn.Flatten(), # (8, 6) after removing unnecessary 1 dimension
)
global_max_pooling(tensor) # (8, 6)
The above explanation is simplified as this operation is carried across specific dimension.
Tensorflow vs PyTorch shape difference
As one could notice, in the case of Tensorflow the output is of shape (8, 5), while in the case of PyTorch it is (8, 6).
This difference stems from different channels dimensions (see here for channels last in PyTorch), namely:
PyTorch assumes data layout of (batch, channels, sequence)
Tensorflow assumes data layout of (batch, sequence, channels) (a.k.a. channels last)
One has to permute the data in case of PyTorch to get exactly the same results:
tensor = tensor.permute(0, 2, 1) # (8, 5, 6)
global_max_pooling(tensor) # (8, 5)
Efficiency
Use torch.nn.AdaptiveAvgPool1d when you want to perform pooling with specified output size (different than 1) as it skips some unnecessary operations torch.nn.MaxPool1d performs (going over the same elements more than once, which is out of scope of this question).
In general case, when we perform global pooling both are roughly equivalent and perform the same number of operations.
Global maximum pooling has no window size, since it is global it considers the whole sequence. The equivalent operator is simply torch.max applied channel-wise, i.e. axis=1:
>>> maxpool = torch_tensor.max(1).values
>>> maxpool.shape
torch.Size([8, 5])
I know it's basic and too easy for you people, but I'm a beginner who needs your help.
I'm struggling to make binary classifier with CNN.
My final goal is to check accuracy over 0.99
I import both MNIST and FASHION_MNIST to identify if it's number or clothing.
So there are 2 category. I want to categorize 0-60000 as 0, and 60001-120000 as 1.
I will use binary_crossentropy.
but I dont know how to start from the beginning.
How can I use vstack hstack at first to combine MNIST and FASHION_MNIST?
This is how I tried so far
****import numpy as np
from keras.datasets import mnist
from keras.datasets import fashion_mnist
import keras
import tensorflow as tf
from keras.utils.np_utils import to_categorical
num_classes = 2
train_images = train_images.astype("float32") / 255
test_images = test_images.astype("float32") / 255
train_images = train_images.reshape((-1, 784))
test_images = test_images.reshape((-1, 784))
train_labels = to_categorical(train_labels, num_classes)
test_labels = to_categorical(test_labels, num_classes)****
First of all
They're images so better treat them as images and don't reshape them to vectors.
Now the answer of the question. Suppose you have mnist_train_image and fashion_train_image, both have (60000, 28, 28) input shape.
What you want to do is consist of 2 parts, combining inputs and making the targets.
First the inputs
As you've already wrote in the question, you can use np.vstack like this
>>> train_image = np.vstack((fashion_train_image, mnist_train_image))
>>> train_image.shape
(120000, 28, 28)
But as you should have already noticed, remembering whether you need vstack or dstack or hstack is kinda a pain. My preference is that I'd use np.concatenate instead
>>> train_image = np.concatenate((fashion_train_image, mnist_train_image), axis=0)
>>> train_image.shape
(120000, 28, 28)
Now instead of remembering what the duck are v or h or d you just need to remember the axis (or dimension) you want to concatenate, in this case it's the first axis which means 0. Especially in case like this one where the "vertical" is the second axis because it's a stack of images and the first axis is "batch".
Next, the labels
Since you want to categorize 0-60000 as 0, and 60001-120000 as 1, there's a lot of fancy ways to do this.
But in a nutshell you can use np.zeros to create an array filled with 0. And np.ones to, you guess it, create an array filled with 1. But as both ones and zeros give you an array of float and I'm not sure this will become a problem or not so I add .astype('uint8') in the back just in case. You can add parameter dtype='uint8' in the function too.
Use the concatenate from above
>>> train_labels = np.concatenate((np.zeros(60000), np.ones(60000))).astype('uint8')
>>> train_labels.shape
(120000,)
Use ones or zeros for the whole size and subtract or add or reassign the rest
>>> train_labels = np.zeros(120000).astype('uint8')
>>> train_labels[60000:] = 1
#####
>>> train_labels = np.ones(120000, dtype='uint8')
>>> train_labels[:60000] -= 1
Important!!!!
There's a noticeable mistake in your example about the label, the index start with 0 so the 60,000th index is 59,999.
So what you actually want is categorize 0-59999 as 0, and 60000-119999 as 1.
does anyone know how to use map_fn or any other tensorflow-func to do a computation on every combination of two input-tensors?
So what i want is something like this:
Having two arrays ([1,2] and [4,5]) i want as a result a matrix with the output of the computation (e.g. add) on every possible combination of the two arrays. So the result would be:
[[5,6],
[6,7]]
I used map_fn but this only takes the elements index-wise:
[[5]
[7]]
Has anyone an idea how implement this?
Thanks
You can add new unit dimensions to each Tensor, then rely on broadcasting addition:
import tensorflow as tf
import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution()
first = tf.constant([1, 2])
second = tf.constant([4, 5])
print(first[None, :] + second[:, None])
Prints:
tf.Tensor(
[[5 6]
[6 7]], shape=(2, 2), dtype=int32)
I am looking for a way to unpack bits in TF in the same way I can do this with np.unpackbits. So revert the operation like:
import numpy as np
import tensorflow as tf
original = np.random.choice(a=[1, 0], size=(100))
data = np.packbits(original.astype(np.bool), axis=None)
X = tf.constant(data)
Assuming I have access only to X, how to convert it to original in TF. Of course I can use numpy, but this will move data from TF to python and then back to TF.
Few thoughts I had in mind (have not implemented any of them):
use tf.map_fn
use tf.contrib.lookup
For both of them the ideas is to map each number to a vector, concat all the vectors, reshape, remove unneeded elements.
Both of the approaches seems more complicated that they should be. Does anyone has an efficient way (in terms of speed) how to achieve numpy's unpackbits in tensorflow?
Perhaps something like this:
import tensorflow as tf
x = tf.constant((1, 2, 7, 0, 255), dtype=tf.uint8)
b = tf.constant((128, 64, 32, 16, 8, 4, 2, 1), dtype=tf.uint8)
unpacked = tf.reshape(tf.mod(tf.to_int32(x[:,None] // b), 2), [-1])
unpacked is in int32 due to tf.mod not accepting bytes, you may want to cast it to uint8 again.
Tensorflow 1.3 will have bitwise operations, so this last line could be replaced with
unpacked = tf.reshape(tf.bitwise.bitwise_and(x, b), [-1])
which will hopefully be faster (and the result in uint8).