I am looking for a way to unpack bits in TF in the same way I can do this with np.unpackbits. So revert the operation like:
import numpy as np
import tensorflow as tf
original = np.random.choice(a=[1, 0], size=(100))
data = np.packbits(original.astype(np.bool), axis=None)
X = tf.constant(data)
Assuming I have access only to X, how to convert it to original in TF. Of course I can use numpy, but this will move data from TF to python and then back to TF.
Few thoughts I had in mind (have not implemented any of them):
use tf.map_fn
use tf.contrib.lookup
For both of them the ideas is to map each number to a vector, concat all the vectors, reshape, remove unneeded elements.
Both of the approaches seems more complicated that they should be. Does anyone has an efficient way (in terms of speed) how to achieve numpy's unpackbits in tensorflow?
Perhaps something like this:
import tensorflow as tf
x = tf.constant((1, 2, 7, 0, 255), dtype=tf.uint8)
b = tf.constant((128, 64, 32, 16, 8, 4, 2, 1), dtype=tf.uint8)
unpacked = tf.reshape(tf.mod(tf.to_int32(x[:,None] // b), 2), [-1])
unpacked is in int32 due to tf.mod not accepting bytes, you may want to cast it to uint8 again.
Tensorflow 1.3 will have bitwise operations, so this last line could be replaced with
unpacked = tf.reshape(tf.bitwise.bitwise_and(x, b), [-1])
which will hopefully be faster (and the result in uint8).
Related
I am trying to code a custom metric for U-net model implemented using keras/tensorflow. In the metric, I need to use the opencv function, 'cv2.dilate' on the ground truth. When I tried to use it, it gave the error as y_true is a tensor and cv2.dilate expects a numpy array.
Any idea on how to implement this?
I tried to convert tensor to numpy array but it is not working.
I searched for the tensorflow implementation of cv2.dilate but couldnt find one.
One possibility, if you are using a simple rectangular kernel in your dilation, is to use tf.nn.max_pool2d as a replacement.
import numpy as np
import tensorflow as tf
import cv2
image = np.random.random((28,28))
kernel_size = 3
# OpenCV dilation works with grayscale image, with H,W dimensions
dilated_cv = cv2.dilate(image, np.ones((kernel_size, kernel_size), np.uint8))
# TensorFlow maxpooling works with batch and channels: B,H,W,C dimenssions
image_w_batch_and_channels = image[None,...,None]
dilated_tf = tf.nn.max_pool2d(image_w_batch_and_channels, kernel_size, 1, "SAME")
# checking that the results are equal
np.allclose(dilated_cv, dilated_tf[0,...,0])
However, given that you mention that you are applying dilation on the ground truth, this dilation does not need to be differentiable. In that case, you can wrap your dilation in a tf.numpy_function
from functools import partial
# be sure to put the correct output type, tf.float64 is working in that specific case because numpy defaults to float64, but it might be different in your case
dilated_tf_npfunc = tf.numpy_function(
partial(cv2.dilate, kernel=np.ones((kernel_size, kernel_size), np.uint8)), [image]
)
When I tried using Sequential API and Functional API in Tensorflow to apply the same simple embedding function, I see different result.
The result is as follows:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras import layers
inputs = np.random.randint(0, 99, [32, 100, 1])
myLayer = layers.Embedding(input_dim = 100, output_dim = 8)
# Sequential API
sm = keras.Sequential()
sm.add(myLayer)
sm_out = sm(inputs)
sm_out.shape # Shape of sm_out is: TensorShape([32, 100, 8])
# Functional API
fm_out = myLayer(inputs)
fm_out.shape # Shape of fm_out is: TensorShape([32, 100, 1, 8])
Is it intended or a bug?
First of all, your second call is not a functional API call. You need to wrap your layer output (with a tf.keras.layers.Input) in a tf.keras.models.Model for this to be a functional API call.
Secondly, when you're calling the sequential model, it is smart enough to detect that last dimension is 1 and ignore that when looking up embeddings (I'm not sure where exactly this is handled, maybe someone else can point to). So when you pass in a tensor of [32, 100, 1], what the embedding layer really sees is a [32, 100] sized array. This, after the look up, gets converted to a [32, 100, 8] sized tensor.
In your second call, when calling the model directly, it doesn't do this. So it simply converts the [32, 100, 1] sized input to a [32, 100, 1, 8] sized input.
You can get the same result from both these methods if you set your inputs shape to [32, 100] or [32, 100, 2] (last dimension != 1).
I guess the lesson here is always use the input_shape argument (to the first layer of the Sequential model) to prevent such unexpected behaviors.
I'm trying to re-implement code generated in tensorflow into pytroch, but I came across maxpooling, looked into the documentation of the two frameworks, and found that their behavior is not the same. Can someone please explain to me why they are different, and which one is more efficient (I ask this because they give a different result)?
import tensorflow
from tensorflow.keras.layers import GlobalMaxPool1D
tf_tensor = tensorflow.random.normal([8, 6, 5])
tf_maxpool = GlobalMaxPool1D()
print("output shape : ", tf_maxpool(tf_tensor).shape)
output shape : (8, 5)
import torch
import torch.nn as nn
torch_tensor = torch.tensor(tf_tensor.numpy())
maxpool = nn.MaxPool1d(kernel_size=2)
print("output shape : ", maxpool(torch_tensor).shape)
output shape : torch.Size([8, 6, 2])
MaxPool vs GlobalMaxPool
torch.nn.MaxPool1d pools every N adjacent values by performing max operation.
For these values:
[1, 2, 3, 4, 5, 6, 7, 8]
with kernel_size=2 as you've specified, you would get the following values:
[2, 4, 6, 8]
which means a sliding window of size 2 gets the maximum value and moves on to the next pair.
Global Pooling is a similar operation, but gets the maximum value from the whole list, as pointed out in Ivan's answer. In our case, we would simply get one 8 value.
This operation, in PyTorch, is called torch.nn.AdaptiveAvgPool1d (optionally followed by torch.nn.Flatten):
import torch
tensor = torch.randn(8, 6, 5)
global_max_pooling = torch.nn.Sequential(
torch.nn.AdaptiveMaxPool1d(1), # (8, 6, 1) shape
torch.nn.Flatten(), # (8, 6) after removing unnecessary 1 dimension
)
global_max_pooling(tensor) # (8, 6)
The above explanation is simplified as this operation is carried across specific dimension.
Tensorflow vs PyTorch shape difference
As one could notice, in the case of Tensorflow the output is of shape (8, 5), while in the case of PyTorch it is (8, 6).
This difference stems from different channels dimensions (see here for channels last in PyTorch), namely:
PyTorch assumes data layout of (batch, channels, sequence)
Tensorflow assumes data layout of (batch, sequence, channels) (a.k.a. channels last)
One has to permute the data in case of PyTorch to get exactly the same results:
tensor = tensor.permute(0, 2, 1) # (8, 5, 6)
global_max_pooling(tensor) # (8, 5)
Efficiency
Use torch.nn.AdaptiveAvgPool1d when you want to perform pooling with specified output size (different than 1) as it skips some unnecessary operations torch.nn.MaxPool1d performs (going over the same elements more than once, which is out of scope of this question).
In general case, when we perform global pooling both are roughly equivalent and perform the same number of operations.
Global maximum pooling has no window size, since it is global it considers the whole sequence. The equivalent operator is simply torch.max applied channel-wise, i.e. axis=1:
>>> maxpool = torch_tensor.max(1).values
>>> maxpool.shape
torch.Size([8, 5])
I know it's basic and too easy for you people, but I'm a beginner who needs your help.
I'm struggling to make binary classifier with CNN.
My final goal is to check accuracy over 0.99
I import both MNIST and FASHION_MNIST to identify if it's number or clothing.
So there are 2 category. I want to categorize 0-60000 as 0, and 60001-120000 as 1.
I will use binary_crossentropy.
but I dont know how to start from the beginning.
How can I use vstack hstack at first to combine MNIST and FASHION_MNIST?
This is how I tried so far
****import numpy as np
from keras.datasets import mnist
from keras.datasets import fashion_mnist
import keras
import tensorflow as tf
from keras.utils.np_utils import to_categorical
num_classes = 2
train_images = train_images.astype("float32") / 255
test_images = test_images.astype("float32") / 255
train_images = train_images.reshape((-1, 784))
test_images = test_images.reshape((-1, 784))
train_labels = to_categorical(train_labels, num_classes)
test_labels = to_categorical(test_labels, num_classes)****
First of all
They're images so better treat them as images and don't reshape them to vectors.
Now the answer of the question. Suppose you have mnist_train_image and fashion_train_image, both have (60000, 28, 28) input shape.
What you want to do is consist of 2 parts, combining inputs and making the targets.
First the inputs
As you've already wrote in the question, you can use np.vstack like this
>>> train_image = np.vstack((fashion_train_image, mnist_train_image))
>>> train_image.shape
(120000, 28, 28)
But as you should have already noticed, remembering whether you need vstack or dstack or hstack is kinda a pain. My preference is that I'd use np.concatenate instead
>>> train_image = np.concatenate((fashion_train_image, mnist_train_image), axis=0)
>>> train_image.shape
(120000, 28, 28)
Now instead of remembering what the duck are v or h or d you just need to remember the axis (or dimension) you want to concatenate, in this case it's the first axis which means 0. Especially in case like this one where the "vertical" is the second axis because it's a stack of images and the first axis is "batch".
Next, the labels
Since you want to categorize 0-60000 as 0, and 60001-120000 as 1, there's a lot of fancy ways to do this.
But in a nutshell you can use np.zeros to create an array filled with 0. And np.ones to, you guess it, create an array filled with 1. But as both ones and zeros give you an array of float and I'm not sure this will become a problem or not so I add .astype('uint8') in the back just in case. You can add parameter dtype='uint8' in the function too.
Use the concatenate from above
>>> train_labels = np.concatenate((np.zeros(60000), np.ones(60000))).astype('uint8')
>>> train_labels.shape
(120000,)
Use ones or zeros for the whole size and subtract or add or reassign the rest
>>> train_labels = np.zeros(120000).astype('uint8')
>>> train_labels[60000:] = 1
#####
>>> train_labels = np.ones(120000, dtype='uint8')
>>> train_labels[:60000] -= 1
Important!!!!
There's a noticeable mistake in your example about the label, the index start with 0 so the 60,000th index is 59,999.
So what you actually want is categorize 0-59999 as 0, and 60000-119999 as 1.
I regularly use scikit-learn pipelines to streamline model processing, and I'm wondering the easiest way to do something similar with Keras in Tensorflow 2.0.
What I'd like to do is deploy a Keras model as an API endpoint, and then submit a piece of text in a numpy array to it and have it tokenized, padded and predicted. But I don't know the shortest path to do this.
Here's some sample code:
from tensorflow import keras
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Embedding, Dense, Flatten
import numpy as np
sample_words = [
'The sky is blue',
'The sky delivers us many gifts',
'Wise men appreciate gifts for what they are, not what they are not',
'Wherever you go, there you are',
'Don\'t pass judgment onto others, or you will quickly be judged yourself'
]
y = np.array([1, 0, 1, 1, 0])
tokenizer = Tokenizer(num_words=10)
tokenizer.fit_on_texts(sample_words)
train_sequences = tokenizer.texts_to_sequences(sample_words)
train_sequences = pad_sequences(train_sequences, maxlen=7)
mod = Sequential([
Embedding(10, 2, input_length=7),
Flatten(),
Dense(3, activation='relu'),
Dense(1, activation='sigmoid')
])
mod.compile(optimizer='adam', loss='binary_crossentropy')
mod.fit(train_sequences, y)
The idea is that if I have a web form and someone submits a form with the words 'The sky is pretty today', I can wrap it in a numpy array, send it to the endpoint (which will be setup on Google Cloud), and have it padded, tokenized, and predicted.
In scikit learn it would be as simple as: pipe = make_pipeline(tokenizer, mod), and then go from there.
I have a feeling there are some solutions that include td.Datasets, but I was hoping keras had something in it that was more user friendly.
Keras is easy in a way that there is no need to explicitly build any pipelines.
The Keras model is using Tensorflow backend to create a computation graph which could be loosely said as similar to scikit-learn's pipeline.
Thus your mod is in itself equivalent to a pipeline having the operations: Embedding -> Flatten -> Dense -> Dense. The mod.compile() method is generating the tensorflow computation graph.
Then everything comes together in model.fit() method where you plug in your inputs to your model (i.e. pipeline) and then the method trains on your data.
In order to have the tokenization be a part of your model, the TextVectorization layer can be used.
This layer has basic options for managing text in a Keras model. It transforms a batch of strings (one sample = one string) into either a list of token indices (one sample = 1D tensor of integer token indices) or a dense representation (one sample = 1D tensor of float values representing data about the sample's tokens)
Code snapshot:
vectorize_layer = TextVectorization(
max_tokens=max_features,
output_mode='int',
output_sequence_length=max_len
)
model.add(vectorize_layer)
input_data = [["foo qux bar"], ["qux baz"]]
model.predict(input_data)
>>>
array([[2, 1, 4, 0],
[1, 3, 0, 0]])