What I mean is, can I, for example, construct 2 different sprite images and be able to choose one of them while viewing embeddings in 2D/3D space using TSNE/PCA?
In other words, when using the following code:
embedding.sprite.image_path = "Path/to/the/sprite_image.jpg"
Is there a way to add another sprite image?
So, when training a Conv Net to distinguish between MNIST digits, I not only need to view the 1,2,..9, and 0 in the 3D/2D space, instead, I would like to see where are the ones gathering in that space. Same for 2s, 3s and so on. so I need a unique color for the 1s, another one for the 2s and so on... I need to view this as in the following image:
source
Any help is much appreciated!
There is an easier way to do this with filtering. You can just select the labels with a regex syntax:
If this is not what you are looking for, you could create a sprite image that assigns the same plain color image to each of your labels!
This functionality should come out of the box (without additional sprite images). See 'colour by' in the left sidepanel. You can toggle the A to switch sprite images on and off.
This run was produced with the example on the front page of the tensorboardX projector GitHub repo. https://github.com/lanpa/tensorboardX
You can also see a live demo with MNIST dataset (images and colours) at http://projector.tensorflow.org/
import torchvision.utils as vutils
import numpy as np
import torchvision.models as models
from torchvision import datasets
from tensorboardX import SummaryWriter
resnet18 = models.resnet18(False)
writer = SummaryWriter()
sample_rate = 44100
freqs = [262, 294, 330, 349, 392, 440, 440, 440, 440, 440, 440]
for n_iter in range(100):
dummy_s1 = torch.rand(1)
dummy_s2 = torch.rand(1)
# data grouping by `slash`
writer.add_scalar('data/scalar1', dummy_s1[0], n_iter)
writer.add_scalar('data/scalar2', dummy_s2[0], n_iter)
writer.add_scalars('data/scalar_group', {'xsinx': n_iter * np.sin(n_iter),
'xcosx': n_iter * np.cos(n_iter),
'arctanx': np.arctan(n_iter)}, n_iter)
dummy_img = torch.rand(32, 3, 64, 64) # output from network
if n_iter % 10 == 0:
x = vutils.make_grid(dummy_img, normalize=True, scale_each=True)
writer.add_image('Image', x, n_iter)
dummy_audio = torch.zeros(sample_rate * 2)
for i in range(x.size(0)):
# amplitude of sound should in [-1, 1]
dummy_audio[i] = np.cos(freqs[n_iter // 10] * np.pi * float(i) / float(sample_rate))
writer.add_audio('myAudio', dummy_audio, n_iter, sample_rate=sample_rate)
writer.add_text('Text', 'text logged at step:' + str(n_iter), n_iter)
for name, param in resnet18.named_parameters():
writer.add_histogram(name, param.clone().cpu().data.numpy(), n_iter)
# needs tensorboard 0.4RC or later
writer.add_pr_curve('xoxo', np.random.randint(2, size=100), np.random.rand(100), n_iter)
dataset = datasets.MNIST('mnist', train=False, download=True)
images = dataset.test_data[:100].float()
label = dataset.test_labels[:100]
features = images.view(100, 784)
writer.add_embedding(features, metadata=label, label_img=images.unsqueeze(1))
# export scalar data to JSON for external processing
writer.export_scalars_to_json("./all_scalars.json")
writer.close()
There are some threads mentioning that this currently fails beyond a threshold number of datapoints. https://github.com/lanpa/tensorboardX
Related
I am trying to do image colorization. I have 5000 images (256x256x3) and would like not to load all data in my program (for memory reason). I have found that it is possible to use ImageDataGenerator.flow_from_directory() but I use LAB images and I would like to feed my model with a numpy array of the L component (256, 256, 1). My targets are A and B components (256, 256, 2). To have my image I then merge the input and output to have a LAB image (256, 256, 3). The problem i that ImageDataGenerator.flow_from_directory() only works with image type files (so a 256x256x3 image) and I would like to know if there is a way to do the same thing with numpy arrays.
I tried using tf.data.Dataset.list_files(), I had all my files but I did not found how to load my numpy array to feed my model. I guess I need to use some sort of generator but I do not really understand how to use it. This is what I have for now :
HEIGHT = 256
WIDTH = HEIGHT
Batch_size = 50
dir_X_train = 'data/X_train_np/train_black_resized/*.npy'
dir_X_test = 'data/X_test/test_black_resized/*.npy'
dir_y_train = 'data/y_train_np/train_color_resized/*.npy'
dir_y_test = 'data/y_test/test_color_resized/*.npy'
X_train_dataset = tf.data.Dataset.list_files(dir_X_train, shuffle=False).batch(Batch_size)
y_train_dataset = tf.data.Dataset.list_files(dir_y_train, shuffle=False).batch(Batch_size)
def process_path(file_path):
return tf.io.read_file(file_path[0])
X_train_dataset = X_train_dataset.map(process_path)
y_train_dataset = y_train_dataset.map(process_path)
train_dataset = tf.data.Dataset.zip((X_train_dataset, y_train_dataset))
for image_black, image_color in train_dataset.take(1):
print(image_black.numpy()[:100])
print(type(image_black))
print(image_color.numpy()[:100])
print(type(image_color))
Output :
b"\x93NUMPY\x01\x00v\x00{'descr': '<f4', 'fortran_order': False, 'shape': (256, 256), } "
<class 'tensorflow.python.framework.ops.EagerTensor'>
b"\x93NUMPY\x01\x00v\x00{'descr': '<f4', 'fortran_order': False, 'shape': (256, 256, 2), } "
<class 'tensorflow.python.framework.ops.EagerTensor'>
The shape seems to be correct but I don't know how to have the numpy.array
I am trying to find the main 200 components of a datasets of 846 images (2048x2048x3 RGB) with sklearn.decomposition.IncrementalPCA.
Data are read by cv2 and reshaped into a 2d np array ([846,2048x2048x3] size, float16)
To ensure a smaller memory cost, I used partial_fit() and divide the original data into smaller chunks (batches) in both partial_fit() and transform() steps.
just like the way in this problem's solution:
Python PCA on Matrix too large to fit into memory
Now my code works well for relative smaller size computations, like computing 20 components for 200 images in the datasets. It outputs right outcomes.
However, the tasks demands me to compute 200 components, which leads to the limit that my batch's size should be larger or at least equal to 200. (according to sklearn's document and the information in the terminal when running the code)
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.IncrementalPCA.html
With such big chunk size,I can finish the IPCA model set, but always face MemoryError when doing partial_fit()
What's more, another problem is:
I need to use inverse_transform later, I am not sure if I can use chunk-style compute in this step or not. (In the code below I did not use it.)
What can I do to avoid this MemoryError? Or should I replace IncrementalPCA with some other method instead ? (these alternatives should have some method like inverse_transform())
The all memory I can access to is 131661572 kB(~127GB)
My code:
from sklearn.decomposition import PCA, IncrementalPCA
import numpy as np
import cv2
import os
folder_path = "./output_img"
input=[]
for i in range(1, 847):
if i%10 == 0: print("loading",i,"th image")
# if i == 60: continue #special case, should be skipped
image_path = folder_path+f"/{i}neutral.jpg"
img = cv2.imread(image_path)
input.append(img.reshape(-1))
print("Loaded all",i,"images")
# change into numpy matrix
all_image = np.stack(input,axis=0)
# trans to 0-1 format float64
all_image = (all_image.astype(np.float16))
### shape: #_of_imag x image_pixel_num (50331648 for img_normals case)
# print(all_image)
# print(all_image.shape)
# PCA, keeps 200 features
COM_NUM=200
pca=IncrementalPCA(n_components = COM_NUM)
print("finished IPCA model set")
saving_path = "./principle847"
element_num = all_image.shape[0] # how many elements(rows) we have in the dataset
chunk_size = 220 # how many elements we feed to IPCA at a time
for i in range(0, element_num//chunk_size):
pca.partial_fit(all_image[i*chunk_size : (i+1)*chunk_size])
print("finished PCA fit:",i*chunk_size,"to",(i+1)*chunk_size)
pca.partial_fit(all_image[(i+1)*chunk_size : element_num]) #tail
print("finished PCA fit:",(i+1)*chunk_size,"to",element_num)
for i in range(0, element_num//chunk_size):
if i==0:
result = pca.transform(all_image[i*chunk_size : (i+1)*chunk_size])
else:
tmp = pca.transform(all_image[i*chunk_size : (i+1)*chunk_size])
result = np.concatenate((result, tmp), axis=0)
print("finished PCA transform:",i*chunk_size,"to",(i+1)*chunk_size)
tmp = pca.transform(all_image[(i+1)*chunk_size : element_num]) #tail
result = np.concatenate((result, tmp), axis=0)
print("finished PCA transform:",(i+1)*chunk_size,"to",element_num)
result = pca.inverse_transform(result)
print("PCA mean:",pca.mean_)
mean_img = pca.mean_
mean_img = mean_img.reshape(2048,2048,3)
mean_img = mean_img.astype(np.uint8)
cv2.imwrite(os.path.join(saving_path,("mean.png")),mean_img)
result=result.reshape(-1,2048,2048,3)
# result shape: #_of_componets * 2048 * 2048 * 3
dst = result
# dst=result/np.linalg.norm(result,axis=(3),keepdims=True)
for j in range(0,COM_NUM):
reconImage = (dst)[j]
# reconImage = reconImage.reshape(4096,4096,3)
reconImage = np.clip(reconImage,0,255)
reconImage = reconImage.astype(np.uint8)
cv2.imwrite(os.path.join(saving_path,("p"+str(j)+".png")),reconImage)
print("Saved",j+1,"principle imgs")
The error goes like:
File "model_generate.py", line 36, in <module>
pca.partial_fit(all_image[i*chunk_size : (i+1)*chunk_size])
File "/root/anaconda3/envs/PCA/lib/python3.8/site-packages/sklearn/decomposition/_incremental_pca.py", line 299, in partial_fit
U, V = svd_flip(U, V, u_based_decision=False)
File "/root/anaconda3/envs/PCA/lib/python3.8/site-packages/sklearn/utils/extmath.py", line 538, in svd_flip
max_abs_rows = np.argmax(np.abs(v), axis=1)
File "/root/anaconda3/envs/PCA/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 1103, in argmax
return _wrapfunc(a, 'argmax', axis=axis, out=out)
File "/root/anaconda3/envs/PCA/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 56, in _wrapfunc
return getattr(obj, method)(*args, **kwds)
MemoryError
Torchvision's RandomResizedCrop is a tool I've found to be extremely handy when I'm working with datasets of high-resolution images at different sizes and aspect ratios and need to resize them down to a uniform size and aspect ratio without squashing and stretching.
Is there an equivalent to this in Tensorflow that can be mapped across a Tensorflow dataset, or a lambda function using tensorflow operations that would achieve the same effective result?
I couldn't find any equivalent in the library, but a somewhat "official" solution is present in the NNCLR tutorial. It indeed relies on the tf.image.crop_and_resize function, as #TFer pointed out.
I modified it a bit to make sure it also has the equivalent of the size argument in the PyTorch implementation, that I called crop_shape that I found clearer:
import tensorflow as tf
class RandomResizedCrop(tf.keras.layers.Layer):
# taken from
# https://keras.io/examples/vision/nnclr/#random-resized-crops
def __init__(self, scale, ratio, crop_shape):
super(RandomResizedCrop, self).__init__()
self.scale = scale
self.log_ratio = (tf.math.log(ratio[0]), tf.math.log(ratio[1]))
self.crop_shape = crop_shape
def call(self, images):
batch_size = tf.shape(images)[0]
random_scales = tf.random.uniform(
(batch_size,),
self.scale[0],
self.scale[1]
)
random_ratios = tf.exp(tf.random.uniform(
(batch_size,),
self.log_ratio[0],
self.log_ratio[1]
))
new_heights = tf.clip_by_value(
tf.sqrt(random_scales / random_ratios),
0,
1,
)
new_widths = tf.clip_by_value(
tf.sqrt(random_scales * random_ratios),
0,
1,
)
height_offsets = tf.random.uniform(
(batch_size,),
0,
1 - new_heights,
)
width_offsets = tf.random.uniform(
(batch_size,),
0,
1 - new_widths,
)
bounding_boxes = tf.stack(
[
height_offsets,
width_offsets,
height_offsets + new_heights,
width_offsets + new_widths,
],
axis=1,
)
images = tf.image.crop_and_resize(
images,
bounding_boxes,
tf.range(batch_size),
self.crop_shape,
)
return images
You can use tf.image.crop_and_resize to crop and resize the image.
Tensorflow provides,
Returns a tensor with crops from the input image at positions defined
at the bounding box locations in boxes. The cropped boxes are all
resized (with bilinear or nearest neighbor interpolation) to a fixed
size = [crop_height, crop_width].
For more details on the library please find here. Thanks
I need to access image shapes to perform an augmentation pipeline although when accessing through image.shape[0] and image.shape[1] I'm unable to perform the augmentations since it outputs that my tensors have shape None.
Related issues: How to access Tensor shape in .map?
Appreciate if anyone could help.
parsed_dataset = tf.data.TFRecordDataset(filenames=train_records_paths).map(parsing_fn) # Returns [image,label]
augmented_dataset = parsed_dataset.map(augment_pipeline)
augmented_dataset = augmented_dataset.unbatch()
Mapped function
"""
Returns:
5 Versions of the original image: 4 corner crops + a central crop and the respective labels.
"""
def augment_pipeline(original_image,label):
central_crop = lambda image: tf.image.central_crop(image,0.5)
corner_crops = lambda image: tf.image.extract_patches(images=tf.expand_dims(image,0), # Transform image in a batch of single sample
sizes=[1, int(0.5 * image.shape[0]), int(0.5 * image.shape[1]), 1], # 50% of the image's height and width
rates=[1, 1, 1, 1],
strides=[1, int(0.5 * image.shape[0]), int(0.5 * image.shape[1]), 1],
padding="SAME")
reshaped_patches = tf.reshape(corner_crops(original_image), [-1,int(0.5*original_image.shape[0]),int(0.5*original_image.shape[1]),3])
images = tf.concat([reshaped_patches,tf.expand_dims(central_crop(original_image),axis=0)],axis=0)
label = tf.reshape(label,[1,1])
labels = tf.tile(label,[5,1])
return images,labels
After further research i was able to manage by using py_func as suggested here and tf.shape(image)[0] here.
Code:
"""
Returns:
5 Versions of the original image: 4 corner crops + a central crop and the respective labels.
"""
def augment_pipeline(original_image,label):
height = int(tf.shape(original_image)[0].numpy() * 0.5) # 50% of the image's height and width
width = int(tf.shape(original_image)[1].numpy() * 0.5)
central_crop = lambda image: tf.image.central_crop(image,0.5)
corner_crops = lambda image: tf.image.extract_patches(images=tf.expand_dims(image,0), # Transform image in a batch of single sample
sizes=[1, height, width, 1],
rates=[1, 1, 1, 1],
strides=[1, height, width, 1],
padding="SAME")
.
.
.
Then we use py_func to allow accessing numpy values inside map function:
parsed_dataset = tf.data.TFRecordDataset(filenames=train_records_paths).map(parsing_fn) # Returns [image,label]
augmented_dataset = parsed_dataset.map(lambda image,label: tf.py_function(func=augment_pipeline,
inp=[image,label],
Tout=[tf.float32,tf.int64]))
augmented_dataset = augmented_dataset.unbatch()
Every Dataset object is iterable. Now the Dataset object can either be in the batched form or the unbatched form. I will tell you how to get their elements shapes in both the cases.
Case 1. Dataset object is in unbatched form.
Method 1. Consuming its elements using iter
it = iter(dataset)
element = next(it)
image,label = element
## element is a tuple
Method 2. using take
element = dataset.take(1)
image,label = element
# element is a tuple
Case 2. When the dataset is batched. Now I assume that the dataset contains (image,label) tuples
Method 1. Using iter
it = iter(dataset)
batch = next(it)
images,labels = batch
## batch is a tuple check it using type(batch)
Method 2. Using take
batch = dataset.take(1)
## Note here each element of the dataset is a batch and each batch contains some number of
## (image,label) tuples
batch = next(iter(batch))
images,labels = batch
## batch is again a tuple
I am trying to display images from the CIFAR-10 TensorFlow tutorial. The images become transformed so that the values read are floats more less between -1 and 3. I'm not show what kind of transformation has been applied. How can I display them to see the original content?
Here is what the part of the image output looks like:
array([[ 1.24836731, 0.04940184, -1.49835348],\n [ 1.117571 , 0.02760247, -1.56375158],\n [ 1.24836731, 0.18019807, -1.41115606],\n [ 1.18296909, 0.09300058, -1.47655416],\n [ 1.13937044, 0.02760247, -1.54195225],\n [ 1.13937044, 0.09300058, -1.52015293],\n
...
np.max(image)
2.9269187
np.min(image)
-1.759946
This is the link to the tutorial:
https://www.tensorflow.org/tutorials/deep_cnn/
Edit:
Rescaling does not seem to work for me:
Try scaling the image to be between 0 and 255? Subtract the min and divide by its new max.
A couple of ways to do this, for the greyscale MNIST images:
tmp = mnist.train.images[0]
tmp = tmp.reshape((28,28))
plt.imshow(tmp, cmap = cm.Greys)
plt.show()
Or, for CIFAR-10 images:
Code below taken from this tutorial
def visualize_sample(X_train, y_train, classes, samples_per_class=7):
"""visualize some samples in the training datasets """
num_classes = len(classes)
for y, cls in enumerate(classes):
idxs = np.flatnonzero(y_train == y) # get all the indexes of cls
idxs = np.random.choice(idxs, samples_per_class, replace=False)
for i, idx in enumerate(idxs): # plot the image one by one
plt_idx = i * num_classes + y + 1 # i*num_classes and y+1 determine the row and column respectively
plt.subplot(samples_per_class, num_classes, plt_idx)
plt.imshow(X_train[idx].astype('uint8'))
plt.axis('off')
if i == 0:
plt.title(cls)
plt.show()