I have 60,000 images with 28x28 shape and 1 channel.
images.shape
# (60000, 300, 300, 1)
So, I want to reshape all the images, say, to 35x35. So, the shape should be:
# (60000, 500, 500, 1)
What I tried is I reshaped every single image one by one and than concatenated them:
images:
import cv2
images = np.reshape(x_test, (60000, 300, 300))
new_images = []
for img in images:
new_images.append(cv2.resize(img, (500, 500), interpolation=cv2.INTER_AREA))
new_images = np.array(new_images)
new_images = np.reshape(new_images, (60000, 500, 500, 1))
But it's taking long time. Is there any more effitient/faster/convenient way of doing this?
currently you resize your data every time after you load it. you should resize/resample your data once, then save that version of your data, and load that version instead.
you can save your whole array using numpy's np.save. that may go faster than loading individual images, which may need to be decoded too. with numpy you have the choice of saving compressed (.npz, np.savez) or uncompressed (.npy). uncompressed takes more space (so that can slow you down) but doesn't require compression/decompression.
Related
I noticed that the output from TensorFlow's image_dataset_from_directory is different than directly loading images (either by PIL, Keras' load_img, etc.). I set up an experiment: I have a single RGB image with dimensions 2400x1800x3, and tried comparing the resulting numpy arrays from the different methods:
from PIL import Image
from tensorflow.keras.utils import image_dataset_from_directory, load_img, img_to_array
train_set = image_dataset_from_directory(
'../data/',
image_size=(2400, 1800), # I'm using original image size
label_mode=None,
batch_size=1
)
for batch in train_set:
img_from_dataset = np.squeeze(batch.numpy()) # remove batch dimension
img_from_keras = img_to_array(load_img(img_path))
img_from_pil = img_to_array(Image.open(img_path))
print(np.all(img_from_dataset == img_from_keras)) # False
print(np.all(img_from_dataset == img_from_pil)) # False
print(np.all(img_from_keras == img_from_pil)) # True
So, even though all methods return the same shape numpy array, the values from image_dataset_from_directory are different. Why is this? And what can/should I do about it?
This is a particular problem during prediction time where I'm taking a single image (i.e. not using image_dataset_from_directory to load the image).
This is strange but I have not figured out exactly why but if you print out a pixel values from the img_from_dataset, img_from_keras and img_from_pil I found that the pixel values for img_from_data are sometimes lower by 1, that is it looks like some kind of rounding is going on. All 3 are supposed to return float32 so I can't see why they should be different. I also tried using
ImageDataGenerator().flow_from_directory and it matches the data for img_from_keras and img_from_pil. Now img_from_dataset return a A tf.data.Dataset object it yields float32 tensors of shape (batch_size, image_size[0], image_size[1], num_channels).
I used this code to detect the pixel value difference where I used a 224 X 224 X3 image
match=True
for i in range(224):
for j in range(224):
for k in range (3):
if img_from_dataset[i,j,k] != img_from_keras[i,j,k]:
match=False
print(img_from_dataset[i,j,k], img_from_keras[i,j,k], i, j, k)
break
if match==False:
break
if match == False:
break
print(match)
An example output of the code is
86.0 87.0 0 0 2
False
If you ever figured out why the difference let me know. I expect one will have to go through the detailed code. I took a quick look. Even though you specified the image size as being the same as the original image, image_dataset_from_directory still resizes the image using tf.image.resize with the iterpolation as interpolation='bilinear'. Maybe the load_img(img_path) and PIL image.open use different interpolations.
I'm currenly trying working with tensorflow dataset 'tf_flowers', and noticed that a lot of images consist mostly of black canvas, like this:
flower1
flower2
Is there any easy way to remove/or filter it out? Preferably it should work on batches, and compile into a graph with #tf.function, as I plan to use it also for bigger datasets with dataset.map(...)
The black pixels are just because of padding. This is a simple operation that allows you to have network inputs having the same size (i.e. you have batches containing images with the of size: 223x221 because smaller images are padded with black pixels).
An alternative to padding that removes the need of adding black pixels to the image, is that of preprocessing the images by:
removing padding via cropping operation
resizing the cropped images to the same size (e.g. 223x221)
You can do all of these operations in simple python, thanks to tensorflow map function. First, define your python function
def py_preprocess_image(numpy_image):
input_size = numpy_image.shape # this is (223, 221)
image_proc = crop_by_removing_padding(numpy_image)
image_proc = resize(image_proc, size=input_size)
return image_proc
Then, given your tensorflow dataset train_data, map the above python function on each input:
# train_data is your tensorflow dataset
train_data = train_data.map(
lambda x: tf.py_func(preprocess_image,
inp = [x], Tout=[tf.float32]),
num_parallel_calls=num_threads
)
Now, you only need to define crop_by_removing_padding and resize, which operate on ordinary numpy arrays and can thus be written in pure python code. For example:
def crop_by_removing_padding(img):
xmax, ymax = np.max(np.argwhere(img), axis=0)
img_crop = img[:xmax + 1, :ymax + 1]
return img_crop
def resize(img, new_size):
img_rs = cv2.resize(img, (new_size[1], new_size[0]), interpolation=cv2.INTER_CUBIC)
return img_rs
I have 100 large-ish (1000x1000) images which I want to use as a training data set for a texture analysis system. I want to randomly generate texture swatches of about 200x200. What is the best way to do this? I would prefer to not preprocess all of the swatches so that each epoch is trained with slightly different swatches.
My initial (naive?) implementation included preprocessing layers in the model that do random crops on the image and just do a ton of epochs to accommodate the small number of large pictures, however after about ~400 epochs TF would crash without exception (it would just exit).
I now find myself coding a data generator (tf.keras.utils.Sequence) that will return a batch of swatches on request, but I feel like I'm reinventing the wheel and it is getting clunky - making me think this can't be the best way.
What is the best way to handle such a situation where you have a somewhat small dataset that you dynamically create more samples from?
I have written a function that will segment an image. Code is below
import cv2
def image_segment( image_path, img_resize, crop_size):
image_list=[]
img=cv2.imread(image_path)
img=cv2.resize(img, img_resize)
shape=img.shape
xsteps =int( shape[0]/crop_size[0])
ysteps = int( shape[1]/crop_size[1])
print (xsteps, ysteps)
for i in range (xsteps):
for j in range (ysteps):
x= i * crop_size[0]
xend=x + crop_size[0]
y= j * crop_size[1]
yend = y + crop_size[1]
cropped_image = cropped_image=img[x: xend, y: yend]
image_list.append(cropped_image)
return image_list
below is an example of use
# This code provides input to the image_segment function
image_path=r'c:\temp\landscape.jpg' # location of image
width=1000 # width to resize input image
height= 800 # height to resize input image
image_resize=( width, height) # specify original image (width, height)
crop_width=200 # width of desired cropped images
crop_height=400 # height of desired cropped images
# Note to get full set of cropped images width/cropped_width and height/cropped_height should be integer values
crop_size=(crop_height, crop_width)
images=image_segment(image_path, image_resize, crop_size) # call the function
The code below will display the resized input image and the resultant cropped images
# this code will display the resized input image and the resultant cropped images
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
img=cv2.imread(image_path) # read in the image
input_resized_image=cv2.resize(img, image_resize) # resize the image
imshow(input_resized_image) # show the resized input image
r=len(images)
plt.figure(figsize=(20, 20))
for i in range(r):
plt.subplot(5, 5, i + 1)
image=images # scale images between 0 and 1 becaue pre-processor set them between -1 and +1
plt.imshow(image[i])
class_name=str(i)
plt.title(class_name, color='green', fontsize=16)
plt.axis('off')
plt.show()
I am new to lightgbm. I have big data (billions of rows constantly updated). The dataset prepared for training is also wide with around 400 columns.
I have 2 questions:
First, my kernel keeps dying after some thousands epochs even for such a small subset as 10 000 rows. Memory use keeps rising while training untill it fails. I have 126 gigabytes of memory.
I have tried training with different parameters, commented are the one that are tried as well
parameters = {
'histogram_pool_size': 5000,
'objective': 'regression',
'metric': 'l2',
'boosting': 'dart',#'gbdt
'num_leaves': 10, #100
'learning_rate': 0.01,
'verbose': 0,
'max_bin': 66,.
'force_col_wise':True, #default
'max_bin': 6, #60 #default
'max_depth': 10, #default
'min_data_in_leaf': 30, #default
'min_child_samples': 20,#default
'feature_fraction': 0.5,#default
'bagging_fraction': 0.8,#default
'bagging_freq': 40,#default
'bagging_seed': 11,#default
'lambda_l1': 2 #default
'lambda_l2': 0.1 #default }
Limiting number of columns seems to help, but I know that some columns that have low score with global feature importance would have significant importance in some local scope.
Second, what is the right way of training lightgbm with big data incrementally and updating lightgbm model with new data? I previously worked mainly with neural nets, which are trained incrementally by nature and I know that trees do not works this way and though it's technically possible to update the model it will not be the same as the model that is trained in a holistic way. How to deal with it?
full code:
# X is dataframe
cat_names = X.select_dtypes(['bool','category',object]).columns.tolist()
for c in cat_names: X[c] = X[c].astype('category')
cat_cols = [c for c, col in enumerate(cat_names)]
X[cat_names] = X[cat_names].apply(lambda x: x.cat.codes)
x = X.values
x_train, x_valid, y_train, y_valid = train_test_split(x, y, test_size=0.2, random_state=42)
train_ds = lightgbm.Dataset(x_train, label=y_train)
valid_ds = lightgbm.Dataset(x_valid, label=y_valid)
model = lightgbm.train(parameters,
train_ds,
valid_sets=valid_ds,
categorical_feature = cat_cols,
num_boost_round=2000,
early_stopping_rounds=50)
Changing data types to less verbose fixed the memory problem! If your dataset is pandas dataframe do something like this:
ds[ds.select_dtypes('float64').columns] = ds.select_dtypes('float64').astype('float32')
ds[ds.select_dtypes('int64').columns] = ds.select_dtypes('int64').astype('int32')
!!! caution Your data ranges may be out of the selected datatype range and pandas will mess up your data in that case. For example int8 dtype is ranges only within -128 to 127, so select the ones that are capable to handle your data.
You may check selected dtype range with
import numpy as np
np.iinfo('int32').min, np.iinfo('int32').max
Starting from the Tensorflow CNN example, I'm trying to modify the model to have multiple images as an input (so that the input has not just 3 input channels, but multiples of 3 by stacking images).
To augment the input, I try to use random image operations, such as flipping, contrast and brightness provided in TensorFlow.
My current solution to apply the same random distortion to all input images is to use a fixed seed value for these operations:
def distort_image(image):
flipped_image = tf.image.random_flip_left_right(image, seed=42)
contrast_image = tf.image.random_contrast(flipped_image, lower=0.2, upper=1.8, seed=43)
brightness_image = tf.image.random_brightness(contrast_image, max_delta=0.2, seed=44)
return brightness_image
This method is called multiple times for each image at graph construction time, so I thought for each image it will use the same random number sequence and consequently, it will result in have the same applied image operations for my image input sequence.
# ...
# distort images
distorted_prediction = distort_image(seq_record.prediction)
distorted_input = []
for i in xrange(INPUT_SEQ_LENGTH):
distorted_input.append(distort_image(seq_record.input[i,:,:,:]))
stacked_distorted_input = tf.concat(2, distorted_input)
# Ensure that the random shuffling has good mixing properties.
min_queue_examples = int(num_examples_per_epoch *
MIN_FRACTION_EXAMPLES_IN_QUEUE)
# Generate a batch of sequences and prediction by building up a queue of examples.
return generate_sequence_batch(stacked_distorted_input, distorted_prediction, min_queue_examples,
batch_size, shuffle=True)
In theory, this works fine. And after doing some test runs, this really seemed to solve my problem. But after a while, I found out that I'm having a race-condition, because I use the input pipeline of the CNN-example code with multiple threads (which is the suggested method in TensorFlow to improve performance and reduce memory consumption at runtime):
def generate_sequence_batch(sequence_in, prediction, min_queue_examples,
batch_size):
num_preprocess_threads = 8 # <-- !!!
sequence_batch, prediction_batch = tf.train.shuffle_batch(
[sequence_in, prediction],
batch_size=batch_size,
num_threads=num_preprocess_threads,
capacity=min_queue_examples + 3 * batch_size,
min_after_dequeue=min_queue_examples)
return sequence_batch, prediction_batch
Because multiple threads create my examples, it is not guaranteed anymore that all image operations are performed in the right order (in sense of the right order of random operations).
Here I came to a point where I got completely stuck. Does anyone know how to solve this problem to apply the same image distortion to multiple images?
Some thoughts of mine:
I thought about to do some synchronizations arround these image distortion methods, but I could find anything provided by TensorFlow
I tried to generate to generate a random number for e.g. the random brightness delta using tf.random_uniform() by myself and use this value for tf.image.adjust_contrast(). But the result of the TensorFlow random generator is always a tensor, and I have not found a way to use this tensor as a parameter for tf.image.adjust_contrast() which expects a simple float32 for its contrast_factor parameter.
A solution that would (partly) work would be to combine all images to a huge image using tf.concat(), apply random operations to change contrast and brightness, and split the image afterwards. But this would not work for random flipping, because this would (at least in my case) change the order of the images, and there is no way to detect whether tf.image.random_flip_left_right() has performed a flip or not, which would be required to fix the wrong order of images if necessary.
Here is what I came up with by looking at the code of random_flip_up_down and random_flip_left_right within tensorflow :
def image_distortions(image, distortions):
distort_left_right_random = distortions[0]
mirror = tf.less(tf.pack([1.0, distort_left_right_random, 1.0]), 0.5)
image = tf.reverse(image, mirror)
distort_up_down_random = distortions[1]
mirror = tf.less(tf.pack([distort_up_down_random, 1.0, 1.0]), 0.5)
image = tf.reverse(image, mirror)
return image
distortions = tf.random_uniform([2], 0, 1.0, dtype=tf.float32)
image = image_distortions(image, distortions)
label = image_distortions(label, distortions)
I would do something like this using tf.case. It allows you to specify what to return if certain condition holds https://www.tensorflow.org/api_docs/python/tf/case
import tensorflow as tf
def distort(image, x):
# flip vertically, horizontally, both, or do nothing
image = tf.case({
tf.equal(x,0): lambda: tf.reverse(image,[0]),
tf.equal(x,1): lambda: tf.reverse(image,[1]),
tf.equal(x,2): lambda: tf.reverse(image,[0,1]),
}, default=lambda: image, exclusive=True)
return image
def random_distortion(image):
x = tf.random_uniform([1], 0, 4, dtype=tf.int32)
return distort(image, x[0])
To check if it works.
import numpy as np
import matplotlib.pyplot as plt
# create image
image = np.zeros((25,25))
image[:10,5:10] = 1.
# create subplots
fig, axes = plt.subplots(2,2)
for i in axes.flatten(): i.axis('off')
with tf.Session() as sess:
for i in range(4):
distorted_img = sess.run(distort(image, i))
axes[i % 2][i // 2].imshow(distorted_img, cmap='gray')
plt.show()