I have successfully run Keras on BMP RGB files but now I need to increase the channels in my data so I've switched to NPY files.
In the process I discovered that the ImageDataGenerator only works with image files...
So I've decided to assemble my train-test data and train-test labels in a mnist-type file because there are many keras scripts out there that read directly from mnist.npz
But I don't understand how to get from my directories of data to a mnist.npz file...??!?
My data is organized as follows:
a train directory full of npy files
a test directory full of npy files
a txt file with labels (one hot encoding) for the train npy files
a txt file with labels (one hot encoding) for the test npy files
Each line in the label files looks this: aaa.npy 100000000000000000000
If you have any suggestions, you're welcome.
Cheers!
Related
I have a folder in which I have 100+ .npy files.
The path to this folder is '/content/drive/MyDrive/lung_cancer/subset0/trainImages'.
This folder has the .npy files as shown in the image the .npy files
The shape of each of these .npy files is (3,512,512)
I want to combine all of these files into one single file with the name trainImages.npy so that I can train my unet model with it.
My unet model takes input of the shape (1,512,512).
I will load the above trainImages.npy file into imgs_train as below to pass it as input into unet model
imgs_train = np.load(working_path+"trainImages.npy").astype(np.float32)
Can someone please tell me how do i concatenate all those .npy files into one single .npy file??
Thanks.
So I found the answer out by myself and I am attaching the code below if anyone needs it. Change it according to your needs..
import os
import numpy as np
path = '/content/drive/MyDrive/lung_cancer/subset0/trainImages/'
trainImages = []
for i in os.listdir(path):
data = np.load(path+i)
trainImages.append(data)
I am training an auto-encoder (keras) on google colab. however, I have 25000 input image and 25000 output image. I tried to:
1- copy the large file from google drive to colab each time (takes 5-6 hours).
2- convert the set to numpy array but when normalizing the images, the size get a lot bigger (from 7GB to 24GB for example) and then I can not fit it into the ram memory.
3- I can not zip and unzip my data.
So please, if anyone knows how to convert it into numpy array( and normalize it) without having large file(24GB).
What I usually do :
Zip all the images and load the .zip file on your Google Drive
Dezip in your colab :
from zipfile import ZipFile
with ZipFile('data.zip', 'r') as zip:
zip.extractall()
All your images are dezipped and stored on the Colab Disk, now you can have a faster acces to them.
Use Generators in keras like flow_from_directory or create your own generator
use you generator when you fit your model :
moel.fit(train_generator, steps_per_epoch = ntrain // batch_size,
epochs=epochs,validation_data=val_generator,
validation_steps= nval // batch_size)
with ntrain and nval the number of images in your train and validation dataset
I have a trained model in 'h5' format. It has some layers and their names with it. I want to read the weights and put them in a single array text file.
I am trying to use h5py but it needs the name of layer in details manually and then weights can be extracted and saved.
Is there other technique to write the weights to text file automatically?
I've been looking for a clear answer, but couldn't find until now.
In Tensorflow, after the training executing, 4 files are generated:
.meta,
.data,
.index and
checkpoint
What is the utility of the .index file?
Thanks!
The .index file holds an immutable key-value table linking a serialized tensor name and where to find data in its .data files
I have an image data set of size 600 x 400 each and I have converted each of the images to TFRecord format. But I am unable to figure out how to use this data? I have seen the imagenet dataset and found only one single binary file (when extracted it form here).
Is it that for an image dataset there will be only one TFRecord or each individual images will have their own TFRecord files?
Tensorflow doesnt look for single tfrecord file. So feel free and point your "data directory" and "train directory" to the location which has set of tfrecord files.
Also, keep in mind files should be in respective directories based on their names like TRAIN-*.tfrecord files in "train directory".
Answer can be more specific if you mentioned what model of TF you are targeting to run on these TF record files.
Hope it helps.