I want to retrain inception module on tiff images. I have followed the steps in https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0. However, it seems tiff images are not supported by inception module because I have received the following error
2017-06-22 16:52:56.712653: W tensorflow/core/platform /cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
Looking for images in 'Type 1'
No files found
Looking for images in 'Type 2'
No files found
No valid folders of images found at Myfolder
Is there any way to handle this issue?
You're right in saying that TensorFlow does not support TIFF images.
See here: No Tensorflow decoder for TIFF images?
If you want to use TIFF images, you could use a library like PIL or Pillow which can read TIFF images and convert them into a numpy array to feed into TensorFlow.
See Working with TIFFs (import, export) in Python using numpy for an example.
If you have a large amount of TIFF files, the above would make training slow as you will be spending more time reading and decoding TIFF files starving the GPU of data.
In this case, take a look at https://www.tensorflow.org/extend/new_data_formats on how to support custom file formats.
If you would like to go with the conversion route, this code, which I adapted with slight modification from Lipin Yang's website, worked nicely to convert TIFF to JPEG for a recent TensorFlow project.
import os
from PIL import Image
current_path = os.getcwd()
for root, dirs, files in os.walk(current_path, topdown=False):
for name in files:
print(os.path.join(root, name))
#if os.path.splitext(os.path.join(root, name))[1].lower() == ".tiff":
if os.path.splitext(os.path.join(root, name))[1].lower() == ".tif":
if os.path.isfile(os.path.splitext(os.path.join(root, name))[0] + ".jpg"):
print ("A jpeg file already exists for %s" % name)
# If a jpeg with the name does *NOT* exist, convert one from the tif.
else:
outputfile = os.path.splitext(os.path.join(root, name))[0] + ".jpg"
try:
im = Image.open(os.path.join(root, name))
print ("Converting jpeg for %s" % name)
im.thumbnail(im.size)
im.save(outputfile, "JPEG", quality=100)
except Exception as e:
print(e)
To save .jpg files in another directory (Extending Beau Hilton's answer)
main_path = "your/main/path"
data_folder = os.path.join(main_path, "Images_tiff")
data_folder_jpg = os.path.join(main_path, "Images_jpg")
if not os.path.isdir(data_folder_jpg):
os.mkdir(data_folder_jpg)
for root, dirs, files in os.walk(data_folder, topdown=False):
new_folder = os.path.join(data_folder_jpg,os.path.split(root)[1])
if (not os.path.exists(new_folder)) and files:
os.mkdir(new_folder)
for name in files:
print(os.path.join(root, name))
#if os.path.splitext(os.path.join(root, name))[1].lower() == ".tiff":
if os.path.splitext(os.path.join(root, name))[1].lower() == ".tif":
if os.path.isfile(os.path.splitext(os.path.join(new_folder, name))[0] + ".jpg"):
print ("A jpeg file already exists for %s" % name)
# If a jpeg with the name does *NOT* exist, convert one from the tif.
else:
outputfile = os.path.splitext(os.path.join(new_folder, name))[0] + ".jpg"
try:
im = Image.open(os.path.join(root, name))
print ("Converting jpeg for %s" % name)
im.thumbnail(im.size)
im.save(outputfile, "JPEG", quality=100)
except Exception as e:
print(e)
Related
I am training a model to classify images into 10 different labels. To load data I'm using ImageDataGenerator.
tensorflow.keras.preprocessing.image import ImageDataGenerator
train_dir = '/content/drive/MyDrive/Colab Notebooks/EuroSAT/Train/'
train_datagen = ImageDataGenerator(rescale=1./255,
horizontal_flip=True, vertical_flip=True)
train_generator = train_datagen.flow_from_directory(train_dir, batch_size=16,
class_mode='categorical', target_size=(64, 64),
subset ='training', shuffle = False)
But there are almost 3000 images in each category while ImageDataGenerator loads only 5443 images in total.
Found 5827 images belonging to 10 classes.
What can I do to possibly go around?
It may be the case that you have image formats that are not supported or corrupted image files. This can happen often if for example you download images via google or bing. As I do this often I developed a function provided below that checks a directory that contains images held in sub directories (class directories if you are using the ImageDataGenerator(),flow_from_directory. It checks to see if the files are valid image files and have the extensions specified in a user defined list of proper extensions. The code is shown below. It is a bit lengthy because it does a lot of checking on inputs etc. Note if it detects a file with the extension jfif it renames it as jpg since they are the same format. The parameter convert_ext can be set to convert all the images to a new image format based on the extension specified, for example 'bmp' If left as None the images retain their original format.
import os
import shutil
import cv2
def check_file_extension (source_dir, good_ext_list, delete=False, convert_ext=None):
# source_dir is the directory containing the class sub directories that hold the images
# good_ext_list is a list of strings you specify as good extensions for the ImageDataGenerator
# this list should be ['jpg', 'jpeg', 'bmp', 'png', 'tiff']
# delete is a boolean, if set to True image files that have invalid extensions or are not valid
# image files will be deleted.
# the function return a list. If delete=False this is a list of all files that have invalid
# extensions or are not valid image files
# if convert_ext is set to other than None, it should be a string indicating the new image format
# the files will be converted to, for example "jpg"
processed_count=0 # will be total number of files found
good_count=0 # will be total number of valid image files found
bad_file_list=[] # will be a list of all files processed that had invalid extensions
removed_count=0 # will be the number of files deleted if delete is set to true
class_list=os.listdir(source_dir)
if len(class_list)==0:
print('directory ', source_dir, ' is empty *** Program Terminating')
return None
print('{0:^20s}{1}{2:^17s}{1}{3:^14s}{1}{4:^15s}'.format('Class Directory',' ', 'Files Processed', 'Files Verified', 'Files Removed'))
for klass in class_list:
class_path=os.path.join(source_dir, klass)
if os.path.isdir(class_path)==False:# check if this is a directory if it is not print a warning
print ('*** Warning *** there are files in ', source_dir, ' it should only contain sub directories' )
else:
class_file_count=0 # will be number of files found in the class directory
class_good_count=0 # will be the number of good files found in the class directory
class_removed_count =0
f_list=os.listdir(class_path) # get a list of files in the class directory
for f in f_list:
f_path=os.path.join(class_path,f)
if os.path.isfile(f_path)==False: # check if it is a file if it is a directory print a warning
print ('*** Warning *** there is a directory in ', class_path, ' there should only be files there')
else:
class_file_count +=1 #increment class file counter
index=f.rfind('.')
fname=f[:index]
fext=f[index+1:].lower()
if fext not in good_ext_list and fext !='jfif':
if delete:
os.remove(f_path)
class_removed_count +=1 # increment removed file counter
else:
bad_file_list.append(f_path) # don't delete but put the path in list of files with bad extensions
else:
if fext =='jfif': # if ext= jfif change it to jpg
fnew_path=os.path.join(class_path, fname + '.' + 'jpg')
shutil.copy(f_path,fnew_path )
os.remove(f_path)
else:
try:
img=cv2.imread(f_path)
shape=img.shape
if convert_ext !=None:
fnew_path=os.path.join(class_path, fname + '.' + convert_ext)
cv2.imwrite(fnew_path,img)
os.remove (f_path)
class_good_count +=1
except:
if delete:
os.remove(f_path)
class_removed_count +=1
else:
bad_file_list.append(f_path)
print('{0:^20s}{1}{2:^17s}{1}{3:^14s}{1}{4:^15s}'.format(klass,' ', str(class_file_count),str(class_good_count), str(class_removed_count)) )
processed_count=processed_count + class_file_count
good_count=good_count + class_good_count
removed_count=removed_count+ class_removed_count
print('processed ', processed_count, ' files ', good_count, 'files were verified ', removed_count, ' files were removed')
return bad_file_list
Below is an example of use
source_dir=r'c:\temp\people\storage'
good_ext_list=['jpg', 'jpeg', 'bmp', 'tiff', 'png']
new_ext='bmp'
bad_file_list=check_file_extension (source_dir, good_ext_list, delete=False,convert_ext=new_ext )
print (bad_file_list)
below is the typical output
Class Directory Files Processed Files Verified Files Removed
savory 20 20 0
unsavory 21 20 0
processed 41 files 40 files were verified 0 files were removed
['c:\\temp\\people\\storage\\unsavory\\040.xyz']
I tried to use ImageDataGenerator to build generator images to train my model, But I am unable to do so because of the PIL.UnidentifiedImageError error. I tried different datasets and the problem pertains only to my dataset.
Now I can't unfortunately delete all the training/testing images as an answer suggested but I can remove the files causing this problem. How can I detect the error causing files?
This is a common problem particularly if you download images from say google. I developed a function that given a directory, it will go through all sub directories and check the files in each sub directory to ensure the have proper extensions and are valid image files. Code is provided below. It returns two lists. good_list is a list of valid image files and bad_list is a list of invalid image files. You will need to have Opencv installed.If you do not have it installed use pip install opencv-contrib-python.
def test_images(dir):
import os
import cv2
bad_list=[]
good_list=[]
good_exts=['jpg', 'png', 'bmp','tiff','jpeg', 'gif'] # make a list of acceptable image file types
for klass in os.listdir(dir) : # iterate through the sub directories
class_path=os.path.join (dir, klass) # create path to sub directory
if os.path.isdir(class_path):
for f in os.listdir(class_path): # iterate through image files
f_path=os.path.join(class_path, f) # path to image files
ext=f[f.rfind('.')+1:] # get the files extension
if ext not in good_exts:
print(f'file {f_path} has an invalid extension {ext}')
bad_list.append(f_path)
else:
try:
img=cv2.imread(f_path)
size=img.shape
good_list.append(f_path)
except:
print(f'file {f_path} is not a valid image file ')
bad_list.append(f_path)
else:
print(f'** WARNING ** directory {dir} has files in it, it should only contain sub directories')
return good_list, bad_list
I am pytorch user and i encounter data that contain .tfrec i want to convert them to jpeg/png format so that i can read it in my pytorch code.
I have search the google but found nothing.
Any help how pytorch user handle tfre
if i read them directly like
import torchvision.transforms as T
from torchvision.datasets import ImageFolder
transform_train = T.Compose([
T.RandomCrop(128, padding_mode="reflect"),
T.RandomHorizontalFlip(),
T.ToTensor()
])
train_ds = ImageFolder(
root=path_to_folder,
transform=transform_train
)
it will through err
RuntimeError: Found 0 files in subfolders Supported extensions are:
.jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif,.tiff,.webp
I am looking for a way to print 3D pdf from the results ABAQUS/Viewer. This will make it easy to communicate the results with others who are interested in the results of simulation but do not have access to ABAQUS.
The best way is to export a vrml file and convert it using Tetra4D or pdf3D and Adobe Acrobat professional. The 3D pdfs can look very good. However, the commercial software would cost over £800 per year. I did create a Python script to create a 3D pdf directly from Abaqus/CAE & Viewer which uses 2 open source tools: 1) Meshlab (http://www.meshlab.net/) to create a U3D file, 2) MiKTeX (https://miktex.org/) to convert the U3D file into a pdf. The output is not as polished as Tetra4D but it works. I have not tried this with the latest version of Meshlab. Just run this script from Abaqus/CAE or Abaqus/Viewer.
# Abaqus CAE/Viewer Python Script to create a 3D pdf directly from Abaqus/CAE or Abaqus/Viewer.
# You must first install meshlab (meshlabserver.exe)and MiKTeX (pdflatex.exe)
# Edit this script to reflect the installed locations of meshlabserver.exe and pdflatex.exe
# It will export a stl or obj file the mesh of current viewport and convert into 3D pdf
# Or run in Abaqus/viewer and it will create a VRML file and convert to 3D pdf.
# If contours are displayed in Abaqus Viewer, then it will create a contour 3D pdf
from abaqus import *
from abaqusConstants import *
from viewerModules import *
import os
import subprocess
import sys
# -----------------------------------------------------------------------------
pdfName='try'
meshlab_path="C:/Program Files/VCG/MeshLab/meshlabserver.exe"
pdfLatex_path="C:/Program Files (x86)/MiKTeX 2.9/miktex/bin/pdflatex.exe"
# -----------------------------------------------------------------------------
currView=session.viewports[session.currentViewportName]
try: # for Abaqus Viewer
cOdbD=currView.odbDisplay
odb = session.odbs[cOdbD.name]
name=odb.name.split(r'/')[-1].replace('.odb','')
module='Vis'
except: # Abaqus CAE
#name=currView.displayedObject.modelName
import stlExport_kernel
name = repr(currView.displayedObject).split('[')[-1].split(']')[0][1:-1] # allows for either main or visulation modules
module='CAE'
print module
if module=='CAE':
#All instances must be meshed
cOdbD=None
try:
ext='.stl'
stlExport_kernel.STLExport(moduleName='Assembly', stlFileName=pdfName + ext, stlFileType='BINARY')
except:
try:
ext='.obj'
session.writeOBJFile(fileName=os.path.join(directory,pdfName + ext), canvasObjects= (currView, ))
except:
print 'Either your assembly is not fully meshed or something else'
directory=(os.getcwd())
else: # Abaqus/Viewer
if cOdbD.viewCut:
session.graphicsOptions.setValues(antiAlias=OFF) # Better with anti aliasing off
odb = session.odbs[cOdbD.name]
directory=odb.path.replace(odb.path.split('/')[-1],'').replace('/','\\')
# Turn off most of the stuff in the viewport
currView.viewportAnnotationOptions.setValues(triad=OFF,
legend=OFF, title=OFF, state=OFF, annotations=OFF, compass=OFF)
ext='.wrl'
session.writeVrmlFile(fileName=os.path.join(directory,pdfName + ext),
compression=0, canvasObjects= (currView, ))
pdfFilePath=os.path.join(directory,pdfName+'-out.pdf')
if os.path.isfile(pdfFilePath):
os.remove(pdfFilePath)
#Check file was deleted
if os.path.isfile(pdfFilePath):
print "Aborted because pdf file of same name cant be deleted. Please close programs which it might be open in"
1/0 #a dodgy way to exit program
# Invoke meshlab to convert to a .u3d file
if cOdbD: #If in Abaqus/viewer
if 'CONTOURS' in repr(cOdbD.display.plotState[0]): # If contours are displayed. Output contoured pdf
p=subprocess.Popen([meshlab_path,'-i',pdfName + ext, '-o',pdfName + '.u3d','-m','vc']) #'vn fn fc vt'
else:
p=subprocess.Popen([meshlab_path,'-i',pdfName + ext, '-o',pdfName + '.u3d'])
else:
p=subprocess.Popen([meshlab_path,'-i',pdfName + ext, '-o',pdfName + '.u3d'])
p.communicate() # Wait for meshlab to finish
file_fullPathName=os.path.join(directory, pdfName + '.tex')
#Read the .tex file which meshlab has just created
with open(file_fullPathName, 'r') as texFile:
lines = texFile.read()
#Edit the .tex file
lines=lines.replace("\usepackage[3D]{movie15}","\\usepackage[3D]{movie15}\n\\usepackage[margin=-2.2in]{geometry}")
if cOdbD:
if 'CONTOURS' in repr(cOdbD.display.plotState[0]):
lines=lines.replace("3Dlights=CAD,","3Dlights=CAD,\n\t3Drender=SolidWireframe,")
lines=lines.replace("\n\end{document}","{---------------------------------------------------------------------------------Click above! MB1 - rotate, MB2 wheel or MB3 - zoom, Ctrl-MB1 - pan--------------}\n\\end{document}")
file_fullPathName=os.path.join(directory, pdfName + '-out.tex')
with open(file_fullPathName, "w") as outp:
outp.write(lines)
p=subprocess.Popen([
pdfLatex_path,
pdfName + '-out.tex',
])
p.communicate()
print 'Conversion to pdf complete'
print file_fullPathName
The simplest way of printing the Abaqus *.odb results are using Tecplot 360 which is read the Abaqus *.odb files and you can get the *.tif and *.png results with any resolutions and you can also rotate the model in 3D and change the fonts and all the things you need.
I have noticed that Tensorflow provides standard procedures for decoding jpeg, png and gif images after reading files. For instance for png:
import tensorflow as tf
filename_queue = tf.train.string_input_producer(['/Image.png']) # list of files to read
reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)
decoded_image = tf.image.decode_png(value) # use png or jpg decoder based on your files.
However, the tiff format decoder seems to be missing.
So what solutions exist for tiff files? Surely, I could convert my input images to png, but this doesn't seem to be a very smart solution.
There's currently no decoder for TIFF images. Look in tensorflow/core/kernels and you see
decode_csv_op.cc
decode_gif_op.cc
decode_jpeg_op.cc
decode_png_op.cc
decode_raw_op.cc
No decode_tiff_op.cc. This could be a good target for community contribution.
As of February 2019, some (limited & experimental) TIFF support has been added as part of the Tensorflow I/O library:
Added a very preliminary TIFF support. TIFF format is rather complex so compressions such as JPEG have not been supported yet, but could be added if needed.
The following methods are currently available:
tfio.experimental.image.decode_tiff
Decode a TIFF-encoded image to a uint8 tensor.
tfio.experimental.image.decode_tiff_info
Decode a TIFF-encoded image meta data.
An example usage from a Tensorflow tutorial:
import tensorflow as tf
import tensorflow.io as tfio
...
def parse_image(img_path: str) -> dict:
...
image = tf.io.read_file(img_path)
tfio.experimental.image.decode_tiff(image)
...
If tf.experimental.image.decode_tiff() won't work for you (as it won't work with my 32-bit TIFF files), you could try using cv2 as described in the answer to this post.
Other options are to use the .map() function with (a) rasterio, (b) skimage, or (c) pillow packages.