I have a set of wav files that I want to generate a spectrogram of. But when I use the tf.audio.decode_wav function, I get the following error:
InvalidArgumentError: Bad audio format for WAV: Expected 1 (PCM), but
got7 [Op:DecodeWav]
How do I circumvent this error? Are there any other ways to generate a log mel spectrogram for wav files using tensorflow?
I am aware of librosa package, but I would prefer tensorflow.
The code is:
def decode_audio(audio_binary):
audio, _ = tf.audio.decode_wav(audio_binary)
return tf.squeeze(audio, axis=-1)
def get_waveform_and_label(file_path):
audio_binary = tf.io.read_file(file_path)
waveform = decode_audio(audio_binary)
return waveform
The error tells you that your files indicate that they have samples encoded as 8-bit mulaw.
As described in the TensorFlow documentation for tf.audio.decode_wav, only 16-bit PCM WAV is supported by this method.
You would need to re-encode your wave files prior to passing them to tensorflow. Something like ffmpeg could help here.
Related
I am new in the tensorflow part and hope someone can help me.
I've seen this document https://www.tensorflow.org/api_docs/python/tf/audio/decode_wav and executed tf.audio.decode_wav( contents, desired_channels=-1, desired_samples=-1, name=None )
The only thing I've changed is the contents, changing to my path name.
But still get an error!
Any methods to convert it? And I want to output something like this. .tfrecord-00000-of-00008
Read the .wav file into a string of bytes and then decode it:
import tensorflow as tf
wav_contents = tf.io.read_file("file.wav")
audio, sample_rate = tf.audio.decode_wav(contents=wav_contents)
audio.shape
This example was borrowed from the TensorFlow tutorial on reading audio files.
I am pytorch user and i encounter data that contain .tfrec i want to convert them to jpeg/png format so that i can read it in my pytorch code.
I have search the google but found nothing.
Any help how pytorch user handle tfre
if i read them directly like
import torchvision.transforms as T
from torchvision.datasets import ImageFolder
transform_train = T.Compose([
T.RandomCrop(128, padding_mode="reflect"),
T.RandomHorizontalFlip(),
T.ToTensor()
])
train_ds = ImageFolder(
root=path_to_folder,
transform=transform_train
)
it will through err
RuntimeError: Found 0 files in subfolders Supported extensions are:
.jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif,.tiff,.webp
I've been trying to use a hyperspectral image dataset that was in .mat files. I found that using the scipy library with its loadmat function I can load the hyperspectral images and selecting some bands to see them as an RGB.
def RGBread(image):
images = loadmat(image).get('new_image')
return abs(images[:,:,(12,6,4)])
def SIread(image):
images = loadmat(image).get('new_image')
return abs(images[:,:,:])
After trying to implement the pix2pix architecture I found an unexpected error. When passing the list of the names of the dataset files by a function that is responsible for load the data(which are still .mat files), Tensor Flow does not have a direct method for this reading or coding, so I get these data with my RGBread and SIread method and then I turned them into tensors.
def load_image(filename, augment=True):
inimg = tf.cast( tf.convert_to_tensor(RGBread(ImagePATH+'/'+filename)
,dtype=tf.float32),tf.float32)[...,:3]
tgimg = tf.cast( tf.convert_to_tensor(SIread(ImagePATH+'/'+filename)
,dtype=tf.float32),tf.float32)[...,:12]
inimg, tgimg = resize(inimg, tgimg,IMG_HEIGH,IMG_WIDTH)
if augment:
inimg, tgimg = random_jitter(inimg, tgimg)
return inimg, tgimg
When loading an image with the load_image method, using the name and path of a single .mat file (a hyperspectral image) of my dataset as argument of my function the method worked perfectly.
plt.imshow(load_train_image(tr_urls[1])[0])
The problem started when I created my dataSet tensor, because my RGBread function does not receive a tensor as a parameter since loadmat('.mat') expects a string. Having the following error.
train_dataset = tf.data.Dataset.from_tensor_slices(tr_urls)
train_dataset = train_dataset.map(load_train_image,
num_parallel_calls=tf.data.experimental.AUTOTUNE)
TypeError: expected str, bytes or os.PathLike object, not Tensor
After reading a lot about reading .mat files I found a user who recommended passing the data to TFrecord format. I've been trying to do it but I couldn't. Someone could help me?
Rasterio may be useful here.
https://rasterio.readthedocs.io/en/latest/
It can read hyperspectral .tif which can be passed to tf.data using a tf.keras data-generator. It may be a bit slow and perhaps should be done before training rather than at runtime.
An alternative is to ask whether you need the geotiff metadata. If not, you can preprocess and save as numpy arrays for tfrecords.
I am trying to deploy a model based on Object Detection example to do some tests and I am getting this error:
"Expects arg[0] to be uint8 but float is provided"
In that case I am using this to load my data:
request.inputs['inputs'].CopyFrom(
tf.contrib.util.make_tensor_proto({FLAGS.input_image}))
where FLAGS.input_image is my image data in bytes.
I was thinking that maybe that I should convert my image bytes to something that this input understands, but I haven't found yet.
What could I do to fix this issue?
Thanks !!!!
To convert the image to bytes, use the following in client code (python)
with open(FLAGS.image, 'rb') as f:
data = f.read()
Also please find a sample client (for inception model in python) as follows https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/inception_client.py
I have noticed that Tensorflow provides standard procedures for decoding jpeg, png and gif images after reading files. For instance for png:
import tensorflow as tf
filename_queue = tf.train.string_input_producer(['/Image.png']) # list of files to read
reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)
decoded_image = tf.image.decode_png(value) # use png or jpg decoder based on your files.
However, the tiff format decoder seems to be missing.
So what solutions exist for tiff files? Surely, I could convert my input images to png, but this doesn't seem to be a very smart solution.
There's currently no decoder for TIFF images. Look in tensorflow/core/kernels and you see
decode_csv_op.cc
decode_gif_op.cc
decode_jpeg_op.cc
decode_png_op.cc
decode_raw_op.cc
No decode_tiff_op.cc. This could be a good target for community contribution.
As of February 2019, some (limited & experimental) TIFF support has been added as part of the Tensorflow I/O library:
Added a very preliminary TIFF support. TIFF format is rather complex so compressions such as JPEG have not been supported yet, but could be added if needed.
The following methods are currently available:
tfio.experimental.image.decode_tiff
Decode a TIFF-encoded image to a uint8 tensor.
tfio.experimental.image.decode_tiff_info
Decode a TIFF-encoded image meta data.
An example usage from a Tensorflow tutorial:
import tensorflow as tf
import tensorflow.io as tfio
...
def parse_image(img_path: str) -> dict:
...
image = tf.io.read_file(img_path)
tfio.experimental.image.decode_tiff(image)
...
If tf.experimental.image.decode_tiff() won't work for you (as it won't work with my 32-bit TIFF files), you could try using cv2 as described in the answer to this post.
Other options are to use the .map() function with (a) rasterio, (b) skimage, or (c) pillow packages.