How to use python pydub to convert mp3 data(bytes) to wav data(bytes) without storing data to file? - pydub

How to use python pydub to convert mp3 data(bytes) to wav data(bytes) without storing data to file?
seg=AudioSegment(data=mp3_data)
seg.set_frame_rate(16000)
seg.set_channels(1)
# no function named set_format
# seg.set_foramt("wav")
return seg.raw_data
Updated:
Oh, I see. BytesIO can be used like this:
from io import BytesIO
seg=AudioSegment.from_mp3(BytesIO(mp3_data))
seg=seg.set_frame_rate(vosk_sample_rate)
seg=seg.set_channels(1)
wavIO=BytesIO()
seg.export(wavIO, format="wav")
return wavIO.getvalue()

Related

How to convert the .wav file to tfrecord file?

I am new in the tensorflow part and hope someone can help me.
I've seen this document https://www.tensorflow.org/api_docs/python/tf/audio/decode_wav and executed tf.audio.decode_wav( contents, desired_channels=-1, desired_samples=-1, name=None )
The only thing I've changed is the contents, changing to my path name.
But still get an error!
Any methods to convert it? And I want to output something like this. .tfrecord-00000-of-00008
Read the .wav file into a string of bytes and then decode it:
import tensorflow as tf
wav_contents = tf.io.read_file("file.wav")
audio, sample_rate = tf.audio.decode_wav(contents=wav_contents)
audio.shape
This example was borrowed from the TensorFlow tutorial on reading audio files.

Is there a way to convert numpy array to PNG/JPG... payload without saving it as a file?

Suppose there exists a numpy array, data. I am trying to do the equivalent of the following
cv2.imwrite(filename, data)
with open(filename, 'rb') as fp:
data_compressed = filename.read()
without having to write to a file. Is there a way to convert numpy array to its equivalent PNG/JPG... representation without having to write to a file and read it as binary?
As Miki pointed out, imencode(...) is the solution.

Load pytorch model from S3 bucket

I want to load a pytorch model (model.pt) from a S3 bucket. I wrote the following code:
from smart_open import open as smart_open
import io
load_path = "s3://serial-no-images/yolo-models/model4/model.pt"
with smart_open(load_path) as f:
buffer = io.BytesIO(f.read())
model.load_state_dict(torch.load(buffer))
This results in the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte
One solution would be to download the model locally, but I want to avoid this and load the model directly from S3. Unfortunately, I couldn't find a good solution for that online. Can someone help me out here?
According to the documentation, the following works:
from smart_open import open as smart_open
import io
load_path = "s3://serial-no-images/yolo-models/model4/model.pt"
with smart_open(load_path, 'rb') as f:
buffer = io.BytesIO(f.read())
model.load_state_dict(torch.load(buffer))
I have tried this before, but didn't see that I have to set 'rb' as argument.

Remove header from .MAT files loaded by loadmat() from scipy.io library

I am new to both python and tensorflow.
I am trying to make a input pipeline for a generative adversarial network with input complex number data in .mat format and loaded it with loadmat() from scipy.io library. Now I am trying to prepare my data for giving input to my network and i tried from_tensor_slices(). But it can not be converted into tensor because of the headers in it. I looked up how to remove header from files by python and found some techniques that can be applied to .csv file but nothing on .mat files. How can I remove the header from .mat files? Also, the loadmat() function returns a list of dictionary I think. How can I extract the data from the file under such condition? Thank you.

Accessing carray of pointcloud using pytables

I am having a hard time understanding how to access the data in a carray.
http://carray.pytables.org/docs/manual/index.html
I have a carray that I can view in a group structure using vitables - but how to open it and retrieve the data it beyond me.
The data are a point cloud that is 3 levels down that I want to make a scatter plot of and extract as a .obj file..
I then have to loop through (many) clouds and do the same thing..
Is there anyone that can give me a simple example of how to do this?
This was my attempt:
import carray as ca
fileName = 'hdf5_example_db.h5'
a = ca.open(rootdir=fileName)
print a
I managed to solve my issue.. I wasn't treating the carray differently to the rest of the hierarchy. I needed to first load the entire db, then refer to the data I needed. I ended up not having to use carray, and just stuck to h5py:
from __future__ import print_function
import h5py
import numpy as np
# read the hdf5 format file
fileName = 'hdf5_example_db.h5'
f = h5py.File(fileName, 'r')
# full path of carry type data (which is in ply format)
dataspace = '/objects/object_000/object_model'
# view the data
print(f[dataspace])
# print to ply file
with open('object_000.ply', 'w') as fo:
for line in f[dataspace]:
fo.write(line+'\n')