Convert .Bin to .MP4 and transferring it using UART OpenMV H7 micropython - camera

I am using an OpenMV H7 with the microPython language. I am currently trying to create a program that will begin recording after confirmation via Bluetooth using UART. Then, after recording is done, it will save and transfer the video back to the other device.
I have successfully recorded a video; however, I do not know how to transcode the .bin file into an mp4 or any other video file through the program and transfer it back to the terminal of my computer.
My question is, is this possible? And if so, how can it be done?
Here is a copy of the current version of my code.
# Video Capture - By: isaias - Sat Mar 5 2022
# Using OpenMV IDE and Micropython
# Processor: ARM STM32H743
import sensor, image, pyb, time
from pyb import UART
uart = UART(3, 115200, timeout_char=1000)
uart.write('Program Started.\n\r')
boolean = 0
record_time = 5000 # 10 seconds in milliseconds
while 1:
boolean = uart.read() #reads from terminal
if boolean: #when user enters 1 in terminal, begin recording for 10 seconds
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.skip_frames(time = 2000)
clock = time.clock()
stream = image.ImageIO("/stream.bin", "w")
# Red LED on means we are capturing frames.
pyb.LED(1).on()
start = pyb.millis()
while pyb.elapsed_millis(start) < record_time:
clock.tick()
img = sensor.snapshot()
# Modify the image if you feel like here...
stream.write(img)
print(clock.fps())
stream.close()
break #Once done recording, leave while look
# Blue LED on means we are done.
pyb.LED(1).off()
pyb.LED(3).on()
#Convert file from .bin to readable .mp4 file and transfer it back to terminal to be downloaded
#We are unsure if this can be done via UART. If not, what other ways make this possible?

Related

DVB-S2 communication between two USRP B200

Thank you for reading this.
I'm having difficulties with DVB-S2 communication between two USRP B200 SDR boards that are connected with SMA cable.
For hardware set-up, I'm using Raspberry Pi 4 (4 GB) to execute GNU Radio Companion, and using USB 3.0 port and cable to connect RPi and USRP B200. And I connected a DC block at the Tx port, as described in USRP homepage manual for DVB-S2. (So, the sequence is RPi 4-USB 3.0-USRP (Tx)-DC block-SMA cable (1 m)-USRP (Rx)-USB 3.0-RPi 4)
I have attached my hardware set-up pictures below.
I am trying to send some sample video through DVB-S2 communication. And I got DVB-S2 GRC flowcharts from links below. I've attached the screenshots, too.
https://github.com/drmpeg/gr-dvbs2
https://github.com/drmpeg/gr-dvbs2rx
At last trial, it was successful with RF options setting like below:
-Constellation: QPSK
-Code rate: 2/5
-Center Freq.: 1 GHz
-1 Mbps (sample rate) * 2 sps (sample per symbol) = 2 Mbps (bandwidth)
-Tx relative gain: 40 dB
(Regarding the code rate and bandwidth, I could see the video was received with 0.8 Mbps data rate)
But the problem is:
-this connection is very unstable as it does often fail even when the RF setting is the same.
-I need to raise the data rate as high as possible, but it's too low for me now. As I know, USRP B200 support ~61.44 Msps, but when I require about above 4 Mbps bandwidth, the log shows Us (underflow) at Tx and Os (overflow) at Rx. I confirmed that the clock rate setting is fine with 56 MHz.
-So I tried using other constellations, code rate, sample rate combinations but they failed.And for 8PSK option, I put 3 into sps variable at the Rx side as 8PSK is 3 bits per sample, but Rx flowchart rejected and saying 'sps needs to be even integer >= 2'. And it was not allowed to use 16APSK or beyond constellations in this USRP or in this flowgraph.
I guess I am missing something.
Is there any way that I can make stable connection and raise up the data rate?
I would really appreciate if you could help me.

Reading avi file, getting single frames with cv2.VideoCapture and video.read, and the png files of the single frames are much bigger than the avi

I'm reading an avi file with approx 2MB size, 301 frames, 20 frames/sec (15 sec long video) and a size of 1024 * 1096 per frame.
When I'm reading the single frames with cv2 and resaving them in original size as png, then I'm getting a size of approx 600KB per picture/frame. So, I have in total 301 * 600KB = 181MB (original avi had 2MB).
Any idea why this is happening and how to reduce the file size of the single frames without changing the resolution? Idea is to somehow generate single frames from the original video, do detections with CNN and to resave the original video again with included detections and the output video shall be somehow very similar to input video (approx same file size, must not be avi format)
PNG files or single frames are in the most cases always larger than the original video file (compressed in the most cases by a codec https://www.fourcc.org/codecs.php). Use for example the following command on Linux to create a compressed avi:
ffmpeg -i FramePicName%d.png -vcodec libx264 -f avi aviFileName
You can get the used codec to create the original video file by the following python cv2 code
cap = cv2.VideoCapture(videoFile)
fourcc = cap.get(cv2.CAP_PROP_FOURCC) # or cv2.cv.CV_CAP_PROP_FOURCC
"".join([chr((int(fourcc) >> 8 * i) & 0xFF) for i in range(4)])

how to add speech training data to tensorflow

I have labelled .wav files to train a Convolutional Neural Network. These are for Bengali phones, for which no standard Dataset is available. I want to input these .wav files to Tensorflow for training my CNN model. I want to create Grayscale Spectrograms from these .wav files, which will be input for my model. I need help in how to do so. If there is more than one alternative, what are their strength and weakness?
Also, they are of variable time lengths, like some are 70ms, some are 160ms. Is there a way to divide them in 20ms segments?
I have done something similar in my research. I used the Linux utility SOX to do the audio wave file manipulation and creating spectrograms.
On the audio file length, you can use the "trim" option within SOX to split the file into 20ms segments. Something along the lines of the following:
sox myaudio.wav trim 0 0.02 : newfile : restart
Using the "spectrogram" option of SOX, you can then create the spectrogram.
sox myaudio.wav -n spectrogram -m -x 256 -y 256 -o myspectrogram.png
The command will create a monochrome spectrogram of size 256x256 and store it in the file "myspectrogram.png".
In my research, I did not split the file into smaller chunks. I found that using the whole wave file of the word was sufficient to get good recognition. But, it depends on what your long term goal is.
You can also look at the ffmpeg ops in TensorFlow for loading audio files, though we don't yet have a built-in spectrogram:
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/ffmpeg

How to slow down a file source in GNU Radio?

I'm attempting to unpack bytes from an input file in GNU Radio Companion into a binary bitstream. My problem is that the Unpack K Bits block works at the same sample rate as the file source. So by the time the first bit of byte 1 is clocked out, byte 2 has already been loaded. How do I either slow down the file source or speed up the Unpack K Bits block? Is there a way I can tell GNU Radio Companion to repeat each byte from the file source 8 times?
Note that "after pack" is displaying 4 times as much data as "before pack".
My problem is that the Unpack K Bits block works at the same sample rate as the file source
No it doesn't. Unpack K Bits is an interpolator block. In your case the interpolation is 8. For every bytes 8 new bytes are produced.
The result is right, but the time scale of your sink is wrong. You have to change the sampling rate at the second GUI Time Sink to fit the true sampling rate of the flowgraph after the Unpack K Bits.
So instead of 32e3 it should be 8*32e3.
Manos' answer is very good, but I want to add to this:
This is a common misunderstanding for people that just got in touch with doing digital signal processing down at the sample layer:
GNU Radio doesn't have a notion of sampling rate itself. The term sampling rate is only used by certain blocks to e.g. calculate the period of a sine (in the case of the signal source: Period = f_signal/f_sample), or to calculate times or frequencies that are written on display axes (like in your case).
"Slowing down" means "making the computer process samples slower", but doesn't change the signal.
All you need to do is match what you want the displaying sink to show as time units with what you configure it to do.

How can I change signal amplitude in pyaudio using numpy?

I'm currently using python 3.3 in combination with pyaudio and numpy. I took the example from the pyaudio website to play a simple wave file and send that data onto the default sound card.
Now I would like to change the volume of the audio, but when I multiply the array by 0.5, I get a lot of noise and distortion.
Here is a code sample:
while data != '':
decodeddata = numpy.fromstring(data, numpy.int16)
newdata = (decodeddata * 0.5).astype(numpy.int16)
stream.write(newdata.tostring())
data = wf.readframes(CHUNK)
How should I handle multiplication or division on this array without ruining the waveform?
Thanks,
It seemed that the source file's bitrate (24 bit) was not compatible with portaudio. After exporting to a 16 bit pcm file, the multiplication did not result in distortion.
To fix this for different typed files, it is necessary to check the bit depth and rescale correspondingly.