Given an audio file 'coolsound.aif', how might I approach the task of retrieving the sound data chunks (SSND chunks) and iterating over them to do some arbitrary processing? I hope to be able to achieve something like the following:
/*
* Pseudocode of what I'd like to do
*/
// get SSND chunks out of audio file somehow
Array soundDatachunks = getSSNDChunksFromSoundFile("coolsound.aif");
// iterate over each chunk
foreach(soundDataChunks as chunk){
// Now iterate over each element in the waveForm data array
foreach(chunk.waveForm as w){
//Just log it to debug console for now
Log(w);
}
}
Other info:
- My aim is to use the waveform data to visualize the audio file graphically.
- The audio file was recorded using AudioToolbox in this manner.
- SSND chunk has the structure as appears in this source:
typedef struct {
ID chunkID;
long chunkSize;
unsigned long offset;
unsigned long blockSize;
unsigned char WaveformData[];
} SoundDataChunk;
There are a few different API's that you can use and it all depends on how much control you want, and what you plan on doing with the audio data. The ExtendedAudioFile API is made for basic operations like getting the audio data and drawing it.
It may seem like a lot of code to make this happen, like you have to create a AudioStreamBasicDescription object and configure it just right, but it allows you to read an entire audio file very quickly and access all the samples for drawing.
GW
Related
I am trying to record audio using a 12 bit resolution ADC, take the sample buffer and send it through CAN FD to another device, which takes samples of this audio and creates a .wav and plays it. The problem is that I see the data of the microphone being sent through CAN FD to the other device, but I am not able to transform this data into a .wav file properly and hear what I say through the microphone. I only hear beeps.
I'm creating a new .wav every 4 CAN FD messages in order to make some kind of real time communication and decrease the delay, but I don't think this is possible or if I am thinking it the proper way.
In this thread I take the message sent by the CAN FD and concatenate it in a buffer in order to introduce it in a .wav file. I have tried bigger buffers but it doesn't change the outcome.
How could I be able to take the data from the CAN FD and hear it?
Clarification: I know using CAN FD to transmit audio isn't the proper way, but it is for a master project.
struct canfd_frame frame;
CAN_MSG msg;
int trama_can[72];
int nbytes;
while (status_libreria == 0)
;
unsigned char buffer[256];
// FILE * fPtr;
int i=0,x=0;
//fPtr = fopen("Test.txt", "w");
while (1) {
do {
nbytes = read(s, &frame, sizeof(struct canfd_frame));
} while (nbytes == 0);
msg.id.ext = frame.can_id;
msg.dlc = frame.len;
if (msg.dlc > 8)
msg.dlc = 8; //Protecci�n hasta adaptar AC3LIB a CANFD
Numas_memcpy(&(msg.data.bdata), &(frame.data), msg.dlc);
can_frame_2_ac3lib(&msg, BUS_VERTICAL);
for(x=0;x<64;x++) buffer[i*64+x] = frame.data[x];
printf("%d \r\n",frame.data[x]);
printf("i:%d \r\n",i);
// Copiar datos a fichero.wav y reproducirlo simultaneamente
if (i == 3) {
printf("Datos IN\r\n");
write_wav("prueba.wav",256 , (short int *)buffer, 16000);
//fwrite(buffer,1,sizeof(buffer),fPtr);
//fclose(fPtr);
system("aplay prueba.wav -f cd");
i = 0;
system("rm prueba.wav");
}
i++;
}
32 first bytes of the audio file being recorded
In the picture, as you can see, the data is being recorded. moreover, this data is the same data as in the ADC, but when I play it, I only hear noise.
Simplify the problem first. Make sure you can transmit known data from one end to the other first at low rates. I'm sure the suggestion below will sound far too trivial. But until you are absolutely confident you understand it all, I predict you sill have many struggles.
Slowly - one frame per second, or even slower.
Learn to send one 0x55 byte from one end to the other and verify at the receiver.
Learn to send a few 0x55 in one frame and verify.
Learn to send 0x12345678 - verify it ends up with the bytes in the right order at the other end
Learn to send a counter. Check it at the receiver, make sure you do not drop any data.
Now do it all again but 10x faster.
Continue until you can send a counter at 10x the rate you need to for the audio without dropping any frames at all, for minutes and then hours.
Stress the rest of the system to make sure it still works under stress.
Only now, can you start to learn about sending audio.
Trust me, you will learn a lot!
WebCodecs is released in Chrome 86. But there's no real code example on how to use it yet. Given a video url, how to extract video frames as ImageData using webcodecs?
What you describe is the entire complex process of acquiring raw bitmap-like data (e.g. something you can dump on a canvas), from a formatted file or a stream of data chunks.
In case of files (including the case where your URL points to a complete file, such as an .mp4 file), this is generally made of 2 steps:
Parsing the container file into individual chunks of encoded video and/or audio
Decoding these chunks of encoded video/audio
WebCodecs only facilitates step 2 of this process, i.e. what is called decoding. The reasoning behind this decision was that parsing the container is computationally trivial, so you can efficiently do this with the File APIs already, but you still need to implement parsing/processing the container yourself.
Luckily, plenty of libraries exist already, many of which ironically existed long before the emergence of the WebCodecs API.
MP4Box is one example, helping you acquire encoded video and audio chunks, which you can then feed into a VideoDecoder or AudioDecoder.
With MP4Box, the key piece of your code will be centered around the onSamples callback you provide, and it'll look something like this:
mp4BoxFile.onSamples = (trackId, user, chunks) =>
{
for (let i = 0; i < chunks.length; i++)
{
let chunk = chunks[i];
let encodedChunk = new EncodedVideoChunk({
// you'll need to deep-inspect chunk to figure these out
type: "key", // or "delta"
timestamp: ...
duration: ...
data: chunk.data
});
// pass encodedChunk to a VideoDecoder instance's decode method
}
};
This is just a rough sketch of how your code will probably look, it probably won't work without more inspection, and it'll take a lot of trial and error, because this is very low level stuff.
WebCodecs is not the silver bullet you probably expected, but it can help you build one.
How to construct easily a raw byte-by-byte InputRange/ForwardRange/RandomAccessRange from a file?
file.byChunk(4096).joiner
This reads a file in 4096-byte chunks and lazily joins the chunks together into a single ubyte input range.
joiner is from std.algorithm, so you'll have to import it first.
The easiest way to make a raw byte range from a file is to just read it all right into memory:
import std.file;
auto data = cast(ubyte[]) read("filename");
// data is a full-featured random access range of the contents
If the file is too large for that to be reasonable, you could try a memory-mapped file http://dlang.org/phobos/std_mmfile.html and use the opSlice to get an array off it. Since it is an array, you get full range features, but since it is memory mapped by the operating system, you get lazy reading as you touch the file.
For a simple InputRange, there's LockingTextReader (undocumented) in Phobos, or you could construct one yourself over byChunk or even fgetc, the C function. fgetc would be the easiest to write:
struct FileByByte {
ubyte front;
void popFront() { front = cast(ubyte) fgetc(fp); }
bool empty() { return feof(fp); }
FILE* fp;
this(FILE* fp) { this.fp = fp; popFront(); /* prime it */ }
}
I haven't actually tested that but i'm pretty sure it'd work. (BTW the file open and close is separate from this because ranges are supposed to be just views into data, not managed containers. You wouldn't want the file closed just because you passed this range into a function.)
This is not a forward nor random access range though. Those are trickier to do on streams without a lot of buffering code and I think that'd be a mistake to try to write - generally, ranges should be cheap, not emulating features the underlying container doesn't natively support.
EDIT: The other answer has a non-buffering way! https://stackoverflow.com/a/30278933/1457000 That's awesome.
For example, there are QR scanners which scan video stream in real time and get QR codes info.
I would like to check the light source from the video, if it is on or off, it is quite powerful so it is no problem.
I will probably take a video stream as input, maybe make images of it and analyze images or stream in real time for presence of light source (maybe number of pixels of certain color on the image?)
How do I approach this problem? Maybe there is some source of library?
It sounds like you are asking for information about several discreet steps. There are a multitude of ways to do each of them and if you get stuck on any individual step it would be a good idea to post a question about it individually.
1: Get video Frame
Like chaitanya.varanasi said, AVFoundation Framework is the best way of getting access to an video frame on IOS. If you want something less flexible and quicker try looking at open CV's video capture. The goal of this step is to get access to a pixel buffer from the camera. If you have trouble with this, ask about it specifically.
2: Put pixel buffer into OpenCV
This part is really easy. If you get it from openCV's video capture you are already done. If you get it from an AVFoundation you will need to put it into openCV like this
//Buffer is of type CVImageBufferRef, which is what AVFoundation should be giving you
//I assume it is BGRA or RGBA formatted, if it isn't, change CV_8UC4 to the appropriate format
CVPixelBufferLockBaseAddress( Buffer, 0 );
int bufferWidth = CVPixelBufferGetWidth(Buffer);
int bufferHeight = CVPixelBufferGetHeight(Buffer);
unsigned char *pixel = (unsigned char *)CVPixelBufferGetBaseAddress(Buffer);
cv::Mat image = cv::Mat(bufferHeight,bufferWidth,CV_8UC4,pixel); //put buffer in open cv, no memory copied
//Process image Here
//End processing
CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );
note I am assuming you plan to do this in OpenCV since you used its tag. Also I assume you can get the OpenCV framework to link to your project. If that is an issue, ask a specific question about it.
3: Process Image
This part is by far the most open ended. All you have said about your problem is that you are trying to detect a strong light source. One very quick and easy way of doing that would be to detect the mean pixel value in a greyscale image. If you get the image in colour you can convert with cvtColor. Then just call Avg on it to get the mean value. Hopefully you can tell if the light is on by how that value fluctuates.
chaitanya.varanasi suggested another option, you should check it out too.
openCV is a very large library that can do a wide wide variety of things. Without knowing more about your problem I don't know what else to tell you.
Look at the AVFoundation Framework from Apple.
Hope it helps!
You can try this method: start by getting all images to an AVCaptureVideoDataOutput. From the method:captureOutput:didOutputSampleBuffer:fromConnection,you can sample/calculate every pixel. Source: answer
Also, you can take a look at this SO question where they check if a pixel is black. If its such a powerful light source, you can take the inverse of the pixel and then determine using a set threshold for black.
The above sample code only provides access to the pixel values stored in the buffer; you cannot run any other commands but those that change those values on a pixel-by-pixel basis:
for ( uint32_t y = 0; y < height; y++ )
{
for ( uint32_t x = 0; x < width; x++ )
{
bgraImage.at<cv::Vec<uint8_t,4> >(y,x)[1] = 0;
}
}
This—to use your example—will not work with the code you provided:
cv::Mat bgraImage = cv::Mat( (int)height, (int)extendedWidth, CV_8UC4, base );
cv::Mat grey = bgraImage.clone();
cv::cvtColor(grey, grey, 44);
I'm trying to write a program to log the float values of samples in an audio file. My steps are as follows:
Open an Extended Audio File
Set up audio format (AudioStreamBasicDescription)
Apply audio format to my Extended Audio File
Set up an AudioBufferList
Read Extended Audio File into AudioBufferList with ExtAudioFileRead()
Log float values of AudioBufferList
I've put the entire 90 line class in this gist: https://gist.github.com/792630
The trouble is, if I apply an audio format to my Extended Audio File (Step 3), I get an error trying to read the file (Step 5). If I comment out step 5, the file reads fine but I will not have enforced my format for the read and I don't get floats when logging.
Any suggestions would be greatly appreciated. Thanks!
One thing I noticed right away is that you are allocating an AudioBufferList on the stack but setting mNumberBuffers to 2. It's fine to use ABLs on the stack, but if you do that they can only ever contain one buffer. But since you've set the client format to mono that isn't your real problem.
The real problem is that you aren't passing the address of fileRef to ExtAudioFileOpenURL- you're passing the value- so there is no way that the call can initialize the variable properly.
The call should look like this:
CheckResult(ExtAudioFileOpenURL(inputFileURL, &fileRef), "ExtAudioFileOpenURL failed");
I did that, compiled your code and everything worked fine.