File (.wav) duration while writing PCM data #16KBps - objective-c

I am writing some silent PCM data on a file #16KBps. This file is of .wav format. For this I have the following code:
#define DEFAULT_BITRATE 16000
long LibGsmManaged:: addSilence ()
{
char silenceBuf[DEFAULT_BITRATE];
if (fout) {
for (int i = 0; i < DEFAULT_BITRATE; i++) {
silenceBuf[i] = '\0';
}
fwrite(silenceBuf, sizeof(silenceBuf), 1, fout);
}
return ftell(fout);
}
Updated:
Here is how I write the header
void LibGsmManaged::write_wave_header( )
{
if(fout) {
fwrite("RIFF", 4, 1, fout);
total_length_pos = ftell(fout);
write_int32(0);
fwrite("WAVE", 4, 1, fout);
fwrite("fmt ",4, 1, fout);
write_int32(16);
write_int16(1);
write_int16(1);
write_int32(8000);
write_int32(16000);
write_int16(2);
write_int16(16);
fwrite("data",4,1,fout);
data_length_pos = ftell(fout);
write_int32(0);
}
else {
std::cout << "File pointer not correctly initialized";
}
}
void LibGsmManaged::write_int32( int value)
{
if(fout) {
fwrite( (const char*)&value, sizeof(value), 1, fout);
}
else {
std::cout << "File pointer not correctly initialized";
}
}
I run this code on my iOS device using NSTimer with interval 1.0 sec. So AFAIK, if I run this for 60 sec, I should get a file.wav that when played should show 60 sec as its duration (again AFAIK). But in actual test it displays almost double duration i.e. 2 min. (approx). I have also tested that when I change the DEFAULT_BITRATE to 8000, then the file duration is almost correct.
I am unable to identify what is going on here. Am I missing something bad here? I hope my code is not wrong.

What you're trying to do (write your own WAV files) should be totally doable. That's the good news. However, I'm a bit confused about your exact parameters and constraints, as are many others in the comments, which is why they have been trying to flesh out the details.
You want to write raw, uncompressed, silent PCM to a WAV file. Okay. How wide does the PCM data need to be? You are creating an array of chars that you are writing to the file. A char is an 8-bit byte. Is that what you want? If so, then you need to use a silent center point of 0x80 (128). 8-bit PCM in WAV files is unsigned, i.e., 0..255, and 128 is silent.
If you intend to store silent 16-bit data, that will be signed data, so the center point (between -32768 and 32767) is 0. Also, it will be stored in little endian byte format. But since it's silence (all 0s), that doesn't matter.
The title of your question indicates (and the first sentence reiterates) that you want to write data at 16 kbps. Are you sure you want raw 16 kbps audio? That's 16 kiloBITs per second, or 16000 bits per second. Depending on whether you are writing 8- or 16-bit PCM samples, that only allows for 2000 or 1000 Hz audio, which is probably not what you want. Did you mean 16 kHz audio? 16 kHz audio translates to 16000 audio samples per second, which more closely aligns with your code. Then again, your code mentions GSM (LibGsmManaged), so maybe you are looking for 16 kbps audio. But I'll assume we're proceeding along the raw PCM route.
Do you know in advance how many seconds of audio you need to write? That makes this process really easy. As you may have noticed, the WAV header needs length information in a few spots. You either write it in advance (if you know the values) or fill it in later (if you are writing an indeterminate amount).
Let's assume you are writing 2 seconds of raw, monophonic, 16000 Hz, 16-bit PCM to a WAV file. The center point is 0x0000.
WAV writing process:
Write 'RIFF'
Write 32-bit file size, which will be 36 (header size - first 8 bytes) + 64000 (see step 12 about that number)
Write 'WAVEfmt ' (with space)
Write 32-bit format header size (16)
Write 16-bit audio format (1 indicating raw PCM audio)
Write 16-bit channel count (1 because it's monophonic)
Write 32-bit sample rate (number of audio sample per second = 16000)
Write 32-bit byte rate (number of bytes per second = 32000)
Write 16-bit block alignment (2 bytes per sample * 1 channel = 2)
Write 16-bit bits per sample (16)
Write 'data'
Write 32-bit length of audio payload data (16000 samples/second * 2 bytes/sample * 2 seconds = 64000 bytes)
Write 64000 bytes, all 0 values
If you need to write a dynamic amount of audio data, leave the length field from steps 2 and 12 as 0, then seek back after you're done writing and fill those in. I'm not convinced that your original code was writing the length fields correctly. Some playback software might ignore those, others might not, so you could have gotten varying results.
Hope that helps! If you know Python, here's another question I answered which describes how to write a WAV file using Python's struct library (I referred to that code fragment a lot while writing the steps above).

Related

Adafruit Trinket M0 (SAMD21) analog read rate slow

Why is the analog read rate seemingly slow (46 ksamples/s) when it should be fast (250 ksamples/s) for my Adafruit Trinket M0? See this simple Arduino code for details; why is PointCount only 46?
//TrinketReadRateTest
//27Nov2022
//Running on Adafruit Trinket M0, SAMD21
//Measures read times of analog reads on Trinket M0
//nothing at all connected to the Trinket
//according to the settings in this wiring.c file lines 160-173, samples per second should be = 250,000:
//C:\Users\<MyUserName>\AppData\Local\Arduino15\packages\adafruit\hardware\samd\1.7.11\cores\arduino\wiring.c
//in this loop, every PointCount is 2 samples, so in 2 millisecs, number of PointCounts should be:
//(.002 secs)(250000 samples/sec)(PointCounts/ 2 samples) = 250
//however, this routine gives a value of 46 WHY?
//if line 170 prescaler is set to DIV16 instead of DIV32, PointCounts gets to 66 (accuracy ???) so this wiring.c is being loaded
#define INPUT1 A3 //ATSAMD21G PA04
#define INPUT2 A4 //ATSAMD21G PA05
unsigned int Input1[1000];
unsigned int Input2[1000];
unsigned int PointCount = 0;
void setup() {
pinMode(INPUT1, INPUT);
pinMode(INPUT2, INPUT);
}
void loop() {
PointCount = 0;
unsigned long StartTime = micros();
do {
Input1[PointCount] = analogRead(INPUT1);
Input2[PointCount] = analogRead(INPUT2);
PointCount++;
} while (micros() - StartTime < 2000); //read 2 millisecs of data points as fast as they come
Serial.begin(9600); //keep serial off during data reads to avoid the question...
delay(1000);
Serial.println(PointCount);
Serial.end();
delay(1000);
}
I tried reading analog samples as fast as they would come. I expected to receive samples at a rate of 250000 per second. What actually resulted was a rate of 46000 samples per second.
Added 28Nov: the wiring.c file is not easy to find. If you want it:
download the tar.bz2 file:
https://adafruit.github.io/arduino-board-index/boards/adafruit-samd-1.7.11.tar.bz2
extract the tar file using 7-zip or whatever
goto cores\arduino\wiring.c
Here are the relevant lines of wiring.c:
//set to 1/(1/(48000000/32) * 6) = 250000 SPS
while(GCLK->STATUS.reg & GCLK_STATUS_SYNCBUSY);
GCLK->CLKCTRL.reg = GCLK_CLKCTRL_ID( GCM_ADC ) | // Generic Clock ADC
GCLK_CLKCTRL_GEN_GCLK0 | // Generic Clock Generator 0 is source
GCLK_CLKCTRL_CLKEN ;
while( ADC->STATUS.bit.SYNCBUSY == 1 ); // Wait for synchronization of registers between the clock domains
ADC->CTRLB.reg = ADC_CTRLB_PRESCALER_DIV32 | // Divide Clock by 32.
ADC_CTRLB_RESSEL_10BIT; // 10 bits resolution as default
ADC->SAMPCTRL.reg = 5; // Sampling Time Length
Adding this additional question 8Dec2022:
wiring.analog.c (in same folder as wiring.c) executes the analog routines. Line 369 of wiring.analog.c says the same thing that the SAMD21 data sheet says: "The first conversion after the reference is changed must not be used."
In lines 371-394, the analogRead routine for SAMD21, two reads are always made; the first to account for the statement above. But why do two reads for every analogRead? The analog reference is not changed with every read and is set prior to any reads. So why not just do one conversion after the reference is set? That way, there only needs to be one conversion per analogRead.
I moved the first conversion routine to the very end of analogReference. It speeds things up to PointCount = 79. Is this a problem? It does not seem to reduce accuracy.
Your second question is easier to answer than your first. The reason there are two ADC reads in the Arduino code is because there is a bug in the ADC hardware on the SAMD21. In the past, Arduino provided a calibration method that allowed you to correct for this instead of adding in the second read and throwing out the first garbage data. This was problematic for a number of reasons and eventually library was modified. There's an old hackaday article that provides a little more detail.
As for the ADC reads being slow, the limitation you're running into is a limitation of the SAMD library for Arduino. For reference, I am using the SAMD21 datasheet and the code from Arduino SAMD on GitHub. To start out with, the Clock speed should be 48Mhz. Using the DIV32 predivider, the ADC clock frequency is 1.5Mhz. Each ADC conversion from the SAMD21 library takes 63 clock cycles. Leaving you with ~23.8Khz. 23.8Khz * 2ms = 47.619 Conversions. Add on top of that the overhead caused by switching between the two input pins (I don't know the exact characterization but likely 1-2 clock pulses) and you'd end up with closer to 46 Conversions in 2ms.
63 clock pulses per conversion is comically high. Typically, the first read is closer to 20 pulses and subsequent ones are 13.5. There is another post on the electrical engineering Stack Exchange where someone tackles this and posts a link to their own library for improving the conversion speeds.

Audio through CAN FD into headphones

I am trying to record audio using a 12 bit resolution ADC, take the sample buffer and send it through CAN FD to another device, which takes samples of this audio and creates a .wav and plays it. The problem is that I see the data of the microphone being sent through CAN FD to the other device, but I am not able to transform this data into a .wav file properly and hear what I say through the microphone. I only hear beeps.
I'm creating a new .wav every 4 CAN FD messages in order to make some kind of real time communication and decrease the delay, but I don't think this is possible or if I am thinking it the proper way.
In this thread I take the message sent by the CAN FD and concatenate it in a buffer in order to introduce it in a .wav file. I have tried bigger buffers but it doesn't change the outcome.
How could I be able to take the data from the CAN FD and hear it?
Clarification: I know using CAN FD to transmit audio isn't the proper way, but it is for a master project.
struct canfd_frame frame;
CAN_MSG msg;
int trama_can[72];
int nbytes;
while (status_libreria == 0)
;
unsigned char buffer[256];
// FILE * fPtr;
int i=0,x=0;
//fPtr = fopen("Test.txt", "w");
while (1) {
do {
nbytes = read(s, &frame, sizeof(struct canfd_frame));
} while (nbytes == 0);
msg.id.ext = frame.can_id;
msg.dlc = frame.len;
if (msg.dlc > 8)
msg.dlc = 8; //Protecci�n hasta adaptar AC3LIB a CANFD
Numas_memcpy(&(msg.data.bdata), &(frame.data), msg.dlc);
can_frame_2_ac3lib(&msg, BUS_VERTICAL);
for(x=0;x<64;x++) buffer[i*64+x] = frame.data[x];
printf("%d \r\n",frame.data[x]);
printf("i:%d \r\n",i);
// Copiar datos a fichero.wav y reproducirlo simultaneamente
if (i == 3) {
printf("Datos IN\r\n");
write_wav("prueba.wav",256 , (short int *)buffer, 16000);
//fwrite(buffer,1,sizeof(buffer),fPtr);
//fclose(fPtr);
system("aplay prueba.wav -f cd");
i = 0;
system("rm prueba.wav");
}
i++;
}
32 first bytes of the audio file being recorded
In the picture, as you can see, the data is being recorded. moreover, this data is the same data as in the ADC, but when I play it, I only hear noise.
Simplify the problem first. Make sure you can transmit known data from one end to the other first at low rates. I'm sure the suggestion below will sound far too trivial. But until you are absolutely confident you understand it all, I predict you sill have many struggles.
Slowly - one frame per second, or even slower.
Learn to send one 0x55 byte from one end to the other and verify at the receiver.
Learn to send a few 0x55 in one frame and verify.
Learn to send 0x12345678 - verify it ends up with the bytes in the right order at the other end
Learn to send a counter. Check it at the receiver, make sure you do not drop any data.
Now do it all again but 10x faster.
Continue until you can send a counter at 10x the rate you need to for the audio without dropping any frames at all, for minutes and then hours.
Stress the rest of the system to make sure it still works under stress.
Only now, can you start to learn about sending audio.
Trust me, you will learn a lot!

Creating a WAV file with an arbitrary bits per sample value?

Do WAV files allow any arbitrary number of bitsPerSample?
I have failed to get it to work with anything less than 8. I am not sure how to define the blockAlign for one thing.
Dim ss As New Speech.Synthesis.SpeechSynthesizer
Dim info As New Speech.AudioFormat.SpeechAudioFormatInfo(AudioFormat.EncodingFormat.Pcm, 5000, 4, 1, 2500, 1, Nothing) ' FAILS
ss.SetOutputToWaveFile("TEST4bit.wav", info)
ss.Speak("I am 4 bit.")
My.Computer.Audio.Play("TEST4bit.wav")
AFAIK no, 4-bit PCM format is undefined, it wouldn't make much sense to have 16 volume levels of audio; quality would be horrible.
While technically possible, I know no decent software (e.g. Wavelab) that supports it, your very own player could though.
Formula: blockAlign = channels * (bitsPerSample / 8)
So for a mono 4-bit it would be : blockAlign = 1 * ((double)4 / 8) = 0.5
Note the usage of double being necessary to not end up with 0.
But if you look at the block align definition below, it really does not make much sense to have an alignment of 0.5 bytes, one would have to work at the bit-level (painful and useless because at this quality, non-compressed PCM would just sound horrible):
wBlockAlign
The block alignment (in bytes) of the waveform data. Playback
software needs to process a multiple of wBlockAlign bytes of data at
a time, so the value of wBlockAlign can be used for buffer
alignment.
Reference:
http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/Docs/riffmci.pdf page 59
Workaround:
If you really need 4-bit, switch to ADPCM format.

Publishing a stream using librtmp in C/C++

How to publish a stream using librtmp library?
I read the librtmp man page and for publishing , RTMP_Write() is used.
I am doing like this.
//Code
//Init RTMP code
RTMP *r;
char uri[]="rtmp://localhost:1935/live/desktop";
r= RTMP_Alloc();
RTMP_Init(r);
RTMP_SetupURL(r, (char*)uri);
RTMP_EnableWrite(r);
RTMP_Connect(r, NULL);
RTMP_ConnectStream(r,0);
Then to respond to ping/other messages from server, I am using a thread to respond like following:
//Thread
While (ThreadIsRunning && RTMP_IsConnected(r) && RTMP_ReadPacket(r, &packet))
{
if (RTMPPacket_IsReady(&packet))
{
if (!packet.m_nBodySize)
continue;
RTMP_ClientPacket(r, &packet); //This takes care of handling ping/other messages
RTMPPacket_Free(&packet);
}
}
After this I am stuck at how to use RTMP_Write() to publish a file to Wowza media server?
In my own experience, streaming video data to an RTMP server is actually pretty simple on the librtmp side. The tricky part is to correctly packetize video/audio data and read it at the correct rate.
Assuming you are using FLV video files, as long as you can correctly isolate each tag in the file and send each one using one RTMP_Write call, you don't even need to handle incoming packets.
The tricky part is to understand how FLV files are made.
The official specification is available here: http://www.adobe.com/devnet/f4v.html
First, there's a header, that is made of 9 bytes. This header must not be sent to the server, but only read through in order to make sure the file is really FLV.
Then there is a stream of tags. Each tag has a 11 bytes header that contains the tag type (video/audio/metadata), the body length, and the tag's timestamp, among other things.
The tag header can be described using this structure:
typedef struct __flv_tag {
uint8 type;
uint24_be body_length; /* in bytes, total tag size minus 11 */
uint24_be timestamp; /* milli-seconds */
uint8 timestamp_extended; /* timestamp extension */
uint24_be stream_id; /* reserved, must be "\0\0\0" */
/* body comes next */
} flv_tag;
The body length and timestamp are presented as 24-bit big endian integers, with a supplementary byte to extend the timestamp to 32 bits if necessary (that's approximatively around the 4 hours mark).
Once you have read the tag header, you can read the body itself as you now know its length (body_length).
After that there is a 32-bit big endian integer value that contains the complete length of the tag (11 bytes + body_length).
You must write the tag header + body + previous tag size in one RTMP_Write call (else it won't play).
Also, be careful to send packets at the nominal frame rate of the video, else playback will suffer greatly.
I have written a complete FLV file demuxer as part of my GPL project FLVmeta that you can use as reference.
In fact, RTMP_Write() seems to require that you already have the RTMP packet formed in buf.
RTMPPacket *pkt = &r->m_write;
...
pkt->m_packetType = *buf++;
So, you cannot just push the flv data there - you need to separate it to packets first.
There is a nice function, RTMP_ReadPacket(), but it reads from the network socket.
I have the same problem as you, hope to have a solution soon.
Edit:
There are certain bugs in RTMP_Write(). I've made a patch and now it works. I'm going to publish that.

Reading SWF Header with Objective-C

I am trying to read the header of an SWF file using NSData.
According to SWF format specification I need to access movie's width and height reading bits, not bytes, and I couldn't find a way to do it in Obj-C
Bytes 9 thru ?: Here is stored a RECT (bounds of movie). It must be read in binary form. First of all, we will transform the first byte to binary: "01100000"
The first 5 bits will tell us the size in bits of each stored value: "01100" = 12
So, we have 4 fields of 12 bits = 48 bits
48 bits + 5 bits (header of RECT) = 53 bits
Fill to complete bytes with zeroes, till we reach a multiple of 8. 53 bits + 3 alignment bits = 56 bits (this RECT is 7 bytes length, 7 * 8 = 56)
I use this formula to determine all this stuff:
Where do I start?
ObjC is a superset of C: You can run C code alongside ObjC with no issues.
Thus, you could use a C-based library like libming to read bytes from your SWF file.
If you need to shuffle bytes into an NSData object, look into the -dataWithBytes:length: method.
Start by looking for code with a compatible license that already does what you want. C libraries can be used from Obj-C code simply by linking them in (or arranging for them to be dynamically linked in) and then calling their functions.
Failing that, start by looking at the Binary Data Programming Guide for Cocoa and NSData Class Reference. You'd want to pull out the bytes that contain the bits you're interested in, then use bit masking techniques to extract the bits you care about. You might find the BitTst(), BitSet(), and BitClr() functions and their friends useful, if they're still there in Snow Leopard; I'm not sure whether they ended up in the démodé parts of Carbon or not. There are also the Posix setbit(), clrbit(), isset(), and isclr() macros defined in . Then, finally, there are the C bitwise operators: ^, |, &, ~, <<, and >>.