I'm trying to produce low bitrate opus files with L/R stereo. What decides if opusenc will use L/R stereo instead of joint stereo? Is there are flag I can pass? Is it related to bitrate?
opusenc input.wav output.opus //produces L/R stereo
opusenc input.wav output.opus --bitrate 8 //produces joint stereo
It looks like it is determined here:
if (st->force_channels!=OPUS_AUTO && st->channels == 2)
{
st->stream_channels = st->force_channels;
} else {
#ifdef FUZZING
/* Random mono/stereo decision */
if (st->channels == 2 && (rand()&0x1F)==0)
st->stream_channels = 3-st->stream_channels;
#else
/* Rate-dependent mono-stereo decision */
if (st->channels == 2)
{
opus_int32 stereo_threshold;
stereo_threshold = stereo_music_threshold + ((voice_est*voice_est*(stereo_voice_threshold-stereo_music_threshold))>>14);
if (st->stream_channels == 2)
stereo_threshold -= 4000;
else
stereo_threshold += 4000;
st->stream_channels = (equiv_rate > stereo_threshold) ? 2 : 1;
} else {
st->stream_channels = st->channels;
}
#endif
}
Just breifly reading through the opusenc source code, it looks like setting force_channels to 2 on the struct OpusEncoder will make it work. However, looking through the opusenc.c source code, no where is that field set. You could easily modify the source however to always force channels to be two. For the future, it looks like opus calls it "dual stereo" rather than "L/R stereo".
Opus by default attempts to make the best decision possible based on the current bitrate. The decision is made on the following table (20 ms frame size):
8-12 kbit/s for NB speech,
16-20 kbit/s for WB speech,
28-40 kbit/s for FB speech,
48-64 kbit/s for FB mono music, and
64-128 kbit/s for FB stereo music.
This is because opus assume that, if the bitrate is too low, it cannot encode stereo with sufficient quality.
Actually the documentation says that it is possible to change the number of channels BUT it doesn't explain how. I'll take a look later anyway on how to do this.
You can find these informations on rfc6716
Related
I'm interfacing my G2553 to a standard LCD screen using UART. I got it working like a dream at 1MHz, but need the system to run at 16MHz for certain other peripherals. I got it very close to working at 16MHz but there is what I believe to be a baudrate error, as the screen will show almost what I sent it, but miss a character here or there or get a character incorrect. I'll take you through how I wrote my code to see if I made any errors.
The screen expects a baudrate of 9600, so I went to the table in my user guide:
So UCBRx = 1666
and UCBRSx = 6.
The expected % errors on this are actually better than what I was using at the 1MHz clock so I thought this would be OK.
This means UCBR0 = 130 and UCBR1 = 6 (This is just the HEX value of UCBRx spread over two 8-bit registers) and UCBRSx = 6.
Now to enter these I looked at what the user guide tells us about the UART modulation register:
(was going to make this pretty but I don't have the reputation to post images)
So my initialisation code ended up being:
if (CALBC1_16MHZ==0xFF) // If calibration constant erased
{
while(1); // do not load, trap CPU!!
}
DCOCTL = 0; // Select lowest DCOx and MODx settings
//setup DCO to 16MHZ
BCSCTL1 = CALBC1_16MHZ;
DCOCTL = CALDCO_16MHZ;
P1SEL |= BIT1 + BIT2 ; // P1.1 = RXD, P1.2=TXD
P1SEL2 |= BIT1 + BIT2;
UCA0CTL1 |= UCSSEL_2; // SMCLK
UCA0BR0 = 130; // 16MHz 9600
UCA0BR1 = 6; // 16MHz 9600
UCA0MCTL = UCBRS2 + UCBRS1; // modulation is 110b or 6
UCA0CTL1 &= ~UCSWRST; // **Initialize USCI state machine**
However as I stated above this is almost there, for example if I try to display the string "Initialising..." which worked for my 1MHz 9600 baudrate settings, with 16MHz at these settings it might display "Initiali ng..." - something close but obviously some error in there that I don't want.
Thank you for any help you can provide on this.
According to the baud rate calculator at http://mspgcc.sourceforge.net/baudrate.html, UCA0MCTL is supposed to be 0x5B
I am using the SoXR library's variable rate feature to dynamically change the sampling rate of an audio stream in real time. Unfortunately I have have noticed that an unwanted clicking noise is present when changing the rate from 1.0 to a larger value (ex: 1.01) when testing with a sine wave. I have not noticed any unwanted artifacts when changing from a value larger than 1.0 to 1.0. I looked at the wave form it was producing and it appeared as if a few samples right at rate change are transposed incorrectly.
Here's a picture of an example of a stereo 440Hz sinewave stored using signed 16bit interleaved samples:
I also was unable to find any documentation covering the variable rate feature beyond the fifth code example. Here's is my initialization code:
bool DynamicRateAudioFrameQueue::intialize(uint32_t sampleRate, uint32_t numChannels)
{
mSampleRate = sampleRate;
mNumChannels = numChannels;
mRate = 1.0;
mGlideTimeInMs = 0;
// Intialize buffer
size_t intialBufferSize = 100 * sampleRate * numChannels / 1000; // 100 ms
pFifoSampleBuffer = new FiFoBuffer<int16_t>(intialBufferSize);
soxr_error_t error;
// Use signed int16 with interleaved channels
soxr_io_spec_t ioSpec = soxr_io_spec(SOXR_INT16_I, SOXR_INT16_I);
// "When creating a var-rate resampler, q_spec must be set as follows:" - example code
// Using SOXR_VR makes sense, but I'm not sure if the quality can be altered when using var-rate
soxr_quality_spec_t qualitySpec = soxr_quality_spec(SOXR_HQ, SOXR_VR);
// Using the var-rate io-spec is undocumented beyond a single code example which states
// "The ratio of the given input rate and ouput rates must equate to the
// maximum I/O ratio that will be used: "
// My tests show this is not true
double inRate = 1.0;
double outRate = 1.0;
mSoxrHandle = soxr_create(inRate, outRate, mNumChannels, &error, &ioSpec, &qualitySpec, NULL);
if (error == 0) // soxr_error_t == 0; no error
{
mIntialized = true;
return true;
}
else
{
return false;
}
}
Any idea what may be causing this to happen? Or have a suggestion for an alternative library that is capable of variable rate audio resampling in real time?
After speaking with the developer of the SoXR library I was able to resolve this issue by adjusting the maximum ratio parameters in the soxr_create method call. The developer's response can be found here.
If I open an audio file with extended audio file services, using the following client data format...
AudioStreamBasicDescription audioFormat;
memset(&audioFormat, 0, sizeof(audioFormat));
audioFormat.mSampleRate = 44100.0;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kAudioFormatFlagIsBigEndian |
kAudioFormatFlagIsSignedInteger |
kAudioFormatFlagIsPacked;
audioFormat.mBytesPerPacket = 4;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = 2;
audioFormat.mBytesPerFrame = 4;
audioFormat.mBitsPerChannel = 16;
And configure an AudioBufferList like so....
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0].mDataByteSize = bufferSize;
bufferList.mBuffers[0].mNumberChannels = audioFormat.mChannelsPerFrame;
bufferList.mBuffers[0].mData = buffer; //malloc(sizeof(UInt8) * 1024 * audioFormat.mBytesPerPacket)
How, then, is the data arranged in mData? If I iterate through the data like so
for (int i = 0; i < frameCount; i++) {
UInt8 somePieceOfAudioData = buffer[i];
}
then what is somePieceOfAudioData.
Is it a sample or a frame (left and right channels together)? If it's a sample then what channel is it a sample for? If for example it's a sample from the right channel, will buffer[i + 1] be a sample for the left channel?
Any ideas, links? Thank you!
Audio data is expected to be interleaved unless the kAudioFormatFlagIsNonInterleaved is set. I've found that for Core Audio questions the best source of documentation is usually the headers. CoreAudioTypes.h contains the following comment:
Typically, when an ASBD is being used, the fields describe the
complete layout of the sample data in the buffers that are represented
by this description - where typically those buffers are represented by
an AudioBuffer that is contained in an AudioBufferList.
However, when an ASBD has the kAudioFormatFlagIsNonInterleaved flag,
the AudioBufferList has a different structure and semantic. In this
case, the ASBD fields will describe the format of ONE of the
AudioBuffers that are contained in the list, AND each AudioBuffer in
the list is determined to have a single (mono) channel of audio data.
Then, the ASBD's mChannelsPerFrame will indicate the total number of
AudioBuffers that are contained within the AudioBufferList - where
each buffer contains one channel. This is used primarily with the
AudioUnit (and AudioConverter) representation of this list - and won't
be found in the AudioHardware usage of this structure.
In your particular case, the buffer will consist of interleaved shorts starting with the left channel.
Yeah, you're reading a frame and it's two 16-bit samples, Left and Right. (Actually, I'm not certain which is Left and which is Right. Hmmm.)
In addition to the header files, the class references built into Xcode are helpful. I find I'm using "option-click" and "command-click" in my code a lot when I'm sorting out these kinds of details. (for those new to Xcode.. these clicks get you the info and docs, and jump-to-source location, respectively.)
The upcoming book "Learning Core Audio: A Hands-on Guide to Audio Programming for Mac and iOS" by Kevin Avila and Chris Adamson does a nice job of explaining how all this works. It's available in "Rough Cut" form now at Safari Books Online:
http://my.safaribooksonline.com/book/audio/9780321636973
i am trying to take the audio buffer samples in real time( resolution of ms)
i am using this function, but it gives me error.
AudioBufferList *bufferList = NULL;
AudioBuffer audioBuffer = bufferList->mBuffers[0];
int bufferSize = audioBuffer.mDataByteSize / sizeof(SInt32);
SInt32 *frame = audioBuffer.mData;
SInt32 signalInput[22050];
for( int i=0; i<bufferSize; i++ )
{
SInt32 currentSample = frame[i];
*(signalInput +i) = currentSample;
NSLog(#"Total power was: %ld ",currentSample);
}
what am i doning wrong here ?
i only need to get the audio samples .i dont want 2 pages code(such as in the app doc)
thanks .
What you want is inconsistent with what you are trying to do. A NULL bufferlist can produce no samples.
You need the two+ pages of code to properly configure the Audio Session and the RemoteIO Audio Unit (etc.) in order to get what you are trying to get. Otherwise there are no samples. The phone won't even turn on audio recording or know how to set up the recording (there a bunches of options) before turning it on. Study the docs and deal with it.
I am new to iPhone. Could you please help me to modify the SpeakHere app from Apple to record in mono format. What should I have to set for mChannelsPerFrame and what else should I set?
I already change some part for record on linearPCM WAVE format.
Here is link to speakHere.
Here is what I think they allow me to change but I don't quite understand on sound:
void ChangeNumberChannels(UInt32 nChannels, bool interleaved)
// alter an existing format
{
Assert(IsPCM(), "ChangeNumberChannels only works for PCM formats");
UInt32 wordSize = SampleWordSize(); // get this before changing ANYTHING
if (wordSize == 0)
wordSize = (mBitsPerChannel + 7) / 8;
mChannelsPerFrame = nChannels;
mFramesPerPacket = 1;
if (interleaved) {
mBytesPerPacket = mBytesPerFrame = nChannels * wordSize;
mFormatFlags &= ~kAudioFormatFlagIsNonInterleaved;
} else {
mBytesPerPacket = mBytesPerFrame = wordSize;
mFormatFlags |= kAudioFormatFlagIsNonInterleaved;
}
}
On iPhone you will only be able to record in mono.
You shouldn't need to do anything to set this up in the SpeakHere example. It's done automatically. For example in AQRecorder::SetupAudioFormat:
size = sizeof(mRecordFormat.mChannelsPerFrame);
XThrowIfError(AudioSessionGetProperty( kAudioSessionProperty_CurrentHardwareInputNumberChannels,
&size,
&mRecordFormat.mChannelsPerFrame), "couldn't get input channel count");
That gets the supported hardware input channels and sets it as an ivar. Elsewhere, the buffer size calculations will factor that in.