Decreasing speed decreases sound quality - just-audio

Decreasing the playback speed of AudioPlayer severely decreases the quality of the audio being played; the audio becomes very "noisy".
Is there any way to fix this or is it an issue with the just_audio implementation?
Reproduce:
final AudioPlayer player = AudioPlayer(); // Create audio player
player.setAsset("..."); // Load audio file
player.setSpeed(0.5); // Halve speed
player.play(); // Start playing

Just to preface this answer, time stretching is a difficult thing to do in real-time because it has to stretch time without stretching the sound waves (stretching the sound waves would lower the frequency and hence the pitch, so it has to stretch time while filling the gaps with fabricated extensions to the existing sound waves). As a result, the very best real time algorithm will still introduce artifacts and distortions.
Now to answer your question, just_audio doesn't provide any options to change the time stretching algorithm, but it does use the best available algorithms for each platform, for general purpose usage. The Android implementation uses Sonic which is better quality than Android's own built-in algorithm. On iOS/macOS, AVAudioTimePitchAlgorithmTimeDomain is used which seems to produce the least distortion at speeds below 1.0 out of the different algorithms Apple provides, although newer iPhones/iOS versions may produce higher quality output. On web browsers, it uses whatever algorithm that web browser provides.
If you need to try out alternatives, you would need to make a copy of just_audio and edit the code that selects the algorithm. You are unlikely to find better options for Android and web, but you might like to experiment with the different iOS/macOS algorithms by searching for AVAudioTimePitchAlgorithmTimeDomain in the code and changing it to one of the other options listed in Apple's documentation. You may find one of the other algorithms works better if you have a specialised use case.

Related

Premiere export settings for background video

I'm not sure if it's allowed ask these questions here, but looks important for us webdevelopers (even bad dev like me :p ).
The question is about export setting videons on Premiere. I'm looking for a background video 30s like airbnb or paypal. Yesterday, I check paypal size and it's only 10/15 Mb for more than a minute. How did they do?
Obviously you want a low average bit rate. Things that can help with that are: keep the resolution low (you can scale it up a bit on the client); use H.264 High Profile (for the H.264 version); use 2-pass encoding; use variable bit rate. You can try increasing the GOP length too.
I assume there's no audio, so that shouldn't be an issue. (Can't remember if Adobe has an option for no sound track, but you can set the audio to a very low bit rate, or post-process it with ffmpeg or something to remove the audio track.)
If you have any control over the video content, you can try to keep it compressible. For example, avoid video with lots of detail or rapid motion. You might be able to selectively blur parts in a way that doesn't look bad. If it doesn't move too fast, you might be able to decrease the frame rate.
If you really want to optimize, you'll probably need to experiment a lot.

Check Device Type For GPUImage

GPUImage requires, for iPhone 4 and below, images smaller than 2048 pixels. The 4S and above can handle much larger. How can I check to see which device my app is currently running on? I haven't found anything in UIDevice that does what I'm looking for. Any suggestions/workarounds?
For this, you don't need to check device type, you simply need to read the maximum texture size supported by the device. Luckily, there is a built-in method within GPUImage that does this for you:
GLint maxTextureSize = [GPUImageContext maximumTextureSizeForThisDevice];
The above will give you the maximum texture size for the device you're running this on. That will determine the largest image size that GPUImage can work with on that device, and should be future-proof against whatever iOS devices come next.
This method works by caching the results of this OpenGL ES query:
glGetIntegerv(GL_MAX_TEXTURE_SIZE, &maxTextureSize);
if you're curious.
I should also note that you can provide images larger than the max texture size to the framework, but they get scaled down to the largest size supported by the GPU before processing. At some point, I may complete my plan for tiling subsections of these images in processing so that larger images can be supported natively. That's a ways off, though.
This is among the best device-detection libraries I've come across: https://github.com/erica/uidevice-extension
EDIT: Although the readme seems to suggest that the more up-to-date versions are in her "Cookbook" sources. Perhaps this one is more current.
Here is a useful class that I have used several times in the past that is very simple and easy to implement,
https://gist.github.com/Jaybles/1323251

control speed of sound xcode

I'm wondering whether it's possible to slow down a sound in xcode. I mean I'll add some .mp3 file in my supporting files in xcode and I'll create app which will be able to speed it up or slow down. For example with slider. Is it even possible? If yes, could anyone help me with some idea? Thanks
AVAudioPlayer has a rate property which should be able to help you accomplish your goal.
http://developer.apple.com/library/IOS/#documentation/AVFoundation/Reference/AVAudioPlayerClassReference/Reference/Reference.html
The audio player’s playback rate. #property float rate
Discussion
This property’s default value of 1.0 provides normal playback rate.
The available range is from 0.5 for half-speed playback through 2.0
for double-speed playback.
To set an audio player’s playback rate, you must first enable rate
adjustment as described in the enableRate property description.
I also found a good SO post on the AVAudioPlayer's rate:
AVAudioPlayer rate
Seems like as you'd mentioned you could set a slider with values from 0.5 to 2.0 and on valuechanged modify the audio players rate by using
- (IBAction)changeValue:(UISlider *)sender
{
//made up assumed ivar names
if ([_audioPlayer respondsToSelector:#selector(setEnableRate:)])
_audioPlayer.enableRate = YES;
if ([_audioPlayer respondsToSelector:#selector(setRate:)])
_audioPlayer.rate = [NSNumber numberWithFloat:slideValue];
}
Playing PCM audio at a faster or slower rate than its sample rate changes its pitch and also introduces considerable artefacts. If you're OK with this, the approach you would use is to decode the MP3 into PCM audio and then use Direct Digital Synthesis Oscillator to control the playback rate.
If you want to maintain pitch but change speed, you need a audio time-stretching algorithm.
Dirac3 from DSP Dimension is a commercial product that can do it, and which is available for licensing for use in iOS application. Other commercial solutions exist.
DSP Dimension's blog provides a helpful tutorial on the basics of how to implement pitch-shifting
using a FFT. Time stretching is essentially the same process. However, there's a fair bit of secret sauce in the DIRAC plug-in they don't tell you about.
Be warned that unless you're a Electronics engineering, physics or maths graduate, you'll probably it tough going to fill in the blanks.

How to detect heart pulse rate without using any instrument in iOS sdk?

I am right now working on one application where I need to find out user's heartbeat rate. I found plenty of applications working on the same. But not able to find a single private or public API supporting the same.
Is there any framework available, that can be helpful for the same? Also I was wondering whether UIAccelerometer class can be helpful for the same and what can be the level of accuracy with the same?
How to implement the same feature using : putting the finger on iPhone camera or by putting the microphones on jaw or wrist or some other way?
Is there any way to check the blood circulation changes ad find the heart beat using the same or UIAccelerometer? Any API or some code?? Thank you.
There is no API used to detect heart rates, these apps do so in a variety of ways.
Some will use the accelerometer to measure when the device shakes with each pulse. Other use the camera lens, with the flash on, then detect when blood moves through the finger by detecting the light levels that can be seen.
Various DSP signal processing techniques can be used to possibly discern very low level periodic signals out of a long enough set of samples taken at an appropriate sample rate (accelerometer or reflected light color).
Some of the advanced math functions in the Accelerate framework API can be used as building blocks for these various DSP techniques. An explanation would require several chapters of a Digital Signal Processing textbook, so that might be a good place to start.

ZyXEL ADPCM codec

I have a ZyXEL USB Omni56K Duo modem and want to send and receive voice streams on it, but to reach adequate quality I probably need to implement some "ZyXEL ADPCM" encoding because plain PCM provides too small sampling rate to transmit even medium quality voice, and it doesn't work through USB either (probably because even this bitrate is too high for USB-Serial converter in it).
This mysterious codec figures in all Microsoft WAV-related libraries as one of many codecs theoretically supported by it, but I found no implementations.
Can someone offer an implementation in any language or maybe some documentation? Writing a custom mu-law decoding algorithm won't be a problem for me.
Thanks.
I'm not sure how ZyXEL ADPCM varies from other flavors of ADPCM, but various ADPCM implementations can be found with some google searches.
However, the real reason for my post is why the choice of ADPCM. ADPCM is adaptive differential pulse-code modulation. This means that the data being passed is the difference in samples, not the current value (which is also why you see such great compression). In a clean environment with no bit loss (ie disk drive), this is fine. However, in a streaming environment, its generally assumed that bits may be periodically mangled. Any bit damage to the data and you'll be hearing static or other audio artifacts very quickly and usually, fairly badly.
ADPCM's reset mechanism isn't framed based, which means the audio problems can go on for an extended period of time depending on the encoder. The reset code is a usually a set of 0s (16 comes to mind, but its been years since I wrote my own ports).
ADPCM in the telephony environment usually converts a 12 bit PCM sample to a 4 bit ADPCM sample (not bad). As for audio quality...not bad for phone conversations and the spoken word, but most people, in a blind test, can easily detect the quality drop.
In your last sentence, you throw a curve ball into the question. You start mentioning muLaw. muLaw is a PCM implementation that takes a 12 bit sample and transforms it using a logarithmic scale to an 8 bit sample. This is the typical compression mechanism for TDM (phone) networkworks in North America (most of the rest of the world uses a similar algorithm called ALaw).
So, I'm confused what you are actually trying to find.
You also mentioned Microsft and WAV implementations. You probably know, but just in case, that WAV is just a wrapper around the audio data that provides format, sampling information, channel, size and other useful information. Without WAV, AU or other wrappers involved, muLaw and ADPCM are usually presented as raw data.
One other tip if you are implementing ADPCM. As I indicated, they use 4 bits to represent a 12 bit sample. They get away with this by both sides having a multiplier table. Your position in the table changes based on the 4 bit value (in other words, the value is both multiple against a step size and used to figure out the new step size). I've seen a variety of algorithms use slightly different tables (no idea why, but you typically see the sent and received signals slowly stray off the bias). One of the older, popular sound packages was different than what I typically saw from the telephony hardware vendors.
And, for more useless trivia, there are multiple flavors of ADPCM. The variances involve the table, source sample size and destination sample size, but I've never had a need to work with them. Just documented flavors that I've found when I did my internet search for specifications for the various audio formats used in telephony.
Piping your pcm through ffmpeg -f u16le -i - -f wav -acodec adpcm_ms - will likely work.
http://ffmpeg.org/