Frame by frame decode using Media Source Extension - html5-video

I've been digging through the Media Source Extension examples on the internet and haven't quite figured out a way to adapt them to my needs.
I'm looking to take a locally cached MP4/WebM video (w/ 100% keyframes and 1:1 ratio of clusters/atoms to keyframes) and decode/display them non-sequentially (ie. frame 10, 400, 2, 100, etc.) and to be able to render these non-sequential frames on demand at rates from 0-60fps. The simple non-MSE approach using the currentTime property fails due to the latency in setting this property and getting a frame displayed.
I realize this is totally outside normal expectations for video playback, but my application requires this type of non-sequential high speed playback. Ideally I can do this with h264 for GPU acceleration but I realize there could be some platform specific GPU buffers to contend with, though it seems that a zero frame buffer should be possible (see here). I am hoping that MSE can accomplish this non-sequential high framerate low latency playback, but I know I'm asking for a lot.
Questions:
Will appendBuffer accept a single WebM cluster / MP4 Atom made up of a single keyframe, and also be able to decode at a high frequency (60fps)?
Do you think what I'm trying to do is possible in the browser?
Any help, insight, or code suggestions/examples would be much appreciated.
Thanks!
Update 4/5/16
I was able to get MSE mostly working with single frame MP4 fragments in Firefox, Edge, and Chrome. However, Chrome seems to be running into the frame buffer issue linked above and I haven't found a way to pre-process a MP4 to invoke this "low delay" mode. Anyone have any clues if it's possible to create such a file with an existing tool like MP4Box?
Firefox and Edge decode/display the individual frames immediately with very little latency, but of course something breaks once I load this video into a Three.js WebGL project (no video output, no errors). I'm ignoring this for now as I'd much rather have things working on Chrome as I'll be targeting Android as well.

I was able to get this working pretty well. The key was getting Chrome to enter its "low delay" mode by muxing a specially crafted MP4 file using modified mp4box sources. I added one line in movie_fragments.c so it read:
if (movie->moov->mvex->mehd && movie->moov->mvex->mehd->fragment_duration) {
trex->track->Header->duration = 0;
Media_SetDuration(trex->track);
movie->moov->mvex->mehd->fragment_duration = 0;
}
Now every MP4 created will have the MEHD fragment duration set to 0 which causes Chrome to process it as a live stream.
I still have one remaining issue related to the timestampOffset property which in combination with the FPS set in the media fragments control the playback speed. Since I'm looking to control the FPS directly I don't want any added delay from the MSE playback engine. I'll post a separate question here to address that.
Thanks,
Dustin

Related

Real-time camera input to Julia-lang

TLDR: How can I achieve low-latency, low-cpu impact webcam aquistition in Julia?
edit: I also posted this on the julia devs forum
I am new to Julia. I am interested in processing the video feed from a connected webcam, and see what kind of performance I can get out of Julia.
I am working on Linux Ubuntu, 16.04.
The only way I have found to get webcam input through video4linux, is through VideoIO, which is working on my system. The video has an unacceptable lag however, of up to 4 seconds. I assume this is given by the buffering of frames by the driver and/or libav (or is it ffmpeg, I dunno). With any camera api worth its name, I should be able to access the latest camera frame acquired... or at least set the size of the queue that Im popping frames from. Seems there is no such option in VideoIO, or maybe I am missing it.
It really is important for me to be able show-case Julia as a high performance language to non-techies... so this lag will ruin the demo I am hoping to put together.
edit: here is some of the code I have:
module myViewCam
export myView
import VideoIO, ImageView;
function myView()
camera = VideoIO.opencamera();
buf = VideoIO.read(camera);
guidict = ImageView.imshow(buf);
while !eof(camera)
VideoIO.read!(camera, buf);
ImageView.imshow(guidict["gui"]["canvas"], buf);
sleep(0.00001);
end
end
end
Assuming above is content of myViewCam.jl at the Julia prompt (the "REPL"), I type:
include("myViewCam.jl");
myViewCam.myView();
Note that this is a fix for the function "VideoIO.viewcam()" which does not work out of the box it seems.
On my system, this brings the Julia thread up to about 100% cpu usage, at the beginning of video-stream there is about 4 seconds lag, but this evens out over time, until it lands on about 0.5 seconds lag. There obviously is some queue where frames are popped from.
Also see Video4Linux wrapper in Julia which works well with Images.jl:
https://github.com/Affie/Video4Linux.jl
It's not registered yet, but has been around for a while. It is possible to make this process multithreaded in Julia using SharedArrays.jl, or likely the new Composible Threading model since Julia 1.3.
PS, this vendor specific camera interface package exists too: https://github.com/JuliaCameras/RealSense.jl

WebRTC - Peerconnection constraints

I've been working on a WebRTC videoconferencing app which is working great, taking into account the current state of WebRTC.
However, I have been exploring the possibilities to add constraints to the video and audio streams being send over by PeerConnection.
More specific in improving the performance of the video.
When videoconferencing on old (slow) laptops, we noticed that the quality of the image is really high but the frame per second is low. The stream is hacky.
About the audio quality, we give it a 8,5 for Chrome but only a 5,5 to 6 for Firefox.
I am not really interested in applying constraints to getUserMedia since this stream is being shown to the user aswell, and we don't want to change anything about this local output. (Unless there isn't another way)
I have found alot of information on W3G's drafts about MediaStreams and WebRTC itself.
These define certain constraints like default fps, minfps, minwidth and minheight of the image. On webrtc.org is also alot of information available like choosing codec etc.
But these settings can only be made "under the hood". It seems these settings cannot be addressed from RTCPeerConnection API level?
Certain examples on the net manipulate the SDP strings in the Offer / Answer part of the WebRTC handshake, is this the way to go ?
TL;DR : How to apply - and What is the best way to apply - constraints on WebRTC like minfps, maxfps, default fps, minwidth, maxwidth, dpi of image, bandwidth of video and audio, audio KHz and any other way to improve performance or quality of the stream(s).
Big thanks in advance !
Right now, most of those can't be set in Firefox or Chrome. A few can be adjusted (with care/pain) in the SDP, but even if there's an SDP option defined for something it doesn't mean that the browsers look at it.
Both Mozilla and Google are looking to improve CPU overload detection and reaction (reduce frame size dynamically, etc). Right now, this effectively isn't being done. Upcoming releases of FF (FF24) will adapt to the capture resolution (as a maximum), but we don't have constraints for that yet, just about:config prefs (see media.*). That would allow you to set a different default resolution for Firefox.

h264 high profile video: playback from specified point

What is the proper and fast way to start streaming/playback of h264 high profile HDTV video dump from the specific point?
Huge sample of the real life stream: this file.
According to 'ffprobe -show_frames' this sample 10Gb 105 minutes video dump has only 28 video frames marked as 'key_frame=1' and 10 I-frames.
Application which I am trying to improve uses such frames as some kind of index, allowing to rewind and play from any key-frame or I-frame.
It works perfectly with other streams. But not in this case, as you can easily understand. Only 28 starting points of playback in 100+ minutes of show is far too low.
I've checked the presence of packets with 'random-access-indicator' enabled - but such packets in this stream aren't on frame boundaries, they don't have 'frame begin' bit enabled, so I can't rely on them.
Is there a way at all to implement 'rewind/pause/play from the specified time point' feature for this codec?
Solved by interpretation as index frames the ones which contain NAL sequences 'nal slice idr' and 'nal slice pps'.

How to turn off Video Acceleration programmatically

I'm using the Windows Media Player OCX in a program runned on hundreds of computers (dedicated).
I have found out that when video acceleration is turned on to "full", on some computers it will cause the video to fail to play correct, with green squares between movies and so on. Turn the acceleration to "None" and everything is fine.
This program is runned on ~800 computers that will autoupdate my program. So I want to add to the startup to my program that it turns off the video acceleration.
The question is, how do I turn off video Acceleration programmatically?
All computers are running XP and at least the second service pack.
It would take me ages to manually logg in to all those computers and change that setting so thats why I want the program to be able to do it automagically for me.
Using the suggested process of running procmon, and filtering out unnecessary data, I was able to determine the changes in the registry when this value changed:
Full Video Acceleration:
[HKEY_CURRENT_USER\Software\Microsoft\MediaPlayer\Preferences\VideoSettings]
"PerformanceSettings"=dword:00000002
"UseVMR"=dword:00000001
"UseVMROverlay"=dword:00000001
"UseRGB"=dword:00000001
"UseYUV"=dword:00000001
"UseFullScrMS"=dword:00000000
"DontUseFrameInterpolation"=dword:00000000
"DVDUseVMR"=dword:00000001
"DVDUseVMROverlay"=dword:00000001
"DVDUseVMRFSMS"=dword:00000001
"DVDUseSWDecoder"=dword:00000001
No Video Acceleration:
[HKEY_CURRENT_USER\Software\Microsoft\MediaPlayer\Preferences\VideoSettings]
"PerformanceSettings"=dword:00000000
"UseVMR"=dword:00000000
"UseVMROverlay"=dword:00000000
"UseRGB"=dword:00000000
"UseYUV"=dword:00000000
"UseFullScrMS"=dword:00000001
"DontUseFrameInterpolation"=dword:00000001
"DVDUseVMR"=dword:00000000
"DVDUseVMROverlay"=dword:00000000
"DVDUseVMRFSMS"=dword:00000000
"DVDUseSWDecoder"=dword:00000000
So, in short, set
PerformanceSettings
UseVMR
UseVMROverlay
UserRGB
UseYUV
DVDUseVMR
DVDUseVMROverlay
DVDUseVMRFSMS
DVDUseSWDecoder
to 0, and set
UseFullScrMS
DontUseFrameInterpolation
to 1.
It seems you're not the only one with this problem. Here's a link to a blog - the author solves his problem by lowering the hardware acceleration level. Tested on Media Player 9, 10 and 11 with REG script to set appropriate settings.
http://thebackroomtech.com/2009/04/15/global-fix-windows-media-player-audio-works-video-does-not/
As well as applying this fix, you might check the affected machines have the latest drivers and codec versions. Finally, if possible, you may consider re-coding the content to a format that doesn't produce the display problems (if the bug is codec related.)
Using hardware acceleration is certainly more energy-efficient - according to this Intel report, almost twice as much energy is used without acceleration, and as there are 800 machines, there's reason to seek out a green solution.

How to programmatically test for audio sync

I have a multimedia application that among other things converts video using FFMpeg. Video conversion being the pain that it is, I have in my test suits some tests that check our ability to convert various video formats, with emphasis on sample videos known not to work.
A common problem we've noticed from users is that some videos end up with their audio desynched after being processed, and I am looking for a way to check this in my tests.
Extracting the audio portion of the resulting videos is not a problem.
My best idea so far would be to check the offset of the first non-silence at both the beginning and end and compare each between the two videos, but I'm hoping someone smart has a better idea.
The application language/environment is Java, but since this is for testing, I'm free to use any toolset.
The basic problem is likely that the video and audio are different lengths. Extract the audio and test its length vs. the video length. If they are significantly different (more than maybe .05 sec, I'm not really sure what is detectable as "off"), then there's a problem.
To fix it, re-encode the audio to match the video length, and then put the audio and video back into a container format.