I'd like to be able to seek to an arbitrary frame in a MPEG-2 file (from DVD, I guess it's called MPEG-2 Program Stream). So far I had been using OpenCV 2.1 for accessing those frames but that would only work on a frame after frame basis (only forward seeking). Later when I installed OpenCV 2.3.1 that possibility was lost though, i.e. limited to AVI. Anyways, I'd like to do that without OpenCV. I've managed to seek to keyframes (I suppose) or every so and so frame (e.g. every 12th frame). Now, looking at VirtualDub frame accurate seeking is possible. It says: ''parsing interleaved MPEG-2 file''. What exactly does that mean and where would I have to start to do the same? Is it even legal, I remember reading something about that somewhere, can't really remember though. I'm programming in C++ using directshow. As far as I know directshow won't do it. Then I was looking into CBaseFilter, streamtime method etc but before I dive into that complex topic I'd like to know if that's the right way to go. Looking forward to your answers, thanks!
# Geraint: code snippet of filter graph:
CoCreateInstance(CLSID_FilterGraph,NULL,CLSCTX_INPROC,IID_IGraphBuilder,(LPVOID *)&pGraphBuilder);
CoCreateInstance(CLSID_MPEG2Demultiplexer,NULL,CLSCTX_INPROC,IID_IBaseFilter,(LPVOID *)&pib);
CoCreateInstance(CLSID_CMPEG2VidDecoderDS,NULL,CLSCTX_INPROC,IID_IBaseFilter,(LPVOID *)&pib2);
pGraphBuilder->AddFilter(pib,L"Sample Splitter");
pGraphBuilder->AddFilter(pib2,L"Sample Decoder");
ZeroMemory(&am_media_type, sizeof(am_media_type));
am_media_type.majortype = MEDIATYPE_Video;
am_media_type.subtype = MEDIASUBTYPE_MPEG2_VIDEO;
am_media_type.formattype = FORMAT_MPEG2Video;
pGraphBuilder->QueryInterface(IID_IMediaControl,(LPVOID *)&pMediaControl);
pGraphBuilder->QueryInterface(IID_IMediaSeeking, (void**)(&pMediaSeeking));
pGraphBuilder->QueryInterface(__uuidof(IVideoFrameStep), (PVOID *)&fst);
pGraphBuilder->QueryInterface(IID_IMediaEvent, (void **)&imev);
pGraphBuilder->QueryInterface(IID_IBasicVideo,(LPVOID *)&ibv);
pGraphBuilder->RenderFile(FILENAME,0);
and then I use IMediaSeeking for seeking the vid. I've also tried frame stepping (hence the references above).
DirectShow is capable of delivering frame-accurate seeking. However, without an index, this is based on a time offset from file start, not a frame count.
Use IMediaSeeking to set the start time. The demux will begin delivery of compressed frames some way before that. The decoder will start decoding at the previous key frame but will discard any frames that are before your chosen start point.
G
Related
I'm going to start off by saying that I'm very new to SDR and GNU Radio. This may be a dumb question, but I have been googling and testing things for about two months now trying to get this to work without success. Any help or pointers would be appreciated!
I'm attempting to use GNU Radio 3.8 to transfer a file using differential QPSK. I've tried to follow the tutorials on the wiki as well as several similar academic papers I found on the internet (which also seem to be based off the wiki tutorial). None of them worked on their own but combining what actually works from each one, I managed to create a flowgraph sans hardware that does indeed send and receive the data from a file. Here's the Flowgraph and here is a screenshot of the results. The results show the four constellation points, and the data from the file source matches up perfectly with the data having gone though the entire transmit+receive chain. In the simulation I have a throttle block and a channel model block where the LimeSDR Source and LimeSDR Sink block would be. So far so good (at least as far as I can tell).
When I actually start transmitting this signal with the SDR, the received data no longer matches up with what is transmitted. Here's the flowgraph I've been using for the transmission. I added a protocol formatter and some FEC blocks that I could have removed for this illustration, but the point is that simply looking at what bits are going into the modulator vs what's being recovered, the two do not match up. The constellation looks good (as far as I can tell) but the bits are all wrong. Here's a screenshot showing the bits being transmitted. You'll notice in the screenshot of the transmitted signal that the signal has a repeating series of three flat top "1's" surrounded on both sides by a period of "0's" (at time 1.5ms and 3.5ms). This is a screenshot of the received bits. At time 1ms and 3ms you can see how it is has significantly more transitions between 1 and 0 than it should.
So at this point I'm stumped. The simulation worked but the real world test does not. I've messed around with the RRC filter properties a significant amount. I have no clue if the values I have chosen are correct as I have not found a tutorial or explanation on how to do so. I just looked at some of the example flowgraphs and made some guesses as to how those values were derived and applied those guesses to my use case. It worked well in the simulation so I thought it would be fine in the real world test. I've tried a variety of samples per symbol but my goal is for a 4800 bit per second transfer speed, and using different samples per symbol didn't help anyway. What should I change in order to get this to work?
Bonus question: The constellation object has QPSK and DQPSK, and the constellation modulator has a differential checkbox. What is the best practice combination of selections to get a differential QPSK modulation?
I have a BGR 24-bit image in memory as continuous buffer (represented by cv::Mat, in case it may be of any help). I would like to load it to ID2D1Bitmap1 bitmap for 2D rendering. I have the following working code (showing a pseudo-code here):
IWICImagingFactory::CreateBitmapFromMemory(GUID_WICPixelFormat24bppBGR);
IWICFormatConverter::Initialize(GUID_WICPixelFormat32bppRGB);
ID2D1DeviceContext::CreateBitmapFromWicBitmap;
This works fine, the main issue being the time it takes: 20-40 milliseconds, which is too long for my application. I am looking for ways to optimize the process.
I, probably, can save the creation time of the ID2D1Bitmap1 by doing this once, and then loading the converted image from memory using CopyFromMemory, but still the conversion itself takes a large amount of time. One way could be loading the raw BGR buffer to GPU memory, and converting it to native RGBA format on the GPU itself, but I have no idea how start with that.
Your second idea is exactly the direction you should go. Create the ID2D1Bitmap(s) once, convert the buffer (more on that below), and use CopyFromMemory. I do something very similar in an app which provides a preview of a connected webcam (which may have one of various formats). Some cameras will deliver the images in MJPEG, YUY2, etc.
That app uses MediaFoundation, and an IMFTransform to convert the buffer. The IMFTransform in this case is an instance of CLSID_CColorConvertDMO (which uses SIMD registers/instructions when possible). However, prior to completing that, I tested with my own conversion code (which was CPU bound, and performed so-so), and another solution with HLSL and DirectCompute (performed well, but handled only one format). In the end I chose CLSID_CColorConvertDMO to handle the various types of input, but if you only have one known type, you may choose to use the HLSL solution (although it will cause you to have to write the conversion code, and setup the 'views' of the data).
However, if you choose the MediaFoundation route, you can use an IMFTransform without all of the rest of the graph (source, sink, etc). After creating the CLSID_CColorConvertDMO instance, and setting the input and output types (format, frame size, etc), create an IMFSample (using MFCreateSample), and an IMFMediaBuffer (using MFCreateMemoryBuffer) to the sample (using IMFSample::AddBuffer), then all that is necessary is to call ProcessInput and ProcessOutput to convert the buffer (create all items upfront). This may sound like a lot, but if done correctly, your cpu utilization will not have a noticeable impact, and you will achieve the performance you are looking for, even for capture cards which often deliver large frames at 60+ FPS (having used DataPath and Blackmagic capture cards in the past).
Good luck. I am certain you will crush it.
Is there a programmatic way to convert two images into an animation sequence (e.g., an animated GIF) like the following example?
This image sequence, taken from a http://memrise.com course, doesn't seem to have manually-edited frames, but seems automatically transformed using some kind shape morphing algorithm. Is there a common term used to describe such an animation or algorithm? Is there a feature in ImageMagick or Photoshop/Gimp that generates such animations, given a pair of images?
Ideally the technique could be scriptable so I could create animations for several pairs of start-end images.
Edit: I have just been told about Gimp's tool under Filters->Animation->Blend, which appears to do the same thing as jQuery morph: each frame i is start + (finish - start)/N*i. In other words, you're transitioning each pixel independently from the start value to the finish value, without any shape morphing. The example gives is more complicated, as it modifies the contours of both images to achieve its compelling effect.
Other examples:
http://static.memrise.com/uploads/mems/32000121024054535.gif
http://static.memrise.com/uploads/mems/225428000121109232837.gif
I have written a tool that doesn't require setting manual keypoints and is not restricted to a domain (like faces). Anyway, the images have to be similar (e.g. two faces or two cars from the same perspective).
https://github.com/kallaballa/Poppy
There is also a web-version created with emscripten.
I generated the above animation using following command line:
poppy flame.png glyph.png flame.png
Although this is an old question, since ImageMagick is mentioned, for anyone who comes here from google it may be worth looking at this imagemagick plugin called shapemorph.
GIMP can't do that directly, but over the years a series of (now poorly maintaind) plug-ins to do that where released by third parties. The keyword for searching for this is "morph" - you should find a bunch of stand alone programs to do that as well, from "gratis" to full fledged Free Software, such as xmorph
Given pairs of vector files (.wmf extension) it is possible to use linear interpolation of shapenodes in Visual Basic for Applications to create frames for GIF animations , though this would take along time to explain. For some examples see
http://www.giless.co.uk/animatorMorphGIFs.htm (it is like a slideshow)
I have made some improvements since then, as well!
I'm trying to make an OpenGL renderer that mashes various shapes into one large mesh and stores these in two VBOs, one GL_ARRAY_BUFFER and one GL_ELEMENT_ARRAY_BUFFER. I'm aiming for it to work on both OpenGL ES 2 and OpenGL 3.2 core. I am currently trying to find the best way to handle deleting shapes from within this mesh and my current approach is to periodically rebuild the entire thing, possibly on a background thread.
The problem is that in order to rebuild the new and clean mesh, I need access to the vertices / indices that have been written to the buffers using glMapBuffer. According to the documentation for GL_OES_mapbuffer, WRITE_ONLY_OES is the only acceptable parameter for 'access'.
So, I don't think the data pointed at there is reliable to read from in order to create my new buffers. I know there are other functions in GL Core that allow you to copy the buffer data, but these also seem to be missing.
Can anyone verify that this is not possible on ES 2.0 or give some approach for achieving buffer reading? My current solution is to keep a shadow copy of all the data, which is obviously not ideal.
I think that keeping a shadow copy of GPU data in main memory is much better than reading these data from GPU memory. It is recommended to discard previous data before using glMapBuffer anyway. Read this for more information (It will not give you direct answer to your question, but it might be usefull).
I am new to CoreAudio, and I would like to output a simple sine wave and square wave with a given frequency and amplitude through the speakers using CA. I don't want to use sound files as I want to synthesize the sound.
What do I need to do this? And can you give me an example or tutorial? Thanks.
There are a number of errors in the previous answer. I, the legendary :-) James McCartney, not James Harkins wrote the sinewavedemo, I also wrote SuperCollider which is what the audiosynth.com website is about. I also now work at Apple on CoreAudio. The sinewavedemo DOES use CoreAudio, since it uses AudioHardware.h from CoreAudio.framework as its way to play the sound.
You should not use the sinewavedemo. It is very old code and it makes dangerous assumptions about the buffer layout of the audio hardware. The easiest way nowadays to play a sound that you are generating is to use the AudioQueue, or to use an output audio unit with a render callback set.
The best and easiest way to do that without files is to prepare a single cycle buffer, containing one cycle of the wave (this is called technically a wavetable)
In the playback function called by CoreAudio thread, fill the output buffer with samples read from the wave buffer.
Note however that you will face two problems very quickly :
- for the sine wave, if the playback frequency is not an integer multiple of the desired sine frequency, you will probably need to implement an interpolator if you want to have a good quality. Using only integer pointers will generate a significant level of harmonic noise.
for the square wave, avoid to just program an array with +1 / -1 values. Such a signal is not bandlimited and will alias a lot. Do not forget that the spectrum of a square wave is virtually infinite!
To get good algorithms for signal generation, take a look to musicdsp.org, that's probably one of the best resource for that
Are you new to audio programming in general? As a starting point i would check out
http://www.audiosynth.com/sinewavedemo.html
This is a minimum osx sinewave implementation by the legendary James Harkins. Note, it doesn't use CoreAudio at all.
If you specifically want to use CoreAudio for your sinewave you need to create an output unit (RemoteIO on the iphone, AUHAL on osx) and supply an input callback, where you can pretty much use the code from the above example. Check out
http://developer.apple.com/mac/library/technotes/tn2002/tn2091.html
The benefits of CoreAudio are chiefly, chain other effects with your sinewave, write plugins for hosts like Logic & provide the interfaces for them, write a host (like Logic) for plugins that can be chained together.
If you don't wont to write a plugin, or host plugins then CoreAudio might not actually be for you. But one of the best things about using CoreAudio is that once you get your sinewave callback working it is easy to add effects, or mix multiple sines together
To do this you need to put your output unit in a graph, to which you can effects, mixers, etc.
Here is some help on setting up graphs http://timbolstad.com/2010/03/16/core-audio-getting-started-pt2/
It isn't as difficult as it looks. Apple provides C++ helper classes for many things (/Developer/Examples/CoreAudio/PublicUtility) and even if you don't want to use C++ (you don't have to!) they can be a useful guide to the CoreAudio API.
If you are not doing this realtime, using the sin() function from math.h is not a bad idea. Just fill however many samples you need with sin() beforehand when it is time to play it, just send it to the audio buffer. sin() can be quite slow to call once every sample if you are doing this realtime, using an interpolated wavetable lookup method is much faster, but the resulting sound will not be as spectrally pure.
There is a good and well documented sine wave player code example in Chapter 7 of the Adamson/Avila "Learning Core Audio" book, published by Addison-Wesley Professional (ISBN-10: 0-321-63684-8 ):
http://www.informit.com/store/learning-core-audio-a-hands-on-guide-to-audio-programming-9780321636843
It is a rather new publication (2012) and addresses precisely the issue of this question. It's only a starting point, but it's a valuable starting point.
BTW. Don't jump to graphs before having this basic lesson (which involves some math) behind.
Concerning example code, a quick and efficient method I often use deals with a pre-filled sinewave lookup table which has as many members as sample rate, for 44100 Hz the table has size of 44100. In other words, cycle length equals sample rate. This gives an acceptable trade-off between speed and quality in many cases. You can initialize it with the program.
If you generate floating point samples (which is default in OSX), and use math functions, use sinf() rather than (float)sin(). Promotions in inner loop cycles of a render callback are always resource-expensive. So are repetitive multiplications of constants, such as 2.0*M_PI, which can too often be found in code examples.