Kinect sensor raises many events per second and if you are not very fast to elaborate them (for example trying to animate a true 3D character) in a few frames you get stuck.
What is the best approach to handle only the reasonable number of events, without blocking the User Interface?
Thanks.
I would suggest requesting the frame in a loop instead of using the event method.
To do this in your animation loop just call:
sensor.DepthStream.OpenNextFrame(millisecondsWait);
Or:
sensor.SkeletonStream.OpenNextFrame(millisecondsWait);
Or:
sensor.ColorStream.OpenNextFrame(millisecondsWait);
Event driven programming is great but when you run into problems like you mention it is better to just call the functions when you need it.
I'd say that if you're animating something really quick and elaborate (e.g. complex 60fps 3D image), the time you'll take to get the image from the camera synchronously might create bumps in your rendering.
I'd try splitting the rendering and the Kinect processing/polling in separate threads; with that approach you could even keep using the 30fps event-driven model.
Related
I'm trying to use WebGL to speed up computations in a simulation of a small quantum circuit, like what the Quantum Computing Playground does. The problem I'm running into is that readPixels takes ~10ms, but I want to call it several times per frame while animating in order to get information out of gpu-land and into javascript-land.
As an example, here's my exact use case. The following circuit animation was created by computing things about the state between each column of gates, in order to show the inline-with-the-wire probability-of-being-on graphing:
The way I'm computing those things now, I'd need to call readPixels eight times for the above circuit (once after each column of gates). This is waaaaay too slow at the moment, easily taking 50ms when I profile it (bleh).
What are some tricks for speeding up readPixels in this kind of use case?
Are there configuration options that significantly affect the speed of readPixels? (e.g. the pixel format, the size, not having a depth buffer)
Should I try to make the readPixel calls all happen at once, after all the render calls have been made (maybe allows some pipelining)?
Should I try to aggregate all the textures I'm reading into a single megatexture and sort things out after a single big read?
Should I be using a different method to get the information back out of the textures?
Should I be avoiding getting the information out at all, and doing all the layout and rendering gpu-side (urgh...)?
Should I try to make the readPixel calls all happen at once, after all the render calls have been made (maybe allows some pipelining)?
Yes, yes, yes. readPixels is fundamentally a blocking, pipeline-stalling operation, and it is always going to kill your performance wherever it happens, because it's sending a request for data to the GPU and then waiting for it to respond, which normal draw calls don't have to do.
Do readPixels as few times as you can (use a single combined buffer to read from). Do it as late as you can. Everything else hardly matters.
Should I be avoiding getting the information out at all, and doing all the layout and rendering gpu-side (urgh...)?
This will get you immensely better performance.
If your graphics are all like you show above, you shouldn't need to do any “layout” at all (which is good, because it'd be very awkward to implement) — everything but the text is some kind of color or boundary animation which could easily be done in a shader, and all the layout can be just a static vertex buffer (each vertex has attributes which point at which simulation-state-texel it should be depending on).
The text will be more tedious merely because you need to load all the digits into a texture to use as a spritesheet and do the lookups into that, but that's a standard technique. (Oh, and divide/modulo to get the digits.)
I don't know enough about your use case but just guessing, Why do you need to readPixels at all?
First, you don't need to draw text or your the static parts of your diagram in WebGL. Put another canvas or svg or img over the WebGL canvas, set the css so they overlap. Let the browser composite them. Then you don't have to do it.
Second, let's assume you have a texture that has your computed results in it. Can't you just then make some geometry that matches the places in your diagram that needs to have colors and use texture coords to look up the results from the correct places in the results texture? Then you don't need to call readPixels at all. That shader can use a ramp texture lookup or any other technique to convert the results to other colors to shade the animated parts of your diagram.
If you want to draw numbers based on the result you can use a technique like this so you'd make a shader at references the result shader to look at a result value and then indexes glyphs from another texture based on that.
Am I making any sense?
I’m writing an interactive application using wxPython and matplotlib. I have a single graph embedded in the window as a FigureCanvasWxAgg instance. I’m using this graph to display some large data sets, between 64,000 and 512,000 data points, so matplotlib’s rendering takes a while. New data can arrive every 1–2 seconds so rendering speed is important to me.
Right now I have an update_graph_display method that does all of the work of updating the graph. It handles updating the actual data as well as things like changing the y axis scale from linear to logarithmic in response to a user action. All in all, this method calls quite a few methods on my axes instance: set_xlim, set_ylabel, plot, annotate, and a handful of others.
The update_graph_display is wrapped in a decorator that forces it to run on the main thread in order to prevent the UI from being modified from multiple threads simultaneously. The problem is that all of this graph computation and drawing takes a while, and since all of this work happens on the main thread the application is unresponsive for noticeable periods of time.
To what extent can the computation of the graph contents be done on some other thread? Can I call set_xlim, plot, and friends on a background thread, deferring just the final canvas.draw() call to the main thread? Or are there some axes methods which will themselves force the graph to redraw itself?
I will reproduce #tcaswell’s comment:
No Axes methods should force a re-draw (and if you find any that do please report it as a bug), but I don't know enough about threading to tell you they will be safe. You might get some traction using blitting and/or re-using artists as much as possible (via set_data calls), but you will have to write the logic to manage that your self. Take a look at how the animation code works.
I need to get access to the actual camera on the iPhone and take a 'snapshot' of it every .X seconds.
On each update, I'll run image processing on it (a modified version of OpenCVCircles) and if conditions are met, exit out of the camera while taking certain actions. If conditions are not met, then I'll simply stay in the camera. The conditions will be a custom configuration of a series of circles that the user has to look at through the camera.
I know that this could be easy by forcing the user to take a picture, and I grab that from the UIImagePicker. However, I think it would be much better to do it automatically for the user, if they are in view of the image.
Is it possible to do this without completely writing my own Camera by AVCapture classes?
Look into AVCaptureSession, AVCaptureDevice and the like classes. They should provide what you need. We are using it here to provide a live video feed through the network.
Edit
From your question edit, I see this answer does not apply directly to your use case. Yet it is the only mean to my knowledge that will allow you to accomplish what you seek.
I am setting out for a visualization project that will generate 1000+ sprites from dynamic data. The toolkit I am using (Flare) requires some optimization. I am trying to figure out some optimization techniques for Flash. How can I make Flash run fast when there are so many sprites on the stage, or maybe there is an optimization technique that doesn't involve generating so many sprites?
One good way of doing is freeze animations which are not visible to the user. But the complication with this is that, you need to remember the state from which the animation has to resume or refers the animation based on the current state of the whole application. Since you have so many sprites generated, make sure that you group them logically. This would help in easily implement the freezing logic.
I am new to CoreAudio, and I would like to output a simple sine wave and square wave with a given frequency and amplitude through the speakers using CA. I don't want to use sound files as I want to synthesize the sound.
What do I need to do this? And can you give me an example or tutorial? Thanks.
There are a number of errors in the previous answer. I, the legendary :-) James McCartney, not James Harkins wrote the sinewavedemo, I also wrote SuperCollider which is what the audiosynth.com website is about. I also now work at Apple on CoreAudio. The sinewavedemo DOES use CoreAudio, since it uses AudioHardware.h from CoreAudio.framework as its way to play the sound.
You should not use the sinewavedemo. It is very old code and it makes dangerous assumptions about the buffer layout of the audio hardware. The easiest way nowadays to play a sound that you are generating is to use the AudioQueue, or to use an output audio unit with a render callback set.
The best and easiest way to do that without files is to prepare a single cycle buffer, containing one cycle of the wave (this is called technically a wavetable)
In the playback function called by CoreAudio thread, fill the output buffer with samples read from the wave buffer.
Note however that you will face two problems very quickly :
- for the sine wave, if the playback frequency is not an integer multiple of the desired sine frequency, you will probably need to implement an interpolator if you want to have a good quality. Using only integer pointers will generate a significant level of harmonic noise.
for the square wave, avoid to just program an array with +1 / -1 values. Such a signal is not bandlimited and will alias a lot. Do not forget that the spectrum of a square wave is virtually infinite!
To get good algorithms for signal generation, take a look to musicdsp.org, that's probably one of the best resource for that
Are you new to audio programming in general? As a starting point i would check out
http://www.audiosynth.com/sinewavedemo.html
This is a minimum osx sinewave implementation by the legendary James Harkins. Note, it doesn't use CoreAudio at all.
If you specifically want to use CoreAudio for your sinewave you need to create an output unit (RemoteIO on the iphone, AUHAL on osx) and supply an input callback, where you can pretty much use the code from the above example. Check out
http://developer.apple.com/mac/library/technotes/tn2002/tn2091.html
The benefits of CoreAudio are chiefly, chain other effects with your sinewave, write plugins for hosts like Logic & provide the interfaces for them, write a host (like Logic) for plugins that can be chained together.
If you don't wont to write a plugin, or host plugins then CoreAudio might not actually be for you. But one of the best things about using CoreAudio is that once you get your sinewave callback working it is easy to add effects, or mix multiple sines together
To do this you need to put your output unit in a graph, to which you can effects, mixers, etc.
Here is some help on setting up graphs http://timbolstad.com/2010/03/16/core-audio-getting-started-pt2/
It isn't as difficult as it looks. Apple provides C++ helper classes for many things (/Developer/Examples/CoreAudio/PublicUtility) and even if you don't want to use C++ (you don't have to!) they can be a useful guide to the CoreAudio API.
If you are not doing this realtime, using the sin() function from math.h is not a bad idea. Just fill however many samples you need with sin() beforehand when it is time to play it, just send it to the audio buffer. sin() can be quite slow to call once every sample if you are doing this realtime, using an interpolated wavetable lookup method is much faster, but the resulting sound will not be as spectrally pure.
There is a good and well documented sine wave player code example in Chapter 7 of the Adamson/Avila "Learning Core Audio" book, published by Addison-Wesley Professional (ISBN-10: 0-321-63684-8 ):
http://www.informit.com/store/learning-core-audio-a-hands-on-guide-to-audio-programming-9780321636843
It is a rather new publication (2012) and addresses precisely the issue of this question. It's only a starting point, but it's a valuable starting point.
BTW. Don't jump to graphs before having this basic lesson (which involves some math) behind.
Concerning example code, a quick and efficient method I often use deals with a pre-filled sinewave lookup table which has as many members as sample rate, for 44100 Hz the table has size of 44100. In other words, cycle length equals sample rate. This gives an acceptable trade-off between speed and quality in many cases. You can initialize it with the program.
If you generate floating point samples (which is default in OSX), and use math functions, use sinf() rather than (float)sin(). Promotions in inner loop cycles of a render callback are always resource-expensive. So are repetitive multiplications of constants, such as 2.0*M_PI, which can too often be found in code examples.