Update: For anyone who stumbles upon this, it seems like SceneKit has a threshold for the maximum number of objects it can render. Using [SCNNode flattenedClone] is a great way to help increase the amount of objects it can handle. As #Hal suggested, I'll file a bug report with Apple describing the performance issues discussed below.
I'm somewhat new to iOS and I'm currently working on my first OS X project for a class. I'm essentially creating random geometric graphs (random points in space connected to one another if the distance between them is ≤ a given radius) and I'm using SceneKit to display the results. I already know I'm pushing SceneKit to its limits, but if the number of objects I'm trying to graph is too large, the whole thing just crashes and I don't know how to interpret the results.
My SceneKit scene consists of the default camera, 2 lighting nodes, approximately 5,000 SCNSpheres each within an SCNNode (the nodes on the graph), and then about 50,0000 connections which are of type SCNPrimitiveSCNGeometryPrimitiveTypeLine which are also within SCNNodes. All of these nodes are then added to one large node which is added to my scene.
The code works for smaller numbers of spheres and connections.
When I run my app with these specifications, everything seems to work fine, then 5-10 seconds after executing the following lines:
dispatch_async(dispatch_get_main_queue(), ^{
[self.graphSceneView.scene.rootNode addChildNode:graphNodes];
});
the app crashes with this resulting screen: .
Given that I'm sort of new to Xcode and used to more verbose output upon crashing, I'm a bit over my head. What can I do to get more information on this crash?
That's terse output for sure. You can attack it by simplifying until you don't see the crash anymore.
First, do you ever see anything on screen?
Second, your call to
dispatch_async(dispatch_get_main_queue(), ^{
[self.graphSceneView.scene.rootNode addChildNode:graphNodes];
});
still runs on the main queue, so I would expect it to make no difference in perceived speed or responsiveness. So take addChildNode: out of the GCD block and invoke it directly. Does that make a difference? At the least, you'll see your crash immediately, and might get a better stack trace.
Third, creating your own geometry from primitives like SCNPrimitiveSCNGeometryPrimitiveTypeLine is trickier than using the SCNGeometry subclasses. Memory mismanagement in that step could trigger mysterious crashes. What happens if you remove those connection lines? What happens if you replace them with long, skinny SCNBox instances? You might end up using SCNBox by choice because it's easier to style in SceneKit than a primitive line.
Fourth, take a look at #rickster's answer to this question about optimization: SceneKit on OS X with thousands of objects. It sounds like your project would benefit from node flattening (flattenedClone), and possibly the use of SCNLevelOfDetail. But these suggestions fall into the category of premature optimization, the root of all evil.
It would be interesting to hear what results from toggling between the Metal and OpenGL renderers. That's a setting on the SCNView in IB ("preferred renderer" I think), and an optional entry in Info.plist.
Related
I have an image at about 7000x6000px. I need this to be in a scrollview/imageView in my app, however this is way to huge for display. It is supposed to be a kind of map. I was hoping to keep the size of the app to the minimum, and the image is just about 13mb in .jpg. In .png it is over 100mb, which is unacceptable. Many have suggested CATiledLayer as an option, but I believe this would result in even bigger file sizes. Anyway, I tried to do it with CATiledLayer, and create my own tiles in TileCutter, (tiles in .jpg), and the size wasn't too bad. But I am having errors all over the place. The iOS version of CATiledLayer is a mystery to me, and I can't find a way to solve this. I get an error saying something about the java-equivalent "index out of bounds of array", even though the array has content at that specific index..
It has a method which returns an array. The array contains data of a .plist. Before the return I log out the content of the array, giving me good data. The call is trying to access
[array objectAtIndex:0]
and put it in a dictionary, but throws OutOfBounds. When logged out the whole array, I can clearly see the content, but when logged out
NSLog("%#",[method objectAtIndex]); I get the same exception.
Anyway, CATiledLayer has given me nothing but problems. I have been reverse-engineering the PhotoScroller project with no luck. Anyone have any other solutions?
Thanks.
Apple has this really neat project, PhotoScroller, that uses CATiledLayer and lets you scroll through several images and zoom them. This seemed really neat until I found out that Apple "cheated" and pre-tiled the images (about 800 tiles saved as file in the bundle!)
I had need for a similar capability, but had to download the images from the network. Thus came about PhotoScrollerNetwork. With the TiledImageBuilder you can download (or read from disk) massive images - I even tested a 18000x18000 image - and it works.
What this class does is start tiling the image as it downloads (when using libjpegturbo) or can save the file then tile (takes longer). The class figures out how many levels of detail are needed to show the image at full resolution and sized to fit in the containing view (a scrollview).
The class uses the disk cache to hold the tiles, but uses and old Unix trick of creating a file, opening it, then unlinking it so that the tiles never really get saved - once the class is dealloced (and the file descriptor closed) the tiles are freed (or if your app crashes they get freed too.
Someone had problems on an iPad 1 with its quite limited memory, so the class now throttles its use of the file system when concurrently loading multiple images. I had a discussion with the iOS kernel manager at WWDC this year, and after explaining the throttling technique to him, he said the algorithm (on managing the amount of disk cache usage) was probably the best technique that could be used.
I think those who suggested CATiledLayer are right. You should really use it! If you need a sample project that displays a huge bitmap using that technology, look here: http://www.cimgf.com/2011/03/01/subduing-catiledlayer/
Many technologies we use as Cocoa/Cocoa Touch developers stand
untouched by the faint of heart because often we simply don’t
understand them and employing them can seem a daunting task. One of
those technologies is found in Core Animation and is referred to as
the CATiledLayer. It seems like a magical sort of technology because
so much of its implementation is a bit of a black box and this fact
contributes to it being misunderstood. CATiledLayer simply provides a
way to draw very large images without incurring a severe memory hit.
This is important no matter where you’re deploying, but it especially
matters on iOS devices as memory is precious and when the OS tells you
to free up memory, you better be able to do so or your app will be
brought down. This blog post is intended to demonstrate that
CATiledLayer works as advertised and implementing it is not as hard as
it may have once seemed.
In my upcoming iPhone game different scene elements are split up into their own CCNode.
My Obstacle node contains many nodes, each representing an obstacle. Inside every obstacle node are the images that make up the obstacle (1 - 4 images), and there are only ~10 obstacles at a time. Every update my game calls the update function in the Obstacle node, which moves every obstacle to the left. But this slows down my game quite a bit.
At the same time, I have a particle node that just contains images and moves them all every frame exactly the same way the Obstacle node does, but it has no noticeable effect on performance. But it has hundreds of images at a time.
My question is why do the obstacles slow it down so much but the particles don't? I have even tried replacing the images used in the obstacles with the ones in the particles and it makes no (noticeable) difference. Would it be that there is another level of child nodes?
You will dramatically increase the app's performance, run speed, frame rate and more if you put all your images in a texture atlas and rendering them once as a batch using the CCSpriteBatchNode class. If you are moving lots of objects around on the screen a lot, this makes the hardware work a lot less.
Using this class is easy. Create the class with a texture atlas that contains all your images, and then add this class as a child to your layer, just as you would a sprite.
However, when you create sprites, add them as children to this batch node, not as children to the layer.
It's very easy and will probably help you quite a lot here.
From what I recall of the Cocos2d documentation, particles are intended to be VERY lightweight so you can have many, many of them on screen at once. Nodes are heavier, require more processing between frames as they interact with the physics system and requiring node-specific rendering. The last time I looked at the render loop code, it was basically O(n) based on the number of CCnodes you had in a scene. Using NSTimers versus Cocos' built in run loop also makes quite a bit of difference in performance.
Could you provide an example of something that slows down a lot? Exactly how do you update these Obstacles?
The cocos2d documentation has some best practices that all, in one way or another, touch on performance. There's a LOT you can do to optimize your frames per second.
In general, when your code is slow, it helps to use Instruments.app to figure out where your code is spending so much time. Since you're using a framework this will be less helpful because you'll end up finding out what functions your code spends a lot of time in, and then figure out how to reduce that via the framework's best practices or other optimizations. There are a few good blog posts on improving performance, I found this one very helpful.
Good day, I'm new to cocos2d, objective-c and stack overflow.
I would like to know if it's possible to share a texture atlas instance among multiple tiled maps. I'm working on a map system, which is supposed to be able to use a really, really huge map but since it needs to run on an iPhone, I have to slice that map into many small ones to be able to cull them so I have multiple CCTMXTiledMaps in my scene which get constantly allocated and deallocated.
This works fine but on every allocation of a tiled map there is a CCTextureAtlas generated which freezes the app during the loading time and uses up a lot of memory even if the tile graphics are everywhere the same.
I looked around but all tutorials just cover the case with only one tiled map. I also tried some awful hacking, but that just caused crashes. I think, I have to setup a tiled map instance manually (not with the loadFromFile function) so I have more control of the initialization but I have no clue of what I have to consider during that.
If you have loaded the textures before, the tilemaps shouldn't freeze the game significantly.
Summary
New to iPhone programming, I'm having trouble picking the right optimization strategy to filter a set of view components in a scrollview with huge content. In what area would my app gain the most performance?
Introduction
My current iPad app-in-progress let's users explore fairly large binary tree structures. The trees contain between 30 to 900 nodes, and when drawing inside a scrollview (with limited zoom) it looks like this.
The nodes' contents are stored in a SQLite backed Core Data model. It's a binary tree and if a node has children, there are always exactly two. The x and y positions are part of the model, as are the dimensions of the node connections, shown as dotted lines.
Optimization
Only about 50 nodes fit the screen at any given time. With the largest trees containing up to 900 nodes, it's not possible to put everything in a scrollview controlled and zooming UIView, that's a recipe for crashes. So I have to do per frame filtering of the nodes.
And that's where my troubles start. I don't have the experience to make a well founded decision between the possible filtering options, and in addition I probably don't know about that really fast special magic buried deep in Objective-C or Cocoa Touch. Because the backing store is close to 200 MB in size (some 90.000 nodes in hundreds of trees), it's very time consuming to test every single tree on the iPad device. Which is why I'd like to ask you guys for advice.
For all my attempts I'm putting a filter method in the scrollViewDidScroll: and scrollViewDidZoom:. I'm also blocking the main thread with the filter, because I can't show the content without the nodes anyway. But maybe someone has an idea in that area?
Because all the positioning is present in the Core Data model, I might use NSFetchRequest to do the filtering. Is that really fast though? I have the idea it's not a very optimized method.
From what I've tried, the faulted managed objects seem to fit in memory at once, but it might be tricky for the larger trees once their contents start firing faults. Is it a good idea to loop over the NSSet of nodes and see what items should be on screen?
Are there other tricks to gain performance? Would you see ways where I could use multi threading to get the display set faster, even though the model's context was created on the main thread?
Thanks for your advice,
EP.
Ironically your binary tree could be divided using Binary Space Partitioning done in 2D so rendering will be very fast performant and a number of check close to minimum necessary.
This is what happens:
The drawGL function is called at the exact end of the frame thanks to a usleep, as suggested. This already maintains a steady framerate.
The actual presentation of the renderbuffer takes place with drawGL(). Measuring the time it takes to do this, gives me fluctuating execution times, resulting in a stutter in my animation. This timer uses mach_absolute_time so it's extremely accurate.
At the end of my frame, I measure timeDifference. Yes, it's on average 1 millisecond, but it deviates a lot, ranging from 0.8 milliseconds to 1.2 with peaks of up to more than 2 milliseconds.
Example:
// Every something of a second I call tick
-(void)tick
{
drawGL();
}
- (void)drawGL
{
// startTime using mach_absolute_time;
glBindRenderbufferOES(GL_RENDERBUFFER_OES, viewRenderbuffer);
[context presentRenderbuffer:GL_RENDERBUFFER_OES];
// endTime using mach_absolute_time;
// timeDifference = endTime - startTime;
}
My understanding is that once the framebuffer has been created, presenting the renderbuffer should always take the same effort, regardless of the complexity of the frame? Is this true? And if not, how can I prevent this?
By the way, this is an example for an iPhone app. So we're talking OpenGL ES here, though I don't think it's a platform specific problem. If it is, than what is going on? And shouldn't this be not happening? And again, if so, how can I prevent this from happening?
The deviations you encounter maybe be caused by a lot of factors, including OS scheduler that kicks in and gives cpu to another process or similar issues. In fact normal human won't tell a difference between 1 and 2 ms render times. Motion pictures run at 25 fps, which means each frame is shown for roughly 40ms and it looks fluid for human eye.
As for animation stuttering you should examine how you maintain constant animation speed. Most common approach I've seen looks roughly like this:
while(loop)
{
lastFrameTime; // time it took for last frame to render
timeSinceLastUpdate+= lastFrameTime;
if(timeSinceLastUpdate > (1 second / DESIRED_UPDATES_PER_SECOND))
{
updateAnimation(timeSinceLastUpdate);
timeSinceLastUpdate = 0;
}
// do the drawing
presentScene();
}
Or you could just pass lastFrameTime to updateAnimation every frame and interpolate between animation states. The result will be even more fluid.
If you're already using something like the above, maybe you should look for culprits in other parts of your render loop. In Direct3D the costly things were calls for drawing primitives and changing render states, so you might want to check around OpenGL analogues of those.
My favorite OpenGL expression of all times: "implementation specific". I think it applies here very well.
A quick search for mach_absolute_time results in this article: Link
Looks like precision of that timer on an iPhone is only 166.67 ns (and maybe worse).
While that may explain the large difference, it doesn't explain that there is a difference at all.
The three main reasons are probably:
Different execution paths during renderbuffer presentation. A lot can happen in 1ms and just because you call the same functions with the same parameters doesn't mean the exact same instructions are executed. This is especially true if other hardware is involved.
Interrupts/other processes, there is always something else going on that distracts the CPU. As far as I know the iPhone OS is not a real-time OS and so there's no guarantee that any operation will complete within a certain time limit (and even a real-time OS will have time variations).
If there are any other OpenGL calls still being processed by the GPU that might delay presentRenderbuffer. That's the easiest to test, just call glFinish() before getting the start time.
It is best not to rely on a high constant frame rate for a number of reasons, the most important being that the OS may do something in the background that slows things down. Better to sample a timer and work out how much time has passed each frame, this should ensure smooth animation.
Is it possible that the timer is not accurate to the sub ms level even though it is returning decimals 0.8->2.0?