Cocos2d moving nodes is choppy - objective-c

In my upcoming iPhone game different scene elements are split up into their own CCNode.
My Obstacle node contains many nodes, each representing an obstacle. Inside every obstacle node are the images that make up the obstacle (1 - 4 images), and there are only ~10 obstacles at a time. Every update my game calls the update function in the Obstacle node, which moves every obstacle to the left. But this slows down my game quite a bit.
At the same time, I have a particle node that just contains images and moves them all every frame exactly the same way the Obstacle node does, but it has no noticeable effect on performance. But it has hundreds of images at a time.
My question is why do the obstacles slow it down so much but the particles don't? I have even tried replacing the images used in the obstacles with the ones in the particles and it makes no (noticeable) difference. Would it be that there is another level of child nodes?

You will dramatically increase the app's performance, run speed, frame rate and more if you put all your images in a texture atlas and rendering them once as a batch using the CCSpriteBatchNode class. If you are moving lots of objects around on the screen a lot, this makes the hardware work a lot less.
Using this class is easy. Create the class with a texture atlas that contains all your images, and then add this class as a child to your layer, just as you would a sprite.
However, when you create sprites, add them as children to this batch node, not as children to the layer.
It's very easy and will probably help you quite a lot here.

From what I recall of the Cocos2d documentation, particles are intended to be VERY lightweight so you can have many, many of them on screen at once. Nodes are heavier, require more processing between frames as they interact with the physics system and requiring node-specific rendering. The last time I looked at the render loop code, it was basically O(n) based on the number of CCnodes you had in a scene. Using NSTimers versus Cocos' built in run loop also makes quite a bit of difference in performance.
Could you provide an example of something that slows down a lot? Exactly how do you update these Obstacles?
The cocos2d documentation has some best practices that all, in one way or another, touch on performance. There's a LOT you can do to optimize your frames per second.
In general, when your code is slow, it helps to use Instruments.app to figure out where your code is spending so much time. Since you're using a framework this will be less helpful because you'll end up finding out what functions your code spends a lot of time in, and then figure out how to reduce that via the framework's best practices or other optimizations. There are a few good blog posts on improving performance, I found this one very helpful.

Related

How to reduce DrawCount in UE4 project? any optimizing professional?

I have a big project to optimize a lot of buildings, trees, and assets. I have a very high BasePass, PrePass, ShadowDepth, and Translucency. See the Image ScreenShot
Any Advice?
Ryzen 7 4800H + RTX 2060 + 16GB RAM
If we're going to reduce draw calls, we're talking about making the engine render fewer objects at once with fewer materials.
Your go to methods for this are:
-Setting up HLOD's to combine distant meshes
https://docs.unrealengine.com/4.26/en-US/BuildingWorlds/HLOD/Overview/
-Setting up HISM's/ISM's (As long as you are using DirectX 12 and not 11. With 11 it will do this by itself). Remember to only do this on objects that are beside each other or the problem can get worse.
https://www.unrealengine.com/marketplace/en-US/product/instance-tool
-Reducing the number of material slots on meshes that don't need so many or combining small meshes with similar materials. Actor merging can be great for this, just be careful of going overboard because it can make light baking & lightmap memory usage a pain.
https://docs.unrealengine.com/4.26/en-US/Basics/Actors/Merging/
-Reducing the max draw distance on some of your smaller meshes that are close to the ground. You can find this in the mesh's rendering settings.
Any of these things would reduce draw calls, just be careful with it. Too much optimization by any one method can always make the problem worse by creating bottlenecks elsewhere. When we're reducing draw calls we're also risking slowing down occlusion calculation times or potentially creating a memory bandwidth bottleneck.
Once you get that draw thread time down the next thing I'd go onto is looking at reducing the number of movable lights, objects casting dynamic shadows, and translucent objects casting dynamic shadows. Those are some common culprits of other optimization issues.

SKPhysicsBody optimization

I have a 2D sidescrolling game. Right now, in order to jump, the player must be touching the ground. Therefor, I have a boolean, isOnGround, that is set to YES when the player collides with a tile object, and no when the player jumps. This generates a LOT of calls to didBeginContact method, slowing down the game.
Firstly, how can I optimise this by using one big physics body for the tiles on the floor (for example clustering multiple adjacent tiles into one single physics body)?
Secondly, is this even efficient? Is there a better way to detect if the play is on the ground? My current method opens up a lot of bugs, for example wall jumping. If a player collides with a wall, isOnGround becomes YES and allows the player to jump.
Having didBeginContact called numerous times should in no way slow down your game. If you are having performance issues, I suspect the problem is probably elsewhere. Are you testing on device or simulator?
If you are using the Tiled app to create your game map, you can use the Objects Layer to create a individual objects in your map which your code can translate into physics bodies later on.
Using physics and collisions is probably the easiest way for you to determine your player's state in relation to ground contact. To solve your wall issue, you simply make a wall contact a different category than your ground. This will prevent the isOnGround to be set to YES.
You could use the physics engine to detect when jumping is enabled, (and this is what I used to do in my game). However I too have noticed significant overhead using the physics engine to detect when a unit was on a surface and that is because contact detection in sprite kit for whatever reason is expensive, even when collisions are already enabled. Even the documentation notes:
For best performance, only set bits in the contacts mask for
interactions you are interested in.
So I found a better solution for my game (which has 25+ simultaneous units that all need surface detection). Instead of going through the physics engine, I just did my own surface calculation and cache the result each game update. Something like this:
final class func getSurfaceID(nodePosition: CGPoint) -> SurfaceID {
//Loop through surface rects and see if position is inside.
}
What I ended up doing was handling my own surface detection by checking if the bottom point of my unit was inside any of the surface frames. And if your frames are axis-aligned (your rectangles are not rotated) you can perform even faster checks to see if the point is inside the frame.
This is more work in terms of level design because you will need to build an array of surface frames either dynamically from your tiles or manually place surface frames in your world (this is what I did).
Making this change reduced the cpu time spent on surface detection from over 20% to 0.1%. It also allows me to check if any arbitrary point lies on a surface rather than needing to create a physics body (which is unnecessary overhead). However this solution obviously won't work for you if you need to use contact detection.
Now regarding your point about creating one large physics body from smaller ones. You could group adjacent floor tiles using a container node and recreate a physics body that fits the nodes that are grouped. Depending on how your nodes are grouped and how you recycle tiles this can get complicated. A better solution would be to create large physics bodies that just overlap your tiles. This would reduce the number of total physics bodies, as well as the number of detections. And if used in combination with the surface frames solution you could really reduce your overhead.
I'm not sure how your game is designed and what its requirements are. I'm just giving you some possible solutions I looked at when developing surface detection in my game. If you haven't already you should definitely profile your game in instruments to see if contact detection is indeed the source of your overhead. If you game doesn't have a lot of contacts I doubt that this is where the overhead is coming from.

Per frame optimization for large datasets

Summary
New to iPhone programming, I'm having trouble picking the right optimization strategy to filter a set of view components in a scrollview with huge content. In what area would my app gain the most performance?
Introduction
My current iPad app-in-progress let's users explore fairly large binary tree structures. The trees contain between 30 to 900 nodes, and when drawing inside a scrollview (with limited zoom) it looks like this.
The nodes' contents are stored in a SQLite backed Core Data model. It's a binary tree and if a node has children, there are always exactly two. The x and y positions are part of the model, as are the dimensions of the node connections, shown as dotted lines.
Optimization
Only about 50 nodes fit the screen at any given time. With the largest trees containing up to 900 nodes, it's not possible to put everything in a scrollview controlled and zooming UIView, that's a recipe for crashes. So I have to do per frame filtering of the nodes.
And that's where my troubles start. I don't have the experience to make a well founded decision between the possible filtering options, and in addition I probably don't know about that really fast special magic buried deep in Objective-C or Cocoa Touch. Because the backing store is close to 200 MB in size (some 90.000 nodes in hundreds of trees), it's very time consuming to test every single tree on the iPad device. Which is why I'd like to ask you guys for advice.
For all my attempts I'm putting a filter method in the scrollViewDidScroll: and scrollViewDidZoom:. I'm also blocking the main thread with the filter, because I can't show the content without the nodes anyway. But maybe someone has an idea in that area?
Because all the positioning is present in the Core Data model, I might use NSFetchRequest to do the filtering. Is that really fast though? I have the idea it's not a very optimized method.
From what I've tried, the faulted managed objects seem to fit in memory at once, but it might be tricky for the larger trees once their contents start firing faults. Is it a good idea to loop over the NSSet of nodes and see what items should be on screen?
Are there other tricks to gain performance? Would you see ways where I could use multi threading to get the display set faster, even though the model's context was created on the main thread?
Thanks for your advice,
EP.
Ironically your binary tree could be divided using Binary Space Partitioning done in 2D so rendering will be very fast performant and a number of check close to minimum necessary.

Optimizing Actionscript performance

I am setting out for a visualization project that will generate 1000+ sprites from dynamic data. The toolkit I am using (Flare) requires some optimization. I am trying to figure out some optimization techniques for Flash. How can I make Flash run fast when there are so many sprites on the stage, or maybe there is an optimization technique that doesn't involve generating so many sprites?
One good way of doing is freeze animations which are not visible to the user. But the complication with this is that, you need to remember the state from which the animation has to resume or refers the animation based on the current state of the whole application. Since you have so many sprites generated, make sure that you group them logically. This would help in easily implement the freezing logic.

Planning a 2D tile engine - Performance concerns

As the title says, I'm fleshing out a design for a 2D platformer engine. It's still in the design stage, but I'm worried that I'll be running into issues with the renderer, and I want to avoid them if they will be a concern.
I'm using SDL for my base library, and the game will be set up to use a single large array of Uint16 to hold the tiles. These index into a second array of "tile definitions" that are used by all parts of the engine, from collision handling to the graphics routine, which is my biggest concern.
The graphics engine is designed to run at a 640x480 resolution, with 32x32 tiles. There are 21x16 tiles drawn per layer per frame (to handle the extra tile that shows up when scrolling), and there are up to four layers that can be drawn. Layers are simply separate tile arrays, but the tile definition array is common to all four layers.
What I'm worried about is that I want to be able to take advantage of transparencies and animated tiles with this engine, and as I'm not too familiar with designs I'm worried that my current solution is going to be too inefficient to work well.
My target FPS is a flat 60 frames per second, and with all four layers being drawn, I'm looking at 21x16x4x60 = 80,640 separate 32x32px tiles needing to be drawn every second, plus however many odd-sized blits are needed for sprites, and this seems just a little excessive. So, is there a better way to approach rendering the tilemap setup I have? I'm looking towards possibilities of using hardware acceleration to draw the tilemaps, if it will help to improve performance much. I also want to hopefully be able to run this game well on slightly older computers as well.
If I'm looking for too much, then I don't think that reducing the engine's capabilities is out of the question.
I think the thing that will be an issue is the sheer amount of draw calls, rather than the total "fill rate" of all the pixels you are drawing. Remember - that is over 80000 calls per second that you must make. I think your biggest improvement will be to batch these together somehow.
One strategy to reduce the fill-rate of the tiles and layers would be to composite static areas together. For example, if you know an area doesn't need updating, it can be cached. A lot depends of if the layers are scrolled independently (parallax style).
Also, Have a look on Google for "dirty rectangles" and see if any schemes may fit your needs.
Personally, I would just try it and see. This probably won't affect your overall game design, and if you have good separation between logic and presentation, you can optimise the tile drawing til the cows come home.
Make sure to use alpha transparency only on tiles that actually use alpha, and skip drawing blank tiles. Make sure the tile surface color depth matches the screen color depth when possible (not really an option for tiles with an alpha channel), and store tiles in video memory, so sdl will use hardware acceleration when it can. Color key transparency will be faster than having a full alpha channel, for simple tiles where partial transparency or blending antialiased edges with the background aren't necessary.
On a 500mhz system you'll get about 6.8 cpu cycles per pixel per layer, or 27 per screen pixel, which (I believe) isn't going to be enough if you have full alpha channels on every tile of every layer, but should be fine if you take shortcuts like those mentioned where possible.
I agree with Kombuwa. If this is just a simple tile-based 2D game, you really ought to lower the standards a bit as this is not Crysis. 30FPS is very smooth (research Command & Conquer 3 which is limited to 30FPS). Even still, I had written a remote desktop viewer that ran at 14FPS (1900 x 1200) using GDI+ and it was still pretty smooth. I think that for your 2D game you'll probably be okay, especially using SDL.
Can you just buffer each complete layer into its view plus an additional tile size for all four ends(if you have vertical scrolling), use the buffer again to create a new buffer minus the first column and drawing on a new end column?
This would reduce a lot of needless redrawing.
Additionally, if you want a 60fps, you can look up ways to create frame skip methods for slower systems, skipping every other or every third draw phase.
I think you will be pleasantly surprised by how many of these tiles you can draw a second. Modern graphics hardware can fill a 1600x1200 framebuffer numerous times per frame at 60 fps, so your 640x480 framebuffer will be no problem. Try it and see what you get.
You should definitely take advantage of hardware acceleration. This will give you 1000x performance for very little effort on your part.
If you do find you need to optimise, then the simplest way is to only redraw the areas of the screen that have changed since the last frame. Sounds like you would need to know about any animating tiles, and any tiles that have changed state each frame. Depending on the game, this can be anywhere from no benefit at all, to a massive saving - it really depends on how much of the screen changes each frame.
You might consider merging neighbouring tiles with the same texture into a larger polygon with texture tiling (sort of a build process).
What about decreasing the frame rate to 30fps. I think it will be good enough for a 2D game.