I have a mesh model and several viewpoints. I'm going to determine which faces are visible from the viewpoint.
Now I'm using bpy.context.scene.ray_cast() method to calculate if there is any obstacle between a viewpoint and a face (a triangle in the mesh), but this method is running on cpu and will cost a lot of time. Is there any way to run the function on GPU, or is there any other faster method to solve the problem?
Related
I have to optimize the result of a process that depends on a large number of variables, i.e. a laser engraving system where the engraving depth depends on the laser speed, distance, power and so on.
The final objective is the minimization of the engraving time, or the maximization of the laser speed. All the other parameters can vary, but must stay within safe bounds.
I have never used any machine learning tools, but to my very limited knowledge this seems like a good use case for TensorFlow or any other machine learning library.
I would experimentally gather data points to train the algorithm, test it and then use a gradient descent optimizer to find the parameters (within bounds) that maximize the laser travel velocity.
Does this sound feasible? How would you approach such a problem? Can you link to any examples available online?
Thank you,
Riccardo
I’m not quite sure if I understood the problem correctly, would you add some example data and a desired output?
As far as I understood, It could be feasible to use TensorFlow, but I believe there are better solutions to that problem. Let me expand on this.
TensorFlow is a framework focused in the development of Deep Learning models. These usually require lots of data (the number really depends on the problem) but I don’t believe that just you manually gathering this data would be enough unless your team is quite big or already have some data gathered.
Also, as you have a minimization (or maximization) problem given variables that lay within a known range, I think this can be a case of Operations Research optimization instead of Machine Learning. Check this example of OR.
I am writing a code to simulate particle movement. (currently 2D soon 3D hopefully)
The thing is, if I use a relatively large timestep particles end up passing through each other.
Do you have any suggestion that would allow me to correct that without using a really small step?
(it is in C++ if that makes much difference).
The use of timestep to advance the clock introduces model artifacts which can destroy the model validity, as is happening in your case. Use discrete event scheduling instead. This paper from Winter Simulation Conference 2005 describes how to implement movement in a discrete event framework. Your model will not only be more accurate, it will probably run much faster as well.
So you will have to do some sort of collision detection to see if two objects would collide.
Depending on your data structure the detection could take many forms. If you just have a list of points you would have to check all against each other in N^2 each step for the particle (adding the movement vector to create a larger spacial foot print). This could be done by the GJK algorithm.
Using some spacial data structure could reduce the complexity by only running the GJK on a pruned set of particles, i.e. no need to check if they impossible could overlap.
It took me the last 30 minutes to remove vertices inside straight lines and to merge faces which connect at the same angle (additionally to the inbuild "lower polygon-count" function). Now I'm wondering if this process is necessary.
Do I need to minimize the amount of vertices in Maya or is it enough to minimize the amount of vertex faces to achieve an optimal rendering
performance?
If both is necessary, which gives the bigger boost?
Im looking specifically into applications, where very high performance is necessary (games).
In my upcoming iPhone game different scene elements are split up into their own CCNode.
My Obstacle node contains many nodes, each representing an obstacle. Inside every obstacle node are the images that make up the obstacle (1 - 4 images), and there are only ~10 obstacles at a time. Every update my game calls the update function in the Obstacle node, which moves every obstacle to the left. But this slows down my game quite a bit.
At the same time, I have a particle node that just contains images and moves them all every frame exactly the same way the Obstacle node does, but it has no noticeable effect on performance. But it has hundreds of images at a time.
My question is why do the obstacles slow it down so much but the particles don't? I have even tried replacing the images used in the obstacles with the ones in the particles and it makes no (noticeable) difference. Would it be that there is another level of child nodes?
You will dramatically increase the app's performance, run speed, frame rate and more if you put all your images in a texture atlas and rendering them once as a batch using the CCSpriteBatchNode class. If you are moving lots of objects around on the screen a lot, this makes the hardware work a lot less.
Using this class is easy. Create the class with a texture atlas that contains all your images, and then add this class as a child to your layer, just as you would a sprite.
However, when you create sprites, add them as children to this batch node, not as children to the layer.
It's very easy and will probably help you quite a lot here.
From what I recall of the Cocos2d documentation, particles are intended to be VERY lightweight so you can have many, many of them on screen at once. Nodes are heavier, require more processing between frames as they interact with the physics system and requiring node-specific rendering. The last time I looked at the render loop code, it was basically O(n) based on the number of CCnodes you had in a scene. Using NSTimers versus Cocos' built in run loop also makes quite a bit of difference in performance.
Could you provide an example of something that slows down a lot? Exactly how do you update these Obstacles?
The cocos2d documentation has some best practices that all, in one way or another, touch on performance. There's a LOT you can do to optimize your frames per second.
In general, when your code is slow, it helps to use Instruments.app to figure out where your code is spending so much time. Since you're using a framework this will be less helpful because you'll end up finding out what functions your code spends a lot of time in, and then figure out how to reduce that via the framework's best practices or other optimizations. There are a few good blog posts on improving performance, I found this one very helpful.
I've got some OpenGL drawing code that I'm trying to optimize. It's currently testing all drawing objects for visibility client-side before deciding whether or not to send rendering data to OpenGL. (This is easier than it sounds. It's drawing a 2D scene so clipping is trivial: just test against the current coordinates of the viewport rectangle.)
It occurs to me that the entire model could be greatly simplified by passing the entire scene to OpenGL and letting the GPU take care of the clipping. But sometimes the total can be very, very complex, involving up to 100,000 total sprites, most of which never get rendered because they're off-camera, and I'd prefer to not end up killing the framerate in the name of simplicity.
I'm using OpenGL 2.0, and I've got a pretty simple vertex shader and a much more complicated fragment shader. Is there any guarantee that says that if the vertex shader runs and determines coordinates that are completely off-camera for all vertices of a polygon, that a clipping test will be applied somewhere between there and the fragment shader and prevent the fragment shader from ever running for that polygon? And if so, is this automatic or is there something I need to do to enable it? I've looked around online for information on this but I haven't found anything conclusive...
Clipping happens after the vertex transform stage before and after the NDC space; clip planes are applied in clip space, viewport clipping is done in NDC space. That is one step before rasterizing. Clipping means, that a face only partially visible is "cut" by inserting new vertices at the visibility border, or fragments outside the viewport discarded. What you mean is usually called culling. Faces completely outside the viewport are culled, at the same stage like clipping.
From a performance point of view, the best code is code never executed, and the best data is data never accessed. So in your case sending off a single drawing call that makes the GPU process a large batch of vertices clearly takes load off the CPU, but it consumes GPU processing power. Culling those vertices before sending the drawing command consumes CPU power, but takes load off the GPU. The goal is to find the right balance. If the number of vertices is low, a simple brute force approach (just render the whole thing) may easily outperform ever other scheme.
However using a simple, yet effective data management scheme can greatly improve performance on both ends. For example a spatial subdivision structure like a Kd tree is easily built (you don't have to balance it). Sorting the vertices into the Kd tree you can omit (cull) large portions of the tree if one branch near to the root is completely outside the viewport. Preparing drawing a frame you iterate through the visible parts of the tree, building the list of vertices to draw, then you pass this list to the rendering command. Kd trees can be traversed on average in O(n log n) time.
It's important to understand the difference between clipping and culling. You appear to be talking about the latter.
Clipping means taking a triangle and literally cutting it into pieces to fit into the viewport. The OpenGL specification defines this process to happen post-vertex shader, for any triangle that is only partially in view.
Culling means throwing something away entirely. If a triangle is not entirely in view, it can therefore be culled. OpenGL does not say that culling has to happen. Remember: the OpenGL specification defines behavior, not performance.
That being said, hardware makers are not stupid. Obvious efforts like not rasterizing triangles that are outside of the viewport are easily implemented and improve performance. Pretty much any hardware that exists will do this.
Similarly, clipping is typically implemented (where possible) with rasterizer tricks, rather than by creating new triangles. Fragments that would be outside of the viewport simply aren't generated by the rasterizer. This is also legal according to OpenGL, because the spec defines apparent behavior. It doesn't really care if you actually cut the triangle into pieces as long as it looks indistinguishable form if you did.
Your question is essentially one of, "How much work should I do to not render off-screen objects?" That really depends on what your scene is and how you're rendering it. You say you're rendering 100,000 sprites. Are you making 100,000 draw calls, or are these sprites part of larger structures that you render with larger granularity? Do you stream the vertex data to the GPU every frame, or is the vertex data static?
Clipping and culling happen before fragment processing. http://www.opengl.org/wiki/Rendering_Pipeline_Overview
However, you will still be passing 100000 * 4 vertices (assuming you're rendering the sprites with quads and not point sprites) to the card if you don't do culling yourself. Depending on the card's memory performance this can be an issue.