Search optimization problem

Search optimization problem - optimization

Suppose you have a list of 2D points with an orientation assigned to them. Let the set S be defined as:
S={ (x,y,a) | (x,y) is a 2D point, a is an orientation (an angle) }.
Given an element s of S, we will indicate with s_p the point part and with s_a the angle part. I would like to know if there exist an efficient data structure such that, given a query point q, is able to return all the elements s in S such that
(dist(q_p, s_p) < threshold_1) AND (angle_diff(q_a, s_a) < threshold_2) (1)
where dist(p1,p2), with p1,p2 2D points, is the euclidean distance, and angle_diff(a1,a2), with a1,a2 angles, is the difference between angles (taken to be the smallest one). The data structure should be efficient w.r.t. insertion/deletion of elements and the search as defined above. The number of vectors can grow up to 10.000 and more, but take this with a grain of salt.
Now suppose to change the above requirement: instead of using the condition (1), let's request all the elements of S such that, given a distance function d, we want all elements of S such that d(q,s) < threshold. If i remember well, this last setup is called range-search. I don't know if the first case can be transformed in the second.

For the distance search I believe the accepted best method is a Binary Space Partition tree. This can be stored as a series of bits. Each two bits (for a 2D tree) or three bits (for a 3D tree) subdivides the space one more level, increasing resolution.
Using a BSP, locating a set of objects to compare distances with is pretty easy. Just find the smallest set of squares or cubes which contain the edges of your distance box.
For the angle, I don't know of anything. I suppose that you could store each object in a second list or tree sorted by its angle. Then you would find every object at the proper distance using the BSP, every object at the proper angles using the angle tree, then do a set intersection.

You have effectively described a "three dimensional cyclindrical space", ie. a space that is locally three dimensional but where one dimension is topologically cyclic. In other words, it is locally flat and may be modeled as the boundary of a four-dimensional object C4 in (x, y, z, w) defined by
z^2 + w^2 = 1
where
a = arctan(w/z)
With this model, the space defined by your constraints is a 2-dimensional cylinder wrapped "lengthwise" around a cross section wedge, where the wedge wraps around the 4-d cylindrical space with an angle of 2 * threshold_2. This can be modeled using a "modified k-d tree" approach (modified 3-d tree), where the data structure is not a tree but actually a graph (it has cycles). You can still partition this space into cells with hyperplane separation, but traveling along the curve defined by (z, w) in the positive direction may encounter a point encountered in the negative direction. The tree should be modified to actually lead to these nodes from both directions, so that the edges are bidirectional (in the z-w curve direction - the others are obviously still unidirectional).
These cycles do not change the effectiveness of the data structure in locating nearby points or allowing your constraint search. In fact, for the most part, those algorithms are only slightly modified (the simplest approach being to hold a visited node data structure to prevent cycles in the search - you test the next neighbors about to be searched).
This will work especially well for your criteria, since the region you define is effectively bounded by these axis-defined hyperplane-bounded cells of a k-d tree, and so the search termination will leave a region on average populated around pi / 4 percent of the area.

Related

In CGAL, can one convert a triangulation in more than three dimensions to a polytope?

If this question would be more appropriate on a related site, let me know, and I'd be happy to move it.
I have 165 vertices in ℤ11, all of which are at a distance of √8 from the origin and are extreme points on their corresponding convex hull. CGAL is able to calculate their d-dimensional triangulation in only 133 minutes on my laptop using just under a gigabyte of RAM.
Magma manages a similar 66 vertex case quite quickly, and, crucially for my application, it returns an actual polytope instead of a triangulation. Thus, I can view each d-dimensional face as a single object which can be bounded by an arbitrary number of vertices.
Additionally, although less essential to my application, I can also use Graph : TorPol -> GrphUnd to calculate all the topological information regarding how those faces are connected, and then AutomorphismGroup : Grph -> GrpPerm, ... to find the corresponding automorphism group of that cell structure.
Unfortunately, when applied to the original polytope, Magma's AutomorphismGroup : TorPol -> GrpMat only returns subgroups of GLd(ℤ), instead of the full automorphism group G, which is what I'm truly hoping to calculate. As a matrix group, G ∉ GL11(ℤ), but is instead ∈ GL11(𝔸), where 𝔸 represents the algebraic numbers. In general, I won't need the full algebraic closure of the rationals, ℚ̅, but just some field extension. However, I could make use of any non-trivially powerful representation of G.
With two days of calculation, Magma can manage the 165 vertex case, but is only able to provide information about the polytope's original 165 vertices, 10-facets, and volume. However, attempting to enumerate the d-faces, for any 2 ≤ d < 10, quickly consumes the 256 GB of RAM I have at my disposal.
CGAL's triangulation, on the other hand, only calculates collections of d-simplices, all of which have d + 1 vertices. It seems possible to derive the same facial information from such a triangulation, but I haven't thought of an easy way to code that up.
Am I missing something obvious in CGAL? Do you have any suggestions for alternative ways to calculate the polytope's face information, or to find the full automorphism group of my set of points?

You can use the package Combinatorial maps in CGAL, that is able to represent polytopes in nD. A combinatorial map describes all cells and all incidence and adjacency relations between the cells.
In this package, there is an undocumented method are_cc_isomorphic allowing to test if an isomorphism exist from two starting points. I think you can use this method from all possible pair of starting points to find all automorphisms.
Unfortunatly, there is no method to build a combinatorial map from a dD triangulation. Such method exists in 3D (cf. this file). It can be extended in dD.

Looking for an efficient structure for checking which circles enclose a point

I have a large set of overlapping circles each at a random location with a specific radius.
type Circle =
struct
val x: float
val y: float
val radius: float
end
Given a new point with type
type Point =
struct
val x: float
val y: float
end
I would like to know which circles in my set enclose the new point. A linear search is trivial. I'm looking for a structure that can hold the circles and return the enclosing circles with better than O(N) for the presented point.
Ideally the structure should be fast for insertion of new circles and removal of circles as well.
I would like to implement this in F# but ideas in any language are fine.
For your information I'm looking to implement
http://takisword.wordpress.com/2009/08/13/bowyerwatson-algorithm/
but it would be an O(N^2) if I use the naive approach of scanning all circles for every new point.

If we assume that circles are distributed over some rectangle with area 1 and the average area of a circle is a then a quadtree with m levels will leave you with an area with size 1/2^m. This leaves
O(Na/2^m)
as the expected number of circles left in the remaining area.
However, we have done O(log(m)) comparisons to get to this point. This leaves the total number of comparisons as
O(log(m)) + O(N/2^m)
The second term will be constant if log(m) is proportional to N.
This suggests that a quadtree can cut things down to O(log n)

Quadtree is a structure for efficient plane search. You can use it to hold subdivision of the plane.
For example you can create quad tree with such properties:
1. Every cell of quadtree contains indices of circles, overlapping it.
2. Every cell does contain not more than K circles (for example 10) // may be broken
3. Height of tree is bounded by M (usually O(log n))
You can construct quadtree, by iterating overlapped cells, and if number of circles inside cell exceedes K, then subdivide that cell into four (if not exceeding max height). Also something should be considered in case of cell inside circles, because its subdivision is pointless.
When finding circles you should localise quadtree, then iterate through overlapping circles and find, those which contains point.
In case of sparse circle distribution search will be very efficient.
I have a bachelor thesis, where I adapted quadtree, for closest segment location, with expected time O(log n), I think similar approach could be used here

Actually you search for triangles whose circumcircles include the new point p. Thus your Delaunay triangulation is already the data structure you need: First search for the triangle t which includes p (google for 'delaunay walk'). The circumcircle of t certainly includes p. Then start from t and grow the (connected) area of triangles whose circumcircles include p.
Implementing it in a fast an reliable way is a lot of work. Unless you want to create a new library you may want to use an existing one. My approach for C++ is Fade2D [1] but there are also many others, it depends on your specific needs.
[1] http://www.geom.at/fade2d/html/

Most efficient way to check if a point is in or on a convex quad polygon

I'm trying to figure out the most efficient/fast way to add a large number of convex quads (four given x,y points) into an array/list and then to check against those quads if a point is within or on the border of those quads.
I originally tried using ray casting but thought that it was a little overkill since I know that all my polygons will be quads and that they are also all convex.
currently, I am splitting each quad into two triangles that share an edge and then checking if the point is on or in each of those two triangles using their areas.
for example
Triangle ABC and test point P.
if (areaPAB + areaPAC + areaPBC == areaABC) { return true; }
This seems like it may run a little slow since I need to calculate the area of 4 different triangles to run the check and if the first triangle of the quad returns false, I have to get 4 more areas. (I include a bit of an epsilon in the check to make up for floating point errors)
I'm hoping that there is an even faster way that might involve a single check of a point against a quad rather than splitting it into two triangles.
I've attempted to reduce the number of checks by putting the polygon's into an array[,]. When adding a polygon, it checks the minimum and maximum x and y values and then using those, places the same poly into the proper array positions. When checking a point against the available polygons, it retrieves the proper list from the array of lists.
I've been searching through similar questions and I think what I'm using now may be the fastest way to figure out if a point is in a triangle, but I'm hoping that there's a better method to test against a quad that is always convex. Every polygon test I've looked up seems to be testing against a polygon that has many sides or is an irregular shape.
Thanks for taking the time to read my long winded question to what's prolly a simple problem.

I believe that fastest methods are:
1: Find mutual orientation of all vector pairs (DirectedEdge-CheckedPoint) through cross product signs. If all four signs are the same, then point is inside
Addition: for every edge
EV[i] = V[i+1] - V[i], where V[] - vertices in order
PV[i] = P - V[i]
Cross[i] = CrossProduct(EV[i], PV[i]) = EV[i].X * PV[i].Y - EV[i].Y * PV[i].X
Cross[i] value is positive, if point P lies in left semi-plane relatively to i-th edge (V[i] - V[i+1]), and negative otherwise. If all the Cross[] values are positive, then point p is inside the quad, vertices are in counter-clockwise order. f all the Cross[] values are negative, then point p is inside the quad, vertices are in clockwise order. If values have different signs, then point is outside the quad.
If quad set is the same for many point queries, then dmuir suggests to precalculate uniform line equation for every edge. Uniform line equation is a * x + b * y + c = 0. (a, b) is normal vector to edge. This equation has important property: sign of expression
(a * P.x + b * Y + c) determines semi-plane, where point P lies (as for crossproducts)
2: Split quad to 2 triangles and use vector method for each: express CheckedPoint vector in terms of basis vectors.
P = a*V1+b*V2
point is inside when a,b>=0 and their sum <=1
Both methods require about 10-15 additions, 6-10 multiplications and 2-7 comparisons (I don't consider floating point error compensation)

If you could afford to store, with each quad, the equation of each of its edges then you could save a little time over MBo's answer.
For example if you have an inward pointing normal vector N for each edge of the quad, and a constant d (which is N.p for one of the vertcies p on the edge) then a point x is in the quad if and only if N.x >= d for each edge. So thats 2 multiplications, one addition and one comparison per edge, and you'll need to perform up to 4 tests per point.This technique works for any convex polygon.

How to calculate continuous effect of gravitational pull between simulated planets

so I am making a simple simulation of different planets with individual velocity flying around space and orbiting each other.
I plan to simulate their pull on each other by considering each planet as projecting their own "gravity vector field." Each time step I'm going to add the vectors outputted from each planets individual vector field equation (V = -xj + (-yj) or some notation like it) except the one being effected in the calculation, and use the effected planets position as input to the equations.
However this would inaccurate, and does not consider the gravitational pull as continuous and constant. Bow do I calculate the movement of my planets if each is continuously effecting the others?
Thanks!

In addition to what Blender writes about using Newton's equations, you need to consider how you will be integrating over your "acceleration field" (as you call it in the comment to his answer).
The easiest way is to use Euler's Method. The problem with that is it rapidly diverges, but it has the advantage of being easy to code and to be reasonably fast.
If you are looking for better accuracy, and are willing to sacrifice some performance, one of the Runge-Kutta methods (probably RK4) would ordinarily be a good choice. I'll caution you that if your "acceleration field" is dynamic (i.e. it changes over time ... perhaps as a result of planets moving in their orbits) RK4 will be a challenge.
Update (Based on Comment / Question Below):
If you want to calculate the force vector Fi(tn) at some time step tn applied to a specific object i, then you need to compute the force contributed by all of the other objects within your simulation using the equation Blender references. That is for each object, i, you figure out how all of the other objects pull (apply force) and those vectors when summed will be the aggregate force vector applied to i. Algorithmically this looks something like:
for each object i
Fi(tn) = 0
for each object j ≠ i
Fi(tn) = Fi(tn) + G * mi * mj / |pi(tn)-pj(tn)|2
Where pi(tn) and pj(tn) are the positions of objects i and j at time tn respectively and the | | is the standard Euclidean (l2) normal ... i.e. the Euclidean distance between the two objects. Also, G is the gravitational constant.
Euler's Method breaks the simulation into discrete time slices. It looks at the current state and in the case of your example, considers all of the forces applied in aggregate to all of the objects within your simulation and then applies those forces as a constant over the period of the time slice. When using
ai(tn) = Fi(tn)/mi
(ai(tn) = acceleration vector at time tn applied to object i, Fi(tn) is the force vector applied to object i at time tn, and mi is the mass of object i), the force vector (and therefore the acceleration vector) is held constant for the duration of the time slice. In your case, if you really have another method of computing the acceleration, you won't need to compute the force, and can instead directly compute the acceleration. In either event, with the acceleration being held as constant, the position at time tn+1, p(tn+1) and velocity at time tn+1, v(tn+1), of the object will be given by:
pi(tn+1) = 0.5*ai(tn)*(tn+1-tn)2 + vi(tn)*(tn+1-tn)+pi(tn)
vi(tn+1) = ai(tn+1)*(tn+1-tn) + vi(tn)
The RK4 method fits the driver of your system to a 2nd degree polynomial which better approximates its behavior. The details are at the wikipedia site I referenced above, and there are a number of other resources you should be able to locate on the web. The basic idea is that instead of picking a single force value for a particular timeslice, you compute four force vectors at specific times and then fit the force vector to the 2nd degree polynomial. That's fine if your field of force vectors doesn't change between time slices. If you're using gravity to derive the vector field, and the objects which are the gravitational sources move, then you need to compute their positions at each of the four sub-intervals in order compute the force vectors. It can be done, but your performance is going to be quite a bit poorer than using Euler's method. On the plus side, you get more accurate motion of the objects relative to each other. So, it's a challenge in the sense that it's computationally expensive, and it's a bit of a pain to figure out where all the objects are supposed to be for your four samples during the time slice of your iteration.

There is no such thing as "continuous" when dealing with computers, so you'll have to approximate continuity with very small intervals of time.
That being said, why are you using a vector field? What's wrong with Newton?
And the sum of the forces on an object is that above equation. Equate the two and solve for a
So you'll just have to loop over all the objects one by one and find the acceleration on it.

How to depict multidimentional vectors on two-dinesional plot?

I have a set of vectors in multidimensional space (may be several thousands of dimensions). In this space, I can calculate distance between 2 vectors (as a cosine of the angle between them, if it matters). What I want is to visualize these vectors keeping the distance. That is, if vector a is closer to vector b than to vector c in multidimensional space, it also must be closer to it on 2-dimensional plot. Is there any kind of diagram that can clearly depict it?

I don't think so. Imagine any twodimensional picture of a tetrahedron. There is no way of depicting the four vertices in two dimensions with equal distances from each other. So you will have a hard time trying to depict more than three n-dimensional vectors in 2 dimensions conserving their mutual distances.
(But right now I can't think of a rigorous proof.)
Update:
Ok, second idea, maybe it's dumb: If you try and find clusters of closer associated objects/texts, then calculate the center or mean vector of each cluster. Then you can reduce the problem space. At first find a 2D composition of the clusters that preserves their relative distances. Then insert the primary vectors, only accounting for their relative distances within a cluster and their distance to the center of to two or three closest clusters.
This approach will be ok for a large number of vectors. But it will not be accurate in that there always will be somewhat similar vectors ending up at distant places.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas