How to index and query complex spatial types in CosmosDB?

How to index and query complex spatial types in CosmosDB? - indexing

I have a CosmosDB database/collection with the partition key on /id and spatial indexing enabled using the Geography configuration. When I query for objects with a LineString property within a given LineString or Polygon, the query retrieves all of the documents in the collection before returning those that are within the LineString/Polygon (retrieved is greater than output). The RU's consumed grow as the number of items within the collection grow, which signals to me that it's basically doing a scan and the index is not working.
CosmosDB documentation states the following:
Azure Cosmos DB supports indexing of Points, LineStrings, Polygons, and MultiPolygons
However the documentation does not have any examples that don't use the Point type and I am unable to query using permutations of exclusively non-Point types and hit the index.
To test spatial indexing is working I have an additional Start property on the item with the value of the first Point in the LineString, and I can query if this is within the Polygon at a constant RU consumption.
Here is the index:
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{
"path": "/*"
}
],
"excludedPaths": [
{
"path": "/\"_etag\"/?"
}
],
"spatialIndexes": [
{
"path": "/*",
"types": [
"Point",
"LineString",
"Polygon",
"MultiPolygon"
]
}
]
}
Here is the needle. The haystack is about 1,000 objects with random LineStrings.
{
"id": "test",
"Start": {
"type": "Point",
"coordinates": [ 1, 3 ]
},
"Points": {
"type": "LineString",
"coordinates": [ [ 1, 3 ], [ 1, 4 ], [ 1, 5 ] ]
}
}
Here is the search within a Polygon:
SELECT *
FROM items i
WHERE ST_WITHIN(i.Points, {
"type":"Polygon",
"coordinates": [[[0, 10], [0, 0], [2, 0], [2, 10], [0, 10]]]
})
---
Request Charge: 127.4 RUs
Retrieved document count: 992
Retrieved document size: 1219980 bytes
Output document count: 1
Output document size: 441 bytes
Index hit document count: 0
Index lookup time: 3.77 ms
Here is the search within a LineString:
SELECT *
FROM items i
WHERE ST_WITHIN(i.Points, {
"type":"LineString",
"coordinates": [[1, 3], [1, 4], [1, 5]]
})
---
Request Charge: 122.53 RUs
Retrieved document count: 992
Retrieved document size: 1219980 bytes
Output document count: 1
Output document size: 441 bytes
Index hit document count: 0
Index lookup time: 3.0100000000000002 ms
Here is the search for a Start within the same Polygon as above, showing that spatial indexing is enabled and working:
SELECT *
FROM items i
WHERE ST_WITHIN(i.Start, {
"type":"Polygon",
"coordinates": [[[0, 10], [0, 0], [2, 0], [2, 10], [0, 10]]]
---
Request Charge: 8.1 RUs
Retrieved document count: 1
Retrieved document size: 343 bytes
Output document count: 1
Output document size: 392 bytes
Index hit document count: 1
Index lookup time: 2.79 ms

I create a container and add your sample document,but the result is different with yours.
First sql result:
SELECT *
FROM items i
WHERE ST_WITHIN(i.Points, {
"type":"Polygon",
"coordinates": [[[0, 10], [0, 0], [2, 0], [2, 10], [0, 10]]]
})
---
Request Charge: 10.53 RUs
Retrieved document count: 1
Retrieved document size: 349 bytes
Output document count: 1
Output document size: 398 bytes
Index hit document count: 1
Index lookup time: 1.6800000000000002 ms
Second sql result:
SELECT *
FROM items i
WHERE ST_WITHIN(i.Points, {
"type":"LineString",
"coordinates": [[1, 3], [1, 4], [1, 5]]
})
---
Request Charge: 7.24 RUs
Retrieved document count: 1
Retrieved document size: 349 bytes
Output document count: 1
Output document size: 398 bytes
Index hit document count: 1
Index lookup time: 1.1399000000000001 ms
Third sql result:
SELECT *
FROM items i
WHERE ST_WITHIN(i.Start, {
"type":"Polygon",
"coordinates": [[[0, 10], [0, 0], [2, 0], [2, 10], [0, 10]]]
})
---
Request Charge: 10.53 RUs
Retrieved document count: 1
Retrieved document size: 349 bytes
Output document count: 1
Output document size: 398 bytes
Index hit document count: 1
Index lookup time: 1.6500000000000001 ms
According to my test,each sql hit the index.
By the way,my index is same to you and geospatial configuration is Geography.You can try again and if the result is similar with yours above,please let me know more detail,such as sdk or detail of your document(I test this on Azure portal).

Related

Linear Diophantine Equations with Restriction in the GAP System

I am searching for a way to use the GAP System to find a solution of a linear Diophantine equation over the non-negative integers. Explicitly, I have a list L of positive integers for each of which there exists a solution of the linear Diophantine equation s = 11*a + 7*b such that a and b are non-negative integers. I would like to have the GAP System return for each element s of L the ordered pair(s) [a, b] corresponding to the above solution(s).
I am familiar already with the command SolutionIntMat in the GAP System; however, this produces only some solution of the linear Diophantine equation s = 11*a + 7*b. Particularly, it is possible (and far more likely) that one of the coefficients a and b is negative. For instance, I obtain the solution [-375, 600] when I use the aforementioned command on the linear Diophantine equation 75 = 11*a + 7*b.
For additional context, this query arises when working with numerical semigroups generated by generalized arithmetic sequences. Use the command LoadPackage("numericalsgps"); to implement computations with such objects. For instance, if S := NumericalSemigroup(11, 29, 36, 43, 50, 57, 64, 71);, then each of the minimal generators of S other than 11 is of the form 2*11 + 7*i for some integer i in [1..7]. One can ask the GAP System for the SmallElements(S);, and the GAP System will return all elements of S up to FrobeniusNumber(S) + 1. Clearly, every element of S is of the form 11*a + 7*b for some non-negative integers a and b; I would like to investigate what coefficients a and b arise. In fact, the answer is known (cf. Proposition 2.5 of this paper); I am just trying to get an understanding of the intuition behind the proof.
Thank you in advance for your time and consideration.

Dylan, thank you for your query and for using GAP and numericalsgps.
You can probably use in this setting Factorizations from the package numericalsgps. It internally rewrites the output of RestrictedPartitions.
For instance, in your example, you can get all possible "factorizations" of the small elements of S, with respect to the generators of S, by typing List(SmallElements(S), x->[x,Factorizations(x,S)]). A particular example:
gap> Factorizations(104,S);
[ [ 1, 0, 0, 1, 1, 0, 0, 0 ], [ 1, 0, 1, 0, 0, 1, 0, 0 ],
[ 1, 1, 0, 0, 0, 0, 1, 0 ], [ 3, 0, 0, 0, 0, 0, 0, 1 ] ]
If you want to see the factorizations of the elements of S in terms of 11 and 7, then you can do the following:
gap> FactorizationsIntegerWRTList(29,[11,7]);
[ [ 2, 1 ] ]
So, for all minimal generators of S you would do
gap> List(MinimalGenerators(S), g-> FactorizationsIntegerWRTList(g,[11,7]));
[ [ [ 1, 0 ] ], [ [ 2, 1 ] ], [ [ 2, 2 ] ], [ [ 2, 3 ] ],
[ [ 2, 4 ] ], [ [ 2, 5 ] ], [ [ 2, 6 ] ], [ [ 2, 7 ] ] ]
For the set of small elements of S, try List(SmallElements(S), g-> FactorizationsIntegerWRTList(g,[11,7])). If you only want up to some integer, just replace SmallElements(S) with Intersection([1..200], S); or if you want the first, say 200, elements of S, use S{[1..200]}.
You may want to have a look at Chapter 9 of the manual, and in particular to FactorizationsElementListWRTNumericalSemigroup.
I hope this helps.

Why does Polyhedron render well on its own but not in combination with complete model

The code below is an attempt to make a simple 3d triangle to work as side supports for a larger model.
It works well on its own, but when i add it to a larger model, one of the sides of the triangle does not render and I am getting warnings of "UI-WARNING: Object may not be a valid 2-manifold and may need repair!"
To make it even stranger, when I click "save", the model is redrawn and the model shows up complete with the missing side.
I am using OpenScad v.2019.05
I am working around the problem by making a few small objects and hull() around them. I would prefer this code to work, however.
//For some odd reason, this module works well on its own.
//It does does not render correctly when used as part of a larger model.
//Then it will miss a side.
//It shows correctly up when saving though.
module supportTriangle(height=10, length=10, thickness=10){
trianglePoints = [
[ 0, 0, 0 ],
[ thickness, 0, 0 ],
[ 0, 0, height ],
[ thickness, 0, height],
[ 0, length, 0],
[ thickness, length, 0]];
triangleFaces = [
[ 0, 1, 5, 4 ],
[ 0, 1, 3, 2 ],
[ 2, 3, 5, 4 ],
[ 0, 4, 2 ],
[ 1, 3, 5 ]];
polyhedron(trianglePoints, triangleFaces);
}
I am getting warnings of "UI-WARNING: Object may not be a valid 2-manifold and may need repair!" when rendering in combination with larger model

try this:
module supportTriangle(height=10, length=10, thickness=10){
trianglePoints = [
[ 0, 0, 0 ],
[ thickness, 0, 0 ],
[ 0, 0, height ],
[ thickness, 0, height],
[ 0, length, 0],
[ thickness, length, 0]];
triangleFaces = [
[ 0, 1, 5, 4 ],
[ 2,3,1,0], // i reversed these to keep them clockwise
[ 4,5,3,2 ], // i reversed these to keep them clockwise
[ 0, 4, 2 ],
[ 1, 3, 5 ]];
polyhedron(trianglePoints, triangleFaces);
}
supportTriangle(10,10,10);
cube(5,center=true); // just an extra thing to make it error if order is wrong
see:
https://en.wikibooks.org/wiki/OpenSCAD_User_Manual/Primitive_Solids#polyhedron
All faces must have points ordered in the same direction . OpenSCAD prefers clockwise when looking at each face from outside inwards. The back is viewed from the back, the bottom from the bottom, etc..

Lookup smallest value greater than current

I have an objects table and a lookup table. In the objects table, I'm looking to add the smallest value from the lookup table that is greater than the object's number.
I found this similar question but it's about finding a value greater than a constant, rather than changing for each row.
In code:
import pandas as pd
objects = pd.DataFrame([{"id": 1, "number": 10}, {"id": 2, "number": 30}])
lookup = pd.DataFrame([{"number": 3}, {"number": 12}, {"number": 40}])
expected = pd.DataFrame(
[
{"id": 1, "number": 10, "smallest_greater": 12},
{"id": 2, "number": 30, "smallest_greater": 40},
]
)

First compare each value lookup['number'] by objects['number'] to 2d boolean mask, then add cumsum and compare first value by 1 and get position by numpy.argmax for set value by lookup['number'].
Output is generated with numpy.where for overwrite all not matched values to NaN.
objects = pd.DataFrame([{"id": 1, "number": 10}, {"id": 2, "number": 30},
{"id": 3, "number": 100},{"id": 4, "number": 1}])
print (objects)
id number
0 1 10
1 2 30
2 3 100
3 4 1
m1 = lookup['number'].values >= objects['number'].values[:, None]
m2 = np.cumsum(m1, axis=1) == 1
m3 = np.any(m1, axis=1)
out = lookup['number'].values[m2.argmax(axis=1)]
objects['smallest_greater'] = np.where(m3, out, np.nan)
print (objects)
id number smallest_greater
0 1 10 12.0
1 2 30 40.0
2 3 100 NaN
3 4 1 3.0

smallest_greater = []
for i in objects['number']: smallest_greater.append(lookup['number'[lookup[lookup['number']>i].sort_values(by='number').index[0]])
objects['smallest_greater'] = smallest_greater

Remapping numpy arrarys to dictionary

I want to remap a numpy array according to a dictionary.
Let us assume I have a numpy array with N rows and 3 columns. Now I want to remap the values according to its indices which are written in tuples in a dictionary.
This works fine:
import numpy as np
a = np.arange(6).reshape(2,3)
b = np.zeros(6).reshape(2,3)
print a
print A
dictt = { (0,0):(0,2), (0,1):(0,1), (0,2):(0,0), (1,0):(1,2), (1,1):(1,1), (1,2):(1,0) }
for key in dictt:
b[key] = a[dictt[key]]
print b
a = [[0 1 2]
[3 4 5]]
b = [[ 2. 1. 0.]
[ 5. 4. 3.]]
Let us assume I have N rows, where N is an even number. Now I want to apply the same mapping (which are valid for those 2 rows in the upper example) to all the other rows.
Hence I want to have an array from:
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
to:
b = [[ 2. 1. 0.]
[ 5. 4. 3.]]
[[ 8. 7. 6.]
[ 11. 10. 9.]]
Any ideas? I would like to do it fast since these are 192000 entries in each array which should be remapped.

For simplicity I would just use [::-1].
a = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
b = [item[::-1] for item in a]
>>> b
[[2, 1, 0], [5, 4, 3], [8, 7, 6]]

Python folium GeoJSON map not displaying

I'm trying to use a combination of geopandas, Pandas and Folium to create a polygon map that I can embed incorporate into a web page.
For some reason, it's not displaying.
The steps I've taken:
Grabbed a .shp from the UK's OS for Parliamentary boundaries.
I've then used geopandas to change the projection to epsg=4326 and then exported as GeoJSON which takes the following format:
{ "type": "Feature", "properties": { "PCON13CD": "E14000532", "PCON13CDO": "A03", "PCON13NM": "Altrincham and Sale West" }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -2.313999519326579, 53.357408280545918 ], [ -2.313941776174758, 53.358341455420039 ], [ -2.31519699483377, 53.359035664493433 ], [ -2.317953152796459, 53.359102954309151 ], [ -2.319855973429864, 53.358581917200119 ],... ] ] ] } },...
Then what I'd like to do is mesh this with a dataframe of constituencies in the following format, dty:
constituency count
0 Burton 667
1 Cannock Chase 595
2 Cheltenham 22
3 Cheshire East 2
4 Congleton 1
5 Derbyshire Dales 1
6 East Staffordshire 4
import folium
mapf = folium.Map(width=700, height=370, tiles = "Stamen Toner", zoom_start=8, location= ["53.0219392","-2.1597434"])
mapf.geo_json(geo_path="geo_json_shape2.json",
data_out="data.json",
data=dty,
columns=["constituency","count"],
key_on="feature.properties.PCON13NM.geometry.type.Polygon",
fill_color='PuRd',
fill_opacity=0.7,
line_opacity=0.2,
reset="True")
The output from mapf looks like:
mapf.json_data
{'../../Crime_data/staffs_data92.json': [{'Burton': 667,
'Cannock Chase': 595,
'Cheltenham': 22,
'Cheshire East': 2,
'Congleton': 1,
'Derbyshire Dales': 1,
'East Staffordshire': 4,
'Lichfield': 438,
'Newcastle-under-Lyme': 543,
'North Warwickshire': 1,
'Shropshire': 17,
'South Staffordshire': 358,
'Stafford': 623,
'Staffordshire Moorlands': 359,
'Stoke-on-Trent Central': 1053,
'Stoke-on-Trent North': 921,
'Stoke-on-Trent South': 766,
'Stone': 270,
'Tamworth': 600,
'Walsall': 1}]}
Although the mapf.create_map() function successfully creates a map, the polygons don't render.
What debugging steps should I take?

#elksie5000, Try mplleaflet it is extremely straightforward.
pip install mplleaflet
in Jupyter/Ipython notebook:
import mplleaflet
ax = geopandas_df.plot(column='variable_to_plot', scheme='QUANTILES', k=9, colormap='YlOrRd')
mplleaflet.show(fig=ax.figure)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to index and query complex spatial types in CosmosDB? - indexing

Related

Linear Diophantine Equations with Restriction in the GAP System

Why does Polyhedron render well on its own but not in combination with complete model

Lookup smallest value greater than current

Remapping numpy arrarys to dictionary

Python folium GeoJSON map not displaying

Categories

Resources