SQL Server 2008+ : Best method for detecting if two polygons overlap? - sql

We have an application that has a database full of polygons (currently stored as points) that a .net app pulls out and checks if they overlap.
I occurred to me that it would be much nicer to convert these point arrays to polygon / polyline objects within the database and use sql to get a bool of weather they overlap or not.
I have seen different methods suggested to do this but non of the examples given were quite in-line with my needs.
I would be very happy to receive input from those kind enough to offer their experience.
Additional:
In response to questions: It is indeed 2D. and yes any crossover of the two is considered true. The polygons have n points and can be concave. The polygons will be saved as 1 per row (after data conversion task) as polygons (i.e. the polygon type .. it might be called something else spatial / geom my memory is not on my side right now)

You can use .STIntersection with .STAsText() to test for overlapping polygons. (I really hate the terminology Microsoft has used (or whoever set the standard terms). "Touching," in my mind, should be a test for whether or not two geometry/geography shapes overlap at all, not just share a border.)
Anyway....
If #RadiusGeom is a geometry representing a radius from a point, the following will return a list of any two polygons where an intersection (a geometry that represents the area where two geometries overlap) is not empty.
SELECT CT.ID AS CTID, CT.[Geom] AS CensusTractGeom
FROM CensusTracts CT
WHERE CT.[Geom].STIntersection(#RadiusGeom).STAsText() <> 'GEOMETRYCOLLECTION EMPTY'
If your geometry field is spatially indexed, this runs pretty quickly. I ran this on 66,000 US CT records in about 3 seconds. There may be a better way, but since no one else had an answer, this was my attempt at an answer for you. Hope it helps!

Calculate and store the bounding rectangle of each polygon in a set of new fields within the row which is associated with that polygon. (I assume you have one; if not, create one.) When your dotnet app has a polygon and is looking for overlapping polygons, it can fetch from the database only those polygons whose bounding rectangles overlap, using a relatively simple SQL SELECT statement. Those polygons should be relatively few, so this will be efficient. Then, your dotnet app can perform the finer polygon overlap calculations in order to determine which ones of those really overlap.

Okay, I got another idea, so I am posting it as a different answer. I think my previous answer with the bounding polygons probably has some merit on its own, even if it was to reduce the number of polygons fetched from the database by a small percentage, but this one is probably better.
MSSQL supports integration with the CLR since version 2005. This means that you can define your own data type in an assembly, register the assembly with MSSQL, and from that moment on MSSQL will be accepting your user-defined data type as a valid type for a column, and it will be invoking your assembly to perform operations with your user-defined data type.
An example article for this technique on the CodeProject: Creating User-Defined Data Types in SQL Server 2005
I have never used this mechanism, so I do not know details about it, but I presume that you should be able to either define a new operation on your data type, or perhaps overload some existing operation like "less-than", so that you can check if one polygon intersects another. This is likely to speed things up a lot.

Related

Checking if a Coordinate is Within a Range - BigQuery GIS

I'm looking at the freely available Solar potential dataset on Google BigQuery that may be found here: https://bigquery.cloud.google.com/table/bigquery-public-data:sunroof_solar.solar_potential_by_censustract?pli=1&tab=schema
Each record on the table has the following border definitions:
lat_max - maximum latitude for that region
lat_min - minimum latitude for that region
lng_max - maximum longitude for that region
lng_min - minimum longitude for that region
Now I have a coordinate (lat/lng pair) and I would like to query to see whether or not that coordinate is within the above range. How do I do that with BQ Standard SQL?
I've seen the Geo Functions here: https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions
But I'm still not sure how to write this query.
Thanks!
Assuming the points are just latitude and longitude as numbers, why can't you just do a standard numerical comparison?
Note: The first link doesn't work without a google account, so I can't see the data.
But if you want to become spatial, I'd suggest you're going to need to take the border coordinates that you have and turn them into a polygon using one of: ST_MAKEPOLYGON, ST_GEOGFROMGEOJSON, or ST_GEOGFROMTEXT. Then create a point using the coords you wish to test ST_MAKEPOINT.
Now you have two geographies you can compare them both using ST_INTERSECTION or ST_DISJOINT depending on what outcome you want.
If you want to get fancy and see how far aware from the border you are (which I guess means more efficient?) you can use ST_DISTANCE.
Agree with Jonathan, just checking if each of the lat/lon value is within the bounds is simplest way to achieve it (unless there are any issues around antimeridian, but most likely you can just ignore them).
If you do want to use Geography objects for that, you can construct Geography objects for these rectangles, using
ST_MakePolygon(ST_MakeLine(
[ST_GeogPoint(lon_min, lat_min), ST_GeogPoint(lon_max, lat_min),
ST_GeogPoint(lon_max, lat_max), ST_GeogPoint(lon_min, lat_max),
ST_GeogPoint(lon_min, lat_min)]))
And then check if the point is within particular rectangle using
ST_Intersects(ST_GeogPoint(lon, lat), <polygon-above>)
But it will likely be slower and would not provide any benefit for this particular case.

Determining which polygon contains the majority of a line - Oracle Spatial

I have an oracle database (11g spatial) that includes a series of area polygons and water mains. I'm trying to attribute each of these mains to the area in which it is contained and for the most part this is straightforward enough (using the SDO_CONTAINS function) but I'm not sure how to deal with mains that straddle multiple polygons due to errors in digitisation.
In cases like this what I'd ideally like to do is attribute a main to an area polygon if the majority of it's length (>50%) is contained within onit. I know that I can use the SDO_RELATE function to determine every polygon that any given main interacts with, but I don't know how to then go about determining how much of it's length is contained within each area.
The principle is like this:
Correlate mains and areas. Assuming you have many mains and many areas, the most efficient approach is to use SDO_JOIN
For each couple (main/area) returned, compute their intersection (SDO_GEM.SDO_INTERSECTION) and measure the length of that intersection (SDO_GEOM.SDO_LENGTH).
From those results, retain the area for each main where the length is the maximum
If you want a full SQL example, allow me a bit of time to write that using sample data.

Geographic or projected data based on SRID in Oracle

Background:
My application needs to display the MBR of the spatial data (geometry) stored in Oracle. For this, I'm currently using Oracle's SDO_AGGR_MBR() function, but it is very slow. On researching a little, I found a function SDO_TUNE.EXTENT_OF() which also computes MBR and is much faster than SDO_AGGR_MBR. It has 2 problems though. It works only with 2D data in projection coordinates. To tap into the performance benefits of EXTENT_OF, I decided to use it for projected data and fallback to SDO_AGGR_MBR for geographic data.
The Problem:
I started out with an assumption that all data with SRID between 4000 to 5000 is geographic but that is not entirely true. I found a table/view named MDSYS.CS_SRS which stores the coordinate system information.
I'm planning to find the SRID using query:
select a.COLUMN_NAME.SDO_SRID from TABLE_NAME a where rownum = 1;
and then using this SRID to query MDSYS.CS_SRS to find out whether the data is geographic or projected. It has column named WKTEXT whose rows start with either PROJCS or GEOGCS.
I could prototype this and it seems to work but entirely confident this is the right approach. The query above fetches the SRID of the first row of the data. I don't know if SRID can be different in a single column. Another assumption I'm making is the text in WKTEXT column. I'll be in a lot of trouble if it's not PROJCS/GEOGCS in all cases and if the values change between different releases of Oracle. In fact, right now, I'm just assuming that PROJCS means projection CS and GEOGCS means geographic CS and I'm not sure if it's right..
I wonder if there's an easier way to find out whether the spatial data in an Oracle DB is projection or geographic.
You did not say which version of the database you are looking at. I am assuming 10gR2 or later.
The easiest is to check table SDO_COORD_REF_SYS:
SQL> select srid, coord_ref_sys_kind from sdo_coord_ref_sys where srid in (4326, 4327, 27700, 7405);
SRID COORD_REF_SYS_KIND
---------- --------------------
4326 GEOGRAPHIC2D
4327 GEOGRAPHIC3D
7405 COMPOUND
27700 PROJECTED
4 rows selected.

What is SRID 0 for geometry columns?

So I added geometry columns to a spatial table and using some of the msdn references I ended up specifying the SRID as 0 like so:
update dbo.[geopoint] set GeomPoint = geometry::Point([Longitude], [Latitude], 0)
However, I believe this was a mistake, but before having to update the column, is 0 actually the default = 4326? The query works as long as I specify the SRID as 0 on the query, but I'm getting weird results in comparison to the geography field I have... SRID 0 does not exist in sys.spatial_reference_systems and I haven't been able to dig up any information on it. Any help would be appreciated.
A SRID of 0 doesn't technically exist, it just means no SRID -- ie, the default if you forget to set it. So, technically, you can still perform distance, intersection and all other queries, so long as both sets of geometries have a SRID of 0. If you have one field of geometries with a SRID of 0 and another set with a SRID that actually exists, you will most likely get very strange results. I remember scratching my head once when not getting any results from a spatial query in exactly this situation and SQL Server did not complain, just 0 results (for what is is worth Postgis will actually fail, with a warning about non-matching SRIDs).
In my opinion, you should always explicitly set the SRID of your geometries (or geographies, which naturally will always be 4326), as not only does it prevent strange query results, but it means you can convert from one coordinate system to another. Being able to convert on the fly from lat/lon (4326), to Spherical Mercator (3857), as used in Google Maps/Bing, which is in meters, or some local coordinate system, such as 27700, British National Grid, also in meters, can be very useful. SQL Server does not to my knowledge support conversion from one SRID to another, but as spatial types are essentially CLR types, there are .NET libraries available should you ever need to do so, see Transform/ Project a geometry from one SRID to another for an example.
If you do decide to change you geometries, you can do something like:
UPDATE your_table SET newGeom = geometry::STGeomFromWKB(oldGeom.STAsBinary(), SRID);
which will create a new column or to do it in place:
UPDATE geom SET geom.STSrid=4326;
where 4326 is just an example SRID.
There is a good reference for SRIDs at http://spatialreference.org/, though this is essentially the same information as you find in sys.spatial_reference_systems.
SRIDs are a way to take into account that the distances that you're measuring on aren't on a flat, infinite plane but rather an oblong spheroid. They make sense for the geography data type, but not for geometry. So, if you're doing geographic calculations (as your statement of "in comparison to the geography field I have"), create geography points instead of geometry points. In order to do calculations on any geospatial data (like "find the distance from this point to this other point"), the SRID of all the objects involved need to be the same.
TL;DR: Is the point on the Cartesian plane? Use geometry. Is the point on the globe? Use geography.

Finding posts where position distance is less or equal to dynamic value

I need help with a architectural problem that im working with. The user enters a position and a radius (e.g. distance). The software searches in a (giant = couple of 100k posts) database table for posts where the users location and the "posts" distance to each other is less than the entered distance.
It's kind of hard for me to explain, but imagine a table with two posts, point a and point c, point U is the user location. The user has entered a position and a radius, and the position and radius for a and c is predefined (stored in a database).
In this case i would only be interested in the point A, because the two areas intersect with each other. How should i transform this into doing in a database with a couple of hundred thousand posts in an effective way? In the database i shall store longitude,latitude and radius.
Depends on which database server you're using, but look into the GIS capabilities that might be included. For example, MS SQL Server 2008 has a built-in geometry type, and PostgreSQL has PostGIS. Oracle has something like this too. Anyhow - these native GIS formats come with spacial querying functions that do the sort of thing you're talking about - searching for matches within given distances, etc... It is pretty simple to accomplish once to switch to the proper datatype.
edit
Since you're using SQL 2008, and your data is lat/long, I suggest the "geography" rather than the "geometry" datatype. Take a look here: http://msdn.microsoft.com/en-us/library/cc280766.aspx