Why is a POINT a geography but a collection is a Geometry? - google-bigquery

Why is the classification of multiple geographies a geometry type? For example:
SELECT ST_GeogFromText('POINT(0 1)') AS geography UNION ALL
SELECT ST_GeogFromText('GEOMETRYCOLLECTION(MULTIPOINT(-1 2, 0 12), LINESTRING(-2 4, 0 6))')
In other words, the string itself is GEOMETRYCOLLECTION instead of GEOGRAPHYCOLLECTION or FEATURECOLLECTION. Why is that so?

These things come from different standards / industry practices.
First, the spatial concepts and their WKT names like POINT and GEOMETRYCOLLECTION come from OGC and SQL/MM standards:
https://www.ogc.org/standards/sfs
https://www.iso.org/standard/60343.html
These standards describe spatial objects, and define WKT representation for POINT, LINESTRING, POLYGON, GEOMETRYCOLLECTION.
Second, once users realized the planar world of GEOMETRY type is not enough, various databases invented ways to handle spherical / geodesic geometries. The most popular way the industry uses to define spherical geometry is a different SQL type GEOGRAPHY. Note that GEOGRAPHY is not a standard thing, but rather a common industry practice. Some databases, like MySQL, does not use GEOGRAPHY type, but use spherical math with geographic SRID. Anyway, when GEOGRAPHY type is written to WKT, it still uses GEOMETRYCOLLECTION name for compatibility. So GEOGRAPHY comes from SQL type, GEOMETRY from WKT standard.
Finally, FeatureCollection is used sometimes to represent sets of object that have a geometry property and additional properties, e.g. in GeoJson format. You'll rarely see it used with SQL databases - the FeatureCollection simply corresponds to a table.

Related

What is SRID 0 for geometry columns?

So I added geometry columns to a spatial table and using some of the msdn references I ended up specifying the SRID as 0 like so:
update dbo.[geopoint] set GeomPoint = geometry::Point([Longitude], [Latitude], 0)
However, I believe this was a mistake, but before having to update the column, is 0 actually the default = 4326? The query works as long as I specify the SRID as 0 on the query, but I'm getting weird results in comparison to the geography field I have... SRID 0 does not exist in sys.spatial_reference_systems and I haven't been able to dig up any information on it. Any help would be appreciated.
A SRID of 0 doesn't technically exist, it just means no SRID -- ie, the default if you forget to set it. So, technically, you can still perform distance, intersection and all other queries, so long as both sets of geometries have a SRID of 0. If you have one field of geometries with a SRID of 0 and another set with a SRID that actually exists, you will most likely get very strange results. I remember scratching my head once when not getting any results from a spatial query in exactly this situation and SQL Server did not complain, just 0 results (for what is is worth Postgis will actually fail, with a warning about non-matching SRIDs).
In my opinion, you should always explicitly set the SRID of your geometries (or geographies, which naturally will always be 4326), as not only does it prevent strange query results, but it means you can convert from one coordinate system to another. Being able to convert on the fly from lat/lon (4326), to Spherical Mercator (3857), as used in Google Maps/Bing, which is in meters, or some local coordinate system, such as 27700, British National Grid, also in meters, can be very useful. SQL Server does not to my knowledge support conversion from one SRID to another, but as spatial types are essentially CLR types, there are .NET libraries available should you ever need to do so, see Transform/ Project a geometry from one SRID to another for an example.
If you do decide to change you geometries, you can do something like:
UPDATE your_table SET newGeom = geometry::STGeomFromWKB(oldGeom.STAsBinary(), SRID);
which will create a new column or to do it in place:
UPDATE geom SET geom.STSrid=4326;
where 4326 is just an example SRID.
There is a good reference for SRIDs at http://spatialreference.org/, though this is essentially the same information as you find in sys.spatial_reference_systems.
SRIDs are a way to take into account that the distances that you're measuring on aren't on a flat, infinite plane but rather an oblong spheroid. They make sense for the geography data type, but not for geometry. So, if you're doing geographic calculations (as your statement of "in comparison to the geography field I have"), create geography points instead of geometry points. In order to do calculations on any geospatial data (like "find the distance from this point to this other point"), the SRID of all the objects involved need to be the same.
TL;DR: Is the point on the Cartesian plane? Use geometry. Is the point on the globe? Use geography.

Determine "reach" by geographic distribution

I have a large collection of checkins for products manufactured at a distinct geographic location. I'd like to create a summary metric used to rank these products by how far, globally, they have traveled from their point of origin. For example, a product produced in Maine that is found in California, Florida, and Dublin, Ireland should rank higher than a product made in California that hasn't been seen outside of California.
What kind of algorithms should I be looking at? How would you approach this?
MS SQL Server (which I've just spotted may not be relevant to you) includes spatial data types that allow you to calculate (among other things) the distance between two points defined by their latitude and longitude. So this code:-
DECLARE #p1 geography = geography::Point(#lat1, #long1, 4326);
SELECT #distance=#p1.STDistance(geography::Point(#lat2, #long2, 4326))
would load #distance with the distance in metres between the two points. I lifted the code from a scalar valued in line function that I wrote - but it could also be targeting table columns directly. The 4326 magic number is a reference to the Spatial Reference System Identifier (SRID) that provides answers in metres. This calculation doesn't take into account altitude and the distortion of the globe (other functions/SRIDs are available for this) but it's probably accurate enough for most purposes.
Unfortunately, if you are restricted to postgresql, this answer is of no use (though it may point you in a direction for further investigation).
A reference for Sql Server can be found here : http://technet.microsoft.com/en-us/library/bb933790.aspx

SQL Server 2008+ : Best method for detecting if two polygons overlap?

We have an application that has a database full of polygons (currently stored as points) that a .net app pulls out and checks if they overlap.
I occurred to me that it would be much nicer to convert these point arrays to polygon / polyline objects within the database and use sql to get a bool of weather they overlap or not.
I have seen different methods suggested to do this but non of the examples given were quite in-line with my needs.
I would be very happy to receive input from those kind enough to offer their experience.
Additional:
In response to questions: It is indeed 2D. and yes any crossover of the two is considered true. The polygons have n points and can be concave. The polygons will be saved as 1 per row (after data conversion task) as polygons (i.e. the polygon type .. it might be called something else spatial / geom my memory is not on my side right now)
You can use .STIntersection with .STAsText() to test for overlapping polygons. (I really hate the terminology Microsoft has used (or whoever set the standard terms). "Touching," in my mind, should be a test for whether or not two geometry/geography shapes overlap at all, not just share a border.)
Anyway....
If #RadiusGeom is a geometry representing a radius from a point, the following will return a list of any two polygons where an intersection (a geometry that represents the area where two geometries overlap) is not empty.
SELECT CT.ID AS CTID, CT.[Geom] AS CensusTractGeom
FROM CensusTracts CT
WHERE CT.[Geom].STIntersection(#RadiusGeom).STAsText() <> 'GEOMETRYCOLLECTION EMPTY'
If your geometry field is spatially indexed, this runs pretty quickly. I ran this on 66,000 US CT records in about 3 seconds. There may be a better way, but since no one else had an answer, this was my attempt at an answer for you. Hope it helps!
Calculate and store the bounding rectangle of each polygon in a set of new fields within the row which is associated with that polygon. (I assume you have one; if not, create one.) When your dotnet app has a polygon and is looking for overlapping polygons, it can fetch from the database only those polygons whose bounding rectangles overlap, using a relatively simple SQL SELECT statement. Those polygons should be relatively few, so this will be efficient. Then, your dotnet app can perform the finer polygon overlap calculations in order to determine which ones of those really overlap.
Okay, I got another idea, so I am posting it as a different answer. I think my previous answer with the bounding polygons probably has some merit on its own, even if it was to reduce the number of polygons fetched from the database by a small percentage, but this one is probably better.
MSSQL supports integration with the CLR since version 2005. This means that you can define your own data type in an assembly, register the assembly with MSSQL, and from that moment on MSSQL will be accepting your user-defined data type as a valid type for a column, and it will be invoking your assembly to perform operations with your user-defined data type.
An example article for this technique on the CodeProject: Creating User-Defined Data Types in SQL Server 2005
I have never used this mechanism, so I do not know details about it, but I presume that you should be able to either define a new operation on your data type, or perhaps overload some existing operation like "less-than", so that you can check if one polygon intersects another. This is likely to speed things up a lot.

GEOMETRY and GEOGRAPHY difference SQL Server 2008

What is difference between GEOMETRY and GEOGRAPHY in SQL Server 2008?
GEOMETRY is for planar spatial data (that is, data on a flat surface)
GEOGRAPHY is for terrestrial spatial data (that is, data on the (curved) surface of the Earth)
See eg here, here for more.
Read the wonderful manual at:
http://msdn.microsoft.com/en-us/library/bb933790.aspx
The geometry Data Type
The geometry data type (planar)
supported by SQL Server conforms to
the Open Geospatial Consortium (OGC)
Simple Features for SQL Specification
version 1.1.0.
For more information on OGC
specifications, see the following:
* OGC Specifications, Simple Feature Access Part 1 - Common
Architecture
* OGC Specifications, Simple Feature Access Part 2 – SQL Options
The geography Data Type
The geography data type (geodetic)
stores ellipsoidal (round-earth) data,
such as GPS latitude and longitude
coordinates.
It comes down to what model of the earth you're using - a planar or geodetic one.

SQL Server units question

This may be a really dumb question, but...
What units does Geography.STLength return? The official MSDN page doesn't say anything about the units returned, and this blog entry here says STLength() returns a float indicating the length of the instance in units. Yes, that's right, it says it returns it in units.
Can anyone shed some light on what units STLength returns? Feet? Meters? Inches? Help!
The units are entirely dependent on the Spatial Reference ID (SRID) of the geography/geometry data being used. By convention, you would generally use an SRID of "0" for geometry types if all the data is in the same unit system.
However, usually the geography type uses an SRID of 4326, which is the reference ID of the latitude/longitude ellipsoidal earth coordinate system known as WGS 84. When you specify point coordinates in this system, it is in degrees of angle of latitude and longitude, rather than some distance from an origin. Length and area calculations on points in this reference system will return completely different results from geometric calculations on the exact same point positions (for a great example see Differences between Geography and Geometry here, and as for why this happens, see here).
So if your data columns were created with an SRID of "0", then the system is defined to be unitless and you would need some metadata about the data model to figure out the units. If they were defined with a real SRID, then you can use this query:
SELECT spatial_reference_id
, well_known_text
, unit_of_measure
, unit_conversion_factor
FROM sys.spatial_reference_systems
to check what units the SRID represents. Most are in metres, but a few are in feet.