Difference between Geometry and Geometry collections - sql

Please can anyone explain the difference between
$table->geometry('positions'); and $table->geometryCollection('positions'); in laravel.

I'm assuming your database is MySQL. Please add it as a tag to your post then.
According to the manual:
https://dev.mysql.com/doc/refman/5.7/en/spatial-type-overview.html
Some spatial data types hold single geometry values:
GEOMETRY
GEOMETRY can store geometry values of any type. The other single-value types (POINT, LINESTRING, and POLYGON) restrict their values to a particular geometry type.
and
The other spatial data types hold collections of values:
GEOMETRYCOLLECTION
GEOMETRYCOLLECTION can store a collection of objects of any type. The other collection types (MULTIPOINT, MULTILINESTRING, and MULTIPOLYGON) restrict collection members to those having a particular geometry type.
The difference is that the last one can hold multiple entities (coordinates/points) in a single column. This is useful if you want to save a square shape for instance, which needs 4 points.
A practical example for the first one would be a location (which is a single set of coordinates (lat+lng)).
I sadly cant give you much information on how to use them within Laravel. However in raw SQL, it basically looks like this:
-- Add data to a GEOMETRY column
SET #g = 'POINT(1 1)';
INSERT INTO geo VALUES (ST_GeomFromText(#g));
-- Add data to a GEOMETRYCOLLECTION column
SET #g = 'GEOMETRYCOLLECTION(POINT(1 1),LINESTRING(0 0,1 1,2 2,3 3,4 4))';
INSERT INTO geocol VALUES (ST_GeomFromText(#g));

Related

Compute minimum of a nested struct column while preserving the schema in Spark/Scala

Suppose I have a dataframe df with a particular column c (that is a struct with several nested fields inside it (could be other structs, integer, string, etc). I do not know these fields beforehand, so I want a general solution.
I want to compute the minimum of this column. Currently I am doing this - val min_df = df.agg(min(c).as("min_col"))
This returns a dataframe min_df with one row and one column. Unfortunately, the schema of min_df ends up being a subset of the original schema of df(c), since some fields and values do not exist in the minimum value. I want it to be the same as the schema of df(c), since I want to compare this minimum value with some other quantities later on.
I already tried something like spark.createDataFrame(min_df.rdd, schema=df.select('c').schema), but this isn't working.
How can I go about computing the minimum/maximum so that the schema is preserved in this case?

How to find city name from latitude and longitude in Postgresql?

I have a database which has latitudes and longitudes of various properties stored. I want to find out, which city does each of these properties belong to (all properties are in the US).
Talking about Postgresql, first of all you need to get a data of US cities boundaries shape file. Possible sites are
https://www.census.gov/geo/maps-data/data/tiger.html
https://www.census.gov/geo/maps-data/data/tiger-cart-boundary.html
https://catalog.data.gov/dataset?tags=cities
After that import data into postgres. I am assuming that your properties data is already stored in postgres. Make sure the SRID geometry type of cities boundaries is 4326. if not, you can convert it easily with ST_transform function.
Finally, to check which city some specific lat/long falls in, you need to convert the lat/long into point geometry and check against the cities data. e.g it would be some thing like this
SELECT c.city_name FROM cities_boundaries AS c, properties AS p
WHERE ST_CONTAINS(c.geom, ST_SetSRID(ST_MakePoint(p.longitude, p.latitude), 4326))

Storing and querying location (coords) in PostgreSQL

In my user schema I have a location column which will hold the last updated coordinates of the user. Is there a PostgreSQL data type I could use to allow me to run SQL queries to search, for example something like: SELECT username FROM Users WHERE currentLocation="20 Miles Away From $1"
If your "location" column holds GPS coordinates then basically your only option is to use the PostGIS extension. With PostGIS you store your locations in a column with geometry or geography data type and then you can apply a rich set of function on it. Your query would become something like:
SELECT username
FROM users
WHERE ST_Distance(currentlocation, $1) = 20; -- careful with units
This is assuming that $1 is of the same data type as currentlocation. You probably have to convert the miles to kilometers, depending on your coordinate reference system and the data type (geography would produce kilometers, geometryproduces whatever unit the data is in).

Column that shows number of elements in another col (Int Array) SQL (postgres 8.3)

I have a column of Int Array. I want to add another column to the table, that always shows the number elements in that array for that row. It should update this value automatically. Is there a way to embedd a function as default value? If so, how would this function know where to pick its argument (the int array column/row number).
In a normalized table you would not include this functionally dependent and redundant information as a separate column.
It is easy and fast enough to compute it on the fly:
SELECT array_dims ('{1,2,3}'::int[]);
Or:
SELECT array_length('{1,2,3}'::int[], 1);
array_length() has been introduced with PostgreSQL 8.4. Maybe an incentive to upgrade? 8.3 is going out of service soon.
With Postgres 8.3 you can use:
SELECT array_upper('{1,2,3}'::int[], 1);
But that's inferior, because the array index can start with any number, if entered explicitly. array_upper() would not tell the actual length then, you would have to subtract array_lower() first. Also note, that in PostgreSQL arrays can always contain multiple dimensions, regardless of how many dimensions have been declared. I quote the manual here:
The current implementation does not enforce the declared number of
dimensions either. Arrays of a particular element type are all
considered to be of the same type, regardless of size or number of
dimensions. So, declaring the array size or number of dimensions in
CREATE TABLE is simply documentation; it does not affect run-time
behavior.
(True for 8.3 and 9.1 alike.) That's why I mentioned array_dims() first, to give a complete picture.
Details about array functions in the manual.
You may want to create a view to include that functionally dependent column:
CREATE VIEW v_tbl AS
SELECT arr_col, array_length(arr_col, 1) AS arr_len
FROM tbl;

MySQL command to search CSV (or similar array)

I'm trying to write an SQL query that would search within a CSV (or similar) array in a column. Here's an example:
insert into properties set
bedrooms = 1,2,3 (or 1-3)
title = nice property
price = 500
I'd like to then search where bedrooms = 2+. Is this even possible?
The correct way to handle this in SQL is to add another table for a multi-valued property. It's against the relational model to store multiple discrete values in a single column. Since it's intended to be a no-no, there's little support for it in the SQL language.
The only workaround for finding a given value in a comma-separated list is to use regular expressions, which are in general ugly and slow. You have to deal with edge cases like when a value may or may not be at the start or end of the string, as well as next to a comma.
SELECT * FROM properties WHERE bedrooms RLIKE '[[:<:]]2[[:>:]]';
There are other types of queries that are easy when you have a normalized table, but hard with the comma-separated list. The example you give, of searching for a value that is equal to or greater than the search criteria, is one such case. Also consider:
How do I delete one element from a comma-separated list?
How do I ensure the list is in sorted order?
What is the average number of rooms?
How do I ensure the values in the list are even valid entries? E.g. what's to prevent me from entering "1,2,banana"?
If you don't want to create a second table, then come up with a way to represent your data with a single value.
More accurately, I should say I recommend that you represent your data with a single value per column, and Mike Atlas' solution accomplishes that.
Generally, this isn't how you should be storing data in a relational database.
Perhaps you should have a MinBedroom and MaxBedroom column. Eg:
SELECT * FROM properties WHERE MinBedroom > 1 AND MaxBedroom < 3;