OpenStreetMap Nominatim schema: neighborhood, suburb, city, state_district, etc - gps

I am trying out the OpenStreetMap Nominatim reverse-geocoder RESTFul API.
Is there a definitive explanation of the address schema returned from the API? Some locations have different attributes in the 'addressparts' block.
For example, for Seattle, Nominatim includes "suburb" and "city" attributes.
<reversegeocode timestamp="Tue, 19 Nov 13 01:48:51 +0000" attribution="Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright" querystring="format=xml&lat=47.60647&lon=-122.32644&zoom=18&addressdetails=1">
<result place_id="3681763473" osm_type="node" osm_id="2159323135" lat="47.6065166" lon="-122.3262919">
725, 9th Avenue, First Hill, Seattle, King, Washington, 98104, United States of America
</result>
<addressparts>
<house_number>725</house_number>
<road>9th Avenue</road>
<suburb>First Hill</suburb>
<city>Seattle</city>
<county>King</county>
<state>Washington</state>
<postcode>98104</postcode>
<country>United States of America</country>
<country_code>us</country_code>
</addressparts>
</reversegeocode>
However, for New York City, it returns "neighborhood" and "state_district" attributes.
<reversegeocode timestamp="Tue, 19 Nov 13 01:50:16 +0000" attribution="Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright" querystring="format=xml&lat=40.71812&lon=-73.98298&zoom=18&addressdetails=1">
<result place_id="5989088711" osm_type="node" osm_id="2273010097" ref="N.Y. Grill & Deli" lat="40.7184546" lon="-73.9828337">
N.Y. Grill & Deli, 208, Rivington Street, Lower East Side, New York City, New York, 10002, United States of America
</result>
<addressparts>
<address29>N.Y. Grill & Deli</address29>
<house_number>208</house_number>
<road>Rivington Street</road>
<neighbourhood>Lower East Side</neighbourhood>
<state_district>New York City</state_district>
<county>New York</county>
<state>New York</state>
<postcode>10002</postcode>
<country>United States of America</country>
<country_code>us</country_code>
</addressparts>
</reversegeocode>

As far as I know there is currently not very much documentation except for the Nominatim wiki page and subpages and the source code. But these categories are generated with the help of the admin_level key found on boundary=administration relations and via the place and addr tags. Each wiki page contains some information about possible values for the corresponding keys.

Related

Best way to add info/description to my items?

I made a geo game a while back where the player has to guess an item from an image (what I call an item is a SQL row basically) for example the bot sends the flag of the Netherlands, you have to type "Netherlands" to win.
Items can be the flag of a country, a capital city, a french department...
I made an info tab where it would basically give info about an item (ie region, former name, capital city, etc).
What I would like to do is properly save this information. I don't really know if I should store this in files like JSON because I would also like to give stats (Win rate per region, amount of games played per region, etc...).
Also, these elements are not fixed because some items have regions, capital cities or whatever and some don't.
Item examples :
(For a flag
Column
Attribute
ID
1
Name
United Kingdom
Former name
United Kingdom of Great Britain and Northern Ireland
Code
GB
Continent
Europe
Subregion
Northern Europe
Capital city
London
...
(For a U.S. State)
Column
Attribute
ID
1
Name
Arizona
Capital city
Phoenix
Largest city
Phoenix
...
The both solution (Add all as column and json) are not the proper way.
I think the best design is to have a key-value table.
Create Table tableName (ID INT, [Key] SYSNAME, [Value])
And data will look like:
ID
Key
Value
1
Name
Arizona
1
Capital City
Phoenix
1
Largest City
Phoenix
2
Name
United Kingdom
2
Former name
United Kingdom of Great Britain and Northern Ireland
Most valuable benefits: No Extra storage for columns with large amount of rows with NULL value.

Filtering out records in a SQL table using rules in another SQL table

Data Table
Name Company Continent country state district
Tom HP Asia India Assam Kdk
George SAP Africa Sudan Chak ksk
Bill EBAY Europe Denmark Lekh Sip
Charles WM Asia India Haryana Jhat
Chip WM Asia India Punjab Chista
Chia WM Asia India Punjab Mast
Rule Table
Continent country state district Pass
Asia India ALL ALL Yes
Asia India Punjab ALL NO
Asia India Punjab Mast Yes
I have two tables in Hive. Depending on the rule I have to filter out the data in the data table.
In the rule table there is a column called pass which determines whether a record in data table needs to be filtered or not.
In this example there are different kinds of rules. They are the ones at broader level and at narrow level.
The rules at narrow level should not affect the rules at broader level. This means the rules at narrow level is an exception to rules at broader level.
For ex: in the rules table, there are 3 records. The first record is the rule at broader level. The other ones are at narrow level.
The first rules says to pass all the records that have country as india,state as any/all and district as any/all.
The second rule says to not pass all the records that have country as India, state as punjab and district as any/all.
The third rule says to pass all records that have country as India,state as punjab and district as Mast.
The second rule is an exception to first rule. The third rule is an exception to second rule.
Considering the data in the data table and rules in the rules table, the pass columns will be as follows for the Indian(country) records.
Name Company Continent country state district Pass
Tom HP Asia India Assam Kdk Yes
Charles WM Asia India Haryana Jhat Yes
Chia WM Asia India Punjab Mast Yes
Chip WM Asia India Punjab Chista No
This is just an example. In production the data will be different.
How do I implement this using SQL/Sql script?
Help is much appreciated.
You want the most specific rule. In Hive, you can use multiple left joins:
select d.*, coalesce(r1.pass, r2.pass, r3.pass)
from data d left join
rules r1
on r1.Continent = d.Continent and
r1.country = d.country and
r1.state = d.state and
r1.district = d.district left join
rules r2
on r2.Continent = d.Continent and
r2.country = d.country and
r2.state = d.state and
r2.district = 'ALL' left join
rules r3
on r3.Continent = d.Continent and
r3.country = d.country and
r3.state = 'ALL' and
r3.district = 'ALL' ;
You might want to continue with the LEFT JOINs if 'ALL' is allowed for continent and country.
#TomG : Please see the below code if that helps
select * from TEMP_TESTING where country ='India' and district<>'Chista'
union
(select * from TEMP_TESTING where country ='India' except
select * from TEMP_TESTING where country ='India' and state='Punjab')
union
select * from TEMP_TESTING where country ='India'and state='Punjab' and district='Mast'

spatial database query (the closest shop to a specific address)

Database name: landmarks
landmark(landmarkId(PK),landmarkName,type,GPSCoordinates,locationId)
Comments(commentId(PK),comment,landmarkId)
Location(locationId(PK),streetName,streetNumber,buildingNumber,suburbId)
Suburb(suburbId(PK),suburbName,cityId)
City(cityId(PK),cityName,stateId)
State(stateId(PK),stateName,countryId)
Country(countryId(PK),countryName)
I'm trying to make a query to get the closest landmark which its type=coffee shop to a specific address.
For example:
The closest coffee shop to 54 George Street, Los angles, California, usa .
I'm new to spatial databases, what is the best optimised SQL query to this example.

Joining multiple fields between the same tables

I have a table called 'Resources' that looks like this:
Country City Street Headcount
UK Halifax High Street 20
United Kingdom Oxford High Street 30
Canada Halifax North St 40
Because of the nature of the location fields, I need to map them to a single 'Address' field, and so I also have the following table called 'Addresses':
Country City Street Address
UK Halifax High Street High Street, Halifax, UK
Canada Halifax North St North Street, Halifax, Canada
United Kingdom Oxford High Street High Street, Oxford, UK
(In reality the Address field does add information rather than just combining what is already there.)
I am currently using the following SQL to produce the query:
SELECT Resources.Country, Resources.City, Resources.Street, Addresses.Address,
Resources.Headcount
FROM Resources
INNER JOIN Addresses ON Resources.Country = Addresses.Country
AND Resources.City = Addresses.City
AND Resources.Street = Addresses.Street
This works for me, but I am worried that I have not seen people use this many ANDs in a single join elsewhere, so don't know if it is a bad idea. (This is simplified version - I may need up to 8 ANDs in a single join in another case) Is this the best way to approach the problem, or is there a better solution?
Thanks
Joining on multiple columns is fine. You don't have to "fear" this.
As far as "a better way". I would suggest creating some variable tables, putting some data in them, and posting that TSQL (DDL and DML) here. Then you can get some possible alternatives. Your question is vague at the present (in regards to the "is there a better way" portion of your question)

Is there a publicly available list of the US States in machine readable form?

Where can I find a list of the US States in a form for importing into my database?
SQL would be ideal, otherwise CSV or some other flat file format is fine.
Edit: Complete with the two letter state codes
I needed this a few weeks ago and put it on my blog as SQL and Tab Delimited. The data was sourced from wikipedia in early January so should be up to date.
US States: http://www.john.geek.nz/index.php/2009/01/sql-tips-list-of-us-states/
I use the Worlds Simplest Code Generator if I need to add columns or remove some of the fields - http://secretgeek.net/wscg.asp
I've also done Countries of the world and International Dialling Codes too.
Countries: http://www.john.geek.nz/index.php/2009/01/sql-tips-list-of-countries/
IDC's: http://www.john.geek.nz/index.php/2009/01/sql-tips-list-of-international-dialling-codes-idcs/
Edit: New: Towns and cities of New Zealand
Depending on why you need the states, it is worth keeping in mind that there are more than 50 valid state codes. For someone deployed outside the USA, it is annoying to come across websites that do not allow address entry with perfectly valid state codes like AE and AP. A better resource would be USPS.
Cut/Paste these into notepad and then import..should be easy enough - there are only 50 after all:
Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming
Out of interest: As there are only 50 and they rarely change, couldn't you not just manually create such a list from a source and put it on a public webspace?
In response to #cspoe7's astute observation, here is a query with all valid states and their abbreviations according to USPS. I have them sorted here by category (official US states, District of Columbia, US territories, military "states") and then alphabetically.
INSERT INTO State (Name, Abbreviation)
VALUES
('Alabama','AL'), -- States
('Alaska','AK'),
('Arizona','AZ'),
('Arkansas','AR'),
('California','CA'),
('Colorado','CO'),
('Connecticut','CT'),
('Delaware','DE'),
('Florida','FL'),
('Georgia','GA'),
('Hawaii','HI'),
('Idaho','ID'),
('Illinois','IL'),
('Indiana','IN'),
('Iowa','IA'),
('Kansas','KS'),
('Kentucky','KY'),
('Louisiana','LA'),
('Maine','ME'),
('Maryland','MD'),
('Massachusetts','MA'),
('Michigan','MI'),
('Minnesota','MN'),
('Mississippi','MS'),
('Missouri','MO'),
('Montana','MT'),
('Nebraska','NE'),
('Nevada','NV'),
('New Hampshire','NH'),
('New Jersey','NJ'),
('New Mexico','NM'),
('New York','NY'),
('North Carolina','NC'),
('North Dakota','ND'),
('Ohio','OH'),
('Oklahoma','OK'),
('Oregon','OR'),
('Pennsylvania','PA'),
('Rhode Island','RI'),
('South Carolina','SC'),
('South Dakota','SD'),
('Tennessee','TN'),
('Texas','TX'),
('Utah','UT'),
('Vermont','VT'),
('Virginia','VA'),
('Washington','WA'),
('West Virginia','WV'),
('Wisconsin','WI'),
('Wyoming','WY'),
('District of Columbia','DC'),
('American Samoa','AS'), -- Territories
('Federated States of Micronesia','FM'),
('Marshall Islands','MH'),
('Northern Mariana Islands','MP'),
('Palau','PW'),
('Puerto Rico','PR'),
('Virgin Islands','VI'),
('Armed Forces Africa','AE'), -- Armed Forces
('Armed Forces Americas','AA'),
('Armed Forces Canada','AE'),
('Armed Forces Europe','AE'),
('Armed Forces Middle East','AE'),
('Armed Forces Pacific','AP')
If you need to memorize them, let Wakko help you :)
You can download a lot of lists on http://www.freebase.com/ .
http://www.geonames.org/export/
The GeoNames geographical database is available for download free of charge under a creative commons attribution license. It contains over eight million geographical names and consists of 6.5 million unique features whereof 2.2 million populated places and 1.8 million alternate names. All features are categorized into one out of nine feature classes and further subcategorized into one out of 645 feature codes. (more statistics ...).
The data is accessible free of charge through a number of webservices and a daily database export.
You could use google sets to make a list of all states as well as lists of more or less anything.
If you need only 52 states SQL server script you can use the following query: solved
INSERT INTO
States ( StateName )
VALUES
( 'Alabama'),
( 'Alaska'),
( 'Arizona'),
( 'Arkansas'),
( 'California'),
( 'Colorado'),
( 'Connecticut'),
( 'Delaware'),
( 'District of Columbia'),
( 'Florida'),
( 'Georgia'),
( 'Hawaii'),
( 'Idaho'),
( 'Illinois'),
( 'Indiana'),
( 'Iowa'),
( 'Kansas'),
( 'Kentucky'),
( 'Louisiana'),
( 'Maine'),
( 'Maryland'),
( 'Massachusetts'),
( 'Michigan'),
( 'Minnesota'),
( 'Mississippi'),
( 'Missouri'),
( 'Montana'),
( 'Nebraska'),
( 'Nevada'),
( 'New Hampshire'),
( 'New Jersey'),
( 'New Mexico'),
( 'New York'),
( 'North Carolina'),
( 'North Dakota'),
( 'Ohio'),
( 'Oklahoma'),
( 'Oregon'),
( 'Pennsylvania'),
( 'Puerto Rico'),
( 'Rhode Island'),
( 'South Carolina'),
( 'South Dakota'),
( 'Tennessee'),
( 'Texas'),
( 'Utah'),
( 'Vermont'),
( 'Virginia'),
( 'Washington'),
( 'West Virginia'),
( 'Wisconsin'),
( 'Wyoming');
I'm just gonna put this list of the United States bash/linux format here so I can save someone some time:
alabama|alaska|arizona|arkansas|california|colorado|connecticut|delaware|florida|georgia|hawaii|idaho|illinois|indiana|iowa|kansas|kentucky|louisiana|maine|maryland|massachusetts|michigan|minnesota|mississippi|missouri|montana|nebraska|nevada|newhampshire|newjersey|newmexico|newyork|northcarolina|northdakota|ohio|oklahoma|oregon|pennsylvania|rhodeisland|southcarolina|southdakota|tennessee|texas|utah|vermont|virginia|washington|westvirginia|wisconsin|wyoming