spatial database query (the closest shop to a specific address) - sql

Database name: landmarks
landmark(landmarkId(PK),landmarkName,type,GPSCoordinates,locationId)
Comments(commentId(PK),comment,landmarkId)
Location(locationId(PK),streetName,streetNumber,buildingNumber,suburbId)
Suburb(suburbId(PK),suburbName,cityId)
City(cityId(PK),cityName,stateId)
State(stateId(PK),stateName,countryId)
Country(countryId(PK),countryName)
I'm trying to make a query to get the closest landmark which its type=coffee shop to a specific address.
For example:
The closest coffee shop to 54 George Street, Los angles, California, usa .
I'm new to spatial databases, what is the best optimised SQL query to this example.

Related

Geocoding for Data Manipulation

Use data from KY’s Department of Alcoholic Beverage Control (ABC) to calculate the availability of alcohol in Fayette county’s neighborhoods. In particular for each neighborhood, calculate the rate of liquor licenses per capita. Show the top 20 neighborhoods with the highest rate of alcohol availability. Show the top 20 neighborhoods with the highest number of licenses. Discuss whether or not these two top-20 lists differ and how. Define neighborhood as a US Census Bureau’s tract
I was trying to data manipulation on the data from ABC for grouping them in neighbourhood and convert Census track to neighbourhood so that I can perform calculations. But could get only the longitude and latitude.

Best way to add info/description to my items?

I made a geo game a while back where the player has to guess an item from an image (what I call an item is a SQL row basically) for example the bot sends the flag of the Netherlands, you have to type "Netherlands" to win.
Items can be the flag of a country, a capital city, a french department...
I made an info tab where it would basically give info about an item (ie region, former name, capital city, etc).
What I would like to do is properly save this information. I don't really know if I should store this in files like JSON because I would also like to give stats (Win rate per region, amount of games played per region, etc...).
Also, these elements are not fixed because some items have regions, capital cities or whatever and some don't.
Item examples :
(For a flag
Column
Attribute
ID
1
Name
United Kingdom
Former name
United Kingdom of Great Britain and Northern Ireland
Code
GB
Continent
Europe
Subregion
Northern Europe
Capital city
London
...
(For a U.S. State)
Column
Attribute
ID
1
Name
Arizona
Capital city
Phoenix
Largest city
Phoenix
...
The both solution (Add all as column and json) are not the proper way.
I think the best design is to have a key-value table.
Create Table tableName (ID INT, [Key] SYSNAME, [Value])
And data will look like:
ID
Key
Value
1
Name
Arizona
1
Capital City
Phoenix
1
Largest City
Phoenix
2
Name
United Kingdom
2
Former name
United Kingdom of Great Britain and Northern Ireland
Most valuable benefits: No Extra storage for columns with large amount of rows with NULL value.

Dataset interpretation Continuous vs Categorical for House Prices

I'm working with the UK house price dataset and was wanting to create a ML model to predict the price of a house based on the city (plus some other categories).
As a newb to all of this, I am stumped. I am fine creating models with continuous variables, or even carrying out one-hot encoding (dummy variables) for some of the other categories which have 4 different options (type of house for example).
However, when it comes to cities, there are about 1200 different cities in the data set and so I am not sure how to engineer the data to deal with this.
Would greatly appreciate anyone having any idea about this!
No matter how much I search, I can't find an answer to this, but this could perhaps be due to not knowing exactly what to search for.
For me you need to have a city grade in every city and a price for a house.
For example:
Country | City Grade
------------+------------
Los Angeles | 1
New York | 4
House | Price
------------+------------
Option1 | $200,000
Option2 | $300,000
Then calculate the house price based on the city grade by multiplying house price * City Grade.
So it means the Option1 house in Los Angeles will still $200,000 but in New York would be $1,200,000.
You don't need to worry about the 1200 cities its easy to query in database.

Joining multiple fields between the same tables

I have a table called 'Resources' that looks like this:
Country City Street Headcount
UK Halifax High Street 20
United Kingdom Oxford High Street 30
Canada Halifax North St 40
Because of the nature of the location fields, I need to map them to a single 'Address' field, and so I also have the following table called 'Addresses':
Country City Street Address
UK Halifax High Street High Street, Halifax, UK
Canada Halifax North St North Street, Halifax, Canada
United Kingdom Oxford High Street High Street, Oxford, UK
(In reality the Address field does add information rather than just combining what is already there.)
I am currently using the following SQL to produce the query:
SELECT Resources.Country, Resources.City, Resources.Street, Addresses.Address,
Resources.Headcount
FROM Resources
INNER JOIN Addresses ON Resources.Country = Addresses.Country
AND Resources.City = Addresses.City
AND Resources.Street = Addresses.Street
This works for me, but I am worried that I have not seen people use this many ANDs in a single join elsewhere, so don't know if it is a bad idea. (This is simplified version - I may need up to 8 ANDs in a single join in another case) Is this the best way to approach the problem, or is there a better solution?
Thanks
Joining on multiple columns is fine. You don't have to "fear" this.
As far as "a better way". I would suggest creating some variable tables, putting some data in them, and posting that TSQL (DDL and DML) here. Then you can get some possible alternatives. Your question is vague at the present (in regards to the "is there a better way" portion of your question)

SQL: update all values of X to Y based on a translation table

I have a Microsoft SQL server DB that imports some data which needs a bit of cleanup; some fields need to be remapped based on a second table. For example:
Table: Data
User Country
Alice Australia
Bob Sydney
Carol London
Dave London
Table: Translations
From To
Sydney Australia
London United Kingdom
Unfortunately cleaning up the source data is not an option, and this import happens daily so manually changing it is not practical.
What is the easiest way to iterate through the Translationstable, so for each pair an it runs something that is efectively "UPDATE Data SET Country = $TO where Country = $FROM"? If this can be done with a stored procedure that would be ideal. I have a feeling there is a nicely simple way to do this with SQL, but it's beyond my SQL skills and I can't find an answer by searching (probably because it has a really trivial name I don't know :-) )
Update Data
Set data.Country = Translations.[To]
From Data
Inner Join Translations
On data.Country = Translations.[from]
Haven't tried it live, but this may work?
UPDATE Data D SET Country = (SELECT To FROM Translations WHERE From = D.Country)