Retrieve closest road when given (lat, long) using OSM in Postgres with Postgis using SQL query - sql

Given a set (lat, long) I am trying to find the maximum speed using "max_speed" and street type using "highway".
I have loaded my database (Postgres and Postgis) as follows:
$ osm2pgsql -c -d gis --slim -C 50000 /var/lib/postgresql/data/germany-latest.osm.pbf
The closest related question I could find was How to query all shops around a certain longitude/latitude using osm-postgis?. I have taken the query, and plugged in a (lat, long) that I found in google maps for the city center of Munich (as the post was also related to city center Munich and I have the map for Germany). The result turns up empty.
gis=# SELECT name, shop FROM planet_osm_point WHERE ST_DWithin(way ,ST_SetSrid(ST_Point(48.137969, 11.573829), 900913), 100);
name | shop
------+------
(0 rows)
Also when looking into the planet_osm_nodes, which contains (lat, long) pairs directly, I end up with no results:
gis=# SELECT * FROM planet_osm_nodes WHERE ((lat BETWEEN 470000000 AND 490000000) AND (lon BETWEEN 100000000 AND 120000000)) LIMIT 10;
id | lat | lon | tags
----+-----+-----+------
(0 rows)
I verified the data is in my database:
gis=# SELECT COUNT(*) FROM planet_osm_point;
count
---------
9924531
(1 row)
and
gis=# SELECT COUNT(*) FROM planet_osm_nodes;
count
-----------
288597897
(1 row)
So ideally, my question would be
Q: How can I find the "max speed" and "highway" given a set (lat, lon)
alternatively, my questions is:
Q: How do I get the query from the other stack overflow post to work?
My best guess is that I need to transform my (lat, lon) in some way, or that I simply have the wrong data for whatever reason.
Edit: added sample data as requested:
gis=# SELECT * FROM planet_osm_point LIMIT 1;
osm_id | access | addr:housename | addr:housenumber | addr:interpolation | admin_level | aerialway | aeroway | amenity | area | barrier | bicycle | brand | bridge | boundary | building | capital | construction | covered | culvert |
cutting | denomination | disused | ele | embankment | foot | generator:source | harbour | highway | historic | horse | intermittent | junction | landuse | layer | leisure | lock | man_made | military | motorcar | name | natural | off
ice | oneway | operator | place | poi | population | power | power_source | public_transport | railway | ref | religion | route | service | shop | sport | surface | toll | tourism | tower:type | tunnel | water | waterway | wetland | wi
dth | wood | z_order | way
-----------+--------+----------------+------------------+--------------------+-------------+-----------+---------+---------+------+---------+---------+-------+--------+----------+----------+---------+--------------+---------+---------+
---------+--------------+---------+-----+------------+------+------------------+---------+----------+----------+-------+--------------+----------+---------+-------+---------+------+----------+----------+----------+------+---------+----
----+--------+----------+-------+-----+------------+-------+--------------+------------------+---------+-----+----------+-------+---------+------+-------+---------+------+---------+------------+--------+-------+----------+---------+---
----+------+---------+----------------------------------------------------
304070863 | | | | | | | | | | | | | | | | | | | |
| | | | | | | | crossing | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | | | | | |
| | | 010100002031BF0D0048E17A94F19F2941CDCCCCDCC60D5741
(1 row)
and
gis=# SELECT * FROM planet_osm_nodes LIMIT 1;
id | lat | lon | tags
--------+-----------+----------+------
234100 | 666501948 | 80442755 |
(1 row)
Edit 2: There was a mention regarding "SRID", so I added example data from another table:
gis=# SELECT * FROM spatial_ref_sys LIMIT 1;
srid | auth_name | auth_srid | srtext
| proj4text
------+-----------+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------
3819 | EPSG | 3819 | GEOGCS["HD1909",DATUM["Hungarian_Datum_1909",SPHEROID["Bessel 1841",6377397.155,299.1528128,AUTHORITY["EPSG","7004"]],TOWGS84[595.48,121.69,515.35,4.115,-2.9383,0.853,-3.408],AUTHORITY["EPSG","1024"]],PR
IMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","3819"]] | +proj=longlat +ellps=bessel +towgs84=595.48,121.69,515.35,4.115,-2.9383,0.853,-3.408 +no_defs
(1 row)

Geometry in PostGIS has a different ordering of (lat long) first is going longitude then latitude.
Also if you want to transform a point from one SRID to another use st_transfrom(), not ST_SetSrid.
ST_Transform relly transform your data from one coordinates system to another.
select st_astext(st_transform(ST_SetSrid(ST_Point(11.573829,48.137969), 4326),900913))
ST_SetSrid - just change SRID for the object.
select st_astext((ST_SetSrid(ST_Point(11.573829,48.137969),900913)
So, you have to change your SQL that way
SELECT name, shop
FROM planet_osm_point
WHERE ST_DWithin(way,st_transform(ST_SetSrid(ST_Point(11.573829,48.137969), 4326),900913), 100);

Related

Loop to find multiple minimum and maximum values

I have a table (tblProduct) with a field (SerialNum).
I am trying to find multiple minimum and maximum values from the field SerialNum, or better put: ranges of sequential serial numbers.
The serial numbers are 5 digits and a letter. Most of the values are sequential, but NOT all!
I need the output for a report to look something like:
00001A - 00014A
00175A - 00180A
00540A - 00549A
12345A - 12349A
04500B - 04503B
04522B - 04529B
04595B
04627B - 04631B
If the values in-between are present.
I tried a loop, but I realized I was using record sets. I need one serial num to be compared to ALL the ranges. Record sets were looking at one range.
I have been able to determine the max and min of the entire series, but not of each sequential group.
| SerialNum |
| -------- |
| 00001A|
| 00002A|
| 00003A|
| 00004A|
| 00005A|
| 00006A|
| 00007A|
| 00008A|
| 00009A|
| 00010A|
| 00011A|
| 00012A|
| 00013A|
| 00014A|
| 00175A|
| 00176A|
| 00177A|
| 00178A|
| 00179A|
| 00180A|
| 00540A|
| 00541A|
| 00542A|
| 00543A|
| 00544A|
| 00545A|
| 00546A|
| 00547A|
| 00548A|
| 00549A|
| 12345A|
| 12346A|
| 12347A|
| 12348A|
| 12349A|
| 04500B|
| 04501B|
| 04502B|
| 04503B|
| 04522B|
| 04523B|
| 04524B|
| 04525B|
| 04526B|
| 04527B|
| 04528B|
| 04529B|
| 04595B|
| 04627B|
| 04628B|
| 04629B|
| 04630B|
| 04631B|
Try to group by the number found with Val:
Select
Min(SerialNum) As MinimumSerialNum,
Max(SerialNum) As MaximumSerialNum
From
tblProduct
Group By
Val(SerialNum)

How to get location from latitude and longitude in T-SQL?

I have a mapping table of locationId along with their center latitude and center longitude value like below-
| Location Id | Center_lat | Center_long |
|-------------|------------|-------------|
| 1 | 50.546 | 88.344 |
| 2 | 48.546 | 86.344 |
| 3 | 52.546 | 89.344 |
I have another table where I am getting continuous location data with latitude and longitude for user like below -
+---------+------------+-------------+
| User Id | Center_lat | Center_long |
+---------+------------+-------------+
| 101 | 50.446 | 88.314 |
| 102 | 48.446 | 86.314 |
| 103 | 52.446 | 89.314 |
+---------+------------+-------------+
I want to get the locationId of all users if their latitude and longitude values lies within 1000 meters of lat-long values corresponding to location id. How can I get it done in T-SQL?
Final table should like below -
+---------+------------+-------------+------------+
| User Id | Center_lat | Center_long | LocationId |
+---------+------------+-------------+------------+
| 101 | 50.546 | 88.344 | 1 |
| 102 | 48.546 | 86.344 | 2 |
| 103 | 52.546 | 89.344 | 3 |
+---------+------------+-------------+------------+
You can use convert the latitude/longitude pairs to geography objects, and then use stdistance():
select u.*, l.location_id
from users u
inner join locations l
on geography::point(u.center_lat, u.center_long, 4326).stdistance(geography::point(l.center_lat, l.center_long, 4326)) < 1000
Note that it would be much more efficient to store this information as points to start with.

PostgreSQL: show trips within a bounding box

I have a trips table containing user's trip information, like so:
select * from trips limit 10;
trip_id | daily_user_id | session_ids | seconds_start | lat_start | lon_start | seconds_end | lat_end | lon_end | distance
---------+---------------+-------------+---------------+------------+------------+-------------+------------+------------+------------------
594221 | 16772 | {170487} | 1561324555 | 41.1175475 | -8.6298934 | 1561325119 | 41.1554091 | -8.6283493 | 5875.39697884959
563097 | 7682 | {128618} | 1495295471 | 41.1782829 | -8.5950303 | 1495299137 | 41.1783908 | -8.5948965 | 5364.81067787512
596303 | 17264 | {172851} | 1578011699 | 41.5195598 | -8.6393526 | 1578012513 | 41.4614024 | -8.717709 | 11187.7956426909
595648 | 17124 | {172119} | 1575620857 | 41.1553116 | -8.6439528 | 1575621885 | 41.1621821 | -8.6383042 | 1774.83365424607
566061 | 8720 | {133624} | 1509005051 | 41.1241975 | -8.5958988 | 1509006310 | 41.1424158 | -8.6101461 | 3066.40306678979
566753 | 8947 | {134662} | 1511127813 | 41.1887996 | -8.5844238 | 1511129839 | 41.2107519 | -8.5511712 | 5264.64026582458
561179 | 7198 | {125861} | 1493311197 | 41.1776935 | -8.5947254 | 1493311859 | 41.1773815 | -8.5947254 | 771.437257541019
541328 | 2119 | {46950} | 1461103381 | 41.1779 | -8.5949738 | 1461103613 | 41.1779129 | -8.5950202 | 177.610819150637
535519 | 908 | {6016} | 1460140650 | 41.1644658 | -8.6422775 | 1460141201 | 41.1642646 | -8.6423309 | 1484.61552373019
548460 | 3525 | {102026} | 1462289206 | 41.177689 | -8.594679 | 1462289843 | 41.1734476 | -8.5916326 | 1108.05119077308
(10 rows)
The task is to filter trips that start and end within the bounding box defined by upper left: 41.24895, -8.68494 and lower right: 41.11591, -8.47569.
If I understand correctly, you can just compare that starting and ending coordinates:
select t.*
from trips t
where lat_start >= 41.11591 and lat_start <= 41.24895 and
lat_end >= 41.11591 and lat_end <= 41.24895 and
long_start >= -8.68494 and long_start <= -8.47569 and
long_end >= -8.68494 and long_end <= -8.47569
Since your coordinates are stored in x,y columns, you have to use ST_MakePoint to create a proper geometry. After that, you can create a BBOX using the function ST_MakeEnvelope and check if start and end coordinates are inside the BBOX using ST_Contains, e.g.
WITH bbox(geom) AS (
VALUES (ST_MakeEnvelope(-8.68494,41.24895,-8.47569,41.11591,4326))
)
SELECT * FROM trips,bbox
WHERE
ST_Contains(bbox.geom,ST_SetSRID(ST_MakePoint(lon_start,lat_start),4326)) AND
ST_Contains(bbox.geom,ST_SetSRID(ST_MakePoint(lon_end,lat_end),4326));
Note: the CTE isn't really necessary and is in the query just for illustration purposes. You can repeat the ST_MakeEnvelope function on both conditions in the WHERE clause instead of bbox.geom. This query also assumes the SRS WGS84 (4326).

Get similar employees based on their attribute values

Consider the following sample table("Customer") with these records
=========
Customer
=========
-----------------------------------------------------------------------------------------------
| customer-id | att-a | att-b | att-c | att-d | att-e | att-f | att-g | att-h | att-i | att-j |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-1 | att-a-7 | att-b-3 | att-c-10 | att-d-10 | att-e-15 | att-f-11 | att-g-2 | att-h-7 | att-i-5 | att-j-14 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-2 | att-a-9 | att-b-7 | att-c-12 | att-d-4 | att-e-10 | att-f-4 | att-g-13 | att-h-4 | att-i-1 | att-j-13 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-3 | att-a-10 | att-b-6 | att-c-1 | att-d-1 | att-e-13 | att-f-12 | att-g-9 | att-h-6 | att-i-7 | tt-j-4 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-19 | att-a-7 | att-b-9 | att-c-13 | att-d-5 | att-e-8 | att-f-5 | att-g-12 | att-h-14 | att-i-13 | att-j-15 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
I have these records and many more records dumped into SQL database and wanted to find top 10 similar customer based on the attribute value. For example customer-1 and customer-19 have atleast one column value matching .i.e "att-a-7" so the output should give me 2 customer-id's or top similar customer that are customer-1 and customer-19.
P.S - there can be one or more columns similar across rows.
I'm using windowing technique to find top 10 similar customer and im not sure if I'm correct.
following is my approach I used in my query :
row_number() over (partition by att-a, att-b,..,att-j order by customer-id) as customers
is this correct. ?

Query from secondary index on aerospike

I'm considering aerospike for one of our projects. So I currently created a 3 node cluster and loaded some data on it.
Sample data
ns: imei
set: imei_data
+-------------------+-----------------------+-----------------------+----------------------------+--------------+--------------+
| imsi | fcheck | lcheck | msc | fcheck_epoch | lcheck_epoch |
+-------------------+-----------------------+-----------------------+----------------------------+--------------+--------------+
| "413010324064956" | "2017-03-01 14:30:26" | "2017-03-01 14:35:30" | "13d20b080011044917004100" | 1488358826 | 1488359130 |
| "413012628090023" | "2016-09-21 10:06:49" | "2017-09-16 13:54:40" | "13dc0b080011044917006100" | 1474432609 | 1505550280 |
| "413010130130320" | "2016-12-29 22:05:07" | "2017-10-09 16:17:10" | "13d20b080011044917003100" | 1483029307 | 1507546030 |
| "413011330114274" | "2016-09-06 01:48:06" | "2017-10-09 11:53:41" | "13d20b080011044917003100" | 1473106686 | 1507530221 |
| "413012629781993" | "2017-08-16 16:03:01" | "2017-09-13 18:10:48" | "13dc0b080011044917004100" | 1502879581 | 1505306448 |
Then I created a secondary index on lcheck_epoch using AQL since I want to query based on date.
create index idx_lcheck on imei.imei_data (lcheck_epoch) NUMERIC
+--------+----------------+-----------+-------------+-------+--------------+----------------+-----------+
| ns | bin | indextype | set | state | indexname | path | type |
+--------+----------------+-----------+-------------+-------+--------------+----------------+-----------+
| "imei" | "lcheck_epoch" | "NONE" | "imei_data" | "RW" | "idx_lcheck" | "lcheck_epoch" | "NUMERIC" |
+--------+----------------+-----------+-------------+-------+--------------+----------------+-----------+
When I execute
select imsi from imei.imei_data where idx_lcheck=1476165806
I'm getting
Error: (204) AEROSPIKE_ERR_INDEX
Please explain.
You're using the index name, not the bin name, in your query. Try this:
SELECT imsi FROM imei.imei_data WHERE lcheck_epoch=1476165806
Or
SELECT imsi FROM imei.imei_data WHERE lcheck_epoch BETWEEN 1490000000 AND 1510000000
Just a note, you can do much more complex queries using predicate filtering through several of the language clients (Java, C, C#, Go). For example the PredExp class of the Java client (see examples.)