I am doing an SQL JOIN...ON in which the column of the other table to join on is an array to a set of rows, and therefore encounter this error. Specifically, I'm doing the JOIN on the tables.
TABLE: location
+------------+------------+---------+----------+
| session_id | gpstime | lat | lon |
+------------+------------+---------+----------+
| 49 | 1458203595 | 39.7449 | -8.8052 |
| 59 | 1458203601 | 39.7438 | -8.8057 |
| 95 | 1458203602 | 39.7438 | -8.8056 |
| 49 | 1458203602 | 39.7438 | -8.8057 |
+------------+------------+---------+----------+
TABLE: trips
+-------------+-----------+---------+-----------+---------+-------------+
| session_ids | lat_start | lat_end | lon_start | lon_end | travel_mode |
+-------------+-----------+---------+-----------+---------+-------------+
| {49} | 39.7449 | 41.1782 | -8.8053 | -8.5946 | car |
| {59,60} | 41.1551 | 41.1542 | -8.6294 | -8.6247 | foot |
| {94,95} | 41.1545 | 40.7636 | -8.6273 | -8.1729 | bike |
+-------------+-----------+---------+-----------+---------+-------------+
Here's the query I used:
SELECT gpstime, lat, lon, travel_mode
FROM location
INNER JOIN trips
ON session_id = session_ids
WHERE (lat BETWEEN SYMMETRIC lat_start AND lat_end)
AND (lon BETWEEN SYMMETRIC lon_start AND lon_end);
Error:
ERROR: operator does not exist: integer = integer[]
LINE 4: ON session_id = session_ids
How do I fix the issue?
The = comparator can only compare two values of the same type. But here you are trying to compare an integer value with an array. So the value 1 cannot equal a value that look like [1,2].
You can use the = ANY(...) comparator which checks if the left value is part of the right array:
demo:db<>fiddle
ON session_id = ANY(session_ids)
S-Man is correct, although you can also use the ANY function as described here.
For more information on the differences between using IN and ANY/ALL, read this question.
Related
I am accessing a table table that I recognised one column was defined like type list (if you come from python like I do). I retrieved its create statement via pg_dump
CREATE TABLE sensemyfeup.trips (
trip_id integer NOT NULL,
daily_user_id integer,
session_ids integer[],
seconds_start integer,
lat_start double precision,
lon_start double precision,
seconds_end integer,
lat_end double precision,
lon_end double precision,
distance double precision
);
I am referring to column session_ids. It contents look like:
SELECT * FROM trips LIMIT 5;
trip_id | daily_user_id | session_ids | seconds_start | lat_start | lon_start | seconds_end | lat_end | lon_end | distance
---------+---------------+---------------+---------------+------------+------------+----- --------+------------+------------+------------------
540797 | 2169 | {43350} | 1461056108 | 41.1250659 | -8.5993936 | 1461056424 | 41.1221733 | -8.6004883 | 412.658565594423
546128 | 3096 | {84659,84663} | 1461847953 | 41.1787939 | -8.6078294 | 1461849730 | 41.1840573 | -8.6033242 | 3469.92906971906
536069 | 1080 | {9837} | 1460293763 | 41.1836186 | -8.6001802 | 1460294099 | 41.1836725 | -8.6001787 | 47.7817179218928
537711 | 1373 | {17641,17689} | 1460590761 | 41.1477454 | -8.611109 | 1460593908 | 41.1477451 | -8.6111093 | 1081.61337507529
542407 | 2254 | {53112} | 1461173383 | 40.9853811 | -8.5205261 | 1461173677 | 40.9873266 | -8.5003848 | 2224.13368208515
As we can see in the session_ids column, some records have 1 value, some multiple.
How I get a summary statistics of rows with 1 session_ids value, with 2, etc..?
We can use cardinality to count the number of elements in session_ids and group by with the results.
select cardinality(session_ids) as number_of_session_id_values
,count(*)
from t
group by cardinality(session_ids)
number_of_session_id_values
count
2
2
1
3
Fiddle
I have a record of users' trips with begin/end positions and time in a table like this:
CREATE TABLE trips(id integer, start_timestamp timestamp with time zone,
session_id integer, start_lat double precision,
start_lon double precision, end_lat double precision,
end_lon double precision, mode integer);
INSERT INTO trips (id, start_timestamp, session_id, start_lat,start_lon,end_lat,end_lon,mode)
VALUES (563097015,'2017-05-20 17:47:12+01', 128618, 41.1783308,-8.5949878, 41.1784478, -8.5948463, 0),
(563097013, '2017-05-20 17:45:29+01', 128618, 41.1781344, -8.5951169, 41.1782919, -8.5950689, 0),
(563097011, '2017-05-20 17:43:41+01', 128618, 41.1781196, -8.5954075, 41.1782139, -8.5950689, 0),
(563097009, '2017-05-20 17:41:48+01', 128618, 41.1782497, -8.595197, 41.1781101, -8.5954124, 0),
(563097003, '2017-05-20 17:10:29+01', 128618, 41.1832512, -8.6081606, 41.1782561, -8.5950259, 0)
And in the second table is the records of raw gps traces for all the trips similar to:
CREATE TABLE gps_traces (session_id integer, seconds integer, lat double precision,
lon double precision, speed double precision);
INSERT INTO gps_traces (session_id, seconds , lat , lon , speed )
VALUES (128618,1495296443,41.1844471,-8.6065158,1.35148),
(128618,1495296444,41.1844482,-8.6065303,1.28004),
(128618,1495296445,41.1844572,-8.6065503,1.46086),
(128618,1495296446,41.1844541,-8.6065691,1.23),
(128618,1495296446,41.1844589,-8.6065861, 1.22919),
(128618,1495296447,41.1844587, -8.6066043, 1.30188),
(128618, 1495296448, 41.1844604, -8.6066261, 1.43126),
(128618, 1495296449, 41.184471, -8.6066412, 1.55003),
(128618,1495296450, 41.1844715, -8.6066572, 1.29062),
(128618,1495296450, 41.1844707, -8.6066736, 1.3618)
From this I want to create a new table mytable containing GPS joining these tables on session_id, like so:
CREATE TABLE mytable AS SELECT id, seconds, lat, lon, speed, mode
FROM trips t
JOIN gps_traces g
ON t.session_id=g.session_id
However, in the new table, I want to ensure that for rows recorded twice at same unix timestamp in a trip, only only is selected into my new table. For example in this case:
SELECT * FROM mytable WHERE id = 563097003;
+-----------+------------+------------+------------+---------+------+
| id | seconds | lat | lon | speed | mode |
+-----------+------------+------------+------------+---------+------+
| 563097003 | 1495296443 | 41.1844471 | -8.6065158 | 1.35148 | 0 |
| 563097003 | 1495296444 | 41.1844482 | -8.6065303 | 1.28004 | 0 |
| 563097003 | 1495296445 | 41.1844572 | -8.6065503 | 1.46086 | 0 |
| 563097003 | 1495296446 | 41.1844541 | -8.6065691 | 1.23 | 0 |
| 563097003 | 1495296446 | 41.1844589 | -8.6065861 | 1.22919 | 0 |
| 563097003 | 1495296447 | 41.1844587 | -8.6066043 | 1.30188 | 0 |
| 563097003 | 1495296448 | 41.1844604 | -8.6066261 | 1.43126 | 0 |
| 563097003 | 1495296449 | 41.184471 | -8.6066412 | 1.55003 | 0 |
| 563097003 | 1495296450 | 41.1844715 | -8.6066572 | 1.29062 | 0 |
| 563097003 | 1495296450 | 41.1844707 | -8.6066736 | 1.3618 | 0 |
| 10 rows | | | | | |
+-----------+------------+------------+------------+---------+------+
Column seconds is the Unix timestamp. As shown, we can see rows having more than 1 unique timestamp count at 1495296446 and 1495296450. I would like to ensure that for each trip, records are selected into the new table with unique timestamp (so in the case above, only one recorded should selected into the new table). I illustrate that in this db<>fiddle.
EDIT
Expected output:
+-----------+------------+------------+------------+---------+------+
| id | seconds | lat | lon | speed | mode |
+-----------+------------+------------+------------+---------+------+
| 563097003 | 1495296443 | 41.1844471 | -8.6065158 | 1.35148 | 0 |
| 563097003 | 1495296444 | 41.1844482 | -8.6065303 | 1.28004 | 0 |
| 563097003 | 1495296445 | 41.1844572 | -8.6065503 | 1.46086 | 0 |
| 563097003 | 1495296446 | 41.1844541 | -8.6065691 | 1.23 | 0 |
| 563097003 | 1495296447 | 41.1844587 | -8.6066043 | 1.30188 | 0 |
| 563097003 | 1495296448 | 41.1844604 | -8.6066261 | 1.43126 | 0 |
| 563097003 | 1495296449 | 41.184471 | -8.6066412 | 1.55003 | 0 |
| 563097003 | 1495296450 | 41.1844715 | -8.6066572 | 1.29062 | 0 |
| 8 rows | | | | | |
+-----------+------------+------------+------------+---------+------+
Use DISTINCT ON:
CREATE TABLE mytable AS
SELECT DISTINCT ON (t.session_id, seconds) id, seconds, lat, lon, speed, mode
FROM trips t JOIN
gps_traces g
ON t.session_id = g.session_id
ORDER BY t.session_id, seconds;
Note: I would expect you to include session_id in the new table as well.
Thanks to #Abelisto, it turns out that the following modification to this answer works as intended.
CREATE TABLE mytable AS SELECT DISTINCT ON (id, seconds)id,
seconds, lat, lon, speed, mode
FROM trips t
JOIN gps_traces g
ON t.session_id=g.session_id
ORDER BY id, seconds
Here is a db<>fiddle.
I have a trips table containing user's trip information, like so:
select * from trips limit 10;
trip_id | daily_user_id | session_ids | seconds_start | lat_start | lon_start | seconds_end | lat_end | lon_end | distance
---------+---------------+-------------+---------------+------------+------------+-------------+------------+------------+------------------
594221 | 16772 | {170487} | 1561324555 | 41.1175475 | -8.6298934 | 1561325119 | 41.1554091 | -8.6283493 | 5875.39697884959
563097 | 7682 | {128618} | 1495295471 | 41.1782829 | -8.5950303 | 1495299137 | 41.1783908 | -8.5948965 | 5364.81067787512
596303 | 17264 | {172851} | 1578011699 | 41.5195598 | -8.6393526 | 1578012513 | 41.4614024 | -8.717709 | 11187.7956426909
595648 | 17124 | {172119} | 1575620857 | 41.1553116 | -8.6439528 | 1575621885 | 41.1621821 | -8.6383042 | 1774.83365424607
566061 | 8720 | {133624} | 1509005051 | 41.1241975 | -8.5958988 | 1509006310 | 41.1424158 | -8.6101461 | 3066.40306678979
566753 | 8947 | {134662} | 1511127813 | 41.1887996 | -8.5844238 | 1511129839 | 41.2107519 | -8.5511712 | 5264.64026582458
561179 | 7198 | {125861} | 1493311197 | 41.1776935 | -8.5947254 | 1493311859 | 41.1773815 | -8.5947254 | 771.437257541019
541328 | 2119 | {46950} | 1461103381 | 41.1779 | -8.5949738 | 1461103613 | 41.1779129 | -8.5950202 | 177.610819150637
535519 | 908 | {6016} | 1460140650 | 41.1644658 | -8.6422775 | 1460141201 | 41.1642646 | -8.6423309 | 1484.61552373019
548460 | 3525 | {102026} | 1462289206 | 41.177689 | -8.594679 | 1462289843 | 41.1734476 | -8.5916326 | 1108.05119077308
(10 rows)
The task is to filter trips that start and end within the bounding box defined by upper left: 41.24895, -8.68494 and lower right: 41.11591, -8.47569.
If I understand correctly, you can just compare that starting and ending coordinates:
select t.*
from trips t
where lat_start >= 41.11591 and lat_start <= 41.24895 and
lat_end >= 41.11591 and lat_end <= 41.24895 and
long_start >= -8.68494 and long_start <= -8.47569 and
long_end >= -8.68494 and long_end <= -8.47569
Since your coordinates are stored in x,y columns, you have to use ST_MakePoint to create a proper geometry. After that, you can create a BBOX using the function ST_MakeEnvelope and check if start and end coordinates are inside the BBOX using ST_Contains, e.g.
WITH bbox(geom) AS (
VALUES (ST_MakeEnvelope(-8.68494,41.24895,-8.47569,41.11591,4326))
)
SELECT * FROM trips,bbox
WHERE
ST_Contains(bbox.geom,ST_SetSRID(ST_MakePoint(lon_start,lat_start),4326)) AND
ST_Contains(bbox.geom,ST_SetSRID(ST_MakePoint(lon_end,lat_end),4326));
Note: the CTE isn't really necessary and is in the query just for illustration purposes. You can repeat the ST_MakeEnvelope function on both conditions in the WHERE clause instead of bbox.geom. This query also assumes the SRS WGS84 (4326).
I'm trying to convert lat/lon to linestring. Basically, grouping the columns lat and lon, making a point, and creating a linestring.
Table:
+------------+----------+-----------+------------+---------+--------+
| link_id | seq_num | lat | lon | z_coord | zlevel |
+------------+----------+-----------+------------+---------+--------+
| "16777220" | "0" | "4129098" | "-7192948" | | 0 |
| "16777220" | "999999" | "4129134" | "-7192950" | | 0 |
| "16777222" | "0" | "4128989" | "-7193030" | | 0 |
| "16777222" | "1" | "4128975" | "-7193016" | | 0 |
| "16777222" | "2" | "4128940" | "-7193001" | | 0 |
| "16777222" | "3" | "4128917" | "-7192998" | | 0 |
| "16777222" | "4" | "4128911" | "-7193002" | | 0 |
+------------+----------+-----------+------------+---------+--------+
My code:
select link_id, ST_SetSRID(ST_MakeLine(ST_MakePoint((lon::double precision / 100000), (lat::double precision / 100000))),4326) as geometry
from public.rdf_link_geometry
group by link_id
limit 50
geometry output column example:
"0102000020E6100000020000004F92AE997CFB51C021E527D53EA54440736891ED7CFB51C021020EA14AA54440"
^^ What is this? how did it get formatted in such a way? I expected a linestring, something like
geometry
7.123 50.123,7.321 50.321
7.321 50.321,7.321 50.321
Data format for link_id is bingint, and for geometry it says geometry
SOLUTION:
select link_id, ST_AsText(ST_SetSRID(ST_MakeLine(ST_MakePoint(
(lon::double precision / 100000), (lat::double precision / 100000))),4326)) as geometry
from public.rdf_link_geometry
group by link_id
limit 50
The output is a geometry, which you can display as text using st_asText
select st_asText('0102000020E6100000020000004F92AE997CFB51C021E527D53EA54440736891ED7CFB51C021020EA14AA54440');
st_astext
--------------------------------------------------
LINESTRING(-71.92948 41.29098,-71.9295 41.29134)
That being said, should you have more than 2 points, you could order them to create a meaningful line:
select st_makeline(geom ORDER BY seqID) from tbl;
Given a set (lat, long) I am trying to find the maximum speed using "max_speed" and street type using "highway".
I have loaded my database (Postgres and Postgis) as follows:
$ osm2pgsql -c -d gis --slim -C 50000 /var/lib/postgresql/data/germany-latest.osm.pbf
The closest related question I could find was How to query all shops around a certain longitude/latitude using osm-postgis?. I have taken the query, and plugged in a (lat, long) that I found in google maps for the city center of Munich (as the post was also related to city center Munich and I have the map for Germany). The result turns up empty.
gis=# SELECT name, shop FROM planet_osm_point WHERE ST_DWithin(way ,ST_SetSrid(ST_Point(48.137969, 11.573829), 900913), 100);
name | shop
------+------
(0 rows)
Also when looking into the planet_osm_nodes, which contains (lat, long) pairs directly, I end up with no results:
gis=# SELECT * FROM planet_osm_nodes WHERE ((lat BETWEEN 470000000 AND 490000000) AND (lon BETWEEN 100000000 AND 120000000)) LIMIT 10;
id | lat | lon | tags
----+-----+-----+------
(0 rows)
I verified the data is in my database:
gis=# SELECT COUNT(*) FROM planet_osm_point;
count
---------
9924531
(1 row)
and
gis=# SELECT COUNT(*) FROM planet_osm_nodes;
count
-----------
288597897
(1 row)
So ideally, my question would be
Q: How can I find the "max speed" and "highway" given a set (lat, lon)
alternatively, my questions is:
Q: How do I get the query from the other stack overflow post to work?
My best guess is that I need to transform my (lat, lon) in some way, or that I simply have the wrong data for whatever reason.
Edit: added sample data as requested:
gis=# SELECT * FROM planet_osm_point LIMIT 1;
osm_id | access | addr:housename | addr:housenumber | addr:interpolation | admin_level | aerialway | aeroway | amenity | area | barrier | bicycle | brand | bridge | boundary | building | capital | construction | covered | culvert |
cutting | denomination | disused | ele | embankment | foot | generator:source | harbour | highway | historic | horse | intermittent | junction | landuse | layer | leisure | lock | man_made | military | motorcar | name | natural | off
ice | oneway | operator | place | poi | population | power | power_source | public_transport | railway | ref | religion | route | service | shop | sport | surface | toll | tourism | tower:type | tunnel | water | waterway | wetland | wi
dth | wood | z_order | way
-----------+--------+----------------+------------------+--------------------+-------------+-----------+---------+---------+------+---------+---------+-------+--------+----------+----------+---------+--------------+---------+---------+
---------+--------------+---------+-----+------------+------+------------------+---------+----------+----------+-------+--------------+----------+---------+-------+---------+------+----------+----------+----------+------+---------+----
----+--------+----------+-------+-----+------------+-------+--------------+------------------+---------+-----+----------+-------+---------+------+-------+---------+------+---------+------------+--------+-------+----------+---------+---
----+------+---------+----------------------------------------------------
304070863 | | | | | | | | | | | | | | | | | | | |
| | | | | | | | crossing | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | | | | | | | |
| | | 010100002031BF0D0048E17A94F19F2941CDCCCCDCC60D5741
(1 row)
and
gis=# SELECT * FROM planet_osm_nodes LIMIT 1;
id | lat | lon | tags
--------+-----------+----------+------
234100 | 666501948 | 80442755 |
(1 row)
Edit 2: There was a mention regarding "SRID", so I added example data from another table:
gis=# SELECT * FROM spatial_ref_sys LIMIT 1;
srid | auth_name | auth_srid | srtext
| proj4text
------+-----------+-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------
3819 | EPSG | 3819 | GEOGCS["HD1909",DATUM["Hungarian_Datum_1909",SPHEROID["Bessel 1841",6377397.155,299.1528128,AUTHORITY["EPSG","7004"]],TOWGS84[595.48,121.69,515.35,4.115,-2.9383,0.853,-3.408],AUTHORITY["EPSG","1024"]],PR
IMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","3819"]] | +proj=longlat +ellps=bessel +towgs84=595.48,121.69,515.35,4.115,-2.9383,0.853,-3.408 +no_defs
(1 row)
Geometry in PostGIS has a different ordering of (lat long) first is going longitude then latitude.
Also if you want to transform a point from one SRID to another use st_transfrom(), not ST_SetSrid.
ST_Transform relly transform your data from one coordinates system to another.
select st_astext(st_transform(ST_SetSrid(ST_Point(11.573829,48.137969), 4326),900913))
ST_SetSrid - just change SRID for the object.
select st_astext((ST_SetSrid(ST_Point(11.573829,48.137969),900913)
So, you have to change your SQL that way
SELECT name, shop
FROM planet_osm_point
WHERE ST_DWithin(way,st_transform(ST_SetSrid(ST_Point(11.573829,48.137969), 4326),900913), 100);