log function in redshift - sql

I am trying to run following query.
CREATE TEMP TABLE tmp_variables AS SELECT
0.99::numeric(10,8) AS y ;
select y, log(y) from tmp_variables
It gives me following error. Is there a way to get around this?
[Amazon](500310) Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.;
Warnings:
Function "log(numeric,numeric)" not supported.

A workaround is to use "float" instead.
CREATE TEMP TABLE tmp_variables AS SELECT
0.99::float AS y ;
select y, log(y) from tmp_variables
works fine and returns
y log
0.99 -0.004364805402450088

The LOG function requires an argument that is data type "double precision". Your code is passing in a data type of "numeric", that's why you are getting an error.
This will work:
CREATE TEMP TABLE tmp_variables AS
SELECT 0.99::numeric(10,8) AS y ;
select y, log(cast(y as double precision)) from tmp_variables;

Related

ST_Area does not exist Heroku Postgresql + Postgis

I have a Postgres extended with Postgis version 2.5 database in Heroku.
I want to use the function:
ST_Area( a_polygon )
Specifically I want a generated column in my table:
alter table buildings add building_area float generated always as ( st_area( base_polygon ) ) stored;
Where base_polygon is of type polygon.
However, I am getting this error:
ERROR: function st_area(polygon) does not exist Hint: No function matches the given name and argument types. You might need to add explicit type casts.
Aren't these commands supposed to be available after I run CREATE EXTENSION postgis?
Or, is there something else I have to do?
It seems your polygon column data type is postgre base built in polygon.
ST_Area expects postgis geometry type as a parameter.
As in this example from docs https://postgis.net/docs/ST_Area.html
select ST_Area(geom) sqft,
ST_Area(ST_Transform(geom, 26986)) As sqm
from (
select
'SRID=2249;POLYGON((743238 2967416,743238 2967450,
743265 2967450,743265.625 2967416,743238 2967416))' :: geometry
geom
) subquery;
Check if this example works, it means that ST_Area function exists.
You can add a column with postgis geometry type. https://postgis.net/docs/AddGeometryColumn.html
SELECT AddGeometryColumn ('my_schema','my_spatial_table','geom',4326,'POLYGON',2);
Then convert your polygons into postgis format, by postgis functions.
For example https://postgis.net/docs/ST_MakePolygon.html
SELECT ST_MakePolygon( ST_GeomFromText('LINESTRING(75 29,77 29,77 29, 75 29)'));

divide operator error in PostgreSql: operator does not exist: unknown /

I have a Trip table in PostgreSQL DB, there is a column called meta in the table.
A example of meta in one row looks like:
meta = {"runTime": 3922000, "distance": 85132, "duration": 4049000, "fuelUsed": 19.595927498516176}
To select the trip which has largest value divided by "distance" and "runTime", I run query:
select MAX(tp."meta"->>'distance'/tp."meta"->>'runTime') maxkph FROM "Trip" tp
but I get ERROR:
/* ERROR: operator does not exist: unknown / jsonb LINE 1: MAX(tp."meta"->>'distance'/tp."meta"...
I also tried:
select MAX((tp."meta"->>'distance')/(tp."meta"->>'runTime')) maxkph FROM "Trip" tp
but get another ERROR:
/* ERROR: operator does not exist: text / text LINE 1: ...MAX((tp."meta"->>'distance')/(tp."meta...
Could you please help me to solve this problem?
There is not operator div for jsonb values. You have to cast a values on both sizes to some numeric type first:
MAX( ((tp."meta"->>'distance')::numeric) / ((tp."meta"->>'runTime')::numeric) ) maxkph
Try using parentheses:
MAX( (tp."meta"->>'distance') / (tp."meta"->>'runTime') ) as maxkph
Your second problem suggests that these values are stored as strings. So convert them:
MAX( (tp."meta"->>'distance')::numeric / (tp."meta"->>'runTime')::numeric ) as maxkph

Cannot have map type columns in DataFrame which calls set operations

: org.apache.spark.sql.AnalysisException: Cannot have map type columns in DataFrame which calls set operations(intersect, except, etc.), but the type of column map_col is map
I have a hive table with a column of type - MAP<Float, Float>. I get the above error when I try to do an insertion on this table in a spark context. Insertion works fine without the 'distinct'.
create table test_insert2(`test_col` string, `map_col` MAP<INT,INT>)
location 's3://mybucket/test_insert2';
insert into test_insert2
select distinct 'a' as test_col, map(0,0) as map_col
Try to convert dataframe to .rdd then apply .distinct function.
Example:
spark.sql("select 'a'test_col,map(0,0)map_col
union all
select 'a'test_col,map(0,0)map_col").rdd.distinct.collect
Result:
Array[org.apache.spark.sql.Row] = Array([a,Map(0 -> 0)])

Zip parallell arrays in hive

I have parallel arrays in a hive table, like this:
with tbl as ( select array(1,2,3) as x, array('a','b','c') as y)
select x,y from tbl;
x y
[1,2,3] ["a","b","c"]
1 row selected (0.108 seconds)
How can I zip them together (like the python zip function), so that I get back a list of structs, like
[(1, "a"), (2, "b"), (3,"c")]
You can posexplode so it gives the positions in the array which can then be used for filtering.
select x,y,collect_list(struct(val1,val2))
from tbl
lateral view posexplode(x) t1 as p1,val1
lateral view posexplode(y) t2 as p2,val2
where p1=p2
group by x,y
Here was my attempt at avoiding a double-explode:
with tbl as (select array(1,2,3,4,5) as x, array('a','b','c','d','e') as y)
select collect_list(struct(xi, y[i-1]))
from tbl lateral view posexplode(x) tbl2 as xi, i;
However, I ran into a strange error:
Error: Error while compiling statement: FAILED: IllegalArgumentException Size requested for unknown type: java.util.Collection (state=42000,code=40000)
I was able to work around it using
set hive.execution.engine=mr;
which is not as fast / optimized as using spark or tez as the back end.

ERROR: column mm.geom does not exist in PostgreSQL execution using R

I am trying to run the model in R software which calls functions from GRASS GIS (version 7.0.2) and PostgreSQL (version 9.5) to complete the task. I have created a database in PostgreSQL and created an extension Postgis, then imported required vector layers into the database using Postgis shapefile importer. Every time I try to run using R (run as an administrator), it returns an error like:
Error in fetch(dbSendQuery(con, q, n = -1)) :
error in evaluating the argument 'res' in selecting a method for function 'fetch': Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: column mm.geom does not exist
LINE 5: (st_dump(st_intersection(r.geom, mm.geom))).geom as geom,
^
HINT: Perhaps you meant to reference the column "r.geom".
QUERY:
insert into m_rays
with os as (
select r.ray, st_endpoint(r.geom) as s,
(st_dump(st_intersection(r.geom, mm.geom))).geom as geom,
mm.legend, mm.hgt as hgt, r.totlen
from rays as r,bh_gd_ne_clip as mm
where st_intersects(r.geom, mm.geom)
)
select os.ray, os.geom, os.hgt, l.absorb, l.barrier, os.totlen,
st_length(os.geom) as shape_length, st_distance(os.s, st_endpoint(os.geom)) as near_dist
from os left join lut as l
on os.legend = l.legend
CONTEXT: PL/pgSQL function do_crtn(text,text,text) line 30 at EXECUTE
I have checked over and over again, column geometry does exist in Schema>Public>Views of PostgreSQL. Any advise on how to resolve this error?
add quotes and then use r."geom" instead r.geom