How to use LEAST function in Calcite SQL with Apache Beam

How to use LEAST function in Calcite SQL with Apache Beam - sql

I'm trying to call the LEAST function in Calcite SQL in my Apache Beam pipeline:
...
,LEAST(12.5 + (25 * Quartile), 100) AS PlayedPercentage
...
Where Quartile is an int32 column. I get the below error:
Caused by: org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.validate.SqlValidatorException: No match found for function signature LEAST(<NUMERIC>, <NUMERIC>)
I've also tried to cast both arguments to Float, but get the same result. How am supposed to call the fuction?

According to Calcite's SQL Reference, LEAST is a dialect-specific operator that is only enabled in the Oracle operator table.
If you were connecting to Calcite directly, I would suggest that include fun=oracle in the JDBC connect string, and this will enable the Oracle operator table. I'm not sure what the steps are if you are using Calcite from within Beam.

Related

How to enable calcite's dialect-specific operator like TO_TIMESTAMP

I want to parser the sql query using calcite to do some SQL equivalence verification. But I found the default setting of calcite don't support dialect-specific operator such as TO_TIMESTAMP. The error is below:
No match found for function signature TO_TIMESTAMP(<CHARACTER>, <CHARACTER>)
The answer here said I can use jdbc to change the setting of calcite. But I cannot found where to use jdbc string to change the setting. Should I use some API in calcite to put the jdbc statement into calcite?

If you're not invoking Calcite via JDBC, how are you calling it? Other APIs to Calcite have an equivalent of the JDBC fun parameter in the answer you reference, but it's difficult to answer your question without more specifics.

PostgreSQL: jdbc, functions and autocommit

I've got a PL/pgSQL function and I'm connecting to Postgres 12.x using Scala library called doobie which uses JDBC underneath. I'd like to understand if the whole call to the function will be treated as a single transaction? I've got default setting of autocommit.
The call to the function is simply:
select * from next_work_item();

All functions in PostgreSQL are always running in a single transaction, no matter what.

Hive ODBC driver does not recognise unix_timestamp

Short version:
How can I get the difference in seconds between 2 timestamps, via the ODBC driver?
Long version:
Using ODBC for a simple query (not that I use cast (... as timestamp) to have a standalone line, the actual query runs against a table with timestamp data):
select unix_timestamp(cast('2019-02-01 01:02:03' as timestamp)) as tto
I got the error message:
unix_timestamp is not a valid scalar function or procedure call
I could not find any configuration option that would change this. Native query is disabled (because I am using prepared statements) and other functions work fine. My guess is that unix_timestamp() (without parameter) is deprecated, and the driver is a bit enthusiastic about preventing using the function.
I tried to work around the problem, and I cast the timestamp as bigint instead of using the unix_timestamp function:
select cast(cast('2019-02-01 01:02:03' as timestamp) as bigint)
This works fine! But when I try to get the diff of 2 timestamps:
select cast(cast('2019-02-01 01:02:03' as timestamp) as bigint) - cast(cast('2019-02-01 01:02:03' as timestamp) as bigint)
I got the message
Operand types SQL_WCHAR and SQL_WCHAR are incompatible for the binary
minus operator
(but then only for complex queries, not if the query consists only of this select).
The driver will accept a diff between 2 timestamps, but then I end up with an interval type, which I cannot convert back to seconds.
I would consider that those are bugs in the ODBC driver, but I cannot contact Hortonworks because I am not a paying customer, and I cannot contact Simba either because I am not a paying customer.
On a side note, if I try to use the floor function, I get the message:
‘floor’ is a reserved keyword.
Yes, I know it's reserved and I am actually trying to ise it.
Any idea how I could get around this?

In short, the official Hive ODBC driver is really really really bad if you cannot use native statements (ie. if you need parameterised queries).
My suggested workarounds are to either get a paying one (eg. https://www.progress.com/datadirect-connectors - I tried it and it works very well) or to just use a jdbc one if your application can support it. All ODBC drivers I found for Hive are wrappers around the jdbc one anyway, bundling a jre.

Getting error unnest is not a recognized built-in function name?

When I'm running a query using 'unnest' statement getting error unnest is not a recognized built-in function name. Is unnest is builtin sql function name? is so it's compatible versions are ?

AWS Redshift Leader Node-Only Function with table reference

I have a requirement to pass the server address, server port, and the count from a table in one query in AWS Redshift. i.e.
select inet_server_addr(), inet_server_port(), count(*) from my_table;
ERROR: 0A000: Specified types or functions (one per INFO message) not
supported on Redshift tables.
I understand that this query does not work because I am trying to execute a Leader Node-Only Function in conjunction with a query which needs to access the compute nodes.
I am wondering, however, if there is a work around available to get the information that I need in one query execution.
Note: Editing the above query to use common table expressions (cte), sub-queries, views, scalar-functions etc results in the same error message.
Note 2: PostgreSQL System information functions like inet_server_addr() are currently unsupported in AWS Redshift, however, they work when called without a table reference.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to use LEAST function in Calcite SQL with Apache Beam - sql

Related

How to enable calcite's dialect-specific operator like TO_TIMESTAMP

PostgreSQL: jdbc, functions and autocommit

Hive ODBC driver does not recognise unix_timestamp

Getting error unnest is not a recognized built-in function name?

AWS Redshift Leader Node-Only Function with table reference

Categories

Resources