Common syntax for creating named_struct in hive and presto - hive

I am trying to define a sql view that picks a subset of elements from a source struct data type and creates a new struct. In hive I can do this:
create view myview as
select
id,
named_struct("cnt", bkg.cnt, "val", bkg.val) as bkg
from mybkgtable
This works. The trouble is, when this view is invoked from presto, it fails with: Function named_struct not registered
Found that presto has no struct data type, but has ROW instead. It works with this syntax:
select
id,
CAST(ROW(bkg.cnt, bkg.val) as row(cnt integer, val double)) as bkg
from mybkgtable
However, this syntax isn't understood by Hive.
Question is, is it possible to have one view definition that works on both hive and presto?

Question is, is it possible to have one view definition that works on both hive and presto?
Sadly, no.

You can use coral to write the view definition in hiveql and translate it to presto.
https://github.com/linkedin/coral
https://engineering.linkedin.com/blog/2020/coral

Related

Querying data with type not supported by Trino

I use Trino to consume data from a MariaDB table.
I have a specificy column at this table with Geographical Data (Point Data https://mariadb.com/kb/en/geometry-types/). Querying the source, the data appear like this:
SELECT location FROM x.y.z
location
---------------------------
POINT (51.566682 83.32865)
POINT (46.77708 16.32856)
POINT (84.857691 4.295681)
But this kind of data isn't supported by Trino (https://trino.io/docs/current/connector/mariadb.html)
I just want the values (x, y) inside POINT(x,y).
The documentation has a flag unsupported-type-handling=CONVERT_TO_VARCHAR but when I use it the data retrieved came like this:
location
-------------------------
�Q�GHJk ���*#
�{���GMg'���(#
0�Z¶nK#�B< / #
I tested a lot of conversions on this varchar but no one worked well. So how can I get this kind of data type using Trino?
The datype is not text, so converting it will not help
You can always use natve functions to myriadb, as long as they return datatypes that are allowed
Table functions
The connector provides specific table functions to access MariaDB.
query(varchar) -> table#
The query function allows you to query the underlying database directly. It requires syntax native to MariaDB, because the full query is pushed down and processed in MariaDB. This can be useful for accessing native features which are not available in Trino or for improving query performance in situations where running a query natively may be faster.
so a query like will work
SELECT
X,Y
FROM
TABLE(
mariadb.system.query(
query => 'SELECT
ST_X(loacation) as X,
ST_Y/location) As Y
FROM
mytable'
)
);

How to view a User Defined Function in BigQuery

I wrote a User Defined function in BigQuery SQL and the documentation doesn't show how to view an existing function code. In other engines, the sintax is something like SHOW FUNCTION ... but that doesn't work here.
Thanks.
You can try to use INFORMATION_SCHEMA views to view the source code and other parameters of the UDF:
#standardSQL
select *
from <project_id>.<dataset_name>.INFORMATION_SCHEMA.ROUTINES;
Pay attention to the location of your UDF, because INFROMATION_SCHEMA view results depend on the location of objects.
You can create function as below :-
CREATE FUNCTION `project.dataset.Function`(x INT64) AS (x * 3);
Use of function will be as :-
select `project.dataset.Function` (Coll) from `project.dataset.Table`
To view function you need to click on function on dataset, you will get entire code.

How to save a view in BigQuery - Standard SQL Dialect

I am trying to save a view using BigQuery's WebUI, which was created in Standard SQL Dialect, but I am getting this error:
Failed to save view. Bad table reference "myDataset.myTable"; table references in standard SQL views require explicit project IDs
Why is this error showing up? How can I fix it? Should the "Table ID" field of the "Save view" dialog include the project id? Or does this error appear because of the query itself? Just in case, the query is running without any problems.
Thanks for your help.
Your view has reference to myDataset.myTable - which is ok when you just run it as a query (for example in Web UI).
But to save it as a view you must fully qualify that reference as below
myProject.myDataset.myTable
So, just add project to that reference
Same reply, in other words
The issue is in this part of query: FROM com.table
When running query, it's fine to not fully specify the name of table like this:
com_company_app_beta_IOS.app_events_20180619
But to save the query as a view the FROM has to be like this:
`company-prod`.com_company_app_beta_IOS.app_events_20180619
You need the backticks around the `company-prod` because the - dash character is unsupported in object names.
The structure in BigQuery look like this:
bigquery ui
I had the same problem.
You'll need to use backticks around the the whole string project.dataset.view/table in both create and select statements:
create view company-prod.com_company_app_beta_IOS.YOUR_VIEW as
select * from company-prod.com_company_app_beta_IOS.app_events_20180619
Use backticks around string project.dataset.view

SparkJob file name

I'm using a HQL query, that contains something similar to...
INSERT OVERWRITE TABLE ex_tb.ex_orc_tb
select *, SUBSTR(INPUT__FILE__NAME,60,4), CONCAT_WS('-', SUBSTR(INPUT__FILE__NAME,71,4), SUBSTR(INPUT__FILE__NAME,75,2), SUBSTR(INPUT__FILE__NAME,77,2))
from ex_db.ex_ext_tb
When I go into hive, and I use that command, it works fine.
When I put it into a pyspark, hivecontext command, instead I get the error...
pyspark.sql.utils.AnalysisException: u"cannot resolve 'INPUT__FILE__NAME' given input columns: [list_name, name, day, link_params, id, template]; line 2 pos 17"
Any ideas why this might be?
INPUT__FILE__NAME is a Hive specific virtual column and it is not supported in Spark.
Spark provides input_file_name function which should work in a similar way:
SELECT input_file_name() FROM df
but it requires Spark 2.0 or later to work correctly with PySpark.

Does SQLite support common table expressions?

Does SQLite support common table expressions?
I'd like to run query like that:
with temp (ID, Path)
as (
select ID, Path from Messages
) select * from temp
As of Sqlite version 3.8.3 SQLite supports common table expressions.
Change log
Instructions
Another solution is to integrate a "CTE to SQLite" translation layer in your application :
"with w as (y) z" => "create temp view w as y;z"
"with w(x) as (y) z" => "create temp table w(x);insert into w y;z"
As an (ugly, desesperate, but working) example :
http://nbviewer.ipython.org/github/stonebig/baresql/blob/master/examples/baresql_with_cte_code_included.ipynb
SQLite doesn't support CTEs, window functions, or any of the like. You can, however, write your own user functions that you can call inside SQLite by registering them to the database with the SQLite API using sqlite_create_function(). You register them with the database, and then you can use them in your own application code. You can make an aggregate function that would perform the sum of a series of averages based on the individual column values. For each value, a step-type callback function is called that allows you to perform some calculation on the data, and a pointer for holding state data is also available.