Just like SQL, do we have udf or function in hive to print the statement. Is there a way to know what are current queries that are running in back ground in Hive.
Related
I am trying to fetch whether partition is added or not to Athena table using lambda function code. But when I use
Select * from "table $partitions" where batch date ='yyyymmdd'
It is showing error because in lambda function " is not taking.
But with out " my code is not working.
Can someone suggest anything on this
I have a SQL Script with multiple drop & create DDL(Create tables As Select *), I want to run them at one go. I am quite new to informatica powercenter, can some one provide the process of using SQL transformation for BigQuery in informatica.
Sample Query:-
drop table if exists sellout.account_table;
CREATE TABLE sellout.account_table
AS
SELECT * FROM
sellout.account_src
WHERE
UPPER(account_name) IN ('RANDOM');
Similar to the above queries i have around 24 SQL's in a script.
I want to run them at once and later make them as part of informatica job.
If the "PowerExchange Google BigQuery" server and client are installed and after executing the infasetup.bat(sh) validateandregisterallfeatures, the mappings would be opened/exported successfully.
Here are some FAQs that might be handy for you:
Q: Why are the output fields in SQL Transformation not seen?
A: Stored Procedure selected in the SQL Transformation must have output parameters declared. Else it would not have output fields other than default Return Code column.
Q: A set of columns are displayed as result while running the Stored Procedure, however, you still do not see the same columns as output in SQL Transformation. Why?
A: Columns seen in the output might not be defined/declared as output parameters in the Stored Procedure. Procedure might have 'SELECT * FROM' like statement, which retrieves the data when the procedure is run from DB UI and a similar result could be seen when the procedure is run programmatically.
However, to call the same procedure from SQL Transformation, explicitly declared output parameters should be present as the transformation imports the metadata of the proc when selected. Unless you declare the output parameters explicitly in the procedure, it cannot be seen as output in the transformation.
Q: Is it necessary to have input/output parameters in Stored Procedure to call it from SQL Transformation?
A: Yes, it is necessary to have input/output parameters in Stored Procedure if it is not having default ones. As these parameters appear as input/output fields in SQL transformation, without these Mapping becomes invalid.
Q: I have SELECT statement in the procedure, does the SQL transformation can push this to next transforamtion?
A: Approprioate output parameters are required for this to work.
Recently I am working on a POC in databricks, where I need to move my R script to the Notebook in Databricks.
for running any Sql expression I need to point to %sql interpreter and then write the query, which works fine.
However, is there any way I can save this query result to an object:
%sql
a <- SHOW databases
This is not working, following is the error:
Please let me know if anything like is possible or not,as of now I can run using library(DBI)
and then save it using dbGetQuery(....)
I would recommend using the spark.sql interface as you are working in a Databricks notebook. Below is code which will work inside a Python DB notebook for reference.
from pyspark.sql.functions import col
# execute and store query result in data frame, collect results to use
mytabs = spark.sql("show databases").select('databaseName').filter(col("databaseName")=="<insert your database here, for example>")
str(mytabs.collect()[0][0])
Just to add to Ricardo's answer, the first line in a command cell is parsed for an optional directive (beginning with a percentage symbol).
If no directive is supplied, then the default language (scala, python, sql, r) of the notebook is assumed. In your example, the default language of the notebook is Python.
When you supply %sql (it must be on the first parsed line), it assumes that everything in that command cell is a SQL command.
The command that you listed:
%sql
a <- SHOW databases
is actually mixing SQL with R.
If you want to return the result of a SQL query to an R variable, you would need to do something like the following:
%r
library(SparkR)
a <- sql("SHOW DATABASES")
You can find more such examples in the SparkR docs here:
https://docs.databricks.com/spark/latest/sparkr/overview.html#from-a-spark-sql-query
I have seen many similar questions, but they are not specific to Google Cloud Datalab AND using UDF at the same time, e.g. query execution shows Unknown TVS error and creating table from query result via python API.
I managed to create the table when UDF was not used, but when it was, it returned error "Unknown TVF: myFunc".
Edit
Here is the code I'm using:
%%bigquery udf --module transform_field
...udf function...
Then I used udf function above in an sql query:
%%sql --module querymodule
...complex SELECT query FROM transform_field(table)...
Then I'd like to use that query to create another table as shown below:
%%bigquery execute --target project:dataset.tablename --query querymodule
But it kept showing this error instead:
Unknown TVF: TRANSFORM_FIELD
Alright, I found it. Turns out you need to pass the query through a python cell before using it in a %%bigquery execute cell:
bq_query = bq.Query(querymodule, udf=transform_field)
Thus the entire process should go as follows:
%%bigquery udf --module transform_field
...udf function...
Then I used udf function above in an sql query:
%%sql --module querymodule
...complex SELECT query FROM transform_field(table)...
Then use the query and udf function above to create a bq.Query object.
bq_query = bq.Query(querymodule, udf=transform_field)
Then use bq_query in table creation.
%%bigquery execute --target project:dataset.tablename --query bq_query
I keep being amazed at what good night sleeps do.
I have a program (flex) that queries a database and gets back a dataset (value, timestamp). In the program I then put each value in the set through an algorithm to get a new value resulting in all the values being transformed. Instead of having to do this transformation of data in my program I would like mysql to do it and send the results back. Essentially, I would like to do a SELECT statement that returns a modified dataset.
Assuming the computation can only be done in a stored procedure or function, yes - as long as you are using MySQL 5.0+. I recommend reading this article on MySQL stored procedures & functions.