How can I use PL Sql in Hive using Spark? - sql

val hiveContext = new HiveContext(sc)
val s = hiveContext.sql("SELECT * FROM Test")
But don't know how to use PL SQL in hive. Please help me.

It does not make sense to use PL/SQL code in hivecontext.sql() as it requires a querystring and not procedure.
The method returns a new data frame and would not perform operations as usually done in an PL/SQL code.
https://spark.apache.org/docs/1.3.0/api/java/org/apache/spark/sql/hive/HiveContext.html

It appears the answer is "yes", which I found in about 20 seconds by googling "hive spark pl/sql". And it has a reference manual here
HPL/SQL is an open source tool (Apache License 2.0) that implements
procedural SQL language for Apache Hive, SparkSQL, Impala as well as
any other SQL-on-Hadoop implementation, any NoSQL and any RDBMS.
HPL/SQL is a hybrid and heterogeneous language that understands
syntaxes and semantics of almost any existing procedural SQL dialect,
and you can use with any database, for example, running existing
Oracle PL/SQL code on Apache Hive and Microsoft SQL Server, or running
Transact-SQL on Oracle, Cloudera Impala or Amazon Redshift.
HPL/SQL
language is compatible to a large extent with Oracle PL/SQL, ANSI/ISO
SQL/PSM (IBM DB2, MySQL, Teradata i.e), PostgreSQL PL/pgSQL (Netezza),
Transact-SQL (Microsoft SQL Server and Sybase) that allows you
leveraging existing SQL/DWH skills and familiar approach to implement
data warehouse solutions on Hadoop. It also facilitates migration of
existing business logic to Hadoop. HPL/SQL is an efficient way to
implement ETL processes in Hadoop
.

Related

Where can I find what SQL dialect that MarkLogic TDE based SQL support?

MarkLogic TDE enables SQL 'like' access to the document data.
Hence via common ODBC driver, other BI tools could possibly access ML DB in a 'relation db' way. However the challenge I have is to know which SQL dialet ML supports.
For example, I want to find how to find the first 10 records to get a snippet of the data. I could do that with
select top 10 * from book (ms sql)
or
select * from book where rownum <= 10 (oracle sql)
How to do the same with MarkLogic SQL?
There are actually many such types of sql syntax questions. I need to find the equivalent of what I normally used with ms sql.
Is there a wiki page to show the difference between ML SQL and MS SQL?
In general, MarkLogic supports the syntax from the SQL92 standard.
Supported SQL Statements, Functions and Types
This section describes the SQL statements and functions supported in MarkLogic. The topics are:
Supported Statements
Supported Functions
Supported Types

Does any RDBMS provide the feature of prepared statements?

I learned the concept of prepared statements in JDBC in Java. So I think that prepared statement is a concept in JDBC, but not in RDBMS.
To see whether my guess is right, may I ask whether any major RDBMS provide the feature of prepared statements, in their PL/PSM like languages,such as PL/SQL, PL/pgSQL, MySQL, Transact-SQL?
If there is any such RDBMS, is prepared statement provided in SQL, or in PL/PSM like languages,such as PL/SQL, PL/pgSQL, MySQL, Transact-SQL?
I read DIfference Between Stored Procedures and Prepared Statements..?, but I can't find which provides the feature of prepared statements, although I think prepared statement is a concept in JDBC not in RDBMS, and stored procedure is a concept in RDBMS only.
Every implementation of SQL-compliant RDBMS should support an API for server-side prepared statements. I can't think of one RDBMS that doesn't support prepared statements.
JDBC has a class for PreparedStatement. The implementation varies by each brand of JDBC driver, but all those that I have used just delegate to the RDBMS API. The JDBC driver sends an SQL query string to the database server, and the SQL may contain parameter placeholders for example ? (some brands — like Oracle — support named parameters).
Some database implementations provide packages or functions you can use to execute a prepared statement, so you can create a query at runtime within a stored procedure.
Oracle: https://docs.oracle.com/cd/A57673_01/DOC/api/doc/PAD18/ch8.htm
Microsoft SQL Server: https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-prepare-transact-sql?view=sql-server-2017
Some database implementations also support PREPARE and EXECUTE statements that you can call as a query. This allows you to use prepared statements in a stored procedure or an SQL script.
MySQL: https://dev.mysql.com/doc/refman/8.0/en/sql-syntax-prepared-statements.html
PostgreSQL: https://www.postgresql.org/docs/10/static/sql-prepare.html

Hana Column Store dialect to Oracle 12c SQL

While trying to benchmark Oracle's Database Inmemory, we were looking for publicly available benchmarking data set and tools. The CH-benCHmark suited our requirement exactly, but it has HANA Column Store Dialect as part of the source files.
So, our requirement is to convert these HANA Column Store dialect SQLs to Oracle 12c SQLs. Google search returned the conversion from Oracle to Hana dialect not the reverse.
Has anyone came across this requirement? Is there a simple/direct way to do the conversion?
Any pointers will be much helpful.
Yes I have done this exercise! there's no direct way from HANA Dialect to Oracle Dialect, But you can make use of ORACLE_LOADER and it's semantics to effectively create Oracle Dialect! Only problem you may face would be the flow, where HANA's flow is totally different from Oracle's schema creation flow.
For example:
you can easily use LOAD FROM FILE... syntax in HANA, But you need an externally organized table in case of Oracle.

What is the name of the SQL variant used by FirebirdSQL?

I have been researching the names of the SQL versions used by different DBMSs.
So far I have:
Microsoft SQL -> Transact SQL
PostgrSQL -> PL/pgSQL
MySQL -> standard SQL (ANSI)
Oracle -> PL/SQL
Firebird -> ?
I haven't found anything about this. I read somewhere that it's PSQL, but I'm not sure if that is true, since the search results for it return many pages about postgres...
Firebird simply has SQL, which is very close to standard SQL (probably closer than MySQL), it then discerns a number of different variants:
SQL, the basic variant (although some of the old InterBase documentation seems to use this to refer to ESQL as well)
ESQL (or Embedded SQL) which allows use of SQL directly in code (using a preprocessor), not used much these days
DSQL (or Dynamic SQL), this is what you usually use when executing queries against Firebird from a programming language
PSQL (or Procedural SQL) is the extension for stored procedures, stored functions, triggers and execute block

ANSI SQL PORTABILITY TO HADOOP HIVE conversion tool or macro

I am working on hadoop hive solutions. My requirement is to convert ansi sql queries to hive queries by using a tool or excel macro. Is there any tool/macro exist? if yes, what are they; if not need suggestions to implement it. Is this possible? Do we have alternative sql queries in Hive for DMLs (like insert,update ... )? What are the pros and cons?
Any suggestions is highly appreciated....
I do not think that whole ANSI sql can be ported to hive because it does not support joins different from equ-join. So such SQL can not be ported.
Another point - there is no updates in hive - data is read only...
The rest looks very similar to ANSI SQL and I would suggest to try running queries as-is.