PrestoDB Hive Catalog: no viable alternative at input 'CREATE EXTERNAL' - hive

I am running a query in Teradata PrestoDB distribution for Hive catalog as:
CREATE EXTERNAL TABLE hive.default.mydata
id INT, datetime timestamp, latitude FLOAT,
longitude FLOAT, bookingid VARCHAR, pre_lat FLOAT,
pre_long FLOAT, time_hour decimal(6, 1), dist_kms decimal(6, 1),
ma6_dist_kms decimal(6, 1), istationary INT, quality_overall VARCHAR,
quality_nonstationary VARCHAR, cartype VARCHAR, isbigloss INT,
bookregion VARCHAR, iho_road VARCHAR)
STORED AS PARQUET
LOCATION "s3://sb.mycompany.com/someFolder/anotherFolder";
Throwing following exception:
Query 20180316_022346_00001_h9iie failed: line 1:8: no viable alternative at input 'CREATE EXTERNAL'
Even when I use hive and run a show table command, I see an error as Schema is set but catalog is not:
presto> use hive;
presto:hive> show tables;
Error running command:
Error starting query at http://localhost:8080/v1/statement returned HTTP response code 400.
Response info:
JsonResponse{statusCode=400, statusMessage=Bad Request, headers={Content-Length=[32], Date=[Fri, 16 Mar 2018 02:25:25 GMT], Content-Type=[text/plain]}, hasValue=false, value=null}
Response body:
Schema is set but catalog is not
Any help would be appreciated. Thanks.

There is no such thing like CREATE EXTERNAL TABLE in Presto. In order to create Hive external table in Presto, please do something like:
CREATE TABLE hive.web.request_logs (
request_time timestamp,
url varchar,
ip varchar,
user_agent varchar
)
WITH (
format = 'TEXTFILE',
external_location = 's3://my-bucket/data/logs/'
)
Please visit this page to see how to interact with Hive from Presto: https://docs.starburstdata.com/latest/connector/hive.html?highlight=hive
use hive; set only the current schema in the user session. I think you wanted to do something like: USE hive.default;. Please take a look here for more details: https://docs.starburstdata.com/latest/sql/use.html

Related

DateTime Conversion Error - Excel to SQL

I have two data sets on two different SQL servers. I've got dataset 1 and put it into Excel and am then going to put this into a temp table so I can query it against the data on server 2. Here is the SQL code that I have created:
Create table #JWTemp1 (AutoID varchar(10), IDNumber varchar(20), AltIDNumber varchar(20), AdmitDTTM datetime, AdmitDay varchar(15), AdmitWeekNo int)
Insert into #JWTemp1 Values('ID001','BCC445567','ABC445567','42510.7326388889','Friday','21')
Each time I try and run the code, I get the following error:
Conversion failed when converting date and/or time from character string.
I know this is a common error but I've tried all manner of soutions and got nowhere. Any suggestions?
You have to format the string. Not sure what DB are you using but here is the syntax for mySql.
DATE_FORMAT(colName, '%Y-%m-%d') DATEONLY
DATE_FORMAT(colName,'%H:%i:%s') TIMEONLY
Unfortunately, non of the answers provided seemed to work. However, I solved the issue using the following logic:
Create table #JWTemp1 (AutoID varchar(10), IDNumber varchar(20), AltIDNumber varchar(20), AdmitDTTM datetime, AdmitDay varchar(15), AdmitWeekNo int)
Insert into #JWTemp1 Values('ID001','BCC445567','ABC445567','42510.7326388889','Friday','21')
Select
convert(datetime, Convert(float,AdmitDTTM))

Vora Modeler View preview fails with com.sap.spark.vora.client.jdbc.VoraJdbcException

I'm running Vora 1.3 on HDP 2.4.3 with Spark 1.6.2.
I've got two tables with data of the same schema, one table residing in a HANA db, another stored as CSV file in HDFS.
I created both tables in Vora using Zeppelin:
CREATE TABLE flights_2006 (Year int, Month_ int, DayofMonth int, DayOfWeek int, DepTime int, CRSDepTime int, ArrTime int, CRSArrTime int, UniqueCarrier string, FlightNum int,
TailNum string, ActualElapsedTime int, CRSElapsedTime int, AirTime int, ArrDelay int, DepDelay int, Origin string, Dest string, Distance int, TaxiIn int, TaxiOut int,
Cancelled int, CancellationCode int, Diverted int, CarrierDelay int, WeatherDelay int, NASDelay int, SecurityDelay int, LateAircraftDelay int)
USING com.sap.spark.vora
OPTIONS (
files "/exch/flights_filtered/part-00000,/exch/flights_filtered/part-00001,/exch/flights_filtered/part-00002,/exch/flights_filtered/part-00003,/exch/flights_filtered/part-00004",
csvdelimiter ","
)
Q1. By the way, when is it going to be possible to supply just directory names, not list all files in a directory, when creating Vora tables from file sources? It's very impractical, as one cannot predict how many part-files are going to be in a directory.
CREATE TABLE flights_2007
USING com.sap.spark.hana
OPTIONS (
tablepath "XXXXXXXXXXXX",
dbschema "XXXXXXXXXX",
host "XXXXXXXXXXX",
instance "00",
user "XXXXXXXXXXX",
passwd "XXXXXXXXXX"
)
And I was able to produce a result from the table join for these two (business meaning of such join set aside):
select f7.MONTH, f7.DAYOFMONTH, f7.UNIQUECARRIER, f7.FLIGHTNUM, f7.YEAR, f7.DEPTIME, f6.year, f6.DepTime
from flights_2007 as f7 inner join flights_2006 as f6
on f7.MONTH = f6.Month_ and f7.DAYOFMONTH = f6.DayofMonth and f7.UNIQUECARRIER = f6.UniqueCarrier and f7.FLIGHTNUM = f6.FlightNum
where f7.MONTH = 1 and f7.DAYOFMONTH = 2 and f7.UNIQUECARRIER = 'WN'
Then I tried to do the very same steps in Vora Modeler.
Q2. How comes that REGISTER TABLE in Zeppelin doesn't lead to tables being available in Vora Modeler?
So, I executed the same two table creation statements in Vora Modeler, using all capitals in table names, as I remember Vora has some issues with that earlier. Then created a Vora View as a join of the two tables with this condition:
FLIGHTS_2007.MONTH = FLIGHTS_2006.MONTH_ and
FLIGHTS_2007.DAYOFMONTH = FLIGHTS_2007.DAYOFMONTH and
FLIGHTS_2007.UNIQUECARRIER = FLIGHTS_2006.UNIQUECARRIER and
FLIGHTS_2007.FLIGHTNUM = FLIGHTS_2006.FLIGHTNUM
.. and used the where-condition:
FLIGHTS_2007.MONTH = 1 and
FLIGHTS_2007.DAYOFMONTH = 2 and
FLIGHTS_2007.UNIQUECARRIER = 'WN'
Expected result for that View preview would be the same as for Zeppelin-based select. Actual result (first few lines of):
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 2165.0 failed 4 times, most recent failure: Lost task 2.3 in stage 2165.0 (TID 78743, eba165.extendtec.com.au): com.sap.spark.vora.client.jdbc.VoraJdbcException: [Vora [eba165.extendtec.com.au:34530.1615085]] Unknown error when executing SELECT "FLIGHTS_2006"."FLIGHTNUM", "FLIGHTS_2006"."DEPTIME", "FLIGHTS_2006"."UNIQUECARRIER", "FLIGHTS_2006"."MONTH_", "FLIGHTS_2006"."YEAR" FROM "FLIGHTS_2006": HL(9): Runtime error. (schema error: could not resolve column "FLIGHTS_2006"."YEAR" (sql parse error)) at com.sap.spark.vora.client.jdbc.VoraJdbcClient.liftedTree1$1(VoraJdbcClient.scala:210) at com.sap.spark.vora.client.jdbc.VoraJdbcClient.generateAutocloseableIteratorFromQuery(VoraJdbcClient.scala:187) at com.sap.spark.vora.client.VoraClient$$anonfun$generateAutocloseableIteratorFromQuery$1.apply(VoraClient.scala:363) at com.sap.spark.vora.client.VoraClient$$anonfun$generateAutocloseableIteratorFromQuery$1.apply(VoraClient.scala:363) at scala.util.Try$.apply(Try.scala:161) at com.sap.spark.vora.client.VoraClient.handleExceptions(VoraClient.scala:775) at com.sap.spark.vora.client.VoraClient.generateAutocloseableIteratorFromQuery(VoraClient.scala:362) at com.sap.spark.vora.VoraRDD.compute(voraRDD.scala:54) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313) at org.apache.spark.rdd.RDD.iterator(RDD.scala:277) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313) at org.apache.spark.rdd.RDD.iterator(RDD.scala:277) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313) at org.apache.spark.rdd.RDD.iterator(RDD.scala:277) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313) at org.apache.spark.rdd.RDD.iterator(RDD.scala:277) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at
Q3. Did I do anything wrong in Vora Modeler? Or is it actually a bug?
you mention that you used all caps for table names when running your CREATE statements. In my experience with the 1.3 Modeler, you must use all uppercase for your column names as well.
schema error: could not resolve column "FLIGHTS_2006"."YEAR"
For example, if you used "CREATE TABLE FLIGHTS_2006 (Year int, ...", try changing that to "CREATE TABLE FLIGHTS_2006 (YEAR int, ..."
Regarding your Q1, yes this is something are currently reviewing as a feature request.
Regarding your Q2, is your Zeppelin connected to the same Vora Thriftserver as your Vora Modeler (aka Vora Tools) ?
Regarding your Q3, the other reply from Ryan is correct, column names are also case sensetive in Vora 1.3

SQL Server 2008 varchar and char not working

I get an error saying varchar or char is not a recognized in SQL Server. However I use the same software, SQL Server 2008 in college, so I do not want to go for a different version of SQL Server. Anyway I can fix this?
Major Error 0x80040E14, Minor Error 26302
create table abc
(
id int,
name varchar(15)
)
Error is:
The specified data type is not valid. [ Data type (if known) = varchar
]
varchar(n) is not supported in SQL Server Compact. Here is the full list of types supported.
Following are the types that you can use.
nvarchar(n)
nchar(n)
ntext
So you might need to change to nvarchar(10), nvarchar(5) and nchar(1) etc..
Really ? , i tried it, also i tried in creating temp table, it seems both are ok.
create table abc (id int, name varchar(15))
create table #tmp (id int, name varchar(15))
http://sqlfiddle.com/#!3/66b44

BIGINT in SPARK SQL

We have configured Spark SQL(1.3.2) to work on top of Hive and we use Beeline to create the tables.
I was trying to create a table with BIGINT datatype.However I see that the table is getting created with INT datatype when I use the below command
CREATE TEMPORARY TABLE cars (blank bigint)
USING com.databricks.spark.csv
OPTIONS (path "cars.csv", header "false")
However when I use the command below I am able to create a table with bigint datatype
CREATE TABLE cars(blank bigint)
Can you let me know how can I create a tableBIGINT datatype using first method
Is it because of this
"Integral literals are assumed to be INT by default, unless the number exceeds the range of INT in which case it is interpreted as a BIGINT, or if one of the following postfixes is present on the number."
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-IntegralTypes(TINYINT,SMALLINT,INT,BIGINT)

Bulk inserting data gives error

Attempting to bulk insert into a table and I am getting the error:
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 31, column 4 (Birthday).
Below is the code I am trying to use to insert the data:
Bulk Insert Dzt.dbo.Player
From 'A:\New Folder\Seed_Files\Player.csv'
With
(
FieldTerminator=',',
RowTerminator='\n',
FirstRow=2
)
Here is the code I used when making the table:
Use Dzt
Create Table Player
(
Player_ID int,
FirstName varchar(255),
LastName varchar(255),
Birthday date,
Email varchar(255),
L_Flag varchar(255)
);
This is my first attempt at making a table and inserting data so I am thinking it is likely a datatype error for the Birthday field but I have been unable to find anything online that I am able to grasp my head on at this time. I have also tried use the datatype datetime instead of date but I received the same error.
I am using SSMS 2012 to create and insert the data onto a 2012 SQL Server.
Let me know if there is anything else I can provide that might help.
As you suspect it could be a date format error, I would suggest importing the csv into a table with Birthday column set to varchar type. Then use this query to filter the erroneous records.
select birthday from temptable where isdate(birthday) = 0
You could then correct those records and then insert them into your old table.