Getting Exception while executing the Hive Create Table statement - hive

I am getting the "SemanticException [Error 10002]: Invalid column reference" while executing the below statement.
CREATE TABLE IF NOT EXISTS default.employee_details_3(FirstName VARCHAR(20),LastName VARCHAR(20)) COMMENT 'This is a test table mod' PARTITIONED BY(Emp_id INT,Gender VARCHAR(15),EmailAddress VARCHAR(40)) CLUSTERED BY(Emp_id,Gender,EmailAddress) INTO 14 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS SEQUENCEFILE ;
I have used the following link for reference
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable

The error is because you are doing partitioning and bucketing on the same column. You can not use same column in partitioned by as well clustered by clause.
use different columns, it will work.
Try below query :
CREATE TABLE IF NOT EXISTS default.employee_details_3
(FirstName VARCHAR(20),
LastName VARCHAR(20)) COMMENT 'This is a test table mod'
PARTITIONED BY(Emp_id INT,Gender VARCHAR(15),EmailAddress VARCHAR(40))
CLUSTERED BY(FirstName) INTO 14 BUCKETS
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS SEQUENCEFILE ;

Related

Error when trying to create a new hive table

I am trying to create a new Hive table, but I am getting the following error
hive> create table salary(id int,name string,salary int,promoted string)
fields terminated by ','
lines terminated by '\n'
stored as textfile;
FAILED: ParseException line 1:67 missing EOF at 'fields' near ')'
In your example, the sentence to create a table with comma separated fields in Hive would be
CREATE TABLE salary(id int,name string,salary int,promoted string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;

columns has 2 elements while hbase.columns.mapping has 3 elements error while creating hive table from hbase

I'm getting following error when I run the below command for creating hive table.
sample is my hive table I'm trying to create. hloan is my existing hbase table. Please help.
create external table sample(id int, name string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES("hbase.columns.mapping"=":key,hl:id,hl:name")
TBLPROPERTIES ("hbase.table.name"="hloan","hbase.mapred.output.outputtable"="sample");
ERROR:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.hbase.HBaseSerDe: columns has 2 elements while hbase.columns.mapping has 3 elements (counting the key if implicit))
As error describes your create external table statement having 2 columns id,name.
In Hbase mapping you are having 3 columns :key,hl:id,hl:name
Create table with 3 columns:
hive> create external table sample(key int, id int, name string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES("hbase.columns.mapping"=":key,hl:id,hl:name")
TBLPROPERTIES ("hbase.table.name"="hloan","hbase.mapred.output.outputtable"="hloan");
(or)
if key and id columns having same data then you can skip hl:id in mapping.
Create table with 2 columns:
hive> create external table sample(id int, name string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES("hbase.columns.mapping"=":key,hl:name")
TBLPROPERTIES ("hbase.table.name"="hloan","hbase.mapred.output.outputtable"="hloan");

how to alter schema by inserting a new column in hive

I have a hive table stored on the cluster. I want to modify it by adding a new column, and have the old columns data with the data of the new column added from another table. Is there a way to do it without recreating the table?
the old schema looks like:
create external table XXX
(item_id bigint,
start_dt string,
end_dt string,
title string,
subtitle string,
description string)
row format delimited fields terminated by '\t' lines terminated by '\n'
stored as textfile
location '/user/me/XXX';
You should be able to do it using below syntax.
ALTER TABLE table_name
[PARTITION partition_spec] -- (Note: Hive 0.14.0 and later)
ADD|REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...)
[CASCADE|RESTRICT] -- (Note: Hive 0.15.0 and later)

Unable to create table in hive

I am creating table in hive like:
CREATE TABLE SEQUENCE_TABLE(
SEQUENCE_NAME VARCHAR2(225) NOT NULL,
NEXT_VAL NUMBER NOT NULL
);
But, in result there is parse exception. Unable to read Varchar2(225) NOT NULL.
Can anyone guide me that how to create table like given above and any other process to provide path for it.
There's no such thing as VARCHAR, field width or NOT NULL clause in hive.
CREATE TABLE SEQUENCE_TABLE( SEQUENCE_TABLE string, NEXT_VAL bigint);
Please read this for CREATE TABLE syntax:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
Anyway Hive is "SQL Like" but it's not "SQL". I wouldn't use it for things such as sequence table as you don't have support for transactions, locking, keys and everything you are familiar with from Oracle (though I think that in new version there is simple support for transactions, updates, deletes, etc.).
I would consider using normal OLTP database for whatever you are trying to achieve
only you have option here like:
CREATE TABLE SEQUENCE_TABLE(SEQUENCE_NAME String,NEXT_VAL bigint) row format delimited fields terminated by ',' stored as textfile;
PS:Again depends the types to data you are going to load in hive
Use following syntax...
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[ROW FORMAT row_format]
[STORED AS file_format]
And Example of hive create table
CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
salary String, destination String)
COMMENT ‘Employee details’
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
LINES TERMINATED BY ‘\n’
STORED AS TEXTFILE;

Create External Table atop pre-partitioned data

I have data that looks like this:
/user/me/output/
key1/
part_00000
part_00001
key2/
part_00000
part_00001
key3/
part_00000
part_00001
The data is pre-partitioned by "key_", and the "part_*" files contains my data in the form "a,b,key_". I create an external table:
CREATE EXTERNAL TABLE tester (
a STRING,
b INT
)
PARTITIONED BY (key STRING)
ROW FORMAT
DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/user/me/output/';
But a SELECT * gives no output. How can I create an external table that will read in this partitioned data?
You will have to change your directory structure to make sure that hive reads the folders. It should be something like this.
/user/me/output/
key=key1/
part_00000
part_00001
key=key2/
part_00000
part_00001
key=key3/
part_00000
part_00001
Once this is done you can create a table on top of this using the query you mentioned.
CREATE EXTERNAL TABLE tester (
a STRING,
b INT
)
PARTITIONED BY (key STRING)
ROW FORMAT
DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/user/me/output/';
You will also have to explicitly add the partitions or do a msck repair on the table to load the partitions with hive metadata. Any of these would do:
msck repair table tester;
OR
Alter table tester ADD PARTITION (key = 'key1');
Alter table tester ADD PARTITION (key = 'key2');
Alter table tester ADD PARTITION (key = 'key3');
Once you have done this, queries would return the output as present in your folders.