I want to copy the structure of full load temp table and add the addition table properties like partitioned by (partition_col), Format='ORC'
Temp table :
Create table if not exists tmp.temp_table( id int,
name string,
datestr string )
temp table got created.
Final table :
CREATE TABLE IF NOT EXISTS tmp.{final_table_name} (
LIKE tmp.temp_table
)
WITH (
FORMAT = 'ORC'
partitioned by('datestr')
)
But I am getting the error as "Error: Error while compiling statement: FAILED: ParseException line 1:63 missing EOF at 'WITH' near 'temp_table' (state=42000,code=40000)"
Any solution to achieve this functionality.
You should not use like and instead use create table as (CTAS) select * from mytab where 1=2.
CREATE TABLE IF NOT EXISTS tmp.{final_table_name}
As select * from tmp.temp_table where 1=2
WITH (
FORMAT = 'ORC'
partitioned by('datestr')
)
Like will create an empty table with exact same definition. CTAS will use same column sequence, data type/length, the sql, and your definition to create new empty table because we are using 1=2.
I create a external table with a wrong(non-exists) path :
create external table IF NOT EXISTS ds_user_id_csv
(
type string,
imei string,
imsi string,
idfa string,
msisdn string,
mac string
)
PARTITIONED BY(prov string,day string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
stored as textfile
LOCATION 'hdfs://cdh0:8020/user/hive/warehouse/test.db/ds_user_id';
And I can not drop the table:
[cdh1:21000] > drop table ds_user_id_csv
> ;
Query: drop table ds_user_id_csv
ERROR:
ImpalaRuntimeException: Error making 'dropTable' RPC to Hive Metastore:
CAUSED BY: MetaException: java.lang.IllegalArgumentException: Wrong FS: hdfs://cdh0:8020/user/hive/warehouse/test.db/ds_user_id, expected: hdfs://nameservice1
So how to solve this? Thank you.
Use the following command to change the location
ALTER TABLE name ds_user_id_csv SET LOCATION '{new location}';
I have a very basic question which is: How can I add a very simple table to Hive. My table is saved in a text file (.txt) which is saved in HDFS. I have tried to create an external table in Hive which points out this file but when I run an SQL query (select * from table_name) I don't get any output.
Here is an example code:
create external table Data (
dummy INT,
account_number INT,
balance INT,
firstname STRING,
lastname STRING,
age INT,
gender CHAR(1),
address STRING,
employer STRING,
email STRING,
city STRING,
state CHAR(2)
)
LOCATION 'hdfs:///KibTEst/Data.txt';
KibTEst/Data.txt is the path of the text file in HDFS.
The rows in the table are seperated by carriage return, and the columns are seperated by commas.
Thanks for your help!
You just need to create an external table pointing to your file
location in hdfs and with delimiter properties as below:
create external table Data (
dummy INT,
account_number INT,
balance INT,
firstname STRING,
lastname STRING,
age INT,
gender CHAR(1),
address STRING,
employer STRING,
email STRING,
city STRING,
state CHAR(2)
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
LOCATION 'hdfs:///KibTEst/Data.txt';
You need to run select query(because file is already in HDFS and external table directly fetches data from it when location is specified in create statement). So you test using below select statement:
SELECT * FROM Data;
create external table Data (
dummy INT,
account_number INT,
balance INT,
firstname STRING,
lastname STRING,
age INT,
gender CHAR(1),
address STRING,
employer STRING,
email STRING,
city STRING,
state CHAR(2)
)
row format delimited
FIELDS TERMINATED BY ‘,’
stored as textfile
LOCATION 'Your hdfs location for external table';
If data in HDFS then use :
LOAD DATA INPATH 'hdfs_file_or_directory_path' INTO TABLE tablename
The use select * from table_name
create external table Data (
dummy INT,
account_number INT,
balance INT,
firstname STRING,
lastname STRING,
age INT,
gender CHAR(1),
address STRING,
employer STRING,
email STRING,
city STRING,
state CHAR(2)
)
row format delimited
FIELDS TERMINATED BY ','
stored as textfile
LOCATION '/Data';
Then load file into table
LOAD DATA INPATH '/KibTEst/Data.txt' INTO TABLE Data;
Then
select * from Data;
I hope, below inputs will try to answer the question asked by #mshabeen.
There are different ways that you can use to load data in Hive table that is created as external table.
While creating the Hive external table you can either use the LOCATION option and specify the HDFS, S3 (in case of AWS) or File location, from where you want to load data OR you can use LOAD DATA INPATH option to load data from HDFS, S3 or File after creating the Hive table.
Alternatively you can also use ALTER TABLE command to load data in the Hive partitions.
Below are some details
Using LOCATION - Used while creating the Hive table. In this case data is already loaded and available in Hive table.
**LOAD DATA INPATH** option - This Hive command can be used to load data from specified location. Point to remember here is, the data will get MOVED from input path to Hive warehouse path.
Example -
LOAD DATA INPATH 'hdfs://cluster-ip/path/to/data/location/'
Using ALTER TABLE command - Mostly this is used to add data from other locations into the Hive partitions. In this case it is required that all partitions are already defined and the values for the partitions are already known. In case of dynamic partitions this command is not required.
Example -
ALTER TABLE table_name ADD PARTITION (date_col='2018-02-21') LOCATION 'hdfs/path/to/location/'
The above code will map the partition to the specified data location (in this case HDFS). However, the data will NOT MOVED to Hive internal warehouse location.
Additional details are available here
I am trying to insert data into a Hive table through Dynamic partitioning the table is
CREATE EXTERNAL TABLE target_tbl_wth_partition(
booking_id string,
code string,
txn_date timestamp,
logger string,
)
partition by (txn_date date,txn_hour int)
Values
txn_date=20160216
txn_hour=12
CREATE EXTERNAL TABLE stg_target_tbl_wth_partition(
booking_id string,
code string,
txn_date timestamp,
logger string,
)
insert overwrite table target_tbl_wth_partition partition(txn_date,hour(txn_date))
select booking_id,code,txn_date,logger from stg_target_tbl_wth_partition;
I am not able to insert with derived columns in Dynamic partition. Any help on how to proceed with such case will be helpful.
Regards,
Rakesh
I suggest you start from something like that...
CREATE TABLE blahblah (...)
PARTITIONED BY (aaa STRING, bbb STRING)
;
SET hive.exec.dynamic.partition = true
;
SET hive.exec.dynamic.partition.mode = nonstrict
;
INSERT INTO TABLE blahblah PARTITION (aaa, bbb)
SELECT ...,
SUBSTRING(aaabbb,1,5) as aaa,
SUBSTRING(aaabbb,7,2) as bbb
FROM sthg
;
...and make it work; then you can start experimenting some weird and unsupported syntax and see what works and what does not.
We have configured Spark SQL(1.3.2) to work on top of Hive and we use Beeline to create the tables.
I was trying to create a table with BIGINT datatype.However I see that the table is getting created with INT datatype when I use the below command
CREATE TEMPORARY TABLE cars (blank bigint)
USING com.databricks.spark.csv
OPTIONS (path "cars.csv", header "false")
However when I use the command below I am able to create a table with bigint datatype
CREATE TABLE cars(blank bigint)
Can you let me know how can I create a tableBIGINT datatype using first method
Is it because of this
"Integral literals are assumed to be INT by default, unless the number exceeds the range of INT in which case it is interpreted as a BIGINT, or if one of the following postfixes is present on the number."
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-IntegralTypes(TINYINT,SMALLINT,INT,BIGINT)