My data format is:
1::Toy Story (1995)::Animation|Children's|Comedy
when I try to load data into Hive 3rd column is reading from file .
I created table as follows:
hive> create table movies(mid int,mname string,gn string)
row format delimited
fields terminated by '::'
lines terminated by '\n'
stored as TEXTFILE;
if the table wont read the data try changing the fields delimiter with the relevant unicode of '::'.
hive> create table movies(mid int,mname string,gn array<string>)
row format delimited
fields terminated by '::'
collection items terminated by '|'
lines terminated by '\n'
stored as TEXTFILE;
Now you can load your dataset.
Related
I am trying to create a new Hive table, but I am getting the following error
hive> create table salary(id int,name string,salary int,promoted string)
fields terminated by ','
lines terminated by '\n'
stored as textfile;
FAILED: ParseException line 1:67 missing EOF at 'fields' near ')'
In your example, the sentence to create a table with comma separated fields in Hive would be
CREATE TABLE salary(id int,name string,salary int,promoted string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
I am trying to load a local file with "|" delimited values into hive table, we usually create a table with option "ROW FORMAT DELIMITER "|" . But I want to create a normal table and load data . What is the right syntax I need to use, please suggest.
Working Code
CREATE TABLE IF NOT EXISTS testdb.TEST_DATA_TABLE
( column1 string,
column 2 bigint,
)ROW FORMAT DELIMITED FIELDS TERMINATED BY '|';
LOAD DATA LOCAL INPATH 'xxxxx.csv' INTO TABLE testdb.TEST_DATA_TABLE;
But I want to do :
CREATE TABLE IF NOT EXISTS testdb.TEST_DATA_TABLE
( column1 string,
column 2 bigint,
);
LOAD DATA LOCAL INPATH 'xxxxx.csv' INTO TABLE testdb.TEST_DATA_TABLE FIELDS TERMINATED BY '|';
Reason begin: If i create a table, HDFS will store the data in the table with "|" delimeter
With second DDL you have provided, Hive will create default formatted table like Textformat,orc,parquet..etc(as per your configuration) with cntrl+A delimited file(default delimiter in hive).
If you want to store the hdfs file with pipe delimited then we need to create Hive Table in Text with | delimiter.
(or)
You can also write the result of select query to local (or) HDFS path with pipe delimiter also.
I had used sqoop-import command to sqoop the data into Hive from teradata. Sqoop-import command is creating a text file with comma(,) as the delimiter.
After Sqooping, I had created an external table as shown below:
CREATE EXTERNAL TABLE IF NOT EXISTS employee ( eid int, name String,
salary String, description String)
COMMENT ‘Employee details’
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
LINES TERMINATED BY ‘\n’
STORED AS TEXTFILE;
But description column has values like this:"abc,xyz,mnl". Due to this,loading of data into a hive table is not proper. Then how to create a text file with a delimiter other than comma while sqooping.
Then how to delimit the fields while creating an external table of Hive?
Use --fields-terminated-by in your Sqoop job if you want to avoid the default delimiter.
--fields-terminated-by - This parameter is used for field separator character in output.
Example: --fields-terminated-by |
and then change fields separator in create table statement by FIELDS TERMINATED BY ‘|’
I tried to create table in hive as below:
create table IF NOT EXISTS department(deptid int, deptname(1) string, deptname(2) string)
row format delimited
fields terminated by ','
lines terminated by '\n'
stored as textfile;
I am getting error as
Error while compiling statement: FAILED: ParseException line 1:58 cannot recognize input near '(' '1' ')' in column type
Is there any other way to create columns with "("
Use ` (backtick) to escape ( (round bracket).
It can be used for both tables names and fields names.
Try:
create table IF NOT EXISTS department(`deptid` int, `deptname(1)` string, `deptname(2)` string) row format delimited fields terminated by ',' lines terminated by '\n' stored as textfile;
I have csv file of format
(id,name,courses)
and data is like
1,"ABC","C,C++,DS"
2,"DEF","Java"
How to load such type of data in hive?
First, create a table
hive>create table tablename(text STRING, count INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
Then load data to hive:
hive>LOAD DATA INPATH '/hdfspath' OVERWRITE INTO TABLE tablename;