Load data into HIVE table - hive

My data format is:
1::Toy Story (1995)::Animation|Children's|Comedy
when I try to load data into Hive 3rd column is reading from file .
I created table as follows:
hive> create table movies(mid int,mname string,gn string)
row format delimited
fields terminated by '::'
lines terminated by '\n'
stored as TEXTFILE;

if the table wont read the data try changing the fields delimiter with the relevant unicode of '::'.

hive> create table movies(mid int,mname string,gn array<string>)
row format delimited
fields terminated by '::'
collection items terminated by '|'
lines terminated by '\n'
stored as TEXTFILE;
Now you can load your dataset.

Related

Error when trying to create a new hive table

I am trying to create a new Hive table, but I am getting the following error
hive> create table salary(id int,name string,salary int,promoted string)
fields terminated by ','
lines terminated by '\n'
stored as textfile;
FAILED: ParseException line 1:67 missing EOF at 'fields' near ')'
In your example, the sentence to create a table with comma separated fields in Hive would be
CREATE TABLE salary(id int,name string,salary int,promoted string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;

How to load a "|" delimited file into hive without creating a hive table with "ROW FORMAT DELIMITER"

I am trying to load a local file with "|" delimited values into hive table, we usually create a table with option "ROW FORMAT DELIMITER "|" . But I want to create a normal table and load data . What is the right syntax I need to use, please suggest.
Working Code
CREATE TABLE IF NOT EXISTS testdb.TEST_DATA_TABLE
( column1 string,
column 2 bigint,
)ROW FORMAT DELIMITED FIELDS TERMINATED BY '|';
LOAD DATA LOCAL INPATH 'xxxxx.csv' INTO TABLE testdb.TEST_DATA_TABLE;
But I want to do :
CREATE TABLE IF NOT EXISTS testdb.TEST_DATA_TABLE
( column1 string,
column 2 bigint,
);
LOAD DATA LOCAL INPATH 'xxxxx.csv' INTO TABLE testdb.TEST_DATA_TABLE FIELDS TERMINATED BY '|';
Reason begin: If i create a table, HDFS will store the data in the table with "|" delimeter
With second DDL you have provided, Hive will create default formatted table like Textformat,orc,parquet..etc(as per your configuration) with cntrl+A delimited file(default delimiter in hive).
If you want to store the hdfs file with pipe delimited then we need to create Hive Table in Text with | delimiter.
(or)
You can also write the result of select query to local (or) HDFS path with pipe delimiter also.

How to create an external Hive table if the field value has comma separated values

I had used sqoop-import command to sqoop the data into Hive from teradata. Sqoop-import command is creating a text file with comma(,) as the delimiter.
After Sqooping, I had created an external table as shown below:
CREATE EXTERNAL TABLE IF NOT EXISTS employee ( eid int, name String,
salary String, description String)
COMMENT ‘Employee details’
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
LINES TERMINATED BY ‘\n’
STORED AS TEXTFILE;
But description column has values like this:"abc,xyz,mnl". Due to this,loading of data into a hive table is not proper. Then how to create a text file with a delimiter other than comma while sqooping.
Then how to delimit the fields while creating an external table of Hive?
Use --fields-terminated-by in your Sqoop job if you want to avoid the default delimiter.
--fields-terminated-by - This parameter is used for field separator character in output.
Example: --fields-terminated-by |
and then change fields separator in create table statement by FIELDS TERMINATED BY ‘|’

Creating column names with "(" in hive 1.1.0

I tried to create table in hive as below:
create table IF NOT EXISTS department(deptid int, deptname(1) string, deptname(2) string)
row format delimited
fields terminated by ','
lines terminated by '\n'
stored as textfile;
I am getting error as
Error while compiling statement: FAILED: ParseException line 1:58 cannot recognize input near '(' '1' ')' in column type
Is there any other way to create columns with "("
Use ` (backtick) to escape ( (round bracket).
It can be used for both tables names and fields names.
Try:
create table IF NOT EXISTS department(`deptid` int, `deptname(1)` string, `deptname(2)` string) row format delimited fields terminated by ',' lines terminated by '\n' stored as textfile;

How to insert multivalued field into one column in hive

I have csv file of format
(id,name,courses)
and data is like
1,"ABC","C,C++,DS"
2,"DEF","Java"
How to load such type of data in hive?
First, create a table
hive>create table tablename(text STRING, count INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
Then load data to hive:
hive>LOAD DATA INPATH '/hdfspath' OVERWRITE INTO TABLE tablename;