hive showing null value - hive

I'm trying to get my hdfs data in hive but it shows NULL value.
Here is the sequence of commands that I'm executing
hive> LOAD DATA INPATH '/user/hadoop/oriondemo/data/Floor1_Floor/Floor1_Floor.txt' OVERWRITE INTO TABLE new;
Loading data to table default.new
OK
Time taken: 0.538 seconds
hive> select * from new;
OK
NULL NULL
NULL NULL
NULL NULL
NULL NULL
NULL NULL
Time taken: 0.321 seconds, Fetched: 5 row(s)
hive>

what is the column separator you are using to upload this text file ? if it is comma(,) then check the values, might be some values can have additional comma for e.g. values of address column.

Related

hive shows Null value

I have query related to hive that Cygnus create external table in hive and it creates table in default database. But when I fetch data from hive using below query it shows me NULL value:
hive> select * from hadoop_abcdx002ftestsinkx002fx0052oom1_x0052oom_row
> ;
OK NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL NULL
NULL
Time taken: 1.495 seconds, Fetched: 12 row(s)
Can anyone please help me on how to proceed further with this issue.
Just run the below command:
Show create table hadoop_abcdx002ftestsinkx002fx0052oom1_x0052oom_row;
Check the location and Go to the location path where the underlying csv file is present.
check if you have data in it.

Insert values in Hive tables with primitive and complex data type

If i have only one table such as student and table definition and schema is such as
hive>
create table student1(S_Id int,
> S_name Varchar(100),
> Address Struct<a:int, b:String, c:int>,
> marks Map<String, Int>);
OK
Time taken: 0.439 seconds
hive>
hive> Describe Student1;
OK
s_id int
s_name varchar(100)
address struct<a:int,b:string,c:int>
marks map<string,int>
Time taken: 0.112 seconds, Fetched: 4 row(s)
Now i am trying to insert values into that Student1 table such as
hive> insert into table student1 values(1, 'Afzal', Struct(42, 'nelson Ave NY', 08309),MAP("MATH", 89));
I am getting that error
FAILED: SemanticException [Error 10293]: Unable to create temp file for insert values Expression of type TOK_FUNCTION not supported in insert/values
How do i insert values for one record in one go, Can anyone please help me?
It works when using insert .. select statement. Create a dummy table with single row, or use some existing table + add limit 1. Also use named_struct function:
Demo:
hive> insert into table student1
select 1 s_id,
'Afzal' s_name,
named_struct('a',42, 'b','nelson Ave NY', 'c',08309) address,
MAP('MATH', 89) marks
from default.dual limit 1; --this is dummy table
Loading data to table dev.student1
Table dev.student1 stats: [numFiles=1, numRows=1, totalSize=48, rawDataSize=37]
OK
Time taken: 27.175 seconds
Check data:
hive> select * from student1;
OK
1 Afzal {"a":42,"b":"nelson Ave NY","c":8309} {"MATH":89}
Time taken: 0.125 seconds, Fetched: 1 row(s)

Not able to remove a column from hive table

I am not able to remove a column using ALTER from an existing hive external table
Hive version is hive 1.1 inside CDH 5.5
hive> create external table alter_test(id int,name string)
> row format delimited
> fields terminated by ','
> location '/user/cloudera/conf_files';
OK
Time taken: 0.132 seconds
hive> select * from alter_test;
OK
100 surender
101 raja
Time taken: 0.141 seconds, Fetched: 2 row(s)
hive> alter table alter_test ADD COLUMNS (deviceid string,mode string,channels int,action_name string,data_countt int);
OK
Time taken: 0.2 seconds
hive> show create table alter_test;
OK
CREATE EXTERNAL TABLE `alter_test`(
`id` int,
`name` string,
`deviceid` string,
`mode` string,
`channels` int,
`action_name` string,
`data_countt` int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://nameservice1/user/cloudera/conf_files'
TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='false',
'last_modified_by'='build',
'last_modified_time'='1500048081',
'numFiles'='0',
'numRows'='-1',
'rawDataSize'='-1',
'totalSize'='0',
'transient_lastDdlTime'='1500048081')
Time taken: 0.049 seconds, Fetched: 25 row(s)
hive> select * from alter_test;
OK
100 surender NULL NULL NULL NULL NULL
101 raja NULL NULL NULL NULL NULL
Time taken: 0.123 seconds, Fetched: 2 row(s)
hive> alter table alter_test drop deviceid;
MismatchedTokenException(26!=187)
at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionSpec(HiveParser_IdentifiersParser.java:10571)
at org.apache.hadoop.hive.ql.parse.HiveParser.dropPartitionSpec (HiveParser.java:44608)
at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatementSuffixDropPartitions(HiveParser.java:11198)
at org.apache.hadoop.hive.ql.parse.HiveParser.alterTableStatementSuffix(HiveParser.java:7748)
at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatement(HiveParser.java:6960)
at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2409)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1586)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1062)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1158)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1037)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: ParseException line 1:28 mismatched input 'deviceid' expecting PARTITION near 'drop' in drop partition statement
hive> alter table alter_test drop column deviceid;
MismatchedTokenException(57!=187)
at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionSpec(HiveParser_IdentifiersParser.java:10571)
at org.apache.hadoop.hive.ql.parse.HiveParser.dropPartitionSpec(HiveParser.java:44608)
at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatementSuffixDropPartitions(HiveParser.java:11198)
at org.apache.hadoop.hive.ql.parse.HiveParser.alterTableStatementSuffix(HiveParser.java:7748)
at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatement(HiveParser.java:6960)
at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2409)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1586)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1062)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1158)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1047)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1037)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
FAILED: ParseException line 1:28 mismatched input 'column' expecting PARTITION near 'drop' in drop partition statement
Is there any workaround to come out of this issue?
There is currently no command in Hive for dropping a column.
A column could be dropped implicitly by defining a new set of columns -
alter table alter_test
replace columns
(id int,name string,mode string,channels int,action_name string,data_countt int)
;
However, in your case it raises an exception -
Unable to alter table.
The following columns have types incompatible
with the existing columns in their respective positions :
channels,data_countt,
Query: alter table alter_test replace columns
(id int,name string,mode string,channels int,action_name
string,data_countt int).
We can work around it by doing it in 2 phases -
1.
alter table alter_test
replace columns
(id int,name string)
;
2.
alter table alter_test
add columns
(mode string,channels int,action_name string,data_countt int)
;
P.s.
Just to make it clear -
All the change are done in the metadata level only.
The data is not being changed.
P.s. 2
And off course you can drop the external table and recreate it...
it's better to create a new table a,then insert into table a from old table,after this drop the old table ,finally rename a to the old table name.

INSERT in table Hive

I have created the following table in hive:
hive> CREATE TABLE IF NOT EXISTS Sensorreading ( recvtime String, nodeid int, sensorid int, systemid int, value float);
OK
Time taken: 3.007 seconds
hive> describe Sensorreading;
OK
recvtime string
nodeid int
sensorid int
systemid int
value float
Time taken: 0.381 seconds
hive>
And now I need to insert data in it. I have tried this but it don't work:
INSERT INTO TABLE Sensorreading (recvtime, nodeid, sensorid, systemid, value) VALUES ('2015-05-29 11:10:00',1,1,1,-45.4);
How is the syntax of INSERT? Thanks
INSERT...VALUES is available starting in Hive 0.14.
Check if your Hive version is 0.14 or later.
Insert is possible in hive 0.14. But if you need to insert something than there are two ways for it (manual methods , not any paticular command):
1. First you can load it from text file(changes only done in it i.e including your rows in it)
2. You can copy the part file to local and than do changes and then again revert back to regular path.

How does the HIVE table format look for the given data input?

I the data in the following format
6856437950 11/16/2008 22:36:38 8204208990 1001004006044273
6715281120 11/16/2008 15:29:42 8132862237 1001004005059895
The Hive table i have create is the following
CREATE TABLE t2 (session_id STRING, date_time STRING, customer_id STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE;
When I load the data into the table and display the contents its shows in the following format:
6856437950 11/16/2008 22:36:38 8204208990 1001004006044273 NULL NULL
6715281120 11/16/2008 15:29:42 8132862237 1001004005059895 NULL NULL
It shows all the elements in the row are assigned to variable session_id and the rest date_time and customer_id are NULL.
I believe I made a mistake in FIELD TERMINATED clause but I am not sure what value to assign it for.
hive (default)> CREATE TABLE t2 (session_id STRING, date_time STRING, customer_id STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE;
OK
Time taken: 9.343 seconds
hive (default)> desc t2;
OK
col_name data_type comment
session_id string
date_time string
customer_id string
Time taken: 0.319 seconds
hive (default)> LOAD DATA LOCAL INPATH '/tmp/input.txt' INTO table t2;
Copying data from file:/tmp/input.txt
Copying file: file:/tmp/input.txt
Loading data to table default.t2
OK
Time taken: 0.766 seconds
hive (default)> select * from t2;
OK
session_id date_time customer_id
6856437950 11/16/2008 22:36:38 8204208990 1001004006044273
6715281120 11/16/2008 15:29:42 8132862237 1001004005059895
Time taken: 0.494 seconds
hive (default)