SemanticException adding partiton Hive table

SemanticException adding partiton Hive table - hive

Attempting to create a partition on a Hive table with the following:
> alter table stock_ticker add if not exists
> partition(stock_symbol='ASP')
> location 'data/stock_ticker_sample/stock_symbol=ASP/'
Which produces the following output
FAILED : SemanticException table is not partitioned but partition spec exists: {stock_symbol=ASP}
There are no partitions on this table prior to this addition attempt
> show partitions stock_ticker;
which results in
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
Table stock_ticker_sample is not a partitioned table
There is no question that the stock_symbol column exists and is of type string.
The query is what steps need to be taken in order to add this partition?

Solution would be to add partitioning info into the definition of stock_ticker table:
CREATE EXTERNAL TABLE stock_ticker (
...
)
PARTITIONED BY (stock_symbol STRING);
Then easily you can add external data to your table by:
> alter table stock_ticker add if not exists
> partition(stock_symbol='ASP')
> location 'data/stock_ticker_sample/stock_symbol=ASP/'
GL!

Related

Copied data column is not partitioned in target table in hive

I have created a table in hive from existing partitioned table using the command
create table new_table As select * from old_table;
Record counts are matching in both the table but when I give DESC table I could see the column is not partitioned in New table.

You should explicitly specify partition columns when creating the table.
create table new_table partitioned by (col1 datatype,col2 datatype,...) as
select * from old_table;

Table partitioning with procedure input parameter

I'm trying to partitioning my table on ID which I got from procedure parameter.
For example my table ddl:
CREATE TABLE bigtable
( ID number )
As input procedure parameter I got eg. number: 130 , So I'm trying to create partition:
Alter table bigtable
add partition part_random_number values(random number);
Of course as random number I mean eg. 120,56 etc : )
But I got an error that object is not partitioned. So I tried to first defined partition clause in crate table statement:
CREATE TABLE bigtable
( ID number )
PARTITION BY list (ID)
But i doesn't work, It works when I defined some partition eg.
CREATE TABLE bigtable
( ID number )
PARTITION BY list (ID)
( partition type values(130);
)
But I would like to avoid it... Is there any other solution?
As result I would like to have table partitioned by procedure input parameterers.

A partitioned table has to have at least one partition. Just create it with a dummy partition and add the ones you actually need using your procedure.

Hive error while creating partitioned view

I got a 'log' table which is currently partitioned by year, month and day. I'm looking to create a partitioned view on top of 'log' table but running into this error:
hive> CREATE VIEW log_view PARTITIONED ON (pagename,year,month,day) AS SELECT pagename, year,month,day,uid,properties FROM log;
FAILED: SemanticException [Error 10093]: Rightmost columns in view output do not match PARTITIONED ON clause
Whats the right way to create a partitioned view?

try this on..
CREATE VIEW log_view PARTITIONED ON (pagename,year,month,day) AS SELECT uid,properties,pagename, year,month,day FROM log;
Reason is partition columns must be last in select statement query.

External table does not return the data in its folder

I have created an external table in Hive with at this location :
CREATE EXTERNAL TABLE tb
(
...
)
PARTITIONED BY (datehour INT)
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION '/user/cloudera/data';
The data is present in the folder but when I query the table, it returns nothing. The table is structured in a way that it fits the data structure.
SELECT * FROM tb LIMIT 3;
Is there a kind of permission issue with Hive tables: do specific users have permissions to query some tables?
Do you know some solutions or workarounds?

You have created your table as partitioned table base on column datehour, but you are putting your data in /user/cloudera/data. Hive will look for data in /user/cloudera/data/datehour=(some int value). Since it is an external table hive will not update the metastore. You need to run some alter statement to update that
So here are the steps for external tables with partition:
1.) In you external location /user/cloudera/data, create a directory datehour=0909201401
OR
Load data using: LOAD DATA [LOCAL] INPATH '/path/to/data/file' INTO TABLE partition(datehour=0909201401)
2.) After creating your table run a alter statement:
ALTER TABLE ADD PARTITION (datehour=0909201401)
Hope it helps...!!!

When we create an EXTERNAL TABLE with PARTITION, we have to ALTER the EXTERNAL TABLE with the data location for that given partition. However, it need not be the same path as we specify while creating the EXTERNAL TABLE.
hive> ALTER TABLE tb ADD PARTITION (datehour=0909201401)
hive> LOCATION '/user/cloudera/data/somedatafor_datehour'
hive> ;
When we specify LOCATION '/user/cloudera/data' (though its optional) while creating an EXTERNAL TABLE we can take some advantage of doing repair operations on that table. So when we want to copy the files through some process like ETL into that directory, we can sync up the partition with the EXTERNAL TABLE instead of writing ALTER TABLE statement to create another new partition.
If we already know the directory structure of the partition that HIVE would create, we can simply place the data file in that location like '/user/cloudera/data/datehour=0909201401/data.txt' and run the statement as shown below:
hive> MSCK REPAIR TABLE tb;
The above statement will sync up the partition to the hive meta store of the table "tb".

How to Update/Drop a Hive Partition?

After adding a partition to an external table in Hive, how can I update/drop it?

You can update a Hive partition by, for example:
ALTER TABLE logs PARTITION(year = 2012, month = 12, day = 18)
SET LOCATION 'hdfs://user/darcy/logs/2012/12/18';
This command does not move the old data, nor does it delete the old data. It simply sets the partition to the new location.
To drop a partition, you can do
ALTER TABLE logs DROP IF EXISTS PARTITION(year = 2012, month = 12, day = 18);

in addition, you can drop multiple partitions from one statement (Dropping multiple partitions in Impala/Hive).
Extract from above link:
hive> alter table t drop if exists partition (p=1),partition (p=2),partition(p=3);
Dropped the partition p=1
Dropped the partition p=2
Dropped the partition p=3
OK
EDIT 1:
Also, you can drop bulk using a condition sign (>,<,<>), for example:
Alter table t
drop partition (PART_COL>1);

Alter table table_name drop partition (partition_name);

You can either copy files into the folder where external partition is located or use
INSERT OVERWRITE TABLE tablename1 PARTITION (partcol1=val1, partcol2=val2...)...
statement.

You may also need to make database containing table active
use [dbname]
otherwise you may get error (even if you specify database i.e. dbname.table )
FAILED Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter partition. Unable to alter partitions because table or database does not exist.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SemanticException adding partiton Hive table - hive

Related

Copied data column is not partitioned in target table in hive

Table partitioning with procedure input parameter

Hive error while creating partitioned view

External table does not return the data in its folder

How to Update/Drop a Hive Partition?

Categories

Resources