How to delete multiple partitions in HIVE table

How to delete multiple partitions in HIVE table - hive

I would like to delete multiple partitions in Hive table. I am able to delete specific partition using ALTER statement as follow :
ALTER TABLE table_name DROP IF EXISTS PARTITION (partition_col= val)
But it is not working with operators like > or <.
The error thrown is :
mismatched input '<' expecting {')', ','}(line 1, pos 81)

Related

Removing specific rows from table without using DELETE

I am working in a company and I need to find a way to delete specific rows from table without using DELETE function.
So I was thinking to use partition and then remove it using drop partition if exists:
select *, count(validity_date) over(partition by another_column) as indicator from schema.table
Which worked, but when I try dropping the partition using
ALTER TABLE schema.table DROP IF EXISTS PARTITION(year(validity_date) = '2022');
I get an error saying
mismatched input '(' expecting set null in drop partition statement
So my question is there any other way to remove specific rows from a table without using the delete function?
Thank you !

There is a typo in your query - missing a closing parenthesis in the end.
ALTER TABLE schema.table DROP IF EXISTS PARTITION(year(validity_date) = '2022'));
These are the only two options to delete the data : Either via drop partition or delete query
P.S. Hive supports ACID transactions like delete and update records/rows on Table only Hive 0.14 version onwards.

Syntax error while using multiple rename RENAME expressions postgresql

I'm trying to write a query that RENAMEs multiple table columns at once. According to the documentation, the syntax is:
ALTER TABLE table_name
RENAME old_col_a AS new_col_a
, RENAME old_col_b AS new_col_b...;
However, in doing so I get a syntax error located on the comma after the first RENAME clause:
ERROR: syntax error at or near ","
LINE 3: , RENAME
^
SQL state: 42601
Character: 1
The query works for multiple DROP/ALTER/ADD columns and for single RENAMEs. I just can't for the life of me figure out why this error is occurring.

You need to use multiple ALTER statements:
ALTER TABLE table_name
RENAME COLUMN old_col_a TO new_col_a;
ALTER TABLE table_name
RENAME COLUMN old_col_b TO new_col_b;
ALTER TABLE
All the forms of ALTER TABLE that act on a single table, except RENAME, SET SCHEMA, ATTACH PARTITION, and DETACH PARTITION can be combined into a list of multiple alterations to be applied together. For example, it is possible to add several columns and/or alter the type of several columns in a single command. This is particularly useful with large tables, since only one pass over the table need be made.

SQLite drop table when row in another table is deleted

I've been wrestling with setting up a trigger and keep getting the error:
SQL logic error near "DROP": syntax error
I have several tables main_table, other_one, other_two, etc.
main_table has several columns with the primary key column named filehash
The values in the primary key column of main_table are also the names of the other_* tables
So, if I delete a row in main_table with a primary key of other_one, I want the trigger to DROP the table other_one too
Here's the trigger statement that is producing the error
CREATE TRIGGER remove_other_one AFTER DELETE ON 'main_table'
WHEN (OLD.filehash == 'other_one')
BEGIN
DROP TABLE IF EXISTS 'other_one' ;
END remove_other_one;
EDIT: the 'fuller' error I get when I run the trigger statement in SQLite DB Browser is:
near "DROP": syntax error: CREATE TRIGGER remove_other_one AFTER DELETE ON 'main_table' WHEN (OLD.filehash == 'other_one') BEGIN DROP

Based on SQLite trigger doc I believe that it is not possible:
There is no option for DDL/dynamic SQL inside trigger.
I guess that you wanted to achieve something like PostgreSQL DBFiddle Demo 1 and Demo 2
You could handle your case in application code. Anyway table per date/customer/hash almost always indicates poor design and in long run will cause more problems.

Dynamic partition cannot be the parent of a static partition

I'm trying to aggregate data from 1 table (whose data is re-calculated monthly) in another table (holding the same data but for all time) in Hive. However, whenever I try to combine the data, I get the following error:
FAILED: SemanticException [Error 10094]: Line 3:74 Dynamic partition cannot be the parent of a static partition 'category'
The code I'm using to create the tables is below:
create table my_data_by_category (views int, submissions int)
partitioned by (category string)
row format delimited
fields terminated by ','
escaped by '\\'
location '${hiveconf:OUTPUT}/${hiveconf:DATE_DIR}/my_data_by_category';
create table if not exists my_data_lifetime_total_by_category
like my_data_by_category
row format delimited
fields terminated by ','
escaped by '\\'
stored as textfile
location '${hiveconf:OUTPUT}/lifetime-totals/my_data_by_category';
The code I'm using to populate the tables is below:
insert overwrite table my_data_by_category partition(category)
select mdcc.col1, mdcc2.col2, pcc.category
from my_data_col1_counts_by_category mdcc
left outer join my_data_col2_counts_by_category mdcc2 where mdcc.category = mdcc2.category
group by mdcc.category, mdcc.col1, mdcc2.col2;
insert overwrite table my_data_lifetime_total_by_category partition(category)
select mdltc.col1 + mdc.col1 as col1, mdltc.col2 + mdc.col2, mdc.category
from my_data_lifetime_total_by_category mdltc
full outer join my_data_by_category mdc on mdltc.category = mdc.category
where mdltc.col1 is not null and mdltc.col2 is not null;
The frustrating part is that I have this data partitioned on another column and repeating this same process with that partition works without a problem. I've tried Googling the "Dynamic partition cannot be the parent of a static partition" error message, but I can't find any guidance on what causes this or how it can be fixed. I'm pretty sure that there's an issue with a way that 1 or more of my tables is set up, but I can't see what. What's causing this error and what I can I do resolve it?

There is no partitioned by clause in this script. As you are trying to insert into non partitioned table using partition in insert statement, it is failing.
create table if not exists my_data_lifetime_total_by_category
like my_data_by_category
row format delimited
fields terminated by ','
escaped by '\\'
stored as textfile
location '${hiveconf:OUTPUT}/lifetime-totals/my_data_by_category';

No. You don't need to add partition clause.
You are doing group by mdcc.category in insert overwrite table my_data_by_category partition(category)..... but you are not using any UDAF.
Are you sure you can do this?

I think that if you change your second create statement to:
create table if not exists my_data_lifetime_total_by_category
partitioned by (category string)
row format delimited
fields terminated by ','
escaped by '\\'
stored as textfile
location '${hiveconf:OUTPUT}/lifetime-totals/my_data_by_category';
you should then be free of errors

SemanticException Partition spec {col=null} contains non-partition columns

I am trying to create dynamic partitions in hive using following code.
SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
create external table if not exists report_ipsummary_hourwise(
ip_address string,imp_date string,imp_hour bigint,geo_country string)
PARTITIONED BY (imp_date_P string,imp_hour_P string,geo_coutry_P string)
row format delimited
fields terminated by '\t'
stored as textfile
location 's3://abc';
insert overwrite table report_ipsummary_hourwise PARTITION (imp_date_P,imp_hour_P,geo_country_P)
SELECT ip_address,imp_date,imp_hour,geo_country,
imp_date as imp_date_P,
imp_hour as imp_hour_P,
geo_country as geo_country_P
FROM report_ipsummary_hourwise_Temp;
Where report_ipsummary_hourwise_Temp table contains following columns,
ip_address,imp_date,imp_hour,geo_country.
I am getting this error
SemanticException Partition spec {imp_hour_p=null, imp_date_p=null,
geo_country_p=null} contains non-partition columns.
Can anybody suggest why this error is coming ?

You insert sql have the geo_country_P column but the target table column name is geo_coutry_P. miss a n in country

I was facing the same error. It's because of the extra characters present in the file.
Best solution is to remove all the blank characters and reinsert if you want.

It could also be https://issues.apache.org/jira/browse/HIVE-14032
INSERT OVERWRITE command failed with case sensitive partition key names
There is a bug in Hive which makes partition column names case-sensitive.
For me fix was that both column name has to be lower-case in the table
and PARTITION BY clause's in table definition has to be lower-case. (they can be both upper-case too; due to this Hive bug HIVE-14032 the case just has to match)

It says while copying the file from result to hdfs jobs could not recognize the partition location. What i can suspect you have table with partition (imp_date_P,imp_hour_P,geo_country_P) whereas job is trying to copy on imp_hour_p=null, imp_date_p=null, geo_country_p=null which doesn't match..try to check hdfs location...the other point what i can suggest not to duplicate column name and partition twice

insert overwrite table report_ipsummary_hourwise PARTITION (imp_date_P,imp_hour_P,geo_country_P)
SELECT ip_address,imp_date,imp_hour,geo_country,
imp_date as imp_date_P,
imp_hour as imp_hour_P,
geo_country as geo_country_P
FROM report_ipsummary_hourwise_Temp;
The highlighted fields should be the same name available in the report_ipsummary_hourwise file

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to delete multiple partitions in HIVE table - hive

Related

Removing specific rows from table without using DELETE

Syntax error while using multiple rename RENAME expressions postgresql

SQLite drop table when row in another table is deleted

Dynamic partition cannot be the parent of a static partition

SemanticException Partition spec {col=null} contains non-partition columns

Categories

Resources