How to truncate a partitioned external table in hive? - hive

I'm planning to truncate the hive external table which has one partition. So, I have used the following command to truncate the table :
hive> truncate table abc;
But, it is throwing me an error stating : Cannot truncate non-managed table abc.
Can anyone please suggest me out regarding the same ...

Make your table MANAGED first:
ALTER TABLE abc SET TBLPROPERTIES('EXTERNAL'='FALSE');
Then truncate:
truncate table abc;
And finally you can make it external again:
ALTER TABLE abc SET TBLPROPERTIES('EXTERNAL'='TRUE');

By default, TRUNCATE TABLE is supported only on managed tables. Attempting to truncate an external table results in the following error:
Error: org.apache.spark.sql.AnalysisException: Operation not allowed: TRUNCATE TABLE on external tables
Action Required
Change applications. Do not attempt to run TRUNCATE TABLE on an external table.
Alternatively, change applications to alter a table property to set external.table.purge to true to allow truncation of an external table:
ALTER TABLE mytable SET TBLPROPERTIES ('external.table.purge'='true');

There is an even better solution to this, which is basically a one liner.
insert overwrite table table_xyz select * from table_xyz where 1=2;
This code will delete all the files and create a blank file in the external folder location with absolute zero records.

Look at https://issues.apache.org/jira/browse/HIVE-4367 : use
truncate table my_ext_table force;

Related

I can't run inserts in firebird 2.5, Table Unknown

I'm trying to create a temporary table to save some codes, but when I try to insert a code it throws me the following error as if the table did not exist:
can't format message 13:796 -- message file C:\Windows\firebird.msg
not found. Dynamic SQL Error. SQL error code = -204. Table unknown.
TEMPCODES. At line 1, column 13.
These are the lines that I try to run:
create global temporary table TEMPCODES
(
codigo varchar(13)
)
on commit delete rows;
insert into TEMPCODES values('20-04422898-0');
Why can't it find the table if I'm creating it before?
In Firebird, you cannot use a database object in the same transaction that created it. You need to commit before you can use the table.
In other words, you should use:
create global temporary table TEMPCODES
(
codigo varchar(13)
)
on commit delete rows;
commit;
insert into TEMPCODES values('20-04422898-0');
Also, it is important to realise that global temporary tables (GTT) are intended as permanent objects. The idea is to create a GTT once, and then use it whenever you need it. The content of a GTT is only visible to the current transaction (on commit delete rows) or to the current connection (on commit preserve rows). Creating a GTT on the fly is not the normal usage pattern for GTTs.

How can I view the definition of a temporary table?

I make a temporary table in SQL Server:
create table #stun (name varchar(40),id int,gender varchar(40))
How can I view its definition afterwards?
you can check this way-
SELECT *
INTO #TempTable
FROM table_name -- any table from database
EXEC tempdb..sp_help '#TempTable'
DROP TABLE #TempTable
Solution 1 :
You can query data against it within the current session :
SELECT
*
FROM
#yourtemporarytable;
Sometimes, you may want to create a temporary table that is accessible across connections. In this case, you can use global temporary tables.
Unlike a temporary table, the name of a global temporary table starts with a double hash symbol (##).
CREATE TABLE ##global_temp (
...
);
SELECT
*
FROM
##global_temp
Solution 2 :
Using SSMS, you can find the table in the left pane >> Design >> get the table structure.

Hive Table is MANAGED or EXTERNAL - issue post table type conversion

I have a hive table in XYZ db named ABC.
When I run describe formatted XYZ.ABC; from hue, I get the following..
that is
Table Type: MANAGED_TABLE
Table Parameters: EXTERNAL True
So is this actually an external or a managed/internal hive table?
This is treated as an EXTERNAL table. Dropping table will keep the underlying HDFS data. The table type is being shown as MANAGED_TABLE since the parameter EXTERNAL is set to True, instead of TRUE.
To fix this metadata, you can run this query:
hive> ALTER TABLE XYZ.ABC SET TBLPROPERTIES('EXTERNAL'='TRUE');
Some details:
The table XYZ.ABC must have been created via this kind of query:
hive> CREATE TABLE XYZ.ABC
<additional table definition details>
TBLPROPERTIES (
'EXTERNAL'='True');
Describing this table will give:
hive> desc formatted XYZ.ABC;
:
Location: hdfs://<location_of_data>
Table Type: MANAGED_TABLE
:
Table Parameters:
EXTERNAL True
Dropping this table will keep the data referenced in Location in describe output.
hive> drop table XYZ.ABC;
# does not drop table data in HDFS
The Table Type still shows as MANAGED_TABLE which is confusing.
Making the value for EXTERNAL as TRUE will fix this.
hive> ALTER TABLE XYZ.ABC SET TBLPROPERTIES('EXTERNAL'='TRUE');
Now, doing a describe will show it as expected:
hive> desc formatted XYZ.ABC;
:
Location: hdfs://<location_of_data>
Table Type: EXTERNAL_TABLE
:
Table Parameters:
EXTERNAL TRUE
Example -
Lets create a sample MANAGED table,
CREATE TABLE TEST_TBL(abc int, xyz string);
INSERT INTO TABLE test_tbl values(1, 'abc'),(2, 'xyz');
DESCRIBE FORMATTED test_tbl;
Changing type to EXTERNAL (in the wrong way using True, instead of TRUE):
ALTER TABLE test_tbl SET TBLPROPERTIES('EXTERNAL'='True');
This gives,
Now lets DROP the table,
DROP TABLE test_tbl;
The result:
Table is dropped but data on HDFS isn't. Showing correct external table behavior!
If we re-create the table we can see data exists:
CREATE TABLE test_tbl(abc int, xyz string);
SELECT * FROM test_tbl;
Result:
The describe shows it wrongly as MANAGED TABLE along with EXTERNAL True because of:
.equals check in the meta
Hive Issue JIRA: HIVE-20057
Proposed fix: Use case insensitive equals

User defined type dropped while in use

I have an Oracle database with a table as follows:
create table Table1
(Column1 number(5,0),
Column2 special_type);
Now, due to some data errors, the support team decided the fix would be to drop and recreate the Type.
I now have a table as follows:
Table1
Column1 number(5,0),
Column2 null
The problem is, I cannot drop the table, I get a "Table has errors" message. I cannot alter the table, I get a "Table has errors" message. I have tried to manipulate the DDL in Oracle SQL Developer, guess what I get? A "Table has errors" message.
Can anyone point me in the right direction?
It seems support team have dropped the type forcefully. In normal scenario, you cannot drop a type if it has any dependent table.
See demo:
SQL> CREATE OR REPLACE TYPE NUU AS TABLE OF NUMBER;
/
Type created.
SQL> CREATE TABLE TBLLL(NUM NUU)
NESTED TABLE NUM STORE AS VVVVV ;
/
Table created.
Now when i try to do a simple drop type, i get the below error:
SQL> drop type nuu;
drop type nuu
*
ERROR at line 1:
ORA-02303: cannot drop or replace a type with type or table dependents
So i drop it forcefully:
SQL> drop type nuu force;
Type dropped.
And when i try to make a select i get the error:
SQL> select * from tblll;
select * from tblll
*
ERROR at line 1:
ORA-04063: table "USER.TBLLL" has errors
So in order to Alter the table you first need to create the type back. Once the type is created back, your table definition becomes correct and then you can Alter your table.
Other solution is to drop the table and recreate it which you already mentioned is not working.
You can try the following -
re-create the type
create another table with same table structure
insert the data into the second table
use the second table now.
You may want to not recreate the type since it caused issues earlier. Find an alternative. It might solve your issue.

Can AWS Redshift drop a table that is wrapped in transaction?

During the ETL we do the following operations:
begin transaction;
drop table if exists target_tmp;
create table target_tmp like target;
insert into target_tmp select * from source_a inner join source_b on ...;
analyze table target_tmp;
drop table target;
alter table target_tmp rename to target;
commit;
The SQL command is performed by AWS Data Pipeline, if this is important.
However, the pipelines sometimes fail with the following error:
ERROR: table 111566 dropped by concurrent transaction
Redshift supports serializable isolation. Does one of the commands break isolation?
Yes that works, but if generating the temp table takes a while you can expect to see that error for other queries while it runs. You could try generating the temp table in a separate transaction (transaction may not be needed unless you worry about updates to the source tables). Then do a quick rotation of the table names so there is much less time for contention:
-- generate target_tmp first then
begin;
alter table target rename to target_old;
alter table target_tmp rename to target;
commit;
drop table target_old;