What is the use of PURGE in DROP statement of Hive? - hive

Hi Please describe the difference between both in hive with example.

DROP TABLE [IF EXISTS] table_name [PURGE];
If you don't use purge the table goes to a Trash directory, from there the table can be recovered after drop it. But if you do use purge table won't go to Trash directory, so it can't be recovered.
Regards !!

Related

Drop and "Create Table as Select" (CTaS) in Oracle

Could you explain some DBA guys advise to not use CTaS then drop table repeatedly lots of time? Whether it impacts to data dictionary in Oracle.
it impacts the undo . Also if database is in archive log mode
lot of archvie logs gets created .
If archive space is filled datbase can crash.
Also in the newer versions of Oracle dropped tables sit in the Recycle Bin unless you do DROP TABLE . PURGE or some such.

Hive: DROP TABLE IF EXISTS <Table Name> does not free memory

When I am using DROP TABLE IF EXISTS <Table Name> in hive, it is not freeing the memory. The files are created as 0000_n.bz2 and they are still on disk.
I have two questions here:
1) Will these files keep on growing for each and every insert?
2) Is there any DROP equivalent to remove the files as well on the disk?
Couple of things you can do:
Check if the table is an external table and in that case you need to drop the files manually on HDFS as dropping tables won't drop the files:
hadoop fs -rm /HDFS_location/filename
Secondly check if you are in the right database. You need to issue use database command before dropping the tables. The database should be same as the one in which tables were created.
There are two types of tables in hive.
Hive managed table: If you drop a hive managed table the data in HDFS are automatically deleted.
External Table: If you drop an external table, hive doesnt delete the underlying data.
I believe yours is an external table.
Drop table if exists table_name purge;
This command will also remove data files from trash folder and cannot be recovered after table drop

Dropping all tables sqlplus

Whenever I perform select * from tab; I get tables that i did not create.
It looks like:
TNAME TABTYPE CLUSTERID
------------------------------ ------- ----------
BIN$GGrKjbVGTVaus4568IEhUQ==$0 TABLE
BIN$H+a0o3uyTTKTOA8WMkNltg==$0 TABLE
BIN$IUNyfOwkS0WSEVjbn04mNw==$0 TABLE
BIN$K/3NJw5zRXyRqPixL3tqDA==$0 TABLE
BIN$KQw9SejEToywXlHp18FMZA==$0 TABLE
BIN$MOEfgWgsS0GkC/CpYW+cxA==$0 TABLE
BIN$QkUYVciPQpWBwqBhxH+Few==$0 TABLE
BIN$QmtbaOYiTHCGEE0PRiLzmg==$0 TABLE
BIN$QxF4/JShTxu8PYIx8g/L7Q==$0 TABLE
BIN$UtEI7RbiQvOYzKqJEibwKQ==$0 TABLE
BIN$VMG0FXp2ROCKbedj3Ge9hg==$0 TABLE
I tried performing
select 'drop table '||table_name||' cascade constraints;' from user_tables;
on spool and executing but those table were not selected. It just looks really messy and is bothering me a lot. What are they? Is there any way I can get rid of them ? Or do I have to just deal with it and work with it?
Q What are they?
A Looks like tables that were dropped and preserved in the RECYCLEBIN.
Q Is there any way I can get rid of them ?
A You can use e.g. PURGE TABLE BIN$GGrKjbVGTVaus4568IEhUQ==$0 ; to remove them.
That will do them individually. Note that indexes and LOBs (and other out-of-line storage) may also have entries in the recycle bin. There are other statements you can use to clear out all entries from the recycle bin.
Q Or do I have to just deal with it and work with it?
A That's up to you.
There are statements to purge the recycle bin for the current user, or if you have privileges, for the entire database. You can also disable the recyclebin, or ( I think) there's an option on the DROP TABLE that will drop a table without keeping it in the recyclebin (so these won't be created in the future.)
There's no need for me to repeat the contents of the Oracle documentation.
Refer to: http://docs.oracle.com/cd/B28359_01/server.111/b28310/tables011.htm#ADMIN01511
Posting as a comment for better visibility..
Credits to #Tomás for pointing me in the right direction.
I had to do
purge recyclebin;
instead of purge dba_recyclebin since I have insufficient privilege(I am a student logged in via ssh).
If you want to turn your recycle bin off, Use ALTER SESSION SET recyclebin = OFF;

How to update partition metadata in Hive , when partition data is manualy deleted from HDFS

What is the way to automatically update the metadata of Hive partitioned tables?
If new partition data's were added to HDFS (without alter table add partition command execution) . then we can sync up the metadata by executing the command 'msck repair'.
What to be done if a lot of partitioned data were deleted from HDFS (without the execution of alter table drop partition commad execution).
What is the way to syncup the Hive metatdata?
EDIT : Starting with Hive 3.0.0 MSCK can now discover new partitions or remove missing partitions (or both) using the following syntax :
MSCK [REPAIR] TABLE table_name [ADD/DROP/SYNC PARTITIONS]
This was implemented in HIVE-17824
As correctly stated by HakkiBuyukcengiz, MSCK REPAIR doesn't remove partitions if the corresponding folder on HDFS was manually deleted, it only adds partitions if new folders are created.
Extract from offical documentation :
In other words, it will add any partitions that exist on HDFS but not in metastore to the metastore.
This is what I usually do in the presence of external tables if multiple partitions folders are manually deleted on HDFS and I want to quickly refresh the partitions :
Drop the table (DROP TABLE table_name)
(dropping an external table does not delete the underlying partition files)
Recreate the table (CREATE EXTERNAL TABLE table_name ...)
Repair it (MSCK REPAIR TABLE table_name)
Depending on the number of partitions this can take a long time. The other solution is to use ALTER TABLE DROP PARTITION (...) for each deleted partition folder but this can be tedious if multiple partitions were deleted.
Try using
MSCK REPAIR TABLE <tablename>;
Ensure the table is set to external, drop all partitions then run the table repair:
alter table mytable_name set TBLPROPERTIES('EXTERNAL'='TRUE')
alter table mytable_name drop if exists partition (`mypart_name` <> 'null');
msck repair table mytable_name;
If msck repair throws an error, then run hive from the terminal as:
hive --hiveconf hive.msck.path.validation=ignore
or set hive.msck.path.validation=ignore;

SQL Server Unable to Open Table or Rename

Hey when I try open table I receive the message
Timeout Expired
Then when I try and rename the table I get
Rename Failed Lock Request Time out Expired
Basically I just want to delete the content of this table but in every step there is something stopping me.
Any Ideas ?
If you just want to empty it, use DELETE FROM TABLENAME or TRUNCATE TABLENAME in SSMS or SQLCMD.
Sounds like some running query/transaction is locking the table. Try using sp_who to see what activity is occuring:
USE master
EXEC sp_who 'active'
Or you can use SQL Server Profiler to see what queries are running against your DB.
SqlACID tel you the best and easiest way to solve your problem.
truncate will work faster and effective, you will not have large archive log files, but after you can't restore your data. it will delete permanently and the flash_got command will not help you.
if truncate does not works too after that you must recreate your table by this command
drop table your_table cascade constraints;
-- Create table
create table your_table (col1 number, col2 varchar2(5)).
if it does not works too after this create another table like it:
create table temp_table2 as
select * from temp_table 1 where 1=2