Turn non-Kudu to Kudu table in Impala

Turn non-Kudu to Kudu table in Impala - impala

having problem with impala update statement, when I used code below
update john_estares_db.tempdbhue set QU=concat(account_id,"Q",quarter(mrs_change_date)," ",year(mrs_change_date));
it return error message:
AnalysisException: Impala does not support modifying a non-Kudu table: john_estares_db.tempdbhue
I would like to know if I can either change my non-Kudu table into a Kudu table or is there an alternate for update statement for non-Kudu in Impala. TIA

Apache Kudu is a data store (think of it as alternative to HDFS/S3 but stores only structured data) which allows updates based on primary key. This has good integration with Impala. A kudu table on Imapla is a way to query data stored on Kudu.
In short if you do not already have Kudu installed and setup already you cannot create a kudu table on Impala.
If you have it Kudu installed and setup, then you cannot simply convert a table kudu table. You have to create a new kudu table with similar structure with some primary-key columns (Kudu requires primary key for all tables) and insert data into this from old non-kudu table using sql query insert into .. select * from ....

What is the type of table john_estares_db.tempdbhue?
Hive or other table type, update or upsert is not supported.
You can use show create table to check your table type.
show create table
If you have kudu installed you can create a kudu table, and move your data into kudu table,then you can use your update code.

Related

Old records appear in the Hadoop table after drop and creating new table with the same old name

I have a question regarding creating tables in Hadoop.
I create external table the following way:
CREATE EXTERNAL HADOOP TABLE SCHEMA.TABLENAME (
ID BIGINT NOT NULL,
CODE INTEGER,
"VALUE" DOUBLE
STORED AS ORC
TBLPROPERTIES ('bigsql.table.io.doAs'='false',
'bucketing_version'='2',
'orc.compress'='ZLIB',
'orc.create.index'='true')
After I created this table I run Jenkins job (with sqoop process) which loads 70.000.000 records to this table.
Then I needed to remove this table, so I run:
DROP TABLE SCHEMA.TABLENAME
Later on I want to create a table with the same name as the previous one, but I need it to be empty. I make the same query as earlier, I do:
CREATE EXTERNAL HADOOP TABLE SCHEMA.TABLENAME (
ID BIGINT NOT NULL,
CODE INTEGER,
"VALUE" DOUBLE
STORED AS ORC
TBLPROPERTIES ('bigsql.table.io.doAs'='false',
'bucketing_version'='2',
'orc.compress'='ZLIB',
'orc.create.index'='true')
But when I create table this way, it has 70.000.000 records inside it again, although I didn't run any job to populate it.
This is why I have two questions:
When I drop and create table with old name, then is it recovering records from the old table?
How can I drop (or truncate) table in bigsql/hive so that I have an empty table with the old name.
I am using bigsql and hive.

Dropping an external table doesn't remove the stored data, only the metadata from the Hive Metastore.
Refer Managed vs External Tables
Key points...
Use external tables when files are already present or in remote locations
files should remain even if the table is dropped
Create a managed table (remove EXTERNAL from your query), if you want to be able to DROP and/or TRUNCATE.
Or have your Jenkins job run hadoop fs -rm -skipTrash before the import.

Setting transactional-table properties results in external table

I am creating a managed table via Impala as follows:
CREATE TABLE IF NOT EXISTS table_name
STORED AS parquet
TBLPROPERTIES ('transactional'='false', 'insert_only'='false')
AS ...
This should result in a managed table which does not support HIVE-ACID.
However, when I run the command I still end up with an external table.
Why is this?

I found out in the Cloudera documentation that neglecting the EXTERNAL-keyword when creating the table does not mean that the table definetly will be managed:
When you use EXTERNAL keyword in the CREATE TABLE statement, HMS stores the table as an external table. When you omit the EXTERNAL keyword and create a managed table, or ingest a managed table, HMS might translate the table into an external table or the table creation can fail, depending on the table properties.
Thus, setting transactional=false and insert_only=false leads to an External Table in the interpretation of the Hive Metastore.
Interestingly, only setting TBLPROPERTIES ('transactional'='false') is completly ignored and will still result in a managed table having transactional=true).

oracle create table into new database with primary key and indexes

I have a table called TableA in DatabaseA and I want to create the same TableA in DatabaseB. I am able to do so but copying only the structure and the data, I seem unable to also create the primary keys and indexes. Is there an SQL statement I can use that copies the table structure, the table data, the primary keys and indexes please?
I am using Oracle 11G.

1. First Method
To get tables and indexes without data see following post
Stack Post
after creating table you can load data using
insert into dest_table as select * from source_table
2. Second Method
use expdp to take backup of source table using table=yourtable parameter as this will by default will take indexes and when you will import using impdp on destination database it will automatically rebuild those indexes.

List Hive table properties to create another table

I have a table on Hive already created. Is there a way to copy the table schema to a terminal to pass it to a create table on another Hive server?

Have you tried the SHOW CREATE TABLE <tablename> command? I think it should give you the create ddl you are looking for.
This link provides some background on when this was implemented.

Hive external table add new column

I see options that allow to add new columns in Hive [source]
However, I have EXTERNAL Table which is mapped with HBase with SERDEPROPERTIES, TBLPROPERTIES and STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'.
Is it possible to add/alter columns after External table with HBase is setup?
Do I just SERDEPROPERTIES for the new columns or do we need to re-do the whole table?

When you try to use ALTER TABLE xx ADD COLUMS( xx string); you get the following error.
SemanticException [Error 10134]: ALTER TABLE cannot be used for a non-native table hbase_cdr2
So looking at this, it seems there is no way to update existing table by adding new columns. But you can drop the hive table, create a new table with required columns. Since it is an external table you only update the metadata by doing that.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Turn non-Kudu to Kudu table in Impala - impala

Related

Old records appear in the Hadoop table after drop and creating new table with the same old name

Setting transactional-table properties results in external table

oracle create table into new database with primary key and indexes

List Hive table properties to create another table

Hive external table add new column

Categories

Resources