ACID table error in IMPALA?Hive upgraded to Hive3 - hive

I am very new to Hive and Impala.
I was trying to run an already existing table in IMPALA but I got the following error.
AnalysisException: Table dev_test.customer not supported. Transactional (ACID) tables are only supported when they are configured as insert_only.
The version is Hive 3. I am clueless as in what to do. I did see some documentation, articles online, but still could not solve the issue. I have attached a screenshot of the error screen. Let me know if you need more information.
Any help is greatly appreciated. Thanks!

Unfortunately you cant see the data through Impala and you have to use hive.
you can change table properties to insert_only to see this data.
alter TABLE tmp2 set
TBLPROPERTIES (
'transactional'='true', 'transactional_properties'='insert_only'
);
When you set a table to FULL ACID or hive upgrades to full acid, table file format changed to ORC and this is not supported by Impala so you can not access them. So you need to use hive to access these tables.
If you choose the workaround and change table properties, you will loose all ACID benefits like UPD/DEL etc.

Related

Getting a Databricks drop schema error for delta table

I have a delta table schema that needs new columns/changed data types (Usually I do this on non delta tables and those work fine)
I have already dropped the existing delta table and tried dropping the schema and getting a 'v1 session catalog' error.
I am currently using SQL, 10.4 LTS cluster, spark3.2.1, scala 2.12 (I cant change these computes), driver and workers are standard E_v4
What I already did, and worked as usual
drop table if exists dbname.tablename;
What I wanted to do next:
drop schema if exists dbname.tablename;
The error I got instead:
Error in SQL statement: AnalysisException: Nested databases are not supported by v1 session catalog: dbname.tablename
When I try recreating the schema in the same location I get the error:
AnalysisException: The specified schema does not match the existing schema at dbfs:locationOfMy/table
... Differences
-Specified schema has additional fields newColNameIAdded, anotherNewColIAdded
-Specified type for myOldCol is different from existing schema ...
If your intention is to keep the existing schema, you can omit the
schema from the create table command. Otherwise please ensure that
the schema matches.
How can I do the schema drop and re-register it in same location and same name with new definitions?
Answering a month later since I didnt get replies and found the right solution;
Delta files have left over partitions and logs that cannot be updated using the drop commands. I had to manually delete the logs depending on where my location was.
Try this:
dbutils.fs.rm(path, True)
Use the path of your schema.
Then create your table again.

Data update in big data system

I'm using Spark Streaming and Hive. I want to insert or update data into exist table in hive using SparkSQL. But, I don't know check whether data existed. Please suggest me this question!

Trying to copy data from Impala Parquet table to a non-parquet table

I am moving data around within Impala, not my design, and I have lost some data. I need to copy the data from the parquet tables back to their original non-parquet tables. Originally, the developers had done this with a simple one liner in a script. Since I don't know anything about databases and especially about Impala I was hoping you could help me out. This is the one line that is used to translate to a parquet table that I need to be reversed.
impalaShell -i <ipaddr> use db INVALIDATE METADATA <text_table>;
CREATE TABLE <parquet_table> LIKE <text_table> STORED AS PARQUET TABLE;
INSERT OVERWRITE <parquet_table> SELECT * FROM <text_table>;
Thanks.
Have you tried simply doing
CREATE TABLE <text_table>
AS
SELECT *
FROM <parquet_table>
Per the Cloudera documentation, this should be possible.
NOTE: Ensure that your does not exist or use a table name that does not already exist so that you do not accidentally overwrite other data.

What is the default schema in Hive?

For Pig, the default schema is ByteArray. Is there a default schema for Hive if we don't mention a schema in Hive? I tried to look at some Hive documentation but couldn't find any.
Hive is schema on Read --- I am not sure this is the answer...If some one could give an insight on this that would be great
Hive does the best that it can to
read the data. You will get lots of null values if there aren’t enough fields in each record
to match the schema. If some fields are numbers and Hive encounters nonnumeric
strings, it will return nulls for those fields. Above all else, Hive tries to recover from all
errors as best it can.
There is not default schema in Hive, in order to query data in hive you have to first create a table explaining the content of your data (by using create external table ... location).
So you basically have to tell hive the "scheme" before querying the data.

Change the database's table in hive or hcatalog

Is there a way to change the database's table in hive or Hcatalog?
For instance, I have the table foo in the database default, and I want to put this table in the database bar. I try this, but it doesn't work:
ALTER TABLE foo RENAME TO bar.foo
Thanks in advance
AFAIK there is no way in HiveQL to do this. A ticket was raised long back though. But the issue is still open.
An alternate could be to use the EXPORT/IMPORT feature provided by Hive. With this feature we can export the data of a table to a HDFS file along with the metadata using the EXPORT command. The data is stored in JSON format. Data once exported this way could be imported back to another database (even another hive instance) using the IMPORT command.
More on this can be found on the IMPORT/EXPORT MANUAL.
HTH
thanks for your response. I found an other mean to change the database
USE db1; CREATE TABLE db2.foo like foo