Getting exception while updating table in Hive - hive

I have created one table in hive from existing s3 file as follows:
create table reconTable (
entryid string,
run_date string
)
LOCATION 's3://abhishek_data/dump1';
Now I would like to update one entry as follows:
update reconTable set entryid='7.24E-13' where entryid='7.24E-14';
But I am getting following error:
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
I have gone through a few posts here, but not getting any idea how to fix this.

I think you should create an external table when reading data from a source like S3.
Also, you should declare the table in ORC format and set properties 'transactional'='true'.
Please refer to this for more info: attempt-to-do-update-or-delete-using-transaction-manager

You can refer to this Cloudera Community Thread:
https://community.cloudera.com/t5/Support-Questions/Hive-update-delete-and-insert-ERROR-in-cdh-5-4-2/td-p/29485

Related

Getting a Databricks drop schema error for delta table

I have a delta table schema that needs new columns/changed data types (Usually I do this on non delta tables and those work fine)
I have already dropped the existing delta table and tried dropping the schema and getting a 'v1 session catalog' error.
I am currently using SQL, 10.4 LTS cluster, spark3.2.1, scala 2.12 (I cant change these computes), driver and workers are standard E_v4
What I already did, and worked as usual
drop table if exists dbname.tablename;
What I wanted to do next:
drop schema if exists dbname.tablename;
The error I got instead:
Error in SQL statement: AnalysisException: Nested databases are not supported by v1 session catalog: dbname.tablename
When I try recreating the schema in the same location I get the error:
AnalysisException: The specified schema does not match the existing schema at dbfs:locationOfMy/table
... Differences
-Specified schema has additional fields newColNameIAdded, anotherNewColIAdded
-Specified type for myOldCol is different from existing schema ...
If your intention is to keep the existing schema, you can omit the
schema from the create table command. Otherwise please ensure that
the schema matches.
How can I do the schema drop and re-register it in same location and same name with new definitions?
Answering a month later since I didnt get replies and found the right solution;
Delta files have left over partitions and logs that cannot be updated using the drop commands. I had to manually delete the logs depending on where my location was.
Try this:
dbutils.fs.rm(path, True)
Use the path of your schema.
Then create your table again.

Truncating tables in Azure Data Factory Pre-Copy script?

I am building a pipeline, and now I need to truncate my destination tables in azure sql db, but before that I need to truncate the destination tables. but I can't figure out the script:
Click to view the ADF screenshot for SINK settings
instead, I put this code but that is wrong because it runs before every copy of the tables (5 times) and truncates all the table except the last one. so I need to make it parameterized I guess:
*truncate table [dbo].[Global_data.csv]
truncate table [dbo].[Option_data.csv]
truncate table [dbo].[State_data.csv]
truncate table [dbo].[Status_data.csv]
truncate table [dbo].[Target_data.csv]*
Also please see my source parameters:
**ADLSv2 container: #pipeline().parameters.SourceContainer
ADLSv2 Directory: #pipeline().parameters.SourceDirectory
ADLSv2 filename: #item().name
Sink TableName: #item().name**
So I'm guessing that my pre-script must be something like:
truncate table #item().name but this resulted an error for me:
Error Screenshot
DetailsErrorCode= SqlOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A database operation failed with the following error: 'Incorrect syntax near '#item'.',Source=,''Type=System.Data.SqlClient.SqlException,Message=Incorrect syntax near '#item'.,Source=.Net SqlClient Data Provider,SqlErrorNumber=102,Class=15,ErrorCode=-2146232060,State=1,Errors=[{Class=15,Number=102,State=1,Message=Incorrect syntax near '#item'.,},],'
when I use TRUNCATE TABLE [#{item()}] , I get below error 5 times (one for each table accordingly):
ErrorCode=SqlOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A database operation failed with the following error: 'Cannot find the object "{"name":"StateMetadata.csv","type":"File"}" because it does not exist or you do not have permissions.',Source=,''Type=System.Data.SqlClient.SqlException,Message=Cannot find the object "{"name":"StateMetadata.csv","type":"File"}" because it does not exist or you do not have permissions.,Source=.Net SqlClient Data Provider,SqlErrorNumber=4701,Class=16,ErrorCode=-2146232060,State=1,Errors=[{Class=16,Number=4701,State=1,Message=Cannot find the object "{"name":"StateMetadata.csv","type":"File"}" because it does not exist or you do not have permissions.,},],'
Please use truncate table TRUNCATE TABLE [#{item().name}]

U-SQL External table error: 'Unable to cast object of type 'System.DBNull' to type 'System.Type'.'

I'm failing to create external tables to two specific tables from Azure SQL DB,
I already created few external tables with no issues.
The only difference I can see between the failed and the successful external tables is that the tables that failed contains geography type columns, so I think this is the issue but i'm not sure.
CREATE EXTERNAL TABLE IF NOT EXISTS [Data].[Devices]
(
[Id] int
)
FROM SqlDbSource LOCATION "[Data].[Devices]";
Failed to connect to data source: 'SqlDbSource', with error(s): 'Unable to cast object of type 'System.DBNull' to type 'System.Type'.'
I solved it by doing a workaround to the external table:
I created a view that select from external rowset using EXECUTE
CREATE VIEW IF NOT EXISTS [Data].[Devices]
AS
SELECT Id FROM EXTERNAL SqlDbSource
EXECUTE "SELECT Id FROM [Data].[Devices]";
This made the script to completely ignore the geography type column, which is currently not supported as REMOTEABLE_TYPE for data sources by U-SQL.
Please have a look at my answer on the other thread opened by you. To add to that, I would also recommend you to have a look at how to create a table using a query. In the query, you should be able to use "extractors" in the query to create the tables. To read more about extractors, please have a look at this doc.
Hope this helps.

How to load data to Hive table and make it also accessible in Impala

I have a table in Hive:
CREATE EXTERNAL TABLE sr2015(
creation_date STRING,
status STRING,
first_3_chars_of_postal_code STRING,
intersection_street_1 STRING,
intersection_street_2 STRING,
ward STRING,
service_request_type STRING,
division STRING,
section STRING )
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES (
'colelction.delim'='\u0002',
'field.delim'=',',
'mapkey.delim'='\u0003',
'serialization.format'=',', 'skip.header.line.count'='1',
'quoteChar'= "\"")
The table is loaded data this way:
LOAD DATA INPATH "hdfs:///user/rxie/SR2015.csv" INTO TABLE sr2015;
Why the table is only accessible in Hive? when I attempt to access it in HUE/Impala Editor I got the following error:
AnalysisException: Could not resolve table reference: 'sr2015'
which seems saying there is no such a table, but the table does show up in the left panel.
In Impala-shell, error is different as below:
ERROR: AnalysisException: Failed to load metadata for table: 'sr2015'
CAUSED BY: TableLoadingException: Failed to load metadata for table:
sr2015 CAUSED BY: InvalidStorageDescriptorException: Impala does not
support tables of this type. REASON: SerDe library
'org.apache.hadoop.hive.serde2.OpenCSVSerde' is not supported.
I have always been thinking Hive table and Impala table are essentially the same and difference is Impala is a more efficient query engine.
Can anyone help sort it out? Thank you very much.
Assuming that sr2015 is located in DB called db, in order to make the table visible in Impala, you need to either issue
invalidate metadata db;
or
invalidate metadata db.sr2015;
in Impala shell
However in your case, the reason is probably the version of Impala you're using, since it doesn't support the table format altogether

Error while reviewing file after inserting data in redshift table

I have a table in Redshift in which I am inserting data from S3.
I viewed the table before inserting the data and it returned a blank table.
However, After inserting data in Redshift table, I am getting below error while doing select * from table.
Command to copy data in table from S3 runs successfully without any error.
java.lang.NoClassDefFoundError:
com/amazon/jdbc/utils/DataTypeUtilities$NumericRepresentation error in
redshift
what could be the possible cause and sol for this?
I have faced this error : java.lang.NoClassDefFoundError when the JDBC connection properties are set incorrectly.
If you are using postgres driver then ensure using postgres://
eg : jdbc:postgresql:// HostName:5439/
Let me know if this works.