How to fix 'get table schema from server' error in Hive - hive

I'm using Microsoft Hive ODBC driver to connect hive server. An error occurred while I'm trying to execute 'select * from tb limit 100' using a table 'tb' with schema csv and a partition key. Other table without partition key can execute successfully.
ERROR [HY000] [Microsoft][Hardy] (97) Error occurred while trying to
get table schema from server. Error: [Microsoft][Hardy] (35) Error
from server: error code: '0' error message:
'MetaException(message:java.lang.UnsupportedOperationException:
Storage schema reading not supported)'.

Add below configuration under "Custom hive-site":
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
It worked for me.
Note: Restart affected services after saving the configuration.

Related

Truncating tables in Azure Data Factory Pre-Copy script?

I am building a pipeline, and now I need to truncate my destination tables in azure sql db, but before that I need to truncate the destination tables. but I can't figure out the script:
Click to view the ADF screenshot for SINK settings
instead, I put this code but that is wrong because it runs before every copy of the tables (5 times) and truncates all the table except the last one. so I need to make it parameterized I guess:
*truncate table [dbo].[Global_data.csv]
truncate table [dbo].[Option_data.csv]
truncate table [dbo].[State_data.csv]
truncate table [dbo].[Status_data.csv]
truncate table [dbo].[Target_data.csv]*
Also please see my source parameters:
**ADLSv2 container: #pipeline().parameters.SourceContainer
ADLSv2 Directory: #pipeline().parameters.SourceDirectory
ADLSv2 filename: #item().name
Sink TableName: #item().name**
So I'm guessing that my pre-script must be something like:
truncate table #item().name but this resulted an error for me:
Error Screenshot
DetailsErrorCode= SqlOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A database operation failed with the following error: 'Incorrect syntax near '#item'.',Source=,''Type=System.Data.SqlClient.SqlException,Message=Incorrect syntax near '#item'.,Source=.Net SqlClient Data Provider,SqlErrorNumber=102,Class=15,ErrorCode=-2146232060,State=1,Errors=[{Class=15,Number=102,State=1,Message=Incorrect syntax near '#item'.,},],'
when I use TRUNCATE TABLE [#{item()}] , I get below error 5 times (one for each table accordingly):
ErrorCode=SqlOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A database operation failed with the following error: 'Cannot find the object "{"name":"StateMetadata.csv","type":"File"}" because it does not exist or you do not have permissions.',Source=,''Type=System.Data.SqlClient.SqlException,Message=Cannot find the object "{"name":"StateMetadata.csv","type":"File"}" because it does not exist or you do not have permissions.,Source=.Net SqlClient Data Provider,SqlErrorNumber=4701,Class=16,ErrorCode=-2146232060,State=1,Errors=[{Class=16,Number=4701,State=1,Message=Cannot find the object "{"name":"StateMetadata.csv","type":"File"}" because it does not exist or you do not have permissions.,},],'
Please use truncate table TRUNCATE TABLE [#{item().name}]

Large CSV file load into SQL Columnstore table Random Error , After restart same SSIS works fine

I had experienced strange error while loading CSV file (, separated values) contain ~1.2 million rows into SQL (2016) columnstore table using SSIS package, I got following error rarely, especially on datetime columns.
After I simply restarted failed ETL it just works fine. We load same file in different environments and error appears only in one environment on same day.
I try to add error output and wait to see for next time, Meanwhile I would like to reach out to experts ask for help, if there is any issue with SQL columnstore table or SSIS while loading datatime values.
But error is while insert data, so it could be more of database side issue.
PackageName > Sequence Container > DFL Transactions. SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available. Source: "Microsoft SQL Server Native Client 11.0" Hresult: 0x80004005 Description: "Invalid date format".
PackageName > Sequence Container > DFL Transactions. There was an error with DST Transactions.Inputs[OLE DB Destination Input].Columns[Account_FromDate] on DST Transactions.Inputs[OLE DB Destination Input]. The column status returned was: "Conversion failed because the data value overflowed the specified type.".

Bigquery Error: 8822097

On trying to load a json file to bigquery. I get the following error: "An internal error occurred and the request could not be completed. Error: 8822097". Is this an error related to hitting the bigquery daily load limit? It will be amazing if someone can point me to a glossary of errors.
{Location: ""; Message: "An internal error occurred and the request could not be completed. Error: 8822097"; Reason: "internalError"
Thanks!
Are you trying to load different types of file in a single command?
It may happen when you try to load from a Google Storage path with both compressed and uncompressed files:
$ gsutil ls gs://bucket/path/
gs://bucket/path/a.txt
gs://bucket/path/b.txt.gz
$ bq load --autodetect --noreplace --source_format=NEWLINE_DELIMITED_JSON "project-id:dataset_name.table_name" gs://bucket/path/*
Waiting on bqjob_id_1 ... (0s) Current status: DONE
BigQuery error in load operation: Error processing job 'project-id:bqjob_id_1': An internal error occurred and the request could not be completed. Error: 8822097
This error can occur due to the maximum columns per table — 10,000 BigQuery limit.
To verify this, you can check the number of distinct columns in the used table:
bq --format=json show project:dataset.table | jq . | grep "type" | grep -v "RECORD" | wc -l
Reducing the number of columns would probably be the best and quickest way to work-around this issue.
We got the same error "An internal error occurred and the request could not be completed. Error: 8822097" when running a standard sql query. Running the corresponding legacy sql query gave us an error message that was actually actionable:
Error while reading table: ABC, error message: The reference schema
differs from the existing data: The required field 'XYZ' is
missing.
Fixing the underlying error, exposed by the legacy sql query, also fixed the error for the standard sql query.
In our case we have avro files. The table was created from the avro files. Newer avro files didn't contain a certain field but the table still contained that field. Rebuilding the table from the new avro files solved the issue. We also have views on top of the table which may or may not change the resulting error message.

external hive metastore issue in EMR cluster

I am pointing my EMR cluster's hive metastore to exteral MySQL RDS instance.
I have created new hive database "mydb" and I got the entry in external MySQL DB in hive.DBS table.
hdfs://ip-10-239-1-118.ec2.internal:8020/user/hive/warehouse/mydb.db mydb hadoop USER
I have also created new hive table "mytable" under mydb database. I got the entry in external MySQL DB in hive.TBLS. so far everything is good..
I terminated my cluster..When I come back next day..I launched new cluster
now, I did the below,
USE MYDB;
create table mytable_2(id int);
I am getting below error,
Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: java.net.NoRouteToHostException No Route to Host from ip-10-239-1-4.ec2.internal/10.239.1.4 to ip-10-239-1-118.ec2.internal:8020 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost)
note :
IP 10.239.1.4 is my current cluster's name node.
IP 10.239.1.118 is my earlier cluster's name node
please let me know what properties need to override to avoid this kind of errors?
I have same issue, and fixed. ^_^
hive> create table sales.t1(i int);
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got
exception: java.net.NoRouteToHostException
No Route to Host from ip-123-234-101-101.ec2.internal/123-234-101-101
to ip-111-111-202-202.ec2.internal:8020 failed on socket timeout
exception: java.net.NoRouteToHostException: No route to host;
For more details see: http://wiki.apache.org/hadoop/NoRouteToHost)
Cause:
We had an external Metastore for the cluster so that we could get rid of the cluster and spin up a new one anytime. Hive Metastore still keeps references to old cluster if there are ‘MANAGED’ tables.
Solution:
hive --service metatool -listFSRoot
hive --service metatool -updateLocation < new_value > < old_value >
E.g.:
new_value = hdfs://ip-XXX.New.XXX.XXX:PORT/user/hive/warehouse
old_value = hdfs://ip-YYY.Old.YYY.YYY:PORT/user/hive/warehouse
Alternatively, you can go into Glue in the AWS console, go to databases/default and edit the entry to have the updated ip in the Location field (which is the output of hive --service metatool -listFSRoot )

How to configure error logging at Source level in a Data Flow Task of SSIS

How to configure error logging at Source level in a Data Flow Task of SSIS?
Sample script used in the example.
create table test_data (col1 int, col2 varchar(100))
insert into test_data values (1, '00 0');
insert into test_data values (2, '02');
OLE DB Source SQL Command with following query
select * from test_data where col2=0;
This query fails
Destination is a flat file.
During execution following error is reported.
OLE DB Source [1] Error: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB
error has occurred. Error code: 0x80040E07. An OLE DB record is
available. Source: "Microsoft SQL Server Native Client 10.0" Hresult:
0x80040E07 Description: "Conversion failed when converting the varchar
value '00 0' to data type int.".
I am getting above error when I run the package.I know the query is not correct but I am not intended to change the source query, this is sample derived from my application.
In order to log this error to a LOG file by specifying a location, what I should do?
I tried using the error output but that handles the errors only on the data coming from Source but not with the query.
There are several ways by which you can log the Error .
1.You can use SSIS Logging feature .
SSIS->Logging
Select the Provider Type (SSIS Log Provider for Text File ) In case you want the error to be saved in text file .
Go the Details tab and select OnError Event .
2.In the Event handler of the component Click the On Error Event and drag and drop a Script Task and write a code to capture the error and save it to a text file .Use the System variable (System::ErrorCode) and
(System::ErrorDescription) to get to know the Error information
Hope this helps.!