I am using spark 3.x with apache-hudi 0.8.0 version.
While I am trying to create presto table by using hudi-hive-sync tool I am getting below error.
Got runtime exception when hive syncing
java.lang.IllegalArgumentException: Could not find any data file written for commit [20220116033425__commit__COMPLETED], could not get schema for table
But I checked all data for partitiionKeys using zepplin notebook , I see all data present.
Its understood that I need to do manually commit the file. How to do it ?
I'm creating a table from a CSV text file on my machine. I'm getting this message - Hive partitioned loads require that the HivePartitioningOptions be set, with both a sourceUriPrefix and mode provided. No sourceUriPrefix was provided.
How can I fix this?
I have created an avro-hive table and loaded data into avro-table from another table using hive insert-overwrite command.I can see the data in avro-hive table but when i try to load this into bigQuery table, It gives an error.
Table schema:-
CREATE TABLE `adityadb1.gold_hcth_prfl_datatype_acceptence`(
`prfl_id` bigint,
`crd_dtl` array< struct < cust_crd_id:bigint,crd_nbr:string,crd_typ_cde:string,crd_typ_cde_desc:string,crdhldr_nm:string,crd_exprn_dte:string,acct_nbr:string,cre_sys_cde:string,cre_sys_cde_desc:string,last_upd_sys_cde:string,last_upd_sys_cde_desc:string,cre_tmst:string,last_upd_tmst:string,str_nbr:int,lng_crd_nbr:string>>)
STORED AS AVRO;
Error that i am getting:-
Error encountered during job execution:
Error while reading data, error message: The Apache Avro library failed to read data with the follwing error: Cannot resolve:
I am using following command to load the data into bigquery:-
bq load --source_format=AVRO dataset.tableName avro-filePath
Make sure that there is data available in your gs folder where you are pointing and the data contains the schema (it should if your created it from Hive). Here you have an example of how load data
bq --location=US load --source_format=AVRO --noreplace my_dataset.my_avro_table gs://myfolder/mytablefolder/part-m-00001.avro
I am trying to create a Data Ingest Feed but all the jobs are failing. I checked Nifi and there are error marks saying that "org.apache.hive.jdbc.hivedriver" was not found. I checked the nifi logs and found the following error :
So where exactly do I need to put the hivedriver jar?
Based on the comments, this seems to be the solution as mentioned by #Greg Hart:
Have you tried using a Data Transformation feed? The Data Ingest
template is for loading data into Hive, but it looks like you're using
it to move data from one Hive table into another.
I am new to big-query and i was trying to load avro file to bigQuery table.For the first two times i was able to load avro file to bigquery table .For the third times onwords it starts failing and the error message is -
Waiting on bqjob_r77fb1a791c9ab204_0000015c88ab3ad8_1 ... (0s) Current
status: DONE BigQuery error in load operation: Error processing job 'xxx-yz-
df:bqjob_r77fb1a791c9ab204_0000015c88ab3ad8_1': Provided Schema does not
match Table xxx-yz-df:adityadb.avro_poc3_part_stage$20120611.
i tried many times .How schema can be mismatch for the same file ,if you try more than two times .The load command which i was using is-
bq load --source_format=AVRO adityadb.avro_poc3_part_stage$20120611 gs://reair_ddh/apps/hive/warehouse/adityadb1.db/avro_poc3_part_txt/ingestion_time=20120611/000000_0
I dont know why this is happening,Any help would be appreciated. Thank you.