Inconsistency in BigQuery Data Ingestion on Streaming Error - google-bigquery

​Hi,
While streaming data to BigQuery, we are facing some inconsistency in data ingested when making https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll requests using BigQuery Java library.
Some of the batches fail with error code: backendError, while some requests time-out with exception stacktrace: https://gist.github.com/anonymous/18aea1c72f8d22d2ea1792bb2ffd6139
For batches which have failed, we have observed 3 different kinds of behaviours related to ingested data:
All records in that batch fail to be ingested into BigQuery
Only some of the records fail to be ingested into BigQuery
All records successfully gets ingested into BigQuery​ in-spite of the​ thrown error
Our questions are:
How can we distinguish between these 3 cases.
For case 2, how can we handle partially ingested data, i.e., which records from that batch should be retried?
For case 3, if all records were successfully ingested, why is error thrown at all?
Thanks in advance...

For partial success, the error response will indicate which rows got inserted and which ones failed - especially, for parsing errors. There are cases where the response fails to reach your client resulting in timeout errors even though the insert succeeded.
In general, you can retry the entire batch and it will be deduplicated if you use the approach outlined in the data consistency documentation.

Related

Bigquery internal error during copy job to move tables between datasets

I'm currently migrating around 200 tables in Bigquery (BQ) from one dataset (FROM_DATASET) to another one (TO_DATASET). Each one of these tables has a _TABLE_SUFFIX corresponding to a date (I have three years of data for each table). Each suffix contains typically between 5 GB and 80 GB of data.
I'm doing this using a Python script that asks BQ, for each table, for each suffix, to run the following query:
-- example table=T_SOME_TABLE, suffix=20190915
CREATE OR REPLACE TABLE `my-project.TO_DATASET.T_SOME_TABLE_20190915`
COPY `my-project.FROM_DATASET.T_SOME_TABLE_20190915`
Everything works except for three tables (and all their suffixes) where the copy job fails at each _TABLE_SUFFIX with this error:
An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 4893854
Retrying the job after some time actually works but of course is slowing the process. Is there anyone who has an idea on what the problem might be?
Thanks.
It turned out that those three problematic tables were some legacy ones with lots of columns. In particular, the BQ GUI shows this warning for two of them:
"Schema and preview are not displayed because the table has too many
columns and may cause the BigQuery console to become unresponsive"
This was probably the issue.
In the end, I managed to migrate everything by implementing a backoff mechanism to retry failed jobs.

When is data ready for querying in Google BigQuery after a Load Job?

Say I have a very large CSV file that I'm loading to a bigquery table. Will this data be available for querying only after the whole file has been uploaded and the job is finnished or will it be available for querying as as the file is being uploaded?
BigQuery will commit data from a load job in an all-or-none fashion. Partial results will not appear in the destination table as a job is progessing, results are committed at the end of the load job.
A load job that terminates with an error will commit no rows. However, for use cases where you have poorly sanitized data, you can optionally choose to configure your load job to allow bad/malformed data to be ignored through configuration values like MaxBadRecords. In such cases, a job may have warnings and still commit the successfully processed data, but the commit semantics remain the same (all at the end, or none if the defined threshold for bad data is exceeded).

BigQuery Error 6034920 when using UPDATE FROM

We are trying to perform an ELT in BigQuery. When using UPDATE FROM, it fails on some tables with the following error:
"An internal error occurred and the request could not be completed.
Error: 6034920"
Moreover, both (Source and Destination) tables consists of data from a single partition.
We are unable to find the details for error code 6034920. Any insight/solutions would be really appreciated?
It is transient, internal error on Bigquery. This behavior is related to a BigQuery shuffling component (in the BQ service backend) and engineers are working to solve it. At the moment there is not an ETA to have this resolved.
In the meantime, as a workaround you should retry the query to detect this behavior again. You can continue tracking the logs in Stackdriver related to this issue by using the following filter:
resource.type="bigquery_resource"
protoPayload.serviceData.jobCompletedEvent.job.jobStatus.additionalErrors.message="An internal error occurred and the request could not be completed. Error: 6034920"
What you can try, is to stop putting values into the partitioning column, it could hopefully fixed the job failures. I hope you find the above pieces of information useful.

Is there size limit on appending ORC data files to Vora tables

I created a Vora table in Vora 1.3 and tried to append data to that table from ORC files that I got from SAP BW archiving process (NLS on Hadoop). I had 20 files, in total containing approx 50 Mio records.
When I tried to use the "files" setting in the APPEND statement as "/path/*", after approx 1 hour Vora returned this error message:
com.sap.spark.vora.client.VoraClientException: Could not load table F002_5F: [Vora [eba156.extendtec.com.au:42681.1640438]] java.lang.RuntimeException: Wrong magic number in response, expected: 0x56320170, actual: 0x00000000. An unsuccessful attempt to load a table might lead to an inconsistent table state. Please drop the table and re-create it if necessary. with error code 0, status ERROR_STATUS
Next thing I tried was appending data from each file using separate APPEND statements. On the 15th append (of 20) I've got the same error message.
The error indicates that the Vora engine on node eba156.extendtec.com.au is not available. I suspect it either crashed or ran into an out-of-memory situtation.
You can check the log directory for a crash dump. If you find one, please open a customer message for further investigation.
If you do not find a crash dump, it is likely a out-of-memory situation. You should find confirmation in either the engine log file or in /var/log/messages (if the oom killer ended the process). In that case, the available memory is not sufficient to load the data.

Getting error from bq tool when uploading and importing data on BigQuery - 'Backend Error'

I'm getting the error: BigQuery error in load operation: Backend Error when I try to upload and import data on BQ. I already reduced size, increased time between imports, but nothing helps. The strange thing is that if I wait for a time and retry it just works.
In the BigQuery Browser tool it appears like an error in some line/field, but I checked and there is none. And obviously this is a fake message, because if I wait and retry to upload/import the same file, it works.
Tnks
I looked up our failing jobs in the bigquery backend, and I couldn't find any jobs that terminated with 'backend error'. I found several that failed because there were ascii nulls found in the data. (it can be helpful to look at the error stream errors, not just the error result). It is possible that the data got garbled on the way to bigquery... are you certain the data did not change between the failing import and the successful one on the same data?
I've found exporting from a big query table to csv in cloud storage hits the same error when certain characters are present in one of the columns (in this case a column storing the raw results from a prediction analysis). By removing that column from the export it resolved the issue.