BigQuery Error 6034920 when using UPDATE FROM - google-bigquery

We are trying to perform an ELT in BigQuery. When using UPDATE FROM, it fails on some tables with the following error:
"An internal error occurred and the request could not be completed.
Error: 6034920"
Moreover, both (Source and Destination) tables consists of data from a single partition.
We are unable to find the details for error code 6034920. Any insight/solutions would be really appreciated?

It is transient, internal error on Bigquery. This behavior is related to a BigQuery shuffling component (in the BQ service backend) and engineers are working to solve it. At the moment there is not an ETA to have this resolved.
In the meantime, as a workaround you should retry the query to detect this behavior again. You can continue tracking the logs in Stackdriver related to this issue by using the following filter:
resource.type="bigquery_resource"
protoPayload.serviceData.jobCompletedEvent.job.jobStatus.additionalErrors.message="An internal error occurred and the request could not be completed. Error: 6034920"
What you can try, is to stop putting values into the partitioning column, it could hopefully fixed the job failures. I hope you find the above pieces of information useful.

Related

Bigquery internal error during copy job to move tables between datasets

I'm currently migrating around 200 tables in Bigquery (BQ) from one dataset (FROM_DATASET) to another one (TO_DATASET). Each one of these tables has a _TABLE_SUFFIX corresponding to a date (I have three years of data for each table). Each suffix contains typically between 5 GB and 80 GB of data.
I'm doing this using a Python script that asks BQ, for each table, for each suffix, to run the following query:
-- example table=T_SOME_TABLE, suffix=20190915
CREATE OR REPLACE TABLE `my-project.TO_DATASET.T_SOME_TABLE_20190915`
COPY `my-project.FROM_DATASET.T_SOME_TABLE_20190915`
Everything works except for three tables (and all their suffixes) where the copy job fails at each _TABLE_SUFFIX with this error:
An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 4893854
Retrying the job after some time actually works but of course is slowing the process. Is there anyone who has an idea on what the problem might be?
Thanks.
It turned out that those three problematic tables were some legacy ones with lots of columns. In particular, the BQ GUI shows this warning for two of them:
"Schema and preview are not displayed because the table has too many
columns and may cause the BigQuery console to become unresponsive"
This was probably the issue.
In the end, I managed to migrate everything by implementing a backoff mechanism to retry failed jobs.

Data Fusion not allow Struct type from Bigquery

I'm try create a pipeline on Datafusion to read a table from bigquery with STRUCT type but received this error:
2021-06-01 19:13:53,818 - WARN [service-http-executor-1321:i.c.w.s.c.AbstractWranglerHandler#210] - Error processing GET /v3/namespaces/system/apps/dataprep/services/service/methods/contexts/interoper_prd/connections/interoper_bq_prd/bigquery/raw_deep/tables/deep_water_report/read?scope=plugin-browser-source, resulting in a 500 response.
java.lang.RuntimeException: BigQuery type STRUCT is not supported
I was able to reproduce your issue. For this reason, I have opened an Issue within GCP Issue Tracker platform, here.
The product team is already aware of the complaint, as you can see in the case comments. Please follow the thread for any updates.

Inconsistency in BigQuery Data Ingestion on Streaming Error

​Hi,
While streaming data to BigQuery, we are facing some inconsistency in data ingested when making https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll requests using BigQuery Java library.
Some of the batches fail with error code: backendError, while some requests time-out with exception stacktrace: https://gist.github.com/anonymous/18aea1c72f8d22d2ea1792bb2ffd6139
For batches which have failed, we have observed 3 different kinds of behaviours related to ingested data:
All records in that batch fail to be ingested into BigQuery
Only some of the records fail to be ingested into BigQuery
All records successfully gets ingested into BigQuery​ in-spite of the​ thrown error
Our questions are:
How can we distinguish between these 3 cases.
For case 2, how can we handle partially ingested data, i.e., which records from that batch should be retried?
For case 3, if all records were successfully ingested, why is error thrown at all?
Thanks in advance...
For partial success, the error response will indicate which rows got inserted and which ones failed - especially, for parsing errors. There are cases where the response fails to reach your client resulting in timeout errors even though the insert succeeded.
In general, you can retry the entire batch and it will be deduplicated if you use the approach outlined in the data consistency documentation.

Backend error on import from Cloud Storage to BigQuery

Recently, we have begun to see a number of errors such as this when importing from Cloud Storage to BigQuery:
Waiting on job_72ae7db68bb14e93b7a6990ed628aedd ... (153s) Current status: RUNNING
BigQuery error in load operation: Backend Error
Waiting on job_894172da125943dbb2cd8891958d2d10 ... (364s) Current status: RUNNING
BigQuery error in load operation: Backend Error
This process runs hourly, and had previously been stable for a long time. Nothing has changed in the import script or the types of data being loaded. Please let me know if you need any more information.
I looked up these jobs in the BigQuery logs-- both of them appear to have succeeded. It is possible that the error you got was in reading the job state. I've filed an internal bug that we should distinguish between errors in the job and errors getting the state of the job in the bq tool.
After the job runs, you can use bq show -j <job_d> to see what the actual state of the job is. If it is still running, you can run bq wait <job_id>.
I also took a look at the front-end logs; all of the status requests for those job ids returned HTTP 200 (success) codes.
Can you add the --apilog=file.txt parameter to your bq command line (you'll need to add it to the beginning of the command line, as in bq --apilog=file.txt load ...) and send the output of a case where you get another failure? If you're worried about sensitive data, feel free to send it directly to me (tigani at google).
Thanks /
Jordan Tigani /
Google BigQuery Engineer

Getting error from bq tool when uploading and importing data on BigQuery - 'Backend Error'

I'm getting the error: BigQuery error in load operation: Backend Error when I try to upload and import data on BQ. I already reduced size, increased time between imports, but nothing helps. The strange thing is that if I wait for a time and retry it just works.
In the BigQuery Browser tool it appears like an error in some line/field, but I checked and there is none. And obviously this is a fake message, because if I wait and retry to upload/import the same file, it works.
Tnks
I looked up our failing jobs in the bigquery backend, and I couldn't find any jobs that terminated with 'backend error'. I found several that failed because there were ascii nulls found in the data. (it can be helpful to look at the error stream errors, not just the error result). It is possible that the data got garbled on the way to bigquery... are you certain the data did not change between the failing import and the successful one on the same data?
I've found exporting from a big query table to csv in cloud storage hits the same error when certain characters are present in one of the columns (in this case a column storing the raw results from a prediction analysis). By removing that column from the export it resolved the issue.