Backend error on import from Cloud Storage to BigQuery - google-bigquery

Recently, we have begun to see a number of errors such as this when importing from Cloud Storage to BigQuery:
Waiting on job_72ae7db68bb14e93b7a6990ed628aedd ... (153s) Current status: RUNNING
BigQuery error in load operation: Backend Error
Waiting on job_894172da125943dbb2cd8891958d2d10 ... (364s) Current status: RUNNING
BigQuery error in load operation: Backend Error
This process runs hourly, and had previously been stable for a long time. Nothing has changed in the import script or the types of data being loaded. Please let me know if you need any more information.

I looked up these jobs in the BigQuery logs-- both of them appear to have succeeded. It is possible that the error you got was in reading the job state. I've filed an internal bug that we should distinguish between errors in the job and errors getting the state of the job in the bq tool.
After the job runs, you can use bq show -j <job_d> to see what the actual state of the job is. If it is still running, you can run bq wait <job_id>.
I also took a look at the front-end logs; all of the status requests for those job ids returned HTTP 200 (success) codes.
Can you add the --apilog=file.txt parameter to your bq command line (you'll need to add it to the beginning of the command line, as in bq --apilog=file.txt load ...) and send the output of a case where you get another failure? If you're worried about sensitive data, feel free to send it directly to me (tigani at google).
Thanks /
Jordan Tigani /
Google BigQuery Engineer

Related

How to monitor Databricks jobs using CLI or Databricks API to get the information about all jobs

I want to monitor the status of the jobs to see whether the jobs are running overtime or it failed. if you have the script or any reference then please help me with this. thanks
You can use the databricks runs list command to list all the jobs ran. This will list all jobs and their current status RUNNING/FAILED/SUCCESS/TERMINATED.
If you wanted to see if a job is running over you would then have to use databricks runs get --run-id command to list the metadata from the run. This will return a json which you can parse out the start_time and end_time.
# Lists job runs.
databricks runs list
# Gets the metadata about a run in json form
databricks runs get --run-id 1234
Hope this helps get you on track!

BigQuery Error 6034920 when using UPDATE FROM

We are trying to perform an ELT in BigQuery. When using UPDATE FROM, it fails on some tables with the following error:
"An internal error occurred and the request could not be completed.
Error: 6034920"
Moreover, both (Source and Destination) tables consists of data from a single partition.
We are unable to find the details for error code 6034920. Any insight/solutions would be really appreciated?
It is transient, internal error on Bigquery. This behavior is related to a BigQuery shuffling component (in the BQ service backend) and engineers are working to solve it. At the moment there is not an ETA to have this resolved.
In the meantime, as a workaround you should retry the query to detect this behavior again. You can continue tracking the logs in Stackdriver related to this issue by using the following filter:
resource.type="bigquery_resource"
protoPayload.serviceData.jobCompletedEvent.job.jobStatus.additionalErrors.message="An internal error occurred and the request could not be completed. Error: 6034920"
What you can try, is to stop putting values into the partitioning column, it could hopefully fixed the job failures. I hope you find the above pieces of information useful.

BigQuery "Backend Error, Job aborted" when exporting data

The export job for one of my tables fails in BigQuery with no error message, I checked the job id hoping to get more info but it just says "Backend Error, Job aborted". I used the command-line tool with tis command
bq extract --project_id=my-proj-id --destination_format=NEWLINE_DELIMITED_JSON 'test.table_1' gs://mybucket/export
I checked this question but I know that it is not a problem with my destination bucket in GCS, Because exporting other tables to same bucked is done successfully.
The only difference here is that this table has a repeated record field and each json can get pretty large but I did not find any limit for this on BigQuery docs.
Any ideas on what be the problem can be?
Job Id from one of my tries: bqjob_r51435e780aefb826_0000015691dda235_1

BigQuery error in extract operation: Error processing job Unexpected. Please try again

I'm having problem to extract data from bigquery to cloud storage, I´ve set a public read -write permissions on Cloud Storage, but I always receive this:
BigQuery error in extract operation: Error processing job Unexpected. Please try again.
the command I'm executing is with bq client tool:
bq extract dummy_dev.users gs://dummy_dev/some.json
Is this a known issue ?
Thanks in advance
Were streaming inserts used to populate the data in the table being extracted? If so, this may be related to the difference in data durability for streaming data and the nature of how streaming data is buffered prior to full replication.

Getting error from bq tool when uploading and importing data on BigQuery - 'Backend Error'

I'm getting the error: BigQuery error in load operation: Backend Error when I try to upload and import data on BQ. I already reduced size, increased time between imports, but nothing helps. The strange thing is that if I wait for a time and retry it just works.
In the BigQuery Browser tool it appears like an error in some line/field, but I checked and there is none. And obviously this is a fake message, because if I wait and retry to upload/import the same file, it works.
Tnks
I looked up our failing jobs in the bigquery backend, and I couldn't find any jobs that terminated with 'backend error'. I found several that failed because there were ascii nulls found in the data. (it can be helpful to look at the error stream errors, not just the error result). It is possible that the data got garbled on the way to bigquery... are you certain the data did not change between the failing import and the successful one on the same data?
I've found exporting from a big query table to csv in cloud storage hits the same error when certain characters are present in one of the columns (in this case a column storing the raw results from a prediction analysis). By removing that column from the export it resolved the issue.