BigQuery "Backend Error, Job aborted" when exporting data - google-bigquery

The export job for one of my tables fails in BigQuery with no error message, I checked the job id hoping to get more info but it just says "Backend Error, Job aborted". I used the command-line tool with tis command
bq extract --project_id=my-proj-id --destination_format=NEWLINE_DELIMITED_JSON 'test.table_1' gs://mybucket/export
I checked this question but I know that it is not a problem with my destination bucket in GCS, Because exporting other tables to same bucked is done successfully.
The only difference here is that this table has a repeated record field and each json can get pretty large but I did not find any limit for this on BigQuery docs.
Any ideas on what be the problem can be?
Job Id from one of my tries: bqjob_r51435e780aefb826_0000015691dda235_1

Related

Bigquery internal error during copy job to move tables between datasets

I'm currently migrating around 200 tables in Bigquery (BQ) from one dataset (FROM_DATASET) to another one (TO_DATASET). Each one of these tables has a _TABLE_SUFFIX corresponding to a date (I have three years of data for each table). Each suffix contains typically between 5 GB and 80 GB of data.
I'm doing this using a Python script that asks BQ, for each table, for each suffix, to run the following query:
-- example table=T_SOME_TABLE, suffix=20190915
CREATE OR REPLACE TABLE `my-project.TO_DATASET.T_SOME_TABLE_20190915`
COPY `my-project.FROM_DATASET.T_SOME_TABLE_20190915`
Everything works except for three tables (and all their suffixes) where the copy job fails at each _TABLE_SUFFIX with this error:
An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 4893854
Retrying the job after some time actually works but of course is slowing the process. Is there anyone who has an idea on what the problem might be?
Thanks.
It turned out that those three problematic tables were some legacy ones with lots of columns. In particular, the BQ GUI shows this warning for two of them:
"Schema and preview are not displayed because the table has too many
columns and may cause the BigQuery console to become unresponsive"
This was probably the issue.
In the end, I managed to migrate everything by implementing a backoff mechanism to retry failed jobs.

Pub/Sub to BigQuery template

Currently evaluating stream process from pub/sub to bq by using template that provided by Google and keep getting this error:
Schema of the message in the BQ is matched. But somehow job is not able to write message to BQ. Any ideas what could be the issue on this?
It looks like an information error because you get an error while inserting the data into BigQuery.
You need to validate these things:
Some fields are missing.
The type of the data you are inserting mismatches with the table
schema.
A mandatory field has no data.
A date does not have the correct format.
You can execute this code in the terminal to get more information about this error.
bq --format=prettyjson show -j <JobID>
It will return a JSON with more details. You can see this example:
"message": "Error while reading data, error message: Could not parse '16.66666666666667' as int for field Course_Percentage (position 46) starting at location 1717164"
You can see this document about troubleshooting streaming inserts.
You can use this option: go to the “Logging” section, then “Logs Explorer” in the console, and see more details about errors.
You can filter the error by product name, in this case “Dataflow” or “Bigquery”.
Use another filter by hour.
You’ll see all the errors by product, and if you click each one, you’ll see more details about this specific error.
You can see this documentation about logging.

BigQuery error in extract operation: Error processing job Unexpected. Please try again

I'm having problem to extract data from bigquery to cloud storage, I´ve set a public read -write permissions on Cloud Storage, but I always receive this:
BigQuery error in extract operation: Error processing job Unexpected. Please try again.
the command I'm executing is with bq client tool:
bq extract dummy_dev.users gs://dummy_dev/some.json
Is this a known issue ?
Thanks in advance
Were streaming inserts used to populate the data in the table being extracted? If so, this may be related to the difference in data durability for streaming data and the nature of how streaming data is buffered prior to full replication.

Backend error on import from Cloud Storage to BigQuery

Recently, we have begun to see a number of errors such as this when importing from Cloud Storage to BigQuery:
Waiting on job_72ae7db68bb14e93b7a6990ed628aedd ... (153s) Current status: RUNNING
BigQuery error in load operation: Backend Error
Waiting on job_894172da125943dbb2cd8891958d2d10 ... (364s) Current status: RUNNING
BigQuery error in load operation: Backend Error
This process runs hourly, and had previously been stable for a long time. Nothing has changed in the import script or the types of data being loaded. Please let me know if you need any more information.
I looked up these jobs in the BigQuery logs-- both of them appear to have succeeded. It is possible that the error you got was in reading the job state. I've filed an internal bug that we should distinguish between errors in the job and errors getting the state of the job in the bq tool.
After the job runs, you can use bq show -j <job_d> to see what the actual state of the job is. If it is still running, you can run bq wait <job_id>.
I also took a look at the front-end logs; all of the status requests for those job ids returned HTTP 200 (success) codes.
Can you add the --apilog=file.txt parameter to your bq command line (you'll need to add it to the beginning of the command line, as in bq --apilog=file.txt load ...) and send the output of a case where you get another failure? If you're worried about sensitive data, feel free to send it directly to me (tigani at google).
Thanks /
Jordan Tigani /
Google BigQuery Engineer

Getting error from bq tool when uploading and importing data on BigQuery - 'Backend Error'

I'm getting the error: BigQuery error in load operation: Backend Error when I try to upload and import data on BQ. I already reduced size, increased time between imports, but nothing helps. The strange thing is that if I wait for a time and retry it just works.
In the BigQuery Browser tool it appears like an error in some line/field, but I checked and there is none. And obviously this is a fake message, because if I wait and retry to upload/import the same file, it works.
Tnks
I looked up our failing jobs in the bigquery backend, and I couldn't find any jobs that terminated with 'backend error'. I found several that failed because there were ascii nulls found in the data. (it can be helpful to look at the error stream errors, not just the error result). It is possible that the data got garbled on the way to bigquery... are you certain the data did not change between the failing import and the successful one on the same data?
I've found exporting from a big query table to csv in cloud storage hits the same error when certain characters are present in one of the columns (in this case a column storing the raw results from a prediction analysis). By removing that column from the export it resolved the issue.