Bigquery Job Failing with 500 Internal Error - google-bigquery

This is really a question for Jordan Tigani and Google's BigQuery support that recommends we use stackoverflow:
I have a BigQuery job that has been executing daily for the past several months but now has started erroring out with an internal 500 error. One example is job id job_4J9LL4vp3xtM30WgqduvQqFFUN4 - would it be possible to know why this job is causing an internal bigquery error and if there's something I can do to fix it?

This is the same issue as bigquery error:Unexpected. Please try again
We've got a fix that we're currently testing, it should go out in this week's release. Unfortunately, there is no workaround, other than to use a table decorator that doesn't include the latest streaming data.

Related

BigQuery job history in UI has disappeared

Experienced a weird problem with BigQuery UI this morning - for a specific project, all Job History both Personal and Project has disappeared. Load jobs are still showing up in the last month when i use the BQ LS command.
Has anyone seen this before, any advice? I've raised a call with the service desk but wondered what you guys think.
best wishes
Dave
It seems to be a bug in the BigQuery UI, there is a public issue reporting this scenario [1], as a workaround you can list all the jobs by using the CLI [2]
[1] https://issuetracker.google.com/118569383
[2] https://cloud.google.com/bigquery/docs/managing-jobs#listing_jobs_in_a_project
Turns out there is a shared, 1000 query / job list maximum in the UI. As we had begun to run many hundreds of queries per day, this was 'pushing' the job queries out of the list. I've requested this to be increased as a feature request in future
best wishes
Dave

BigQuery python client library dropping data on insert_rows

I'm using the Python API to write to BigQuery -- I've had success previously, but I'm pretty novice with the BigQuery platform.
I recently updated a table schema to include some new nested records. After creating this new table, I'm seeing significant portions of data not making it to BigQuery.
However, some of the data is coming through. In a single write statement, my code will try to send through a handful of rows. Some of the rows make it and some do not, but no errors are being thrown from the BigQuery endpoint.
I have access to the stackdriver logs for this project and there are no errors or warnings indicating that a write would have failed. I'm not streaming the data -- using the BigQuery client library to call the API endpoint (I saw other answers that state issues with streaming data to a newly created table).
Has anyone else had issues with the BigQuery API? I haven't found any documentation stating about a delay to access the data (I found the opposite -- supposed to be near real-time, right?) and I'm not sure what's causing the issue at this point.
Any help or reference would be greatly appreciated.
Edit: Apparently the API is the streaming API -- missed on my part.
Edit2: This issue is related. Though, I've been writing to the table every 5 minutes for about 24 hours, and I'm still seeing missing data. I'm curious if writing to a BigQuery table within 10 minutes of it's creation puts you in a permanent state of losing data or if it would be expected to catching everything after the initial 10 minutes from creation.

Big Query keeps on running queries; account deactived

I've been trying out BigQuery for a week or so. I've linked BigQuery to one of our organizations Firebase projects. I used my free trial of Google Cloud Platform for this test. I didn't want to use BQ anymore, thus I deactived the link between Firebase and BQ. I also degraded our account from Blaze back to Spark. I assumed that, with the deactivation of the Blaze subscription and link between Firebase and BQ, no queries would be ran in BQ. This happened almost a week ago. However, this did not happen; the last query ran yesterday. I am not sure if I will be billed for these queries.
How do I cancel all queries in BQ in the future? I can't seem to find a(n easy) way to do that.
THanks in advance.

BigQuery data load - Is it possible to collect error record?

I could see the option "Number of errors allowed" to ignore bad records when running a job.
bq command line parameter:
--max_bad_records
Web interface:
Is there way to collect the bad records that are rejected while executing the job?
There currently is not a way but I agree with the other commenter that it would be a good feature request. Even exposing a sampling of bad records could be helpful.
As of today, this option is used only to skip bad records in the file and allows load job to succeed.
May be it's collecting in Logging page in GCP. Did you look here?

Bigquery Backend Error when exporting results to a table

There is some time that the query runs perfectly, but lately has appeared to me this error: "Backend Error".
I know that my query is huge, and it takes about 300 seconds to execute. But I imagine this is some BigQuery's bug, so I wonder why this error is happening.
This error started appears when I was executing some other queries, when I just wanted the results and not export them.
So I started to create a table with the results hopping that BigQuery could be able to perform the query
Here is an image that shows the error:
I looked up your job in the BigQuery job database, and it completed successfully after 160 seconds.
BigQuery queries are fundamentally asynchronous. That is, when you run a query, it runs as a named Job by the BigQuery service. Since the original call may timeout, usual best-practice is to poll for completion by using the jobs.getQueryResults() API. My guess is that this is the API call that actually failed.
We had reports of an elevated number of Backend Errors yesterday and we're still investigating. However, these don't appear to be actual failed queries, instead they are failures getting the status of queries or getting the results, that should go away by retrying.
How did you run the query? Did you use the BigQuery Web UI? If you are using the API, did you call the bigquery.jobs.insert() api or the bigquery.jobs.query() api?