Frequent 503 errors raised from BigQuery Streaming API - google-bigquery

Streaming data into BigQuery keeps failing due to the following error, which occurs more frequently recently:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 503 Service Unavailable
{
"code" : 503,
"errors" : [ {
"domain" : "global",
"message" : "Connection error. Please try again.",
"reason" : "backendError"
} ],
"message" : "Connection error. Please try again."
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:145)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:312)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1049)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:410)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:343)
Relevant question references:
Getting high rate of 503 errors with BigQuery Streaming API
BigQuery - BackEnd error when loading from JAVA API

We (the BigQuery team) are looking into your report of increased connection errors. From the internal monitoring, there hasn't been global a spike in connection errors in the last several days. However, that doesn't mean that your tables, specifically, weren't affected.
Connection errors can be tricky to chase down, because they can be caused by errors before they get to the BigQuery servers or after they leave. The more information your can provide, the easier it is for us to diagnose the issue.
The best practice for streaming input is to handle temporary errors like this to retry the request. It can be a little tricky, since when you get a connection error you don't actually know whether the insert succeeded. If you include a unique insertId with your data (see the documentation here), you can safely resend the request (within the deduplication window period, which I think is 15 minutes) without worrying that the same row will get added multiple times.

Related

Load from GCS to GBQ causes an internal BigQuery error

My application creates thousands of "load jobs" daily to load data from Google Cloud Storage URIs to BigQuery and only a few cases causing the error:
"Finished with errors. Detail: An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 7916072"
The application is written on Python and uses libraries:
google-cloud-storage==1.42.0
google-cloud-bigquery==2.24.1
google-api-python-client==2.37.0
Load job is done by calling
load_job = self._client.load_table_from_uri(
source_uris=source_uri,
destination=destination,
job_config=job_config,
)
this method has a default param:
retry: retries.Retry = DEFAULT_RETRY,
so the job should automatically retry on such errors.
Id of specific job that finished with error:
"load_job_id": "6005ab89-9edf-4767-aaf1-6383af5e04b6"
"load_job_location": "US"
after getting the error the application recreates the job, but it doesn't help.
Subsequent failed job ids:
5f43a466-14aa-48cc-a103-0cfb4e0188a2
43dc3943-4caa-4352-aa40-190a2f97d48d
43084fcd-9642-4516-8718-29b844e226b1
f25ba358-7b9d-455b-b5e5-9a498ab204f7
...
As mentioned in the error message, Wait according to the back-off requirements described in the BigQuery Service Level Agreement, then try the operation again.
If the error continues to occur, if you have a support plan please create a new GCP support case. Otherwise, you can open a new issue on the issue tracker describing your issue. You can also try to reduce the frequency of this error by using Reservations.
For more information about the error messages you can refer to this document.

Why does Google Drive api keep getting userRateLimitExceeded error?

I'm making a service that mainly uses Google Drive API. Share files by inserting and subtracting share permissions for multiple files on personal drive. But there is an error in certain users' accounts. This error is an error that occurs in the process of creating and inserting Google drive permission. I tried all the solutions suggested by Google drive support. Delete cache, delete cookies, use another browser, wait 24 hours before using etc. I've tried everything, but it's not working and the error rate of the service is increasing. Who can tell me how to solve this problem?
error response :
{
"code" : 403,
"errors" : [ {
"domain" : "usageLimits",
"location" : "user",
"locationType" : "other",
"message" : "User rate limit exceeded",
"reason" : "userRateLimitExceeded"
} ],
"message" : "User rate limit exceeded"
}
User rate limit exceeded
This error is flood protection. Resolve a 403 error: User rate limit exceeded
The solution would be to slow down and implement exponential backoff and retry the request.

Google Dataflow stalled after BigQuery outage

I have a Google Dataflow Job running. The dataflow job is reading messages from Pub/Sub, enrich it and write the enriched data into BigQuery.
Dataflow was processing approximately 5000 messages per second. I am using 20 workers to run the dataflow job.
Yesterday it seems there was a BigQuery outage. So writing the data in BigQuery part failed. After some time, my dataflow stopped working.
I see 1000 errors like below
(7dd47a65ad656a43): Exception: java.lang.RuntimeException: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "The project xx-xxxxxx-xxxxxx has not enabled BigQuery.",
"reason" : "invalid"
} ],
"message" : "The project xx-xxxxxx-xxxxxx has not enabled BigQuery.",
"status" : "INVALID_ARGUMENT"
}
com.google.cloud.dataflow.sdk.util.BigQueryTableInserter.insertAll(BigQueryTableInserter.java:285)
com.google.cloud.dataflow.sdk.util.BigQueryTableInserter.insertAll(BigQueryTableInserter.java:175)
com.google.cloud.dataflow.sdk.io.BigQueryIO$StreamingWriteFn.flushRows(BigQueryIO.java:2728)
com.google.cloud.dataflow.sdk.io.BigQueryIO$StreamingWriteFn.finishBundle(BigQueryIO.java:2685)
com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.finishBundle(DoFnRunnerBase.java:159)
com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.finishBundle(SimpleParDoFn.java:194)
com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.finishBundle(ForwardingParDoFn.java:47)
com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.finish(ParDoOperation.java:65)
com.google.cloud.dataflow.sdk.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
com.google.cloud.dataflow.sdk.runners.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:719)
Stack trace truncated. Please see Cloud Logging for the entire trace.
Please note that dataflow is not working even the BigQuery started working. I had to restart the dataflow job to make it work.
This causes data loss. Not only at the time of outage, but also until I notice the error and restart dataflow job. Is there a way to configure the retry option so that dataflow job does not go into stale on these cases?

GoogleApiException: Google.Apis.Requests.RequestError Backend Error [500] when streaming to BigQuery

I'm streaming data to BigQuery for the past year or so from a service in Azure written in c# and recently started to get increasing amount of the following errors (most of the requests succeed):
Message: [GoogleApiException: Google.Apis.Requests.RequestError An
internal error occurred and the request could not be completed. [500]
Errors [
Message[An internal error occurred and the request could not be completed.] Location[ - ] Reason[internalError] Domain[global] ] ]
This is the code I'm using in my service:
public async Task<TableDataInsertAllResponse> Update(List<TableDataInsertAllRequest.RowsData> rows, string tableSuffix)
{
var request = new TableDataInsertAllRequest {Rows = rows, TemplateSuffix = tableSuffix};
var insertRequest = mBigqueryService.Tabledata.InsertAll(request, ProjectId, mDatasetId, mTableId);
return await insertRequest.ExecuteAsync();
}
Just like any other cloud service, BigQuery doesn't offer a 100% uptime SLA (it's actually 99.9%), so it's not uncommon to encounter transient errors like these. We also receive them frequently in our applications.
You need to build exponential backoff-and-retry logic into your application(s) to handle such errors. A good way of doing this is to use a queue to stream your data to BigQuery. This is what we do and it works very well for us.
Some more info:
https://cloud.google.com/bigquery/troubleshooting-errors
https://cloud.google.com/bigquery/loading-data-post-request#exp-backoff
https://cloud.google.com/bigquery/streaming-data-into-bigquery
https://cloud.google.com/bigquery/sla

com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found

When using a Talend bigquery input component (BQ java api) to read from bigquery, I get the following error (for a long running job) -
Exception in component tBigQueryInput_4
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
"code" : 404,
"errors" : [ {
"domain" : "global",
"message" : "Not found: Table rand-cap:_f000fcf374688fc5e7da50a4c0c04ba228d993c3.anon0849eba05949a62962f218a0433d6ee82bf13a7b",
"reason" : "notFound"
} ],
"message" : "Not found: Table rand-cap:_f000fcf374688fc5e7da50a4c0c04ba228d993c3.anon0849eba05949a62962f218a0433d6ee82bf13a7b"
}
Is this because of the "temporary" table that bq creates when querying results not being available after 24hrs. Or is it because rate limit was exceeded since I am querying a large table ?
In either case, how can I find more details on this error and what steps should I take to prevent this ?
Thank you !
This seems to be a problem in Talend, there are other users describing your issue: https://www.talendforge.org/forum/viewtopic.php?id=44734
Google Bigquery has a property i.e. Allowlargeresults but its not there in TBigqueryinput.
Hi there - I am currently using Talend open studio v6.1.1 and this issue still exists.