GoogleApiException: Google.Apis.Requests.RequestError Backend Error [500] when streaming to BigQuery - google-bigquery

I'm streaming data to BigQuery for the past year or so from a service in Azure written in c# and recently started to get increasing amount of the following errors (most of the requests succeed):
Message: [GoogleApiException: Google.Apis.Requests.RequestError An
internal error occurred and the request could not be completed. [500]
Errors [
Message[An internal error occurred and the request could not be completed.] Location[ - ] Reason[internalError] Domain[global] ] ]
This is the code I'm using in my service:
public async Task<TableDataInsertAllResponse> Update(List<TableDataInsertAllRequest.RowsData> rows, string tableSuffix)
{
var request = new TableDataInsertAllRequest {Rows = rows, TemplateSuffix = tableSuffix};
var insertRequest = mBigqueryService.Tabledata.InsertAll(request, ProjectId, mDatasetId, mTableId);
return await insertRequest.ExecuteAsync();
}

Just like any other cloud service, BigQuery doesn't offer a 100% uptime SLA (it's actually 99.9%), so it's not uncommon to encounter transient errors like these. We also receive them frequently in our applications.
You need to build exponential backoff-and-retry logic into your application(s) to handle such errors. A good way of doing this is to use a queue to stream your data to BigQuery. This is what we do and it works very well for us.
Some more info:
https://cloud.google.com/bigquery/troubleshooting-errors
https://cloud.google.com/bigquery/loading-data-post-request#exp-backoff
https://cloud.google.com/bigquery/streaming-data-into-bigquery
https://cloud.google.com/bigquery/sla

Related

Load from GCS to GBQ causes an internal BigQuery error

My application creates thousands of "load jobs" daily to load data from Google Cloud Storage URIs to BigQuery and only a few cases causing the error:
"Finished with errors. Detail: An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 7916072"
The application is written on Python and uses libraries:
google-cloud-storage==1.42.0
google-cloud-bigquery==2.24.1
google-api-python-client==2.37.0
Load job is done by calling
load_job = self._client.load_table_from_uri(
source_uris=source_uri,
destination=destination,
job_config=job_config,
)
this method has a default param:
retry: retries.Retry = DEFAULT_RETRY,
so the job should automatically retry on such errors.
Id of specific job that finished with error:
"load_job_id": "6005ab89-9edf-4767-aaf1-6383af5e04b6"
"load_job_location": "US"
after getting the error the application recreates the job, but it doesn't help.
Subsequent failed job ids:
5f43a466-14aa-48cc-a103-0cfb4e0188a2
43dc3943-4caa-4352-aa40-190a2f97d48d
43084fcd-9642-4516-8718-29b844e226b1
f25ba358-7b9d-455b-b5e5-9a498ab204f7
...
As mentioned in the error message, Wait according to the back-off requirements described in the BigQuery Service Level Agreement, then try the operation again.
If the error continues to occur, if you have a support plan please create a new GCP support case. Otherwise, you can open a new issue on the issue tracker describing your issue. You can also try to reduce the frequency of this error by using Reservations.
For more information about the error messages you can refer to this document.

GCP - Message from PubSub to BigQuery

I need to get the data from my pubsub message and insert into bigquery.
What I have:
const topicName = "-----topic-name-----";
const data = JSON.stringify({ foo: "bar" });
// Imports the Google Cloud client library
const { PubSub } = require("#google-cloud/pubsub");
// Creates a client; cache this for further use
const pubSubClient = new PubSub();
async function publishMessageWithCustomAttributes() {
// Publishes the message as a string, e.g. "Hello, world!" or JSON.stringify(someObject)
const dataBuffer = Buffer.from(data);
// Add two custom attributes, origin and username, to the message
const customAttributes = {
origin: "nodejs-sample",
username: "gcp",
};
const messageId = await pubSubClient
.topic(topicName)
.publish(dataBuffer, customAttributes);
console.log(`Message ${messageId} published.`);
}
publishMessageWithCustomAttributes().catch(console.error);
I need to get the data/attributes from this message and query in BigQuery, anyone can help me?
Thaks in advance!
In fact, there is 2 solutions to consume the messages: either a message per message, or in bulk.
Firstly, before going in detail, and because you will perform BigQuery calls (or Facebook API calls), you will spend a lot of the processing time to wait the API response.
Message per Message
If you have an acceptable volume of message, you can perform a message per message processing. You have 2 solutions here:
You can handle each message with Cloud Functions. Set the minimal amount of memory to the functions (128Mb) to limit the CPU cost and thus the global cost. Indeed, because you will wait a lot, don't spend expensive CPU cost to do nothing! Ok, you will process slowly the data when they will be there but, it's a tradeoff.
Create Cloud Function on the topic, or a Push Subscription to call a HTTP triggered Cloud Functions
You can also handle request concurrently with Cloud Run. Cloud Run can handle up to 250 requests concurrently (in preview), and because you will wait a lot, it's perfectly suitable. If you need more CPU and memory, you can increase these value to 4CPU and 8Gb of memory. It's my preferred solution.
Bulk processing is possible if you are able to easily manage multi-cpu multi-(light)thread development. It's easy in Go. Concurrency in Node is also easy (await/async) but I don't know if it's multi-cpu capable or only single-cpu. Anyway, the principle is the following
Create a pull subscription on PubSub topic
Create a Cloud Run (better for multi-cpu, but also work with App Engine or Cloud Functions) that will listen the pull subscription for a while (let's say 10 minutes)
For each message pulled, an async process is performed: get the data/attribute, make the call to BigQuery, ack the message
After the timeout of the pull connexion, close the message listening, finish the current message processing and exit gracefully (return 200 HTTP code)
Create a Cloud Scheduler that call every 10 minutes the Cloud Run service. Set the timeout to 15 minutes and discard retries.
Deploy the Cloud Run service with a timeout of 15 minutes.
This solution offers a better message throughput processing (you can process more than 250 message per Cloud Run service), but don't have a real advantage because you are limited by the API call latency.
EDIT 1
Code sample
// For pubsunb triggered function
exports.logMessageTopic = (message, context) => {
console.log("Message Content")
console.log(Buffer.from(message.data, 'base64').toString())
console.log("Attribute list")
for (let key in message.attributes) {
console.log(key + " -> " + message.attributes[key]);
};
};
// For push subscription
exports.logMessagePush = (req, res) => {
console.log("Message Content")
console.log(Buffer.from(req.body.message.data, 'base64').toString())
console.log("Attribute list")
for (let key in req.body.message.attributes) {
console.log(key + " -> " + req.body.message.attributes[key]);
};
};

Read Timed Out : sychronous query via Bigquery java API

We are using the big query JAVA API to retrieve results for our analytics reporting frontend. We are trying to retrieve the results synchronously. A lot of times we get Read timed out error, even before the query timeout as specified in the parameters. Here's the stack trace for a sample fail:
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:331)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:830)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:787)
at com.sun.net.ssl.internal.ssl.AppInputStream.read(AppInputStream.java:75)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:697)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:640)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:318)
at com.google.api.client.http.javanet.NetHttpResponse.<init>(NetHttpResponse.java:36)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:94)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:965)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:410)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:343)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:460)
I am not able to retrieve the job id of the resulting job as the error occurs before I can retrieve a JobReference object. The timeout specified in this case was 300 sec. The query failed well before it. The query contains three JOIN's and several GROUP EACH BY clauses. Can you suggest us a possible way to debug this ?
Adding the code snippet:
QueryRequest queryInfo = new QueryRequest().setQuery(sql)
.setTimeoutMs(timeOutInSec * 1000);
// get project id
BQGameConnectionDetails details = Config
.getBQConnectionDetails(gameId);
String projectId = details.getProjectId();
Bigquery.Jobs.Query queryRequest = getInstance(gameId).jobs()
.query(projectId, queryInfo);
QueryResponse response = queryRequest.execute();
There are two timeouts involved. The first timeout is in the HTTP request you've sent to bigquery. The second is in the bigquery request timeout. It sounds like you've set the latter to a large value, but the former is likely the timeout that you're hitting. If the HTTP request times out before the BigQuery timeout, the connection will be closed and BigQuery won't have a chance to respond.
There are two options: First is to increase the HTTP request timeout (which depends on the libraries you're using, but this page here may be helpful). The second is to decrease the bigquery timeout. This means you'll have to use jobs.getQueryResults() to read the actual results, but this is a more robust method because it doesn't matter how long the query takes, you can just call getQueryResults() in a loop. I would post a link to a good java sample that does this, but I don't know that one exists, unfortunately.

Frequent 503 errors raised from BigQuery Streaming API

Streaming data into BigQuery keeps failing due to the following error, which occurs more frequently recently:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 503 Service Unavailable
{
"code" : 503,
"errors" : [ {
"domain" : "global",
"message" : "Connection error. Please try again.",
"reason" : "backendError"
} ],
"message" : "Connection error. Please try again."
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:145)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:312)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1049)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:410)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:343)
Relevant question references:
Getting high rate of 503 errors with BigQuery Streaming API
BigQuery - BackEnd error when loading from JAVA API
We (the BigQuery team) are looking into your report of increased connection errors. From the internal monitoring, there hasn't been global a spike in connection errors in the last several days. However, that doesn't mean that your tables, specifically, weren't affected.
Connection errors can be tricky to chase down, because they can be caused by errors before they get to the BigQuery servers or after they leave. The more information your can provide, the easier it is for us to diagnose the issue.
The best practice for streaming input is to handle temporary errors like this to retry the request. It can be a little tricky, since when you get a connection error you don't actually know whether the insert succeeded. If you include a unique insertId with your data (see the documentation here), you can safely resend the request (within the deduplication window period, which I think is 15 minutes) without worrying that the same row will get added multiple times.

Exception in RavenDB when doing bulk inserts

I'm doing some bulk inserts with RavenDB, and at some random document I get an InvalidOperationException with a HTTP 403 (forbidden), and the message "This single use token has expired".
The code is pretty straightforward:
using (var bulkInsert = store.BulkInsert("MyDatabase", new BulkInsertOptions { CheckForUpdates = true, BatchSize = 512 })
{
// Mapping stuff
bulkInsert.store(doc, productId);
}
I have tried experimenting with the batch size without any luck.
How do I fix it?
I'm using RavenDB 2.5 build 2700 and hosting RavenDb as a Windows service.
When running RavenDB in a console, I can see that it logs this message:
Error when using events transport. An operation was attempted on a
nonexistent network connection.