BigQuery generated table from gSheet is unaccessible through JDBC driver - google-bigquery

I'm trying to connect BigQuery to a BI Tool (Cognos Analytics) through a JDBC driver.
It all works OK, until I try to read a gSheet generated table. Then I get the following error:
Data source adapter error: java.sql.SQLException: [Simba][BigQueryJDBCDriver](100033) Error getting job status.
[Simba][BigQueryJDBCDriver](100033) Error getting job status.
[Simba][BigQueryJDBCDriver](100033) Error getting job status.
400 Bad Request
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "Error while reading table: <GSHEET-TABLE>, error message: Failed to read the spreadsheet. Error code: PERMISSION_DENIED",
"reason" : "invalid"
} ],
"message" : "Error while reading table: <GSHEET-TABLE>, error message: Failed to read the spreadsheet. Error code: PERMISSION_DENIED",
"status" : "INVALID_ARGUMENT"
}
My connection string is the following:
jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<MY-PROJECT-ID>;OAuthType=0;OAuthServiceAcctEmail=<CLIENT-EMAIL-FROM-JSON>;OAuthPvtKeyPath=<PATH-TO-JSON>;Timeout=60; RequestGoogleDriveScope=1;
The Service Account has full Admin ownership of the entire project.
Does somebody know how to access gSheets generated tables?

Related

BigQuery API call by BigQuerySinkConnector send error

I have a problem when executing a query on BigQuery and I end up with the following error:
org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:568)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:326)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:228)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:196)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:184)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:234)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.wepay.kafka.connect.bigquery.exception.BigQueryConnectException: A write thread has failed with an unrecoverable error
Caused by: The job encountered an internal error during execution and was unable to complete successfully.
at com.wepay.kafka.connect.bigquery.write.batch.KCBQThreadPoolExecutor.lambda$maybeThrowEncounteredError$0(KCBQThreadPoolExecutor.java:101)
at java.base/java.util.Optional.ifPresent(Optional.java:183)
at com.wepay.kafka.connect.bigquery.write.batch.KCBQThreadPoolExecutor.maybeThrowEncounteredError(KCBQThreadPoolExecutor.java:100)
at com.wepay.kafka.connect.bigquery.BigQuerySinkTask.put(BigQuerySinkTask.java:236)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:546)
... 10 more
Caused by: com.google.cloud.bigquery.BigQueryException: The job encountered an internal error during execution and was unable to complete successfully.
at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:113)
at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.getQueryResults(HttpBigQueryRpc.java:623)
at com.google.cloud.bigquery.BigQueryImpl$34.call(BigQueryImpl.java:1222)
at com.google.cloud.bigquery.BigQueryImpl$34.call(BigQueryImpl.java:1217)
at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
at com.google.cloud.bigquery.BigQueryImpl.getQueryResults(BigQueryImpl.java:1216)
at com.google.cloud.bigquery.BigQueryImpl.getQueryResults(BigQueryImpl.java:1200)
at com.google.cloud.bigquery.Job$1.call(Job.java:332)
at com.google.cloud.bigquery.Job$1.call(Job.java:329)
at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
at com.google.cloud.RetryHelper.poll(RetryHelper.java:64)
at com.google.cloud.bigquery.Job.waitForQueryResults(Job.java:328)
at com.google.cloud.bigquery.Job.getQueryResults(Job.java:291)
at com.google.cloud.bigquery.BigQueryImpl.query(BigQueryImpl.java:1187)
at com.wepay.kafka.connect.bigquery.MergeQueries.mergeFlush(MergeQueries.java:158)
at com.wepay.kafka.connect.bigquery.MergeQueries.lambda$mergeFlush$1(MergeQueries.java:119)
... 3 more
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
GET https://www.googleapis.com/bigquery/v2/projects/car-project-prd/queries/3df89651-3567-4b28-88a8-0655a174574c?location=EU&maxResults=0&prettyPrint=false
{
"code" : 400,
"errors" : [ {
"domain" : "global",
"message" : "The job encountered an internal error during execution and was unable to complete successfully.",
"reason" : "jobInternalError"
} ],
"message" : "The job encountered an internal error during execution and was unable to complete successfully.",
"status" : "INVALID_ARGUMENT"
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:149)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:112)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:39)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:443)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1108)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:541)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:474)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:591)
at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.getQueryResults(HttpBigQueryRpc.java:621)
... 20 more
The context is as follows:
I use the Kafka BigQuerySinkConnector to update partitioned tables in a dataset. The connector works perfectly for 1, 2 or 3 days and then fails with the trace above.
Can you tell me if you have an idea of the reason for this error or do you have any leads that I could follow to discover the problem and solve it.
I tried to get information to google help but it's just the following message.
Support Google for Error

Pydrill Heapmemory Issue while Executing Spark submit commnads

Exception :
[MainThread ] [ERROR] : Message : TransportError(500, '{\n "errorMessage" : "RESOURCE ERROR: There is not enough heap memory to run this query using the web interface. \n\nPlease try a query with fewer columns or with a filter or limit condition to limit the data returned. \nYou can also try an ODBC/JDBC client. \n\n[Error Id: 8dc824fc-b1b5-4352-85fe-84e2eb5ff71d on

Dataflow insert into BigQuery fails with large number of files for asia-northeast1 location

I am using Cloud Storage Text to BigQuery template on Cloud Composer.
The template is kicked from Python google api client.
The same program
works fine in US location (for Dataflow and BigQuery).
fails in asia-northeast1 location.
works fine with the fewer (less than 10000) input files in asia-northeast location.
Does anybody have an idea about this?
I want to execute in the asia-northeast location for business reason.
More details about failure:
The program worked until "ReifyRenameInput", and the failed .
dataflow job failed
with the error message below:
java.io.IOException: Unable to insert job: beam_load_textiotobigquerydataflow0releaser0806214711ca282fc3_8fca2422ccd74649b984a625f246295c_2a18c21953c26c4d4da2f8f0850da0d2_00000-0, aborting after 9 .
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:231)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startJob(BigQueryServicesImpl.java:202)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$JobServiceImpl.startCopyJob(BigQueryServicesImpl.java:196)
at org.apache.beam.sdk.io.gcp.bigquery.WriteRename.copy(WriteRename.java:144)
at org.apache.beam.sdk.io.gcp.bigquery.WriteRename.writeRename(WriteRename.java:107)
at org.apache.beam.sdk.io.gcp.bigquery.WriteRename.processElement(WriteRename.java:80)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException:
404 Not Found { "code" : 404, "errors" : [ { "domain" : "global", "message" : "Not found: Dataset pj:datasetname", "reason" : "notFound" } ], "message" : "Not found: Dataset pj:datasetname" }
(pj and dataset name are not real name, and they are project name and dataset name for outputTable parameter)
Although the error message said the dataset is not found, the dataset surely existed.
Moreover, some new tables which seems to be tempory tables were created in the dataset after the program.
This is a known issue related to your Beam SDK version according to this public issue tracker. The Beam 2.5.0 SDK version doesn't have this issue.

com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found

When using a Talend bigquery input component (BQ java api) to read from bigquery, I get the following error (for a long running job) -
Exception in component tBigQueryInput_4
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
"code" : 404,
"errors" : [ {
"domain" : "global",
"message" : "Not found: Table rand-cap:_f000fcf374688fc5e7da50a4c0c04ba228d993c3.anon0849eba05949a62962f218a0433d6ee82bf13a7b",
"reason" : "notFound"
} ],
"message" : "Not found: Table rand-cap:_f000fcf374688fc5e7da50a4c0c04ba228d993c3.anon0849eba05949a62962f218a0433d6ee82bf13a7b"
}
Is this because of the "temporary" table that bq creates when querying results not being available after 24hrs. Or is it because rate limit was exceeded since I am querying a large table ?
In either case, how can I find more details on this error and what steps should I take to prevent this ?
Thank you !
This seems to be a problem in Talend, there are other users describing your issue: https://www.talendforge.org/forum/viewtopic.php?id=44734
Google Bigquery has a property i.e. Allowlargeresults but its not there in TBigqueryinput.
Hi there - I am currently using Talend open studio v6.1.1 and this issue still exists.

BigQuery Load Job [invalid] Too many errors encountered

I'm trying to insert data into BigQuery using the BigQuery Api C# Sdk.
I created a new Job with Json Newline Delimited data.
When I use :
100 lines for inputs : OK
250 lines for inputs : OK
500 lines for inputs : KO
2500 lines : KO
The error encountered is :
"status": {
"state": "DONE",
"errorResult": {
"reason": "invalid",
"message": "Too many errors encountered. Limit is: 0."
},
"errors": [
{
"reason": "internalError",
"location": "File: 0",
"message": "Unexpected. Please try again."
},
{
"reason": "invalid",
"message": "Too many errors encountered. Limit is: 0."
}
]
}
The file works well when I use the Bq Tools with command :
bq load --source_format=NEWLINE_DELIMITED_JSON dataset.datatable pathToJsonFile
Something seems to be wrong on server side or maybe when I transmit the file but we cannot get more log than "internal server error"
Does anyone have more informations on this ?
Thanks you
"Unexpected. Please try again." could either indicate that the contents of the files you provided had unexpected characters, or it could mean that an unexpected internal server condition occurred. There are several questions which might help shed some light on this:
does this consistently happen no matter how many times you retry?
does this directly depend on the lines in the file, or can you construct a simple upload file which doesn't trigger the error condition?
One option to potentially avoid these problems is to send the load job request with configuration.load.maxBadRecords higher than zero.
Feel free to comment with more info and I can maybe update this answer.