Datastream Troubleshoot: "An unknown error occurred. Please try again. If the error persists, contact Google support" - google-bigquery

We are trying to replicate data from AlloyDB to Bigquery using Datastream.
We Get "An unknown error occurred. Please try again. If the error persists, contact Google support."
In the Datastream console --> objects list, we see all source tables with Object Status "Failed" and Backfill status "Completed".
In Bigquery we see only a subset of the tables (not all the "Completed" objects were synced).
In the Logs Explorer I can see this error on BQ:
I also see this error: error: {
code: 11
message: "Unsupported primary key column either does not exist or is a pseudocolumn at [1:401]"
}
The column referred in the error is of type enum.
The desired situation is having all the AlloyDB tables replicated into Bigquery.
The error message is not very informative...
What does it mean?
What would be the best way to go about troubleshooting this?

We're actively working on making these error messages be more informative, and improvements are continuously being rolled out as we identify more edge cases. Assuming you followed all the steps in the documentation, then you may need to open a ticket with support for further investigation. If a support ticket isn't an option, you can still report the issue using the public issue tracker

I just had this same issue but connecting to a PostgreSQL in AWS RDS:
Beginning with Postgres 10, passwords are encrypted using SCRAM-SHA-256 in PostgreSQL. Google DataStream still expects MD5 password encryption, or it will generate an "unknown error" in the logs and fail the backfills.
You'll need to update your postgresql.conf (or RDS Cluster Parameter Group if you're using AWS like me):
password_encryption = 'MD5'
Restart the database and make sure the parameter has changed with:
SHOW password_encryption;
Reset the password of your users:
ALTER USER "{username}" with password '{password}';
More info from the PostgreSQL docs: https://www.postgresql.org/docs/current/auth-password.html

Related

Load from GCS to GBQ causes an internal BigQuery error

My application creates thousands of "load jobs" daily to load data from Google Cloud Storage URIs to BigQuery and only a few cases causing the error:
"Finished with errors. Detail: An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 7916072"
The application is written on Python and uses libraries:
google-cloud-storage==1.42.0
google-cloud-bigquery==2.24.1
google-api-python-client==2.37.0
Load job is done by calling
load_job = self._client.load_table_from_uri(
source_uris=source_uri,
destination=destination,
job_config=job_config,
)
this method has a default param:
retry: retries.Retry = DEFAULT_RETRY,
so the job should automatically retry on such errors.
Id of specific job that finished with error:
"load_job_id": "6005ab89-9edf-4767-aaf1-6383af5e04b6"
"load_job_location": "US"
after getting the error the application recreates the job, but it doesn't help.
Subsequent failed job ids:
5f43a466-14aa-48cc-a103-0cfb4e0188a2
43dc3943-4caa-4352-aa40-190a2f97d48d
43084fcd-9642-4516-8718-29b844e226b1
f25ba358-7b9d-455b-b5e5-9a498ab204f7
...
As mentioned in the error message, Wait according to the back-off requirements described in the BigQuery Service Level Agreement, then try the operation again.
If the error continues to occur, if you have a support plan please create a new GCP support case. Otherwise, you can open a new issue on the issue tracker describing your issue. You can also try to reduce the frequency of this error by using Reservations.
For more information about the error messages you can refer to this document.

LeaseAlreadyPresent Error in Azure Data Factory V2

I am getting the following error in a pipeline that has Copy activity with Rest API as source and Azure Data Lake Storage Gen 2 as Sink.
"message": "Failure happened on 'Sink' side. ErrorCode=AdlsGen2OperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ADLS Gen2 operation failed for: Operation returned an invalid status code 'Conflict'. Account: '{Storage Account Name}'. FileSystem: '{Container Name}'. Path: 'foodics_v2/Burgerizzr/transactional/_567a2g7a/2018-02-09/raw/inventory-transactions.json'. ErrorCode: 'LeaseAlreadyPresent'. Message: 'There is already a lease present.'. RequestId: 'd27f1a3d-d01f-0003-28fb-400303000000'..,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.Azure.Storage.Data.Models.ErrorSchemaException,Message=Operation returned an invalid status code 'Conflict',Source=Microsoft.DataTransfer.ClientLibrary,'",
The pipeline runs in a for loop with Batch size = 5. When I make it sequential, the error goes away, but I need to run it in parallel.
This is known issue with adf limitation variable thread parallel running.
You probably trying to rename filename using variable.
Your option is to run another child looping after each variable execution.
i.e. variable -> Execute Pipeline
enter image description here
or
remove those variable, hard coded those variable expression in azure activity.
enter image description here
Hope this helps

Runtime error : DBSQL_SQL_ERROR exception : CX_SY_OPEN_SQL_DB, where to focus?

More info from stms :
Short Text
SQL error "SQL code: -10108" occurred while accessing table "CDHDR".
What happened?
Database error text: "SQL message: Session has been reconnected."
Return value of the database layer: "SQL dbsl rc: 99"
The problem is that client is running this code through the job in which I am not available to debug it. All I know is that it is working for other countries selected but for some specific it is not so the question is :
Was it a temporary connection error between two servers or is it data overload issue due to SELECT statement?
At the moment I am not really sure which thing I have to look on the system, the same program once produced the error for exceeding the limit in the db query ( > 10 minutes) so this might be related to system configuration or what?
Thanks in advance

Getting exception while reading data from blob in azure

While I am trying to read the list of blob data on azure, I am getting the following error:
Function evaluation disabled because a previous function evaluation timed out. You must continue execution to reenable function evaluation.
How to resolve this?
Please see the following link. Your code likely has a endless loop. https://msdn.microsoft.com/en-us/library/ms234762.aspx

updateblob fails in Powerbuilder

In Powerbuilder I am trying to update a table (Oracle) with blob but get sqlerror, "Database statement must refer to blob variable". My declaration and updateblob statements are as follows:
blob lblob_newxml
long llong_subid
UPDATEBLOB RP_XML_FORMS SET XML_DOC = :lblob_newxml
WHERE SUBMISSION_ID = :llong_subid
USING SQLCA;
Does anybody know why it is happening and or how to solve this problem? Thanks.
To get more information on this problem and the possible causes, I'd run with one of the database traces turned on. (You can check out database trace options in the Connecting to Your Database manual; link may not be appropriate for your PB version, which you haven't mentioned yet.) This may or may not tell you more, but it tracks everything between the app and when the PB drivers pass the commands "over the wall" to the database's driver.
Good luck,
Terry.
"The PowerBuilder VM can get the SQL syntax for the following types of errors, and passes it to the Transaction object’s DBError event for the following types of errors: ..." (see this page).
If your lblob_newxml is null then use this update statement instead:
UPDATE RP_XML_FORMS SET XML_DOC = NULL
WHERE SUBMISSION_ID = :llong_subid
USING SQLCA;