Update surrealDB data error,output info:Failed to resolve lock - locking

I run SurrealDB server with Docker,start command:"docker run --rm -p 8000:8000 docker.io/surrealdb/surrealdb:latest start --log trace --user root --pass root tikv://10.206.0.9:2379",specifying the TiKV cluster endpoint as the backing data store.
1.I run "UPDATE account SET username.first = "Tobie",username.last = "Morgan Hitchcock" where id="account:02vpvzway97jn2wj23kq" ",update one record is OK,return:
[
{
"time": "222.085746ms",
"status": "OK",
"result": [
{
"age": 18,
"datatime": "2022-10-05T03:20:57.781059411Z",
"id": "account:02vpvzway97jn2wj23kq",
"name": "SurrealDB99",
"password": "b50339a10e1de285ac99d4c3990b8693",
"username": {
"first": "Tobie",
"last": "Morgan Hitchcock"
}
}
]
}
]
But update more record return ERR.
E.g UPDATE account SET password=crypto::md5(id) RETURN NONE; return:
[
{
"time": "1.035028887s",
"status": "ERR",
"detail": "There was a problem with a datastore transaction: Failed to resolve lock"
}
]

Related

BigQuery Execute fails with no meaningful error on Cloud Data Fusion

I'm trying to use the BigQuery Execute function in Cloud Data Fusion (Google). The component validates fine, the SQL checks out but I get this non-meaningful error with every execution:
02/11/2022 12:51:25 ERROR Pipeline 'test-bq-execute' failed.
02/11/2022 12:51:25 ERROR Workflow service 'workflow.default.test-bq-execute.DataPipelineWorkflow.<guid>' failed.
02/11/2022 12:51:25 ERROR Program DataPipelineWorkflow execution failed.
I can see nothing else to help me debug this. Any ideas? The SQL in question is a simple DELETE from dataset.table WHERE ds = CURRENT_DATE()
This was the pipeline
{
"name": "test-bq-execute",
"description": "Data Pipeline Application",
"artifact": {
"name": "cdap-data-pipeline",
"version": "6.5.1",
"scope": "SYSTEM"
},
"config": {
"resources": {
"memoryMB": 2048,
"virtualCores": 1
},
"driverResources": {
"memoryMB": 2048,
"virtualCores": 1
},
"connections": [],
"comments": [],
"postActions": [],
"properties": {},
"processTimingEnabled": true,
"stageLoggingEnabled": false,
"stages": [
{
"name": "BigQuery Execute",
"plugin": {
"name": "BigQueryExecute",
"type": "action",
"label": "BigQuery Execute",
"artifact": {
"name": "google-cloud",
"version": "0.18.1",
"scope": "SYSTEM"
},
"properties": {
"project": "auto-detect",
"sql": "DELETE FROM GCPQuickStart.account WHERE ds = CURRENT_DATE()",
"dialect": "standard",
"mode": "batch",
"dataset": "GCPQuickStart",
"table": "account",
"useCache": "false",
"location": "US",
"rowAsArguments": "false",
"serviceAccountType": "filePath",
"serviceFilePath": "auto-detect"
}
},
"outputSchema": [
{
"name": "etlSchemaBody",
"schema": ""
}
],
"id": "BigQuery-Execute",
"type": "action",
"label": "BigQuery Execute",
"icon": "fa-plug"
}
],
"schedule": "0 1 */1 * *",
"engine": "spark",
"numOfRecordsPreview": 100,
"maxConcurrentRuns": 1
}
}
I was able to catch the error using Cloud Logging. To enable Cloud Logging in Cloud Data Fusion, you may use this GCP Documentation. And follow these steps to view the logs from Data Fusion to Cloud Logging. Replicating your scenario this is the error I found:
"logMessage": "Program DataPipelineWorkflow execution failed.\njava.util.concurrent.ExecutionException: com.google.cloud.bigquery.BigQueryException: Cannot set destination table in jobs with DML statements\n at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)\n at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)\n at io.cdap.cdap.internal.app.runtime.distributed.AbstractProgramTwillRunnable.run(AbstractProgramTwillRunnable.java:274)\n at org.apache.twill.interna..."
}
What we did to resolve this error: Cannot set destination table in jobs with DML statements is we left the Dataset Name and Table Name empty inside the pipeline properties as there is no need for the destination table to be specified.
Output:

Set Subnet ID and EC2 Key Name in EMR Cluster Config via Step Functions

As of November 2019 AWS Step Function has native support for orchestrating EMR Clusters. Hence we are trying to configure a Cluster and run some jobs on it.
We could not find any documentation on how to set the SubnetId as well as the Key Name used for the EC2 instances in the cluster. Is there any such possibility?
As of now our create cluster step looks as following:
"States": {
"Create an EMR cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
"Parameters": {
"Name": "TestCluster",
"VisibleToAllUsers": true,
"ReleaseLabel": "emr-5.26.0",
"Applications": [
{ "Name": "spark" }
],
"ServiceRole": "SomeRole",
"JobFlowRole": "SomeInstanceProfile",
"LogUri": "s3://some-logs-bucket/logs",
"Instances": {
"KeepJobFlowAliveWhenNoSteps": true,
"InstanceFleets": [
{
"Name": "MasterFleet",
"InstanceFleetType": "MASTER",
"TargetOnDemandCapacity": 1,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge"
}
]
},
{
"Name": "CoreFleet",
"InstanceFleetType": "CORE",
"TargetSpotCapacity": 2,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge",
"BidPriceAsPercentageOfOnDemandPrice": 100 }
]
}
]
}
},
"ResultPath": "$.cluster",
"End": "true"
}
}
As soon as we try to add "SubnetId" key in any of the subobjects in Parameters, or in Parameter itself we get the error:
Invalid State Machine Definition: 'SCHEMA_VALIDATION_FAILED: The field "SubnetId" is not supported by Step Functions at /States/Create an EMR cluster/Parameters' (Service: AWSStepFunctions; Status Code: 400; Error Code: InvalidDefinition;
Referring to the SF docs on the emr integration we can see that createCluster.sync uses the emr API RunJobFlow. In RunJobFlow we can specify the Ec2KeyName and Ec2SubnetId located at the paths $.Instances.Ec2KeyName and $.Instances.Ec2SubnetId.
With that said I managed to create a State Machine with the following definition (on a side note, your definition had a syntax error with "End": "true", which should be "End": true)
{
"Comment": "A Hello World example of the Amazon States Language using Pass states",
"StartAt": "Create an EMR cluster",
"States": {
"Create an EMR cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
"Parameters": {
"Name": "TestCluster",
"VisibleToAllUsers": true,
"ReleaseLabel": "emr-5.26.0",
"Applications": [
{
"Name": "spark"
}
],
"ServiceRole": "SomeRole",
"JobFlowRole": "SomeInstanceProfile",
"LogUri": "s3://some-logs-bucket/logs",
"Instances": {
"Ec2KeyName": "ENTER_EC2KEYNAME_HERE",
"Ec2SubnetId": "ENTER_EC2SUBNETID_HERE",
"KeepJobFlowAliveWhenNoSteps": true,
"InstanceFleets": [
{
"Name": "MasterFleet",
"InstanceFleetType": "MASTER",
"TargetOnDemandCapacity": 1,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge"
}
]
},
{
"Name": "CoreFleet",
"InstanceFleetType": "CORE",
"TargetSpotCapacity": 2,
"InstanceTypeConfigs": [
{
"InstanceType": "m3.2xlarge",
"BidPriceAsPercentageOfOnDemandPrice": 100
}
]
}
]
}
},
"ResultPath": "$.cluster",
"End": true
}
}
}

GraphJSON serialization in Gremlin.Net

I'm trying to query the TinkerPop server (hosted inside docker container) via CosmosDB client library, which uses under the hood Gremlin.Net. So I managed to connect it and insert the data, here's intercepted WebSocket request:
!application/vnd.gremlin-v1.0+json{
"requestId": "b64bd2eb-46c3-4095-9eef-768bca2a14ed",
"op": "eval",
"processor": "",
"args": {
"gremlin": "g.addV(\"User\").property(\"UserId\",2).property(\"CustomerId\",1)"
}
}
The response:
{
"requestId": "b64bd2eb-46c3-4095-9eef-768bca2a14ed",
"status": {
"message": "",
"code": 200,
"attributes": {
"host": "/172.19.0.1:38848"
}
},
"result": {
"data": [
{
"id": 0,
"label": "User",
"type": "vertex",
"properties": {}
}
],
"meta": {}
}
}
Problem is that I see those properties when I'm connected via gremlin console
gremlin> g.V().hasLabel("User").has("CustomerId",1).has("UserId",2).limit(1).valueMap()
==>{UserId=[2], CustomerId=[1]}
Also, I'm able to query the TinkerPop server with Gremlin.Net:
!application/vnd.gremlin-v1.0+json{
"requestId": "de35909f-4bc1-4aae-aa5f-28361b3c0933",
"op": "eval",
"processor": "",
"args": {
"gremlin": "g.V().hasLabel(\"User\").has(\"CustomerId\",1).has(\"UserId\",2).limit(1)"
}
}
But it returns a payload with zero-valued ID and without any properties included:
{
"requestId": "de35909f-4bc1-4aae-aa5f-28361b3c0933",
"status": {
"message": "",
"code": 200,
"attributes": {
"host": "/172.19.0.1:38858"
}
},
"result": {
"data": [
{
"id": 0,
"label": "User",
"type": "vertex",
"properties": {}
}
],
"meta": {}
}
}
Tried to swap between GraphSON v1, v2, v3 with no luck. Documentation says that script serializers should include all the properties. Do I have to tweak the config somehow to make this work and return properties?
So it seems that with a version of 3.4 of the Gremlin server ReferenceElementStrategy
was added by default to traversals, to preserve compatibility between binary and script serializers. In our case we wanted to mimic the behavior of the CosmosDB, so to adjust and receive desired behavior just remove the strategy from init script (in our case it was empty-sample.groovy
globals << [g : graph.traversal().withStrategies(ReferenceElementStrategy.instance())]
to
globals << [g : graph.traversal()]

Azure Data Factory v2 If activity always fails

I'm currently struggling with the Azure Data Factory v2 If activity which always fails with this error message:
enter image description here
I've designed two separate pipelines, one takes the full snapshot of the data (1333 records) from the on-premises SQL Server and loads the data into the Azure SQL Database, and another one just takes delta from the same source.
Both pipelines work fine when executed independently.
I then decided to wrap these two pipelines into the one parent pipeline which would do this:
1.
Execute LookUp activity to check if the target table in Azure SQL Database has any records, basic Select Count(Request_ID) As record_count From target_table - activity works fine, I can preview the returned record count.
2.
Pass the output from the LookUp activity to the If activity with the conditions that if record_count = 0, the parent pipeline would invoke the full load pipeline, otherwise the parent pipeline would invoke the delta load pipeline.
This is the actual expression:
{#activity('lookup_sites_record_count').output.firstRow.record_count}==0"
Whenever I try to execute this parent pipeline, it fails with the above message of "Activity failed: Activity failed because an inner activity failed."
Both inner activities, that is, full load and delta load pipelines, work just fine when triggered independently.
What I'm missing?
Many thanks in advance :).
mikhailg
Pipeline's JSON definition below:
{
"name": "pl_remedyreports_load_rs_sites",
"properties": {
"activities": [
{
"name": "lookup_sites_record_count",
"type": "Lookup",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false
},
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": "Select Count(Request_ID) As record_count From mdp.RS_Sites;"
},
"dataset": {
"referenceName": "ds_azure_sql_db_sites",
"type": "DatasetReference"
}
}
},
{
"name": "If_check_site_record_count",
"type": "IfCondition",
"dependsOn": [
{
"activity": "lookup_sites_record_count",
"dependencyConditions": [
"Succeeded"
]
}
],
"typeProperties": {
"expression": {
"value": "{#activity('lookup_sites_record_count').output.firstRow.record_count}==0",
"type": "Expression"
},
"ifFalseActivities": [
{
"name": "pl_remedyreports_invoke_load_sites_inc",
"type": "ExecutePipeline",
"typeProperties": {
"pipeline": {
"referenceName": "pl_remedyreports_load_sites_inc",
"type": "PipelineReference"
}
}
}
],
"ifTrueActivities": [
{
"name": "pl_remedyreports_invoke_load_sites_full",
"type": "ExecutePipeline",
"typeProperties": {
"pipeline": {
"referenceName": "pl_remedyreports_load_sites_full",
"type": "PipelineReference"
}
}
}
]
}
}
],
"folder": {
"name": "Load Remedy Reference Data"
}
}
}
Your expression should be:
#equals(activity('lookup_sites_record_count').output.firstRow.record_count,0)

Invalid Path while inserting job from google cloud storage to google bigquery

I am trying to insert a job through HTTP Post request, but i am getting Invalid path error.
My request body is as follows:
{
"configuration": {
"load": {
"sourceUris": [
"gs://onianalytics/PersData.csv"
],
"schema": {
"fields": [
{
"name": "Name",
"type": "STRING"
},
{
"name": "Age",
"type": "INTEGER"
}
]
},
"destinationTable": {
"datasetId": "Test_Dataset",
"projectId": "lithe-anvil-404",
"tableId": "tb_test_Pers"
}
}
},
"jobReference": {
"jobId": "10",
"projectId": "lithe-anvil-404"
}
}
For the sourceuri parameter, I am passing "gs://onianalytics/PersData.csv", where onianalytics is my bucket name and PersData.csv is my csv file (from which I want to upload data into google bigquery).
I am getting below response:
"status": {
"state": "DONE",
"errorResult": {
"reason": "invalid",
"message": "Invalid path: gs://onianalytics/PersData.csv"
},
"errors": [
{
"reason": "invalid",
"message": "Invalid path: gs://onianalytics/PersData.csv"
}
]
},
"statistics": {
"creationTime": "1387276603674",
"startTime": "1387276603751",
"endTime": "1387276603751"
}
}
Please explain why this error is occurring?
Is your bucket under the same projectId that has the BigQuery service activated and you requested tokens with? If not, have you tried enabling read/write access for that project?