Pyodbc to Hive on Dataproc intermittent error: (79) Failed to reconnect to server. (79) (SQLExecDirectW) - hive

I launch a Dataproc cluster with Hive using GoogleAPIs in Python and connect to the Hive with pyodbc. Hive queries succeed and fail seemingly randomly.
Cloudera Hive ODBC driver 2.6.9
pyodbc 4.0.30
Error: "pyodbc.OperationalError: ('08S01', '[08S01] [Cloudera][Hardy] (79) Failed to reconnect to server. (79) (SQLExecDirectW)')"
Some server logs:
{
"insertId": "xxx",
"jsonPayload": {
"filename": "yarn-yarn-timelineserver-xxx-m.log",
"class": "org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager",
"message": "ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted"
},
{
"insertId": "xxx",
"jsonPayload": {
"container": "container_xxx_0002_01_000001",
"thread": "ORC_GET_SPLITS #4",
"application": "application_xxx_0002",
"message": "Failed to get files with ID; using regular API: Only supported for DFS; got class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem",
"class": "|io.AcidUtils|",
"filename": "application_xxx_0002.container_xxx_0002_01_000001.syslog_dag_xxx_0002_4",
"container_logname": "syslog_dag_xxx_0002_4"
},

Related

Validation error in aws cloudwatch events rule?

I am triggering my codebuild using codebuild triggers feature with an cron expression cron(*/2 * * * ? *) which triggers for every 2 minutes . Unfortunately, this didn't run after 2 minutes when i checked the cloudwatch show metrics i can see that there were some failedinvocations. To know the cause the of the error i enabled the cloudtrail logs and i can see the error like this
{
"eventVersion": "1.04",
"userIdentity": {
"type": "IAMUser",
"principalId": "xx",
"arn": "arn:aws:iam::xx:user/xx",
"accountId": "xx",
"accessKeyId": "xx",
"userName": "xx",
"sessionContext": {
"attributes": {
"mfaAuthenticated": "true",
"creationDate": "2019-03-04T06:21:22Z"
}
},
"invokedBy": "signin.amazonaws.com"
},
"eventTime": "2019-03-04T09:04:56Z",
"eventSource": "monitoring.amazonaws.com",
"eventName": "DescribeAlarms",
"awsRegion": "ap-south-1",
"sourceIPAddress": "xxx",
"userAgent": "signin.amazonaws.com",
"errorCode": "ValidationException",
"errorMessage": "1 validation error detected: Value 'INVALID_FOR_SUMMARY' at 'stateValue' failed to satisfy constraint: Member must satisfy enum value set: [INSUFFICIENT_DATA, ALARM, OK]",
"requestParameters": {
"stateValue": "INVALID_FOR_SUMMARY"
},
"responseElements": null,
"requestID": "94f3a789-3e5c-11e9-92f8-xxx",
"eventID": "c9ecfca2-a650-4997-b707-xxx",
"eventType": "AwsApiCall",
"recipientAccountId": "xxx"
}
What is this exactly mean 1 validation error detected: Value 'INVALID_FOR_SUMMARY' at 'stateValue' failed to satisfy constraint: Member must satisfy enum value set: [INSUFFICIENT_DATA, ALARM, OK] ?
Does this error is the reason for not triggering my code build ?
Any help is appreciated
Thanks
Do it from amazon command line, it looks like a known issue in AWS. I managed to update mine via the regular CI job of the orchestration.

My AKS Cluster was brought down, how can I recover?

I have been playing around with load-testing my application on a single agent cluster in AKS. During the testing, the connection to the dashboard stalled and never resumed. My application seems down as well, so I am assuming the cluster is in a bad state.
The API server is restate-f4cbd3d9.hcp.centralus.azmk8s.io
kubectl cluster-info dump shows the following error:
{
"name": "kube-dns-v20-6c8f7f988b-9wpx9.14fbbbd6bf60f0cf",
"namespace": "kube-system",
"selfLink": "/api/v1/namespaces/kube-system/events/kube-dns-v20-6c8f7f988b-9wpx9.14fbbbd6bf60f0cf",
"uid": "47f57d3c-d577-11e7-88d4-0a58ac1f0249",
"resourceVersion": "185572",
"creationTimestamp": "2017-11-30T02:36:34Z",
"InvolvedObject": {
"Kind": "Pod",
"Namespace": "kube-system",
"Name": "kube-dns-v20-6c8f7f988b-9wpx9",
"UID": "9d2b20f2-d3f5-11e7-88d4-0a58ac1f0249",
"APIVersion": "v1",
"ResourceVersion": "299",
"FieldPath": "spec.containers{kubedns}"
},
"Reason": "Unhealthy",
"Message": "Liveness probe failed: Get http://10.244.0.4:8080/healthz-kubedns: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)",
"Source": {
"Component": "kubelet",
"Host": "aks-agentpool-34912234-0"
},
"FirstTimestamp": "2017-11-30T02:23:50Z",
"LastTimestamp": "2017-11-30T02:59:00Z",
"Count": 6,
"Type": "Warning"
}
As well as some Pod Sync errors in Kube-System.
Example of issue:
az aks browse -g REstate.Server -n REstate
Merged "REstate" as current context in C:\Users\User\AppData\Local\Temp\tmp29d0conq
Proxy running on http://127.0.0.1:8001/
Press CTRL+C to close the tunnel...
error: error upgrading connection: error dialing backend: dial tcp 10.240.0.4:10250: getsockopt: connection timed out
You'll probably need to ssh to the node to see if the Kubelet service is running. For future you can set Resource quotas from exhausting all resources in the cluster nodes.
Resource Quotas -https://kubernetes.io/docs/concepts/policy/resource-quotas/

successful snapshot fails to load some shards, RepositoryMissingException in elasticsearch

I had a backup successfully complete to my s3 bucket in elasticsearch:
{
"state": "SUCCESS",
"start_time": "2014-12- 06T00:12:39.362Z",
"start_time_in_millis": 1417824759362,
"end_time": "2014-12-06T00:33:34.352Z",
"end_time_in_millis": 1417826014352,
"duration_in_millis": 1254990,
"failures": [],
"shards": {
"total": 345,
"failed": 0,
"successful": 345
}
}
But when I restore from the snapshot, I have a few failed shards, with the following message:
[2014-12-08 00:00:05,580][WARN ][cluster.action.shard] [Sunder] [kibana-int][4] received shard failed for [kibana-int][4],
node[_QG8dkDaRD-H1uPL_p57lw], [P], restoring[elasticsearch:snapshot_1], s[INITIALIZING], indexUUID [SAuv_EU3TBGZ71NhkC7WOA],
reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[kibana-int][4] failed recovery];
nested: IndexShardRestoreFailedException[[kibana-int][4] restore failed];
nested: RepositoryMissingException[[elasticsearch] missing]; ]]
how I do reconcile the data, or if necessary remove the shards from my cluster to complete the recovery?

Elasticsearch snapshot restore throwing "repository missing" exception

"error": "RemoteTransportException[[Francis Underwood][inet[/xx.xx.xx.xx:9300]][cluster/snapshot/get]]; nested: RepositoryMissingException[[xxxxxxxxx] missing]; ",
"status": 404
I am also unable to create new snapshot repository for snapshots on s3
PUT _snapshot/bkp_xxxxx_master
{
"type": "s3",
settings": {
"region": "us-xxxx-x",
"bucket": "elasticsearch-backups",
"access_key": "xxxxxxxxxxxx",
"secret_key": "xxxxxxxxxxxxxxxxxxx"
}
}
Response I receive for this PUT is below:
{
"error": "RemoteTransportException[[Francis Underwood][inet[/xx.xx.xx.xx:9300]][cluster/repository/put]]; nested: RepositoryException[[bkp_xxxxxxx_master] failed to create repository]; nested:'AbstractMethodError[org.elasticsearch.cloud.aws.blobstore.S3BlobStore.immutableBlobContainer(Lorg/elasticsearch/common/blobstore/BlobPath;)Lorg/elasticsearch/common/blobstore/ImmutableBlobContainer;]; ",
"status": 500
}
Thanks in advance!
I know this is an old issue but I had been able to replicate this over multiple ElasticSearch versions and it turns out that the reason was conflict between JVM versions and elasticsearch-aws-cloud plugin versions.
As long as you have consistent versions across the cluster (in my case it was Joda version in elasticsearch-aws-cloud was not compatible with the latest JVM version I had installed on the newer nodes.

AWS xray put trace segment command return error

I am trying to send segment doc manually using the CLI with example on this page: https://docs.aws.amazon.com/xray/latest/devguide/xray-api-sendingdata.html#xray-api-segments
I created my own Trace ID and also start and end time.
The command i used are:
> DOC='{"trace_id": "'$TRACE_ID'", "id": "6226467e3f841234", "start_time": 1581596193, "end_time": 1581596198, "name": "test.com"}'
>echo $DOC
{"trace_id": "1-5e453c54-3dc3e03a3c86f97231d06c88", "id": "6226467e3f845502", "start_time": 1581596193, "end_time": 1581596198, "name": "test.com"}
> aws xray put-trace-segments --trace-segment-documents $DOC
{
"UnprocessedTraceSegments": [
{
"ErrorCode": "ParseError",
"Message": "Invalid segment. ErrorCode: ParseError"
},
{
"ErrorCode": "MissingId",
"Message": "Invalid segment. ErrorCode: MissingId"
},
{
"ErrorCode": "MissingId",
"Message": "Invalid segment. ErrorCode: MissingId"
},
.................
The put-trace-segment keep giving me error. The segment doc comply with the JSON schema too. Am i missing something else?
Thanks.
I need to enclose the JSON with "..". The command that works for me was: aws xray put-trace-segments --trace-segment-documents "$DOC"
This is probably due an error in the documentation or that the xray team was using another kind of shell.