Our AKS cluster suddenly stopped responding to az aks and kubectl commands. We tried to do az aks upgrade since that has previously been recommended here. First we upgraded successfully from 1.7.7 to 1.7.7, but that didn't fix the cluster state. Next we tried to upgrade from 1.7.7 to 1.7.12. Unfortunately that failed and now the cluster does not seem operational.
Here is the error response at the end for the upgrade command:
{
"additionalProperties": {
"endTime": "2018-03-07T14:15:43.7948662Z",
"error": {
"code": "ControlPlaneCloudProviderNotSet",
"message": "CloudProviderProfile is not set"
},
"startTime": "2018-03-07T14:14:31.6196846Z",
"status": "Failed"
},
"agentPoolProfiles": null,
"dnsPrefix": null,
"fqdn": null,
"id": null,
"kubernetesVersion": null,
"linuxProfile": null,
"location": null,
"name": "03ae4ea8-58ef-0c47-8346-64a665d0edf7",
"provisioningState": null,
"servicePrincipalProfile": null,
"tags": null,
"type": null
}
We found a GitHub issue https://github.com/Azure/AKS/issues/165 in the history which looks similar to what we are seeing here. Is it the same issue?
Correlation id: 7377a386-dfef-4c30-923d-b894001e14ac
Location: west-europe
GitHub issue for this StackOverflow post: https://github.com/Azure/AKS/issues/229
I would probably try to get the context set again by using:
az aks get-credentials -n CLUSTER-NAME -g RESOURCE-GROUP
Then try a upgrade without setting the version:
az aks upgrade -n CLUSTER-NAME -g RESOURCE-GROUP
Hopefully, that should get it back up again.
Related
I'm trying to upload NFT assets using the Solana devnet with candy machine v2
This is the command that I am running
ts-node ~/metaplex-foundation/metaplex/js/packages/cli/src/candy-machine-v2-cli.ts upload -e devnet -k ~/.config/solana/devnet.json -cp config.json ./assets
when I run the command above, I get this error message:
Beginning the upload for 7 (img+json) pairs
started at: 1644272598092
initializing candy machine
Error deploying config to Solana network. RangeError: indeterminate span
at Structure.getSpan (/Users/dlku/metaplex-foundation/metaplex/js/node_modules/buffer-layout/lib/Layout.js:1221:13)
at Structure.encode (/Users/dlku/metaplex-foundation/metaplex/js/node_modules/buffer-layout/lib/Layout.js:1267:23)
at InstructionCoder._encode (/Users/dlku/metaplex-foundation/metaplex/js/packages/cli/node_modules/#project-serum/anchor/src/coder/instruction.ts:85:24)
Has anyone else ran into this issue ? Any help would be appreciated
Here's an example of the json data:
{
"name":"NFT NAME",
"symbol": "NN",
"description": "My collection",
"image": "7.png",
"properties": {
"creators": [{"address": "00x00x00x00x000x00x00x", "share": 100}],
"files": [{"uri": "7.png", "type": "image/png"}]
}
}
and this is the config:
{
"price": 0.50,
"number": 10,
"solTreasuryAccount":
"00x00x000x0x000x00x00x000x0000x00x00",
"storage": "arweave"
}
similar questions but didn't solve my issue:
4 months ago
1 month ago
github
Apparently many users are having issue with devnet
So as of March 7 2022 devnet is acting up , it'd recommend using an RPC
I get an empty response when I fetch config with URL
e.g :- http://localhost:8888/emi-app/dev
Response:-
{
"name": "emi-app",
"profiles": [
"dev"
],
"label": null,
"version": null,
"state": null,
"propertySources": []
}
Need to know the reason why I am getting empty response
Make sure emi-app-dev.properties or emi-app-dev.yml is present in the Git config repo. Also make sure it's added committed.
I'm working on kong 0.13.1. Following the docs I added certificate as follows:
{
"data": [
{
"cert": "certificate is really here",
"created_at": 1529667116000,
"id": "6ae77f49-a13f-45b1-a370-8d53b35d7bfd",
"key": "The key is really here",
"snis": [
"myapp.local",
"mockbin.myapp.local"
]
}
],
"total": 1
}
Then added an API which works perfectly well with http:
{
"data": [
{
"created_at": 1529590900803,
"hosts": [
"mockbin.myapp.local"
],
"http_if_terminated": false,
"https_only": false,
"id": "216c23c5-a1ae-4bef-870b-9c278113f8f8",
"name": "mockbin",
"preserve_host": false,
"retries": 5,
"strip_uri": true,
"upstream_connect_timeout": 60000,
"upstream_read_timeout": 60000,
"upstream_send_timeout": 60000,
"upstream_url": "http://localhost:3000"
}
],
"total": 1
}
But unfortunately Kong keeps sending me a default cert located in /usr/local/kong/ssl/kong-default.crt
I'm testing it with:
openssl s_client -connect localhost:8443/products -host mockbin.myapp.local -debug
Back in the days there was a dynamic ssl plugin (where api ssl was added with version 0.3.0) but it's gone since 0.10 update.
I know that it's kinda fix my code configuration question but possibly someone else might also run into similar issue.
I spent some time on figuring it out but I didn't manage to fix it. As kong docs say, api is deprecated so I ended up with rewriting everything to routes and services and I advise you to do the same. Routes and services work perfectly well when implementing step by step based on docs.
The Kong documentation seems clear on how to use the administrative api to configure ssl certificates. It is certainly easier to maintain the certificate at the global level, rather than service and route-specific administration.
Others looking for the answer to this question should find it straightforward, to follow the instructions in the latest Kong documentation linked above.
I have been playing around with load-testing my application on a single agent cluster in AKS. During the testing, the connection to the dashboard stalled and never resumed. My application seems down as well, so I am assuming the cluster is in a bad state.
The API server is restate-f4cbd3d9.hcp.centralus.azmk8s.io
kubectl cluster-info dump shows the following error:
{
"name": "kube-dns-v20-6c8f7f988b-9wpx9.14fbbbd6bf60f0cf",
"namespace": "kube-system",
"selfLink": "/api/v1/namespaces/kube-system/events/kube-dns-v20-6c8f7f988b-9wpx9.14fbbbd6bf60f0cf",
"uid": "47f57d3c-d577-11e7-88d4-0a58ac1f0249",
"resourceVersion": "185572",
"creationTimestamp": "2017-11-30T02:36:34Z",
"InvolvedObject": {
"Kind": "Pod",
"Namespace": "kube-system",
"Name": "kube-dns-v20-6c8f7f988b-9wpx9",
"UID": "9d2b20f2-d3f5-11e7-88d4-0a58ac1f0249",
"APIVersion": "v1",
"ResourceVersion": "299",
"FieldPath": "spec.containers{kubedns}"
},
"Reason": "Unhealthy",
"Message": "Liveness probe failed: Get http://10.244.0.4:8080/healthz-kubedns: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)",
"Source": {
"Component": "kubelet",
"Host": "aks-agentpool-34912234-0"
},
"FirstTimestamp": "2017-11-30T02:23:50Z",
"LastTimestamp": "2017-11-30T02:59:00Z",
"Count": 6,
"Type": "Warning"
}
As well as some Pod Sync errors in Kube-System.
Example of issue:
az aks browse -g REstate.Server -n REstate
Merged "REstate" as current context in C:\Users\User\AppData\Local\Temp\tmp29d0conq
Proxy running on http://127.0.0.1:8001/
Press CTRL+C to close the tunnel...
error: error upgrading connection: error dialing backend: dial tcp 10.240.0.4:10250: getsockopt: connection timed out
You'll probably need to ssh to the node to see if the Kubelet service is running. For future you can set Resource quotas from exhausting all resources in the cluster nodes.
Resource Quotas -https://kubernetes.io/docs/concepts/policy/resource-quotas/
"error": "RemoteTransportException[[Francis Underwood][inet[/xx.xx.xx.xx:9300]][cluster/snapshot/get]]; nested: RepositoryMissingException[[xxxxxxxxx] missing]; ",
"status": 404
I am also unable to create new snapshot repository for snapshots on s3
PUT _snapshot/bkp_xxxxx_master
{
"type": "s3",
settings": {
"region": "us-xxxx-x",
"bucket": "elasticsearch-backups",
"access_key": "xxxxxxxxxxxx",
"secret_key": "xxxxxxxxxxxxxxxxxxx"
}
}
Response I receive for this PUT is below:
{
"error": "RemoteTransportException[[Francis Underwood][inet[/xx.xx.xx.xx:9300]][cluster/repository/put]]; nested: RepositoryException[[bkp_xxxxxxx_master] failed to create repository]; nested:'AbstractMethodError[org.elasticsearch.cloud.aws.blobstore.S3BlobStore.immutableBlobContainer(Lorg/elasticsearch/common/blobstore/BlobPath;)Lorg/elasticsearch/common/blobstore/ImmutableBlobContainer;]; ",
"status": 500
}
Thanks in advance!
I know this is an old issue but I had been able to replicate this over multiple ElasticSearch versions and it turns out that the reason was conflict between JVM versions and elasticsearch-aws-cloud plugin versions.
As long as you have consistent versions across the cluster (in my case it was Joda version in elasticsearch-aws-cloud was not compatible with the latest JVM version I had installed on the newer nodes.