Aerospike- Python Client- get host name from INFO_NODE - aerospike

looking for details of below questions.
Using INFO_NODE we are looking to build an UI interface to display something like (i net statistics) below.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2020-07-21 00:39:27 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cluster Node Node Ip Build Cluster Migrations Cluster Cluster Principal Client Uptime
Name . Id . . Size . Key Integrity . Conns .
test testbox.test.com:3000 *aaaaaaaaaaaa0 XX.XXX.XX>XXXX:3000 E-4.5.3.5 1 0.000 key12344 True aaaaaaaaaaaa0 3 14:12:13
How can we get details of cluster name and node name.
we could get other details from INFO_NODE?
Can we run quiesce, recluster from python client?

You should be able to issue any info command from a client. See this Knowledge Base article. Getting any statistics or configuration parameter values can also be done through similar info calls. For example node-id or cluster-name.

Related

Presto cannot access the web page while using SSL

my presto version is 0.240
my operation: i want to use ssl for use https in presto
so i change my config refer only by this url: https://trino.io/docs/current/security/internal-communication.html
but i can't Access to the presto address https://192.168.100.142:9999/
I don't know which step I did wrong.
What should I do to implement HTTPS for Presto?
this is my config:
A cluster of two machines
node 1 142 hostname:sbider-dev-01
/opt/presto-server-0.240/etc/config.properties
coordinator=true
node-scheduler.include-coordinator=true
query.max-memory=7.5GB
query.max-memory-per-node=3.5GB
query.max-total-memory-per-node=3.5GB
experimental.reserved-pool-enabled=false
memory.heap-headroom-per-node=0.5GB
#experimental.spill-enabled=true
#experimental.max-spill-per-node=8GB
#experimental.query-max-spill-per-node=8GB
query.low-memory-killer.policy=total-reservation-on-blocked-nodes
#http-server.http.port=9999
#discovery-server.enabled=true
#discovery.uri=http://192.168.100.142:9999
internal-communication.shared-secret="8HRJWX41DwtuYZcNw8uMbshA8wDLoLS78tT3UVL+Z+m0xG7KCygGurE9SXEbGy2bLtPLza1MhAnWJp2mJp/S+j9EFWWuztXz7cHJhSz9QFiVxYCs1Wzn+IVKgHD5z+iGbdKjwRtgUjwNvS4MIfqwqwKlVZiEtGgEDv7j/kAgpOYPvFCRJfb/U/+b7qPpwPNDA6kXu3Dj5p1Q81+kmbFO59WSh6c4QwqdbFHAaY8XFWo8tIogxpmwQQqV3BvICmesxlIhBH/pOGgoyl86QQ/TaAMaWjaddNcgO5keTGhhOj/juGZ/gbOL/PHGNs1ENSPRnjvIGLHFQPDrm36YenhfTH5L7X0Q9HwwnEpEoYkDJsmMEV+elPZK767nZXHryuvDvHGs0PhYSRO8ekOgC3CaE1tfiGh5M9H5C2fnyeGRQ0iwtgXh83kRDuPzVrRx5yj2cHQJOZu+CcXCJ3aa1Tijxq56RfdcEz9Frr8n8aXaNMtRlchcXn3+B4biByS9duq28VHHBDlyYQQ6VSKbLDt1GBi5oOQICtrGuOY+/MD+rnV5uxPUQcSIh9KmA1WjahJEz0ItDKpB66JgVkTrVDWEJPeozKTvHRLG9sBudRhQ5abJGEAhx9b78dUbTcEkRlPuvUN1WjwVlUzjyUDKd14ocuhpoOBzjV9kFhTqQZ4zgNo="
http-server.http.enabled=false
#node.internal-address-source=FQDN
node.internal-address=sbider-dev-01,sbider-dev-02
http-server.https.enabled=true
http-server.https.port=9999
# jks文件全路径
http-server.https.keystore.path=/ceshi/keystore.jks
http-server.https.keystore.key=123456
discovery.uri=https://192.168.100.142:9999
internal-communication.https.required=true
internal-communication.https.keystore.path=/ceshi/keystore.jks
internal-communication.https.keystore.key=123456
node 2 143 hostname cat /opt/presto-server-0.240/etc/config.properties
coordinator=flase
query.max-memory=7.5GB
query.max-memory-per-node=3.5GB
query.max-total-memory-per-node=3.5GB
experimental.reserved-pool-enabled=false
memory.heap-headroom-per-node=0.5GB
#experimental.spill-enabled=true
#experimental.max-spill-per-node=8GB
#experimental.query-max-spill-per-node=8GB
query.low-memory-killer.policy=total-reservation-on-blocked-nodes
#discovery.uri=http://192.168.100.142:9999
internal-communication.shared-secret="8HRJWX41DwtuYZcNw8uMbshA8wDLoLS78tT3UVL+Z+m0xG7KCygGurE9SXEbGy2bLtPLza1MhAnWJp2mJp/S+j9EFWWuztXz7cHJhSz9QFiVxYCs1Wzn+IVKgHD5z+iGbdKjwRtgUjwNvS4MIfqwqwKlVZiEtGgEDv7j/kAgpOYPvFCRJfb/U/+b7qPpwPNDA6kXu3Dj5p1Q81+kmbFO59WSh6c4QwqdbFHAaY8XFWo8tIogxpmwQQqV3BvICmesxlIhBH/pOGgoyl86QQ/TaAMaWjaddNcgO5keTGhhOj/juGZ/gbOL/PHGNs1ENSPRnjvIGLHFQPDrm36YenhfTH5L7X0Q9HwwnEpEoYkDJsmMEV+elPZK767nZXHryuvDvHGs0PhYSRO8ekOgC3CaE1tfiGh5M9H5C2fnyeGRQ0iwtgXh83kRDuPzVrRx5yj2cHQJOZu+CcXCJ3aa1Tijxq56RfdcEz9Frr8n8aXaNMtRlchcXn3+B4biByS9duq28VHHBDlyYQQ6VSKbLDt1GBi5oOQICtrGuOY+/MD+rnV5uxPUQcSIh9KmA1WjahJEz0ItDKpB66JgVkTrVDWEJPeozKTvHRLG9sBudRhQ5abJGEAhx9b78dUbTcEkRlPuvUN1WjwVlUzjyUDKd14ocuhpoOBzjV9kFhTqQZ4zgNo="
http-server.http.enabled=false
#node.internal-address-source=FQDN
node.internal-address=sbider-dev-01,sbider-dev-02
http-server.https.enabled=true
http-server.https.port=9999
http-server.https.keystore.path=/ceshi/keystore.jks
http-server.https.keystore.key=123456
discovery.uri=https://192.168.100.142:9999
internal-communication.https.required=true
internal-communication.https.keystore.path=/ceshi/keystore.jks
internal-communication.https.keystore.key=123456
server log in sbider-dev-01: cat /opt/presto-server-0.240/var/log/server.log
Companion catalogs: catalog_name1=catalog_name2,catalog_name3=catalog_name4,...
2021-01-12T12:41:09.766+0800 INFO main Bootstrap transaction.idle-check-interval 1.00m 1.00m Time interval between idle transactions checks
2021-01-12T12:41:09.766+0800 INFO main Bootstrap transaction.idle-timeout 5.00m 5.00m Amount of time before an inactive transaction is considered expired
2021-01-12T12:41:09.767+0800 INFO main Bootstrap transaction.max-finishing-concurrency 1 1 Maximum parallelism for committing or aborting a transaction
2021-01-12T12:41:09.767+0800 WARN main Bootstrap UNUSED PROPERTIES
2021-01-12T12:41:09.767+0800 WARN main Bootstrap internal-communication.shared-secret
2021-01-12T12:41:09.767+0800 WARN main Bootstrap
2021-01-12T12:41:11.037+0800 ERROR main com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors:
1) Configuration property 'internal-communication.shared-secret' was not used
at com.facebook.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:238)
1 error
com.google.inject.CreationException: Unable to create injector, see the following errors:
1) Configuration property 'internal-communication.shared-secret' was not used
at com.facebook.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:238)
1 error
at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:543)
at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:159)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:106)
at com.google.inject.Guice.createInjector(Guice.java:87)
at com.facebook.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:245)
at com.facebook.presto.server.PrestoServer.run(PrestoServer.java:131)
at com.facebook.presto.server.PrestoServer.main(PrestoServer.java:77)
You're following Trino (fka Presto SQL) documentation for securing internal documentation, but got Presto binary from facebook's fork of the project (prestodb).
Go to https://trino.io/download.html to get latest Trino release.
The alternative solution (using prestodb's documentation and prestodb's binary) is NOT a safe, viable alternative, due to security issues known and not fixed in prestodb code base.

Inconsistent behavior of Quartz2 scheduler in Apache Camel

I have an Apache Camel project that is using Quartz2 as the scheduler. The requirement is to make it a cluster. The code is deployed to weblogic 12c. the quartz is configured as per many samples with clustering enabled.
This is my properties file (without the datasource)
org.quartz.scheduler.instanceName = MyScheduler
org.quartz.scheduler.instanceId = AUTO
org.quartz.scheduler.skipUpdateCheck = true
org.quartz.scheduler.jobFactory.class = org.quartz.simpl.SimpleJobFactory
org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 10
org.quartz.threadPool.threadPriority = 5
org.quartz.jobStore.misfireThreshold = 60000
org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.oracle.OracleDelegate
org.quartz.jobStore.useProperties=true
org.quartz.JobBuilder.requestRecovery=true
org.quartz.jobStore.isClustered = true
org.quartz.jobStore.clusterCheckinInterval = 20000
When I deploy and start both nodes I see that the QRTZ_SCHEDULER_STATE table has extra entry for one of the nodes:
MyScheduler-routerContext server_node21567108546690
MyScheduler-routerContext-1 server_node11565896495100
MyScheduler-routerContext-1 server_node11567108547295
And I am guessing because of that the one node is being called once in a while while the other node gets called all the time (so occasionally both nodes are invoked at the same time).
I have tried to do a clean restart of weblogic nodes but the issue is still there
This is how my route(s) look like:
from("quartz2://provRegGroup/createUsersTrigger?cron={{create_users_cron}}&job.name=createUsersJob")
.routeId("createUsersRB")
.log("**** starting check for create users");
//where
//create_users_cron=0+0,5,10,15,20,25,30,35,40,45,50,55+*+*+*+?
//expecting one node being called by the scheduler at a time..
I figured out what caused the issue. apparently there were orphan weblogic processes that were running on one (or even both nodes) - this would be a question to our tech archs - why this was such a mess.. ps was showing two weblogic servers running on a node - one that I started recently and one that was there for say a month..
expecting this would never happen to production environment I assume the issue has been resolved..

Redis cluster cannot add nodes

There are two redis server. And I have run three redis instances on each server.
When I executed cluster meet [ip] [port] to add the cluster nodes, I found I just could add the nods which was running on the same server. Everytime I run this command, it alwasys echo an "OK" for me. But when I use cluster nodes to check the nodes list, it always shows like this.
172.18.0.155:7010> cluster meet 172.18.0.156 7020
OK
172.18.0.155:7010> cluster nodes
ad829d8b297c79f644f48609f17985c5586b4941 127.0.0.1:7010#17010 myself,master - 0 1540538312000 1 connected
87a8017cfb498e47b6b48f0ad69fc066c466a9c2 172.18.0.156:7020#17020 handshake - 1540538308677 0 0 disconnected
fdf5879554741759aab14eba701dc185b605ac16 127.0.0.1:7012#17012 master - 0 1540538313000 0 connected
ec7b3ecba7a175ddb81f254821243dd469a7f961 127.0.0.1:7011#17011 master - 0 1540538314288 2 connected
You can see the nodes status is disconnected. And you can find it will disappare from the list, if you check again about 5s later.
Has anybody meet this problem before? I have no idea how to solve this problem. Please help me. Thanks a lot.
I have solved the problem. I found I had done some mistakes with the bind configuration. When I just add one IP which communicate with other nodes for the bind setting. The cluster nodes can add normally.

Authentication failures in cassandra when 1 of 16 nodes is down

I have a Cassandra cluster running :
Cassandra 2.0.11.83 | DSE 4.6.0 | CQL spec 3.1.1 | Thrift protocol 19.39.0
The cluster has 18 nodes, split among 3 datacenters, 6 in each. My system_auth keyspace has the following replication defined:
replication = {
'class': 'NetworkTopologyStrategy',
'DC1': '4',
'DC2': '4',
'DC3': '4'}
and my authenticator/authorizer are set to:
authenticator: org.apache.cassandra.auth.PasswordAuthenticator
authorizer: org.apache.cassandra.auth.CassandraAuthorizer
This morning I brought down one of the nodes in DC1 for maintenance. Within a few seconds/minute client applications started logging exceptions like this:
"User my_application_user has no MODIFY permission on or any of its parents"
Running 'LIST ALL PERMISSIONS of my_application_user' on one of the other nodes shows that user to have SELECT and MODIFY on the keyspace xxxxx, so I am rather confused. Do I have a setup issue? Is this a bug of some sort?
Re-posting this as the answer, as BrianC suggested above.
So this is resolved... Here's the sequence of events that seems to have fixed it:
Add 18 more nodes
Run cleanup on original nodes (this was part of the original plan)
Run a scrub on 1 table, since it was throwing exceptions on cleanup
Run a repair on the system_auth KS on the original troubled node
Wait for repair service to complete a full pass on all keyspaces
Decom original 18 nodes.
Honestly, I don't know what fixed it. The system_auth repair makes most sense, but what doesn't make sense is that it had run many passes before, so why work now, I don't know. I hope this at least helps someone.

julia on PBS cluster: what to give to addprocs()?

I'm trying to setup a cluster across machines on a PBS managed cluster. I'm perfectly able to compute within one node by saying julia -p 12 (after having reserved one node with 12 CPUs).
I understand that to use several machines, I have to add them to the master process with addprocs. I was able to do that on a different cluster (SGE). on this one here something is going wrong.
You can see everything I'm doing, including submit scripts etc, on this branch of a github repo.
to get a list of machines, I parse the PBS_NODEFILE, which for the case of a submit script with option
#PBS -l nodes=2:ppn=12 # give me 2 nodes with 12 processors each
looks like something like this:
red0004
red0004
...
red0004
red0347
...
red0347
I parse this file with bind_pe_procs() in sge.jl in the repo and give a vector of machine names to addprocs. When I submit this I get this error which I put up a gist with the resulting SSH error. I don't know what it means.
has this to do with a system setting, ie do i have to talk to the sys admin about SSH between machines? What are the right questions to ask?
I am unsure about what exactly I have to give to addprocs(). I don't want to add the master process (I don't want worker 1 SSHing into itself?), so I exclude ENV["HOST"] = node001 from my list. but what about all processors with the same name node002? do i list all of those
machines = [ "red0347" for i=1:12]
or just once
machines = ["red0347"]
in addprocs(machines)
thanks!