Communication between two Ignite clusters (maybe merging two Ignite clusters in one) - ignite

We have two Ignite clusters. They work completely independent one from other. This is how it works now:
IgniteCluster1
SpringBootApplication1 start up IgniteCluster1's server node
/request1 come to SpringBootApplication1
SpringBootApplication1 has InstanceOfIgniteCluster1
through InstanceOfIgniteCluster1 SpringBootApplication1 execute something like
String response1 = compute.execute(Task1.class, request1);
IgniteCluster1 has Cache1
method in Task1 class takes some data from Cache1 and do some calculations
response1 is send as response for /request1
IgniteCluster2 works the same way
SpringBootApplication2 start up its server node
/request2 come to SpringBootApplication2
SpringBootApplication2 has InstanceOfIgniteCluster2
through InstanceOfIgniteCluster2 SpringBootApplication2 execute something like
String response2 = compute.execute(Task2.class, request2);
IgniteCluster2 has Cache2
method in Task2 class takes some data from Cache2 and do some calculations
response2 is send as response for /request2
But now Task1 from IgniteCluster1 should have response2 for calculations of response1. We do not want to have Cache2 in IgniteCluster1 for that. So we must somehow get response2 from IgniteCluster2. Also we don't want send /request to SpringBootApplication2 across network because of latency - we want ask data from IgniteCluster2 directly.
We see now only one solution - all remain the same, but in SpringBootApplication1 we also start up client node of IgniteCluster2. And in Task1 we firstly get response2 through this client node
String response2 = compute.execute(Task2.class, request);
and after that we get response1 through server node of IgniteCluster1
String response1 = compute.execute(Task1.class, request + response2);
Is it right solution?
Is it the only solution?
Is it the best solution?
Actually for us it would be better if there would be only one IgniteCluster with nodes of 2 different type - some nodes started by SpringBootApplication1 (Type1) and some nodes started by SpringBootApplication2 (Type2). And nodes of different type could somehow communicate with each other (so Type1 node can ask data directly from Type2 nodes). And caches (Cache1 and Cache2) would be stored only on the appropriate nodes (so Cache1 on Type1 nodes and Cache2 on Type2 nodes). Is it possible? How?

1.2.3. This solution will definitely work. But quality of this way depends on your service design/architecture.
Yes, it's possible. You can colocate cache with a subset of cluster nodes. See this reference https://www.gridgain.com/docs/latest/developers-guide/configuring-caches/managing-data-distribution
Optionally, after configure cache's node filter you can use ClusterGroup feature to compute task on nodes which contains nessesary cache's partitions. Like so:
Ignite ignite = Ignition.ignite();
IgniteCluster cluster = ignite.cluster();
ClusterGroup cacheGroup1 = cluster.forCacheNodes("myCache1");
ClusterGroup cacheGroup2 = cluster.forCacheNodes("myCache2");
// Request to type1 nodes
String response1 = ignite.compute(cacheGroup1).execute(Task1.class, request1);
// type2
String response2 = ignite.compute(cacheGroup2).execute(Task2.class, request2);
More details here https://www.gridgain.com/docs/latest/developers-guide/distributed-computing/cluster-groups

Related

Task queues and result queues with Celery and Rabbitmq

I have implemented Celery with RabbitMQ as Broker. I rely on Celery v4.4.7 since I have read that v5.0+ doesn't support RabbitMQ anymore. RabbitMQ is a MUST in my case.
Everything has been containerized then deployed as pods within Kubernetes 1.19. I am able to execute long running tasks and everything apparently looks fine at first glance. However, I have few concerns which require your expertise.
I have declared inbound and outbound queues but Celery created his owns and I do not see any message within those queues (inbound or outbound) :
inbound_queue = "_IN"
outbound_queue = "_OUT"
app = Celery()
app.conf.update(
broker_url = 'pyamqp://%s//' % path,
broker_heartbeat = None,
broker_connection_timeout = int(timeout)
result_backend = 'rpc://',
result_persistent = True,
task_queues = (
Queue(algorithm_queue, Exchange(inbound_queue), routing_key='default', auto_delete=False),
Queue(result_queue, Exchange(outbound_queue), routing_key='default', auto_delete=False),
),
task_default_queue = inbound_queue,
task_default_exchange = inbound_exchange,
task_default_exchange_type = 'direct',
task_default_routing_key = 'default',
)
#app.task(bind=True,
name='osmq.tasks.add',
queue=inbound_queue,
reply_to = outbound_queue,
autoretry_for=(Exception,),
retry_kwargs={'max_retries': 5, 'countdown': 2})
def execute(self, data):
<method_implementation>
I have implemented callbacks to get results back via REST APIs. However, randomly, it can return or not some results when the status is successfull. This is probably related to message persistency. In details, when I implement flower API to get info, status is successfull and the result is partially displayed (shortened json messages) - when I call AsyncResult, for the same status, result is either None or the right one. I do not understand the mechanism between rabbitmq queues and kombu which seems to cache the resulting message. I must guarantee to retrieve results everytime the task has been successfully executed.
def callback(uuid):
task = app.AsyncResult(uuid)
Specifically, it was that Celery 5.0+ did not support amqp:// as a result back end anymore. However, as your example, rpc:// is supported.
The relevant snippet is here: https://docs.celeryproject.org/en/stable/getting-started/backends-and-brokers/index.html#rabbitmq
We tend to always ignore_results=True in our implementation, so I can't give any practical tips of how to use rpc://, other than to infer that any response is put on an application-specific queue, instead of being able to put on a specified queue (or even different broker / rabbitmq instance) via amqp://.

using .net StackExchange.Redis with "wait" isn't working as expected

doing a R/W test with redis cluster (servers): 1 master + 2 slaves. the following is the key WRITE code:
var trans = redisDatabase.CreateTransaction();
Task<bool> setResult = trans.StringSetAsync(key, serializedValue, TimeSpan.FromSeconds(10));
Task<RedisResult> waitResult = trans.ExecuteAsync("wait", 3, 10000);
trans.Execute();
trans.WaitAll(setResult, waitResult);
using the following as the connection string:
[server1 ip]:6379,[server2 ip]:6379,[server3 ip]:6379,ssl=False,abortConnect=False
running 100 threads which do 1000 loops of the following steps:
generate a GUID as key and random as value of 1024 bytes
writing the key (using the above code)
retrieve the key using "var stringValue =
redisDatabase.StringGet(key, CommandFlags.PreferSlave);"
compare the two values and print an error if they differ.
running this test a few times generates several errors - trying to understand why as the "wait" with (10 seconds!) operation should have guaranteed the write to all slaves before returning.
Any idea?
WAIT isn't supported by SE.Redis as explained by its prolific author at Stackexchange.redis lacks the "WAIT" support
What about improving consistency guarantees, by adding in some "check, write, read" iterations?
SET a new key value pair (master node)
Read it (set CommandFlags to DemandReplica.
Not there yet? Wait and Try X times.
4.a) Not there yet? SET again. go back to (3) or give up
4.b) There? You're "done"
Won't be perfect but it should reduce probability of losing a SET??

How to run multiple Akka.NET Lighthouse seeds

On my way to use Akka.NET for a scalable application, I am trying to setup a cluster of Lighthouse seed nodes. I am testing 3 Lighthouse nodes as seed nodes, each running on the same machine with different ports. Following is my hocon config sample:
lighthouse.actorsystem: "my-system"
# See petabridge.cmd configuration options here: https://cmd.petabridge.com/articles/install/host-configuration.html
petabridge.cmd.host = "0.0.0.0"
petabridge.cmd.port = 9111/9112/9113 #one in each node
akka.actor.provider = cluster
akka.remote.log-remote-lifecycle-events = DEBUG
akka.remote.dot-netty.tcp.transport-class = "Akka.Remote.Transport.DotNetty.TcpTransport, Akka.Remote"
akka.remote.dot-netty.tcp.applied-adapters = []
akka.remote.dot-netty.tcp.transport-protocol = tcp
akka.remote.dot-netty.tcp.public-hostname = "localhost"
akka.remote.dot-netty.tcp.hostname = "localhost"
akka.remote.dot-netty.tcp.port = 4001/4002/4003
akk.cluster.seed-nodes = ["akka.tcp://my-system#localhost:4001","akka.tcp://my-system#localhost:4002","akka.tcp://my-system#localhost:4003"]
akk.cluster.roles = [lighthouse]
If I start up these nodes from 3 command prompts, each is printing the following messages:
[INFO][22-01-2019 11:45:17][Thread 0020][Cluster] Cluster Node [akka.tcp://my-system#localhost:4001/4002/4003] - Node [akka.tcp://my-system#localhost:4001/4002/4003] is JOINING itself (with roles []) and forming a new cluster
[INFO][22-01-2019 11:45:17][Thread 0020][Cluster] Cluster Node [akka.tcp://my-system#localhost:4001/4002/4003] - Leader is moving node [akka.tcp://my-system#localhost:4001/4002/4003] to [Up]
My concern here is that, as per the logs printed, these three instances are not forming a cluster and seems to be forming three separate clusters as the nodes themselves are not getting any message about other Lighthouse nodes.
Can somebody please clarify if this is the expected behavior as there is no example seems to be available online.

Celery with rabbitmq creates results multiple queues

I have installed Celery with RabbitMQ.
Problem is that for every result that is returned, Celery will create in the Rabbit, queue with the task's ID in the exchange celeryresults.
I still want to have results, but on ONE queue.
my celeryconfig:
from datetime import timedelta
OKER_URL = 'amqp://'
CELERY_RESULT_BACKEND = 'amqp'
#CELERY_IGNORE_RESULT = True
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT=['json', 'application/json']
CELERY_TIMEZONE = 'Europe/Oslo'
CELERY_ENABLE_UTC = True
from celery.schedules import crontab
CELERYBEAT_SCHEDULE = {
'every-minute': {
'task': 'tasks.remote',
'schedule': timedelta(seconds=30),
'args': (),
},
}
Is that possible? How?
Thanks!
amqp backend creates a new queue for each task. Alternatively, there is a new rpc backend which keeps results in a single queue.
http://docs.celeryproject.org/en/master/whatsnew-3.1.html#new-rpc-result-backend
Nothing unusual.
That is how celery works when we use amqp as result backend. It will create a new temporary queue for every result corresponding to each tasks that worker consumes.
If you are not interested in the result, you can try CELERY_IGNORE_RESULT = True setting
If you do want to store the result, then i would recommend using a different result backend like Redis.
You say you want Celery to keep the result on one queue. Now, to answer your question, let me ask you one:
How do you expect each producer to check for it's relevant result without reading every single message off the queue to find the one it needs/wants?
In essence, what you want is a database of key-value pairs so that the lookup is O(1). The only way to do that with a queue broker is to create one queue for each "pair".
I understand that having many GUID queues is not neat or pretty, but it's conceptually the only way to do it on a messaging broker.
This solution won't keep all the results to ONE queue, but it will at least clean up the extra queues right when you're done with them.
If you use Redis as your backend, when you're done with a result that has created an errant queue, run result.forget(). This will cause both the result and the queue for the result to disappear. This can help you manage the number of queues you have, and prevent OOM issues.

Extensions for Computationally-Intensive Cypher queries

As a follow up to a previous question of mine, I want to find all 30 pathways that exist between two given nodes within a depth of 4. Something to the effect of this:
start startnode = node(1), endnode(1000)
match startnode-[r:rel_Type*1..4]->endnode
return r
limit 30;
My database contains ~50k nodes and 2M relationships.
Expectedly, the computation time for this query is very, very large; I even ended up with the following GC message in the message.log file: GC Monitor: Application threads blocked for an additional 14813ms [total block time: 182.589s]. This error keeps occuring, and blocks all threads for an indefinite period of time. Therefore, I am looking for a way to lower the computational strain of this query on the server by optimizing the query.
Is there any extension I could use to help optimize this query?
Give this one a try:
https://github.com/wfreeman/findpaths
You can query the extension like so:
.../findpathslen/1/1000/4/30
And it will give you a json response with the paths found. Hopefully that helps you.
The meat of it is here, using the built-in graph algorithm to find paths of a certain length:
#GET
#Path("/findpathslen/{id1}/{id2}/{len}/{count}")
#Produces(Array("application/json"))
def fof(#PathParam("id1") id1:Long, #PathParam("id2") id2:Long, #PathParam("len") len:Int, #PathParam("count") count:Int, #Context db:GraphDatabaseService) = {
val node1 = db.getNodeById(id1)
val node2 = db.getNodeById(id2)
val pathFinder = GraphAlgoFactory.pathsWithLength(Traversal.pathExpanderForAllTypes(Direction.OUTGOING), len)
val pathIterator = pathFinder.findAllPaths(node1,node2).asScala
val jsonMap = pathIterator.take(count).map(p => obj(p))
Response.ok(compact(render(decompose(jsonMap))), MediaType.APPLICATION_JSON).build()
}