The C# IgniteSessionStateStoreProvider appears to be launching an entire instance of a JVM in-process in order to run a ignite node. I was wondering if it would be feasible to create a lightweight provider that talks to the cluster nodes, but that isn't a full node itself, i.e. doesn't require an in-process JVM.
By feasible I mean is there some fundamental architectural reason why there can't be a lightweight/thin provider?
is there some fundamental architectural reason why there can't be a lightweight/thin provider
We just used Ignite.NET API to create the provider, and that API starts a JVM in process. Advantage is this works very fast.
Thin client mode is coming in Ignite.NET 2.4 (expected in January). I've filed a ticket to add thin session store: https://issues.apache.org/jira/browse/IGNITE-7269
Related
We are analyzing ignite to use it in .NET platform, In Ignite document we can see there is some limitation on running cluster in Java and Thin Client in .NET. The list provided in the document doesn't mention about "Data Streaming". So would like to know whether "Data Streaming" is supported in "Mixed-Platform" or not ?
Yes, Data Streaming is supported in mixed-platform clusters, and you can do it from Ignite.NET Thin Client without any limitations.
P.S. The document is slightly outdated (fix on the way), Services are also supported in this scenario, you can make Java <-> .NET service calls in both directions, which some customers use extensively:
https://ignite.apache.org/docs/latest/net-specific/net-java-services-execution
I see a two ways to setup client in Ignite:
then Ignition.start(IgniteConfiguration.clientMode = true)
Ignition.startClient
but can't find any details about the first mode in docs.
What's the difference between two ways?
The first one will start a thick client node that joins the cluster via an internal protocol, receive all of the cluster-wide updates such as topology changes, and support all of the Ignite APIs.
While the second one will start a Java Thin Client which is a lightweight client that connects to the existing cluster node and performs all operations through that node. The thin client does not become a part of the cluster topology.
You can find more details regarding the difference between the client types here.
The former starts a thick client, the latter a thin client.
A thick client is (mostly) a server node that’s configured not to store data or run compute tasks. So you get the full API, but it’s relatively heavy-weight.
A thin client does not become a cluster node, so it’s smaller and simpler. There are a few APIs that are not available.
If you can use a thin client, that’s likely the correct solution.
I am try planning to introduce Apache Ignite into existing old .net web api project to use it as key/value store for detecting duplicate requests sent to the load balanced api.
I would like to introduce minimum overhead to each request.
As I understand Client node is communicating to the server through TCP.
My current go to plan is to create a singleton object which will establish connection to the remote cache and register it in my DI container.
Is it ok to leave node running and TCP connection open or should make the ignite object scoped to start close on each request/response cycle?
Keep in open, as a singleton.
Ignite object is thread safe
It is expensive to create and connect to the cluster (in case of classic "Thick" client)
There is also a "Thin" client, which is very lightweight and can be created and disposed often. Note that thin client is also thread safe.
Also, you can try to use REST:
https://apacheignite.readme.io/docs/rest-api
Hi I am new to Marklogic and Apache. I have been provided task to use apache as loadbalancer for our Marklogic cluster of 3 machines. Marklogic cluster is currently running on Linux servers.
How can we achieve this? Any information regarding this would be helpful.
You could use mod_proxy_balancer. How you configure it depends what MarkLogic client you would like to use. If you would like to use the Java Client API, please follow the second example here to allow apache to generate stickiness cookies. If you would like to use XCC, please configure it to use the ML-Server-generated or backend-generated "SessionID" cookie.
The difference here is that XCC uses sessions whereas the Java Client API builds on the REST API which is stateless, so there are no sessions. However, even in the Java Client API when you use multi-request transactions, that imposes state for the duration of that transaction so the load balancer needs a way to route requests during that transaction to the correct node in the MarkLogic cluster. The stickiness cookie will be resent by the Java Client API with every request that uses a Transaction so the load balancer can maintain that stickiness for requests related to that transaction.
As always, do some testing of your configuration to make sure you got it right. Properly configuring apache plugins is an advanced skill. Since you are new to apache, your best hope of ensuring you got it right is checking with an HTTP monitoring tool like WireShark to look at the HTTP traffic from your application to MarkLogic Server to make sure things are going to the correct node in the cluster as expected.
Note that even with the client APIs (Java, Node.js) its not always obvious or explicit at the language API layer what might cause a session to be created. Explicitly creating multi statement transactions definately will, but other operations may do so as well. If you are using the same connection for UI (browser) and API (REST or XCC) then the browser app is likely to be doing things that create session state.
The safest, but least flexable configuration is "TCP Session Affinity". If they are supported they will eliminate most concerns related to load balancing. Cookie Session Affinity relies on guarenteeing that the load balencer uses the correct cookie. Not all code is equal. I have had cases where it the load balancer didn't always use the cookie provided. Changing the configuration to "Load Balancer provided Cookie Affinity" fixed that.
None of this is needed if all your communications are stateless at the TCP layer, the HTTP layer and the app layer. The later cannot be inferred by the server.
Another conern is if your app or middle tier is co-resident with other apps or the same app connecting to the same load balancer and port. That can be difficult to make sure there are no 'crossed wires' . When ML gets a request it associates its identity with the client IP and port. Even without load balencers, most modern HTTP and TCP client libraries implement socket caching. A great perfomrance win, but a hidden source of subtle random severe errors if the library or app are sharing "cookie jars" (not uncomnon). A TCP and Cookie Jar cache used by different application contexts can end up sending state information from one unrelated app in the same process to another. Mostly this is in middle tier app servers that may simply pass on requests from the first tier without domain knowledge, presuming that relying on the low level TCP libraries to "do the right thing" ... They are doing the right thing -- for the use case the library programmers had in mind -- don't assume that your case is the one the library authors assumed. The symptoms tend to be very rare but catastrophic problems with transaction failures and possibly data corruption
and security problems (at an application layer) because the server cannot tell the difference between 2 connections from the same middle tier.
Sometimes a better strategy is to load balance between the first tier and the middle tier, and directly connect from the middle tier to MarkLogic.
Especially if caching is done at the load balancer. Its more common for caching to be useful between the middle tier and the client then the middle tier and the server. This is also more analogous to the classic 3 tier architecture used with RDBMS's .. where load balancing is between the client and business logic tiers not between business logic and database.
We are working on developing a Java EE based application. Our application is Java 1.5 compatible and will be deployed to WAS ND 6.1.0.21 with EBJ 3.0 and Web Services feature packs. The configuration is currently one cell with two clusters. Each cluster will have two nodes.
Our application, or our system, as I should rather say, comes in two or three parts.
Part 1: An ear deployed to one cluster that contains 3rd party vendor code combined with customization code. Their code is EJB 2.0 compliant and has a lot of Remote Home interfaces.
Part 2: An ear deployed to the same cluster as the first ear. This ear contains EBJ 3's that make calls into the EJB 2's supplied by the vendor and the custom code. These EJB 3's are used by the JSF UI also packaged with the EAR, and some of them are also exposed as web services (JAX-WS 2.0 with SOAP 1.2 compliance) for other clients.
Part 3: There may be other services that do not depend on our vendor/custom code app. These services will be EJB 3.0's and web services that are deployed to the other cluster.
Per a recommendation from some IBM staff on site here, communication between nodes in a cluster can be EJB RMI. But if we are going across clusters and/or other cells, then the communication should be web services.
That said, some of us are wondering about performance and optimizing communication for speed of our applications that will use our web services and EJB's. Right now most EJB's are exposed as remote. (and our vendor set theirs up that way, rather than also exposing local home interfaces). We are wondering if WAS does any optimizations between apps in the same node/cluster node space. If two apps are installed in the same area and they call each other via remote home interface, is WAS smart enough to make it a local home interface call?
Are their other optimization techniques? Should we consider them? Should we not? What are the costs/benefits? Here is the question from one of our team members as sent in their email:
The question is: Supposing we develop our EJBs as remote EJBs, where our UI controller code is talking to our EXT java services via EJB3...what are our options for performance optimization when both the EJB server and client are running in the same container?
As one point of reference, google has given me some oooooold websphere performance tuning documentation from 2000 that explains a tuning configuration you can set to enable Call By Reference for EJB communication when they're in the same application server JVM. It states the following:
Because EJBs are inherently location independent, they use a remote programming
model. Method parameters and return values are serialized over RMI-IIOP and returned
by value. This is the intrinsic RMI "Call By Value" model.
WebSphere provides the "No Local Copies" performance optimization for running EJBs
and clients (typically servlets) in the same application server JVM. The "No Local
Copies" option uses "Call By Reference" and does not create local proxies for called
objects when both the client and the remote object are in the same process. Depending
on your workload, this can result in a significant overhead savings.
Configure "No Local Copies" by adding the following two command line parameters to
the application server JVM:
* -Djavax.rmi.CORBA.UtilClass=com.ibm.CORBA.iiop.Util
* -Dcom.ibm.CORBA.iiop.noLocalCopies=true
CAUTION: The "No Local Copies" configuration option improves performance by
changing "Call By Value" to "Call By Reference" for clients and EJBs in the same JVM.
One side effect of this is that the Java object derived (non-primitive) method parameters
can actually be changed by the called enterprise bean. Consider Figure 16a:
Also, we will also be using Process Server 6.2 and WESB 6.2 as well in the future. Any ideas? recommendations?
Thanks
The only automatic optimization that can really be done for remote EJBs is if they are colocated (accessed from within the same JVM). In that case, the ORB will short-circuit some of the work that would otherwise be required if the request needed to go across the wire. There will still be some necessary ORB overhead including object serialization (unless you turn on noLocalCopies, with all the caveats it brings).
Alternatively, if you know that the UI controller is colocated, your method calls do not rely on parameter or return value copying, and your interface does not rely on the exception differences between local and remote views, then you could create and expose a local subinterface that will be much faster than remote access through the ORB.