Gemfire cache pre-heat completion - gemfire

I would like to have one server and a few clients. The Server will be my own Java application that uses CacheFactory. I will be reading all my static data from a database and populating the cache even before it is requested by any client. While the cache is getting populated in the server, it would also be spreading among all clients that are connected to the server. Once the cache population is complete, I would like to give a green signal to all clients to start requesting data. Is there something I need to do so that the server sends an event to the clients or the clients generate an event signallig the completion of cache pre-heating?
Thanks,
Yash

One way to accomplish this would be to create a region on the server and the client (say /server-ready) for notification only. The client will register interest for all keys in this region. The client will also register a CacheListener for this region.
When the server is done loading data, you can put an entry in the server-ready region. The event will be sent to the client and afterCreate() method on the CacheListener will be invoked, which could serve as a notification to your clients that the server is done populating data.

Related

Apache Ignite's Continuous Queries event handler group & sequencing

We are trying to use the Continuous Query feature of Ignite. But we are facing an issue on handling that event. Below is our problem statement
We have defined a Continuous Query with remote filter for a cache and shared the filter definition with Thick Client.
We are running multiple replica of the "Thin Client" in Kubernetes cluster.
Now the problem is each instance of the "Thin Client" running in k8s cluster have registered the remote filter and each instance receiving the event and trying to process the data in parallel. This resulting in duplicating data process or even overriding the data in my store.
Is there any way to form a consumer group and ensure that only one instance of the "Thin Client" is receiving the notification and its processing the data ?
My Thick client and Thin Clients are in .NET
Couldn't found any details on Ignite document
https://ignite.apache.org/docs/latest/key-value-api/continuous-queries
Here each thin client is starting its own continuous query and thereby, by design, each thin client is getting its own event to consume. If you want to route an event to a specific client then you would need to start only one continuous query, and distribute that event to your app as you see fit.
Take a look at ignite messaging to see whether it fits your use case.
Also check out the distributed Queue/Set which have unique delivery guarantees.

Controlling Gemfire cache updates in the background

I will be implementing a Java program that acts as a gemfire client. The program will continuosly process records that it receives on its port from a remote program. Each record will be processed using the static data cached with my program. The cache may get updated behind the scenes in my program when it is changed on the gemfire server. The processing of one record may take a few seconds. I run the risk of processing half the record with static data that was prevalent before the change and rest of the record with static data that has taken effect after the change. Is there a way I can tell gemfire to not apply the cache to the local client until I am done processing the ongoing record?
Regards,
Yash
Consider this approach: Use a Continuous query "Select *" instead of event registration. A CQ does not update the client region like a subscription does. Make your client region LOCAL. After receiving the CQ event on the client, execute your long running process and put the value that you received from the CQ into your client region. Decoupling client and server in this way will allow your client to run long-running processes.
Alternatively: if you must have the client cache proxied with the server as an absolute requirement, then keep the interest registration AND register a CQ. Ignore the event callback from the subscription but handle your long-running process using the event callback from the CQ.
The following is from page 683 at http://gemfire.docs.pivotal.io/pdf/pivotal-gemfire-ug.pdf
CQs do not update the client region. This is in contrast to other server-to-client messaging like the updates sent to satisfy interest registration and responses to get requests from the client's Pool.

Redis Stale Data

I'm new at Redis. I'm designing a pub/sub mechanism, in which there's a specific channel for every client (business client) that has at least one user (browser) connected. Those users then receive information of the client to which they belong.
I need Redis because I have a distributed system, so there exists a backend which pushes data to the corresponding client channels and then exists a webapp which has it's own server (multiple instances) that holds the users connections (websockets).
Resuming:
The backend is the publisher and webapp server is the subscriber
A Client has multiple Users
One channel per Client with at least 1 User connected
If Client doesn't have connected Users, then no channel exists
Backend pushes data to every existing Client channel
Webapp Server consumes data only from the Client channels that correspond to the Users connected to itself.
So, in order to reduce work, from my Backend I don't want to push data to Clients that don't have Users connected. So it seems that I need way to share the list of connected Users from my Webapp to my Backend, so that the Backend can decide which Clients data push to Redis. The obvious solution to share that piece of data would be the same Redis instance.
My approach is to have a key in Redis with something like this:
[USERS: User1/ClientA/WebappServer1, User2/ClientB/WebappServer1,
User3/ClientA/WebappServer2]
So here comes my question...
How can I overcome stale data if for example one of my Webapps nodes crashes and it doesn't have the chance to remove the list of connected Users to it from Redis?
Thanks a lot!
Firstly, good luck with the overall project - sounds challenging and fun :)
I'd use a slightly different design to keep track of my users - have each Client/Webapp maintain a set (possibly sorted with login time as score) of their users. Set a TTL for the set and have the client/webapp reset it periodically, or it will expire if the owning process crashes.

How does the load balanced server is working?

Thanks for taking time to read my questions.
I am having some basic doubts about the load balanced servers.
I assume that One application is hosted on the two servers, when one server is heavily loaded the load balancer is switching the responsibilities of handling the particular request to another server.
This is how I assumed about the load balancer.
Which is managing and monitoring the load and do all the transfers of requests?
How do the static variables are taken place for processing? For ex: , - I have a variable called as 'totalNumberOfClick'. Which is being incremented whenever we hit the page.
If a GET request is handled by a server and its POST method also should be managed by that server.Right? For Ex: in to- A user is requesting a page for editing, the Asp.Net runtime will create a set of viewstate (which has controlID and its values) and is maintained in the server and client side. When we hit the post button the server is validating the view state and allowing it to into a server and doing other processing.
If the post is getting transferred to another server, how the Runtime allow it to do the processing.
If you are using the load balancing built into Windows, then there are several options for how the load is distributed. The servers keep in communication with each other and organise the load between themselves.
The most scalable option is to evenly balance the requests across all of the servers. This means that each request could end up being processed by a different server so a common practice is to use "sticky sessions". These are tied to the user's IP address, and make sure that all requests from the same user go to the same server.
There is no way to share static variables across multiple servers so you will need to store the value in a database or on another server.
If you find an out of process to host session state (such as stateserver or sql server) then you can process any request on any server. Viewstate allows the server to recreate most of the data needed that generated the page.
I have some answers for you.
When it comes to web applications, load balancers need to provide what is calles Session Stickyness. That means that once a server is elected to serve a clients request all subsequent request will be directed to the same node as long as the session is active. Of course this is not neccessary if your web application does not rely on any state that has to be preserved (i.e. stateless, sessionless).
I think this can answer your third and maybe even your second question.
Your first question is on how load balancers work internally. Since I am not an expert in that I can only guess that the load balancer that each client is talking to measures ping response times to derive an estimated load amount on the server. Maybe more sophisticated techniques could be used.

Is it possible to have asynchronous processing

I have a requirement where I need to send continuous updates to my clients. Client is browser in this case. We have some data which updates every sec, so once client connects to our server, we maintain a persistent connection and keep pushing data to the client.
I am looking for suggestions of this implementation at the server end. Basically what I need is this:
1. client connects to server. I maintain the socket and metadata about the socket. metadata contains what updates need to be send to this client
2. server process now waits for new client connections
3. One other process will have the list of all the sockets opened and will go through each of them and send the updates if required.
Can we do something like this in Apache module:
1. Apache process gets the new connection. It maintains the state for the connection. It keeps the state in some global memory and returns back to root process to signify that it is done so that it can accept the new connection
2. the Apache process though has returned the status to root process but it is also executing in parallel where it going through its global store and sending updates to the client, if any.
So can a Apache process do these things:
1. Have more than one connection associated with it
2. Asynchronously waiting for new connection and at the same time processing the previous connections?
This is a complicated and ineffecient model of updating. Your server will try to update clients that have closed down. And the server has to maintain all that client data and meta data (last update time, etc).
Usually, for continuous updates ajax is used in a polling model. The client has a javascript timer that when it fires, hits a service that provides updated data. The client continues to get updates at regular intervals without having to write an apache module.
Would this model work for your scenario?
More reasons to opt for poll instead of push
Periodic_Refresh
With a little patch to resume a SUSPENDED mpm_event connection, I've got an asynchronous Apache module working. With this you can do the improved polling:
javascript connects to Apache and asks for an update;
if there's no updates, then instead of answering immediately the module uses SUSPENDED;
some time later, after an update or a timeout happens, callback fires somewhere;
callback gives an update (or a "no updates" message) to the client and resumes the connection;
client goes to step 1, repeating the poll which with Keep-Alive will use the same connection.
That way the number of roundtrips between the client and the server can be decreased and the client receives the update immediately. (This is known as Comet's Reverse Ajax, AFAIK).