Is it possible to lose events when using long-polling to retrieve real time notifications? - notifications

When subscribing to real-time notifications, I go through the normal handshake, subscribe, connect flow.
Once the connection returns with events, I reconnect and wait for the next response to return. My question is:
If events are generated the first response and the next reconnect, could they be lost?
As an example: A synchronous application which processes returned response data after it returns and only reconnects once the data processing has finished could cause a significant delay between the response and the next reconnect. Are the cumulocity events generated during that delay buffered in the real-time queue for that particular client id or are they just lost?
Another possible example is when the client ID is no longer valid (this seems to happen every day at midnight), I have to resubscribe, causing a period of time during which no one is subscribed.

The client ID that you receive when handshaking is connected to a queue on the server side. That queue keeps all notifications that you are not able to receive until the next connect. It delivers them when you reconnect. (Try it out with Postman: After a connect returns, send a couple of events, then connect again. You will notice that you will get all events at once.)
However, as you noticed, the queue is not kept forever. If you are not able to reconnect within two hours (I believe), the queue is thrown away in order to not block server resources. This is what you noticed. In that case, you need to query the database to determine any missed events (e.g., poll any operations in pending state from devices).

Related

RabbitMQ pause a queue

I am using a RabbitMQ Server (v3.8.9) with Java clients.
Use case is:
Our Backend creates messages for different clients. We send them out to their respective Endpoints.
1 Producer -> Outbound Queue -> 1 Consumer
The producer creates messages for n clients
Which the consumer should send out to the clients' endpoints
Messages must be kept in the correct order regarding each client
Works fine, unless all clients are up and running. Problem: If one client becomes unavailable, we need to have a bulletproof retry mechanism for that.
Say:
Wait 1 Minute and try again
All following messages must NOT be delivered before the first failed one and kept in the correct order
If a retry works, then ALL other messages should be send to the client immediately
As you can see, it is not a solution to just "supsend" the consumer, because it should still deliver msg to the other (alive) clients. Due to application limitations and a dynamic number of clients, we cannot spawn one consumer per client queue.
My best approach right now is to dynamically create one queue per client, which are then routed to a single outbound queue. If one msg to one client cannot be delivered by the consumer, I would like to "pause" the clients queue for x minutes. An API call like "queue_pause('client_q1', '5 Minutes')" would help. But even then I have to deal with the other, already routed messages to that particular client and keep them in the correct order...
Any better ideas?
I think the key here is that a single consumer script can consume from multiple queues. So if I'm understanding correctly, you could model this as:
Each client has its own queue. These could be created by the consumer script when it starts up, or by a back-end process when a new client is created.
The consumer script subscribes to each queue separately
When a message is received, the consumer tries to send it immediately to the client; if it succeeds, it is manually acknowledged with basic.ack, and the consumer is ready to send the next message to that client.
When a message cannot be delivered to the client, it is requeued (basic.nack or basic.reject with requeue=1), retaining its position in the client's queue.
The consumer then needs to pause consuming from that particular queue. Depending on how its written, that could be as simple as a sleep in that particular thread, but if that's not practical, you can effectively "pause" the subscription to the queue:
Cancel the subscription to that queue, leaving other subscriptions in tact
Store the queue name and the retry time in an appropriate variable
If the consumer script is implemented with an event/polling loop, check the list of "paused" subscriptions each time around that loop; if the retry time has been reached, re-subscribe.
Alternatively, if the library / framework supports it, register a delayed event that will fire at the appropriate time and re-subscribe the queue. The exact mechanics of this depend on the technologies you're using.
All the other subscriptions will continue, so messages to other clients will be delivered. The queue with no subscribers will retain the messages for the offline client in order until the consumer script starts consuming them again.

managing lock on message in RabbitMQ

I'm trying to use RabbitMQ in a more unconventional way (though at this point i can pick any other message queue implementation if needed)
I have one queue (I can have more if needed) that where customers are fetching N messages asynchronous. After they do their work I send the results from the client to the db.
I have two problems: first I don't want that they will work on the same message, second I want to grantee that I wont lose messages in case that my customer will close the browser or just stop working.
I looked at the documentation and saw the TTL which was perfect for me if I could alter that message that got timeout isn't going to be deleted but to move to another queue. can't find a way to alter this.
Moreover I looked at the confirmation option which in the first glance looked what I wanted,that mechanism is working like this: when the consumer gets a message he send confirmation to queue, I thought I can delay this confirm and send it when the work is done on the client side.
my problem was that I can't program the queue that if any message didn't get confirm then return it to the queue (or to another).
I also find how to do a scheduled message but it didn't help either because I don't want that the message will be inserted to the queue in five min,I want that when a customer will receive a message it will be locked in the queue for 5 min until confirm to delete is set otherwise return it to the queue.
Can I do temporary queue that enables my mechanism?
If someone can help with one of the problems or suggest another architecture or option to do it in another MQ it would be great.
Resources:
confirmation:
http://www.rabbitmq.com/blog/2011/02/10/introducing-publisher-confirms/
post about locks but his problem was a batcher component:
Locks and batch fetch messages with RabbitMq
TTL:
https://www.rabbitmq.com/ttl.html
Schedule a message:
https://www.rabbitmq.com/blog/2015/04/16/scheduling-messages-with-rabbitmq/
my problem was that I can't program the queue that if any message
didnt get confirm then return it to the queue (or to another).
RabbitMQ does this anyhow, so all you have to do is switch off the auto-ack flag, you figured this out
I thought I can delay this confirm and send it when the work is done
on the client side.
so just send the ACK once you've finished with processing the message.
All the unacknowledged messages remain in the queue and are re-delivered to next consumer (or the same one when it's up again, depending on your setup)

RabbitMQ consumer crash and consumer-count

If a consumer of a RabbitMQ crashes, with no graceful disconnection, will a subsequent declare-ok request fired several milliseconds later report a diminished consumer-count? Or is there an amount of time that needs to pass before the reported number will change?
declare-ok count all known consumers regardless their actual state.
See, in fact, some time after connection get dangled it still marked as alive (exact time depends on OS settings and whether do you use heartbeats and whether there are any network operation over that connection). In RabbitMQ management panel you may see connection and it channels with consumer tags listed some time after connection died.

How to specify another timeout queue for NSB?

I am using NSB 4.4.2
I want to have something like heartbeats on my saga to show processing statistics.
When i request a timeout it sends to sagas input queue.
In case of many messages prior to this timeout message, IHandleTimeouts may not be fired at specific time.
Is it a bug? Or how can i use separate queue for timeout messages?
Thanks
You are correct - when a timeout is ready to be dispatched, it is sent to the incoming queue of the endpoint, and if there are already many other messages in there, it will have to wait its turn to be processed.
Another thing you might want to consider, is that the endpoint may be down at that time.
If you want to guarantee that your saga code will be invoked at (or very close to) the time of the timeout, you'll need to set up a high availability deployment first. Then, you should look at setting the SLA required of that endpoint - how quickly messages should be processed, and then monitor the time to breach SLA performance counter.
See here for more information: http://docs.particular.net/nservicebus/monitoring-nservicebus-endpoints
You should be prepared to scale out your endpoint as needed to guarantee enough processing power to keep up with the load coming in.
NOTE: The reason we use the same incoming queue for processing these timeouts is by design. A timeout message is almost always the same priority or lower than the other business messages being processed by a saga. As such, it doesn't make sense to have them cut ahead of other messages in line.
Timeouts are sent to the [endpointname].timeouts

Setting a long timeout for RabbitMQ ack message

I was wondering if this is possible. I want to pull a task from a queue and have some work that could potentially take anywhere from 3 seconds or longer (possibly) minutes before an ack is sent back to RabbitMQ notifying that the work has been completed. The work is done by a user, hence this is why the time it takes to process the job varies.
I don't want to ack the message immediately after I pop off the queue because I want the message to be requeued if no ack is received. Can anyone give me any insights into how to solve my problem?
Having a long timeout should be fine, and certainly as you say you want redelivery if something goes wrong, so you want to only ack after you finish.
The best way to achieve that, IMO, would be to have multiple consumers on the queue (i.e. multiple threads/processes consuming from the same queue). That should be fine as long as there's no particular ordering constraint on your queue contents (i.e. the way there might be if the queue were to contain contents representing Postgres data that involves FK constraints).
This tutorial on the RabbitMQ website provides more info (Python linked, but there should be similar tutorials for other languages): https://www.rabbitmq.com/tutorials/tutorial-two-python.html
Edit in response to comment from OP:
What's your heartbeat set to? If your worker doesn't acknowledge the heartbeat within the set period of time, the server will consider the connection to be dead.
Not sure which language you're using, but for Java you would use the setRequestedHeartbeat method to specify the heartbeat.
The way you implement your workers, it's vital that the heartbeat can still be sent back to the RabbitMQ server. If something blocks the client from sending the heartbeat, the server will kill the connection after the time interval expires.