What is the frequency threshold for google cloud messaging throttling - google-cloud-messaging

As https://developer.android.com/google/gcm/adv.html#throttling indicates ,
Blockquote and to optimize for the overall network efficiency and battery life of devices, GCM implements throttling of messages using a token bucket scheme. Messages are throttled on a per application and per collapse key basis (including non-collapsible messages). Each application collapse key is granted some initial tokens, and new tokens are granted periodically therefter. Each token is valid for a single message sent to the device. If an application collapse key exhausts its supply of available tokens, new messages are buffered in a pending queue until new tokens become available at the time of the periodic grant.
Is there any exact threshold or related data implying a proper sending frequency?

Related

Sabre API - ReachedTALimit issue

I keep getting this response whenever I try to call SessionCreateRQ
<soap-env:Fault>
<faultcode>soap-env:Client.ReachedTALimit</faultcode>
<faultstring>You have reached the limit of Host TAs allocated to you</faultstring>
<detail>
<StackTrace>com.sabre.universalservices.base.exception.ApplicationICEException: errors.authentication.USG_RESOURCE_UNAVAILABLE</StackTrace>
</detail>
</soap-env:Fault>
How can I keep track of opened session, is there a way to Terminate unused active session tokens if I don't have these tokens.
The EPR you use in SessionCreateRQ is associated with a pool of connections (similar in concept to a database connection pool). Sabre support would advise you what the maximum size of that pool is. When you have the maximum number of concurrent sessions active, calling SessionCreateRQ will return the error you are getting.
SessionCloseRQ will release a connection back to the TA Pool, or they will be automatically released after 15 minutes of inactivity. If you are sharing the same pool with other EPRs (or the same EPR in different applications) and you don't have access to those session tokens, there's not much you can do to free up connections in your TA pool other than wait for those sessions to either close (via the other application calling SessionCloseRQ) or timeout.
There's a few ways to keep track of open sessions, related to connection pooling. I've seen a database table used for this purpose. A SessionCreateRQ wrapper service was created that checked if there were any existing unused tokens in a database table. If so, that existing token is returned, otherwise the sabre SessionCreateRQ service is called to create a new token, which is then inserted into that table. A SessionCloseRQ wrapper service would mark that token as free in the database table, without calling the underlying sabre SessionCloseRQ service. That's the high level concept and there are other implementation details that need to be considered, such as sabre transactions that might be associated with sessions if you are going to reuse them and handling free tokens that have timedout after 15 minutes and need to be removed from the table. Having that database table then gives you visibility of all the session tokens you have that are in use, or free, and lets you manage the size of the connection pool.
You have reached the maximum number of open sessions for your credential.
You must now close unused sessions to return to the session open operational limit.
To avoid this type of situation, you must create a session manager or manage the opening and closing of each open session in a workflow.
If you are having this issue with BargainFinderMaxRQ or AdvancedAirshoppingRQ services then I suggest using the TokenCreateRQ service for flight availability.
The management of TokenCreateRQ is done by SABRE and in this case you will be free with SessionCreateRQ sessions to use in the booking Create, booking issue, among others.

Send FMC Message to a topic chunk by chunk

We have a website that works with two million users. When we have new events on the website we send an FCM notification to our user's mobile app. But the website does not have enough resources for lots of users at once.
Can we send FCM messages to a topic chunk by chunk or deliberately decrease the fanout rate and put a delay between each fanout?
What is your suggestion?
There is no way to control the fanout rate of topics in Firebase Cloud Messaging.
The only options I can think of are to:
Create a number of more specific topics (e.g. topic-001, topic-002, ... topic-100), subscribing each client to one of the topics randomly (a form of sharding), and then sending a message to each topic in turn with a delay in between them.
Using a data only message, and delaying the display in your application code by a random amount.
No longer using topics but delivering straight to FCM tokens in your code, so that you fully control when each individual message gets sent.

FCM Maximum message rate to a single device

per https://firebase.google.com/docs/cloud-messaging/concept-options#device_throttling, it says...
You can send up to 240 messages/minute and 5,000 messages/hour to a single device. This high threshold is meant to allow for short term bursts of traffic, such as when users are interacting rapidly over chat. This limit prevents errors in sending logic from inadvertently draining the battery on a device.
does this mean a device can only receive 240 messages / minute?
or does it mean it can receive 240 messages sent by a particular device?
say, 4 other devices can send 240 messages to a device?
This is a limit on the number of downstream messages that can be sent to a device for the entire project.
Sending messages to a device should only happen from a trusted environment (your development machine, the Firebase console, a server you control, Cloud Functions). There is no ability to send downstream messages with the Firebase client-side SDKs.

Can RabbitMQ (or similar message queuing system) be used to single thread requests per user?

The issue is we have some modern web applications that are integrated with a legacy system that was never designed to support multiple concurrent requests from a single user. Basically there are certain types of requests that the legacy system can only handle one-at-a-time from a single user. It can handle multiple concurrent requests coming from different users, but for technical reasons cannot handle multiple from a single user. In these situations, the user's first request will complete successfully, but any subsequent requests from that same user that come in while the first request is still executing will fail.
Because our apps are ajax enabled, multi-tab/multi-browser friendly, and just the fact that there are multiple apps - there are certain scenarios where a user could wind up having more than one of these types of requests being sent to the legacy system at the same time.
I'm trying to determine if something like RabbitMQ could be positioned in front of the legacy system and leveraged to single-thread requests per user/IP. The thinking being that the web apps would send all requests to MQ, and they'd stack into per-user queues and pass on to the legacy system one at a time.
I don't know if there would be concerns about the potential number of queues this could create - we have a user-base of approx 4,000.
And I know we could somewhat address this in the web apps individually, but since there are multiple apps it'd be duplicating logic across them, and you'd still have the potential for two different apps to fire off concurrent requests.
Any feedback would be appreciated. Thanks-
I'm not sure a unique queue per user will work as you would need to have a backend worker process listening for messages on that queue that would need to be dynamically created.
Below is one option but it does have a performance bottleneck potential as a single backend process would be handling all requests sequentially. You could use multiple worker processes but you wouldn't know if one had completed before the other causing a race condition if your app requires a specific sequence of actions.
You could simply put all transactions (from all users) into a single queue and have a backend process pull off of that queue and service the request. If there needs to be a response back to the user once the request was serviced, then the worker process could respond back to a separate queue with a correlationID that could be used to send the response date back to the correct user.
I've done this before with ExpressJS apps where the following flow would happen:
The user/process/ajax makes a request
Express takes the payload from the request object and sends it to a RabbitMQ queue with a unique correlationId (e.g. UUID).
Express then takes the response object and stores it in a responseStore object with the key being the correlationId
Meanwhile, a backend worker process pulls the item from the queue, does some work and then sends a message to a different response queue with the same correlationId
The ExpressJS application has a connection to the response queue and when it receives a message, it takes the correlationId from the response and looks for a response object stored with same correlationId in the responseStore. If it finds it, it takes the payload from the message and does something like response.send(payload) or response.json(payload)
To do this, you should also have a mechanism that stores the creation time of the response object in the responseStore along with the response object. Then have a separate process that will check the responseStore and clean up old response objects after a certain timeout in case there are issues with the backend process completing.
Look here for more info on RPC with RabbitMQ:
https://www.rabbitmq.com/tutorials/tutorial-six-javascript.html
Hope this helps.

Message throttling in GCM / FCM push notification

I would like to know what is called Message throttling in Google FCM push notification? I am trying to implement a sample push notification using FCM, but didn't understand about message throttling mentioned in their steps. There is no documentation also found about it.
https://aerogear.org/docs/unifiedpush/aerogear-push-android/guides/#google-setup
Could someone clarify about this term?
This documentation of Throttling by https://stuff.mit.edu explains it really well:
To prevent abuse (such as sending a flood of messages to a device) and to optimize for the overall network efficiency and battery life of devices, GCM implements throttling of messages using a token bucket scheme. Messages are throttled on a per application and per collapse key basis (including non-collapsible messages). Each application collapse key is granted some initial tokens, and new tokens are granted periodically therefter. Each token is valid for a single message sent to the device. If an application collapse key exhausts its supply of available tokens, new messages are buffered in a pending queue until new tokens become available at the time of the periodic grant. Thus throttling in between periodic grant intervals may add to the latency of message delivery for an application collapse key that sends a large number of messages within a short period of time. Messages in the pending queue of an application collapse key may be delivered before the time of the next periodic grant, if they are piggybacked with messages belonging to a non-throttled category by GCM for network and battery efficiency reasons.
On a simpler note, I guess you can simply see throttling like a funnel that prevents an overflow of messages (normally for downstream messaging), regulating the in-flow of messages to avoid flooding.
For example, you send 1000 messages to a single device (let's also say that all is sent successfully), there's a chance that GCM will throttle your messages so that only a few would actually push through OR each message will be delivered but not simultaneously to the device.