Number of concurrent connections to GCM service - google-cloud-messaging

We are sending push notifications to Android devices via GCM API.
People are allowed to subscribe to different topics and receive alert every couple of days.
There is between 100_000 to 1_000_000 users subscribed for given topic, so we wanted to speed things up using more than ten connections.
We see answers with retry, so we retry after specified period of time as stated in the docs.
Can we get rid of retires by using more connections and sending the requests slower?
Or is the quota set for given API key and starting more connections will even hurt us?
EDIT:
We are using GCM HTTP interface. To be precise erlang-gcm library: https://github.com/pdincau/gcm-erlang We are sending message to 1M users. We are not sending to topic. We are performing multicast send to list of users. gcm-erlang library allows us to pass 1000 users per request (which is also the limit of GCM API). This means, we have to perform at least 1000 requests.
It takes something around 10 minutes to process all those 1000 requests, so we wanted to make them in parallel, but it doesn't make it faster. Here I've found information on throttling: https://stuff.mit.edu/afs/sipb/project/android/docs/google/gcm/adv.html#throttling
"Messages are throttled on a per application"
Does it mean, that even if this are messages to different users, we are still throttled, because they are using single API key for our mobile application?
Will the XMPP endpoint faster?

It is weird that parallelizing requests didn't make them faster. How come? Are you sure that the bottleneck is not on your side?
No, it doesn't look like you're throttled (you would receive errors if you were instead of waiting on line)
I still don't understand why topics don't work for you. They seem like a good match.
Anyway, if you want to send messages individually, I would highly recommend switching to XMPP. You will be able to send one hundred messages at a time per connection and open up to 1000 connection (but you're not gonna need that much really).

Related

Is it a good practice to create a channel for each user in redis message bus

We are using redis message bus and handling messages using a channel. But if our application is deployed in multiple instances then the request and response is passed to all the instances. To avoid this scenario which of the below approach is better?
Create a channel for each instance of the application
Create a channel for each user
Any suggestions will be highly appreciated
The limiting factor here is the number of subscribers to the same channel. Number of channels can be large as such. So you can choose the granularity accordingly. Read more here:
https://groups.google.com/forum/#!topic/redis-db/R09u__3Jzfk
All the complexity on the end is on the PUBLISH command, that performs
an amount of work that is proportional to:
a) The number of clients receiving the message.
b) The number of clients subscribed to a pattern, even if they'll not
match the message.
This means that if you have N clients subscribed to 100000 different
channels, everything will be super fast.
If you have instead 10000 clients subscribed to the same channel,
PUBLISH commands against this channel will be slow, and take maybe a
few milliseconds (not sure about the actual time taken). Since we have
to send the same message to everybody.
Similar question asked before : How does Redis PubSub subscribe mechanism works?

Push notifications emulate 100000 subscribers

I have made a service which sends push notifications to subscribers via GCM and APNS. For a small number of subscribers everything works fine, but I`d like to test it, say, for 100000 subscribers. In particular, I am interested how service workers act when trying to fetch a data for such amount of subscribers, what if my DB or server is not able to serve all of them in one second etc.
For now I don`t have such amount of subscribers, are there any way to emulate them for testing purposes?
Thank you.
Service Worker is a client-side JavaScript Worker, so the number of subscribers should be irrelevant. However, do note that when sending downstream messages, you can only have up to 1000 registration tokens at a time, so you'll have to repeat the batch by 100 (in this example). Once a message is sent to GCM, the subscribers' web client should be able to handle it separately.
You have to have valid registration_ids to test it out to that degree. That means, registering thousands of clients! It can still be done, though. There are a few ready-made scripts or sites where you can test GCM, but you'll still need to provide those registration_ids.

EWS: Streaming Notifications vs Push Notifcations

Microsoft Exchange introduced Streaming notifications as an alternative to pull/push notifications with Exchange 2010. Basic introduction on streaming can be found on this msdn article and blog
However, I cant figure out the actual advantage of streaming over push notifications.
The only advantage mentioned in the blog is "..and you don’t have to create a listener application as for the push notifications." Apart from that, are there any other advantages and disadvantages?
How do other factors like managing the subscription, re-subscription logic, scalability, max num of subscriptions, etc compare over push? Also, Streaming subscription has a maximum alive time of 30 mins and I would have to re-subscribe every 30 mins? Isnt that a disadvantage for a large number of subscriptions(my application has to manage 20K+ mailboxes)?
Any light on the comparison factors would be helpful.
The main reason for Streaming Notifications (SNs) is Exchange Online. You can't have EOL opening up an HTTP connection to an application potentially behind a firewall. Even within a corporate network there are firewall issues. I've had several cases where my app could not get Push Notifications (PNs) because of the firewall on its own server.
On the surface SNs would also seem to be more efficient because each notification is not opening up its own TCP connection, but rather they flow in on a single pipe. After doing some Wiresharking on this, I'm not really convinced this is so because it seems like under the covers they're doing Long Polling, so each notification coming in will cause a new HTTP call back up to Exchange.
The 30 minute max is no big deal, just reopen the connection in the handler and you're done--you don't have to actually re-subscribe. In fact, I am of the opinion that I want to lower that even more to say 3 minutes. You apparently cannot add new subscriptions or remove old ones except within the disconnect handler. (Try it, and you get errors.)
And yes, you don't have to code an HTTP handler, which is nice I guess.

ZMQ device queue does not load balance properly

I know that ZMQ offers all of the flexibility to do your own load-balancing. However I would expect the out-of-the-box broker, about 4 lines of code using the line
zmq_device (ZMQ_QUEUE, frontend, backend);
to load balance quite well as the documentation says it does load balance.
ZMQ_QUEUE creates a shared queue that collects requests from a set of clients, and distributes these fairly among a set of services. Requests are fair-queued from frontend connections and load-balanced between backend connections. Replies automatically return to the client that made the original request.
I have an army of back-end services and yet find that often my front-end clients have to wait several seconds for something that takes < 1/10 of a second in a 1:1 setting (there are same # of client and service machines). I suspect that ZMQ is not load-balancing properly out of the box - it's sending too many requests to the same service even though it doesn't have bandwidth, etc.
I think this is partly because the services are multithreaded in a way that lets them take up to 10 concurrent requests yet it slows down greatly at near the 10th request even though it can still accept them. Random distribution would be ideal. Is there an out-of-the-box way to do this or can it be done in a few lines of code, or do I have to write my own broker from scratch?
Fwiw issue was the workers were taking on work when they didn't have room for it, issue was not in ZMQ layer per se.

Redis publish/subscribe: see what channels are currently subscribed to

I am currently interested in seeing what channels are subscribed to in a Redis pub/sub application I have. When a client connects to our server, we register them to a channel that looks like:
user:user_id
The reason for this is I want to be able to see who's "online". I currently blindly fire off messages to a channel without knowing if a client is online since it's not critical that they receive these types of messages.
In an effort to make my application smarter, I'd like to be able to discover if a client is online or not using the pub/sub API, and if they are offline, cache their messages to a separate redis queue which I can push to them when they get back online.
This does not have to be 100% accurate, but the more accurate it is, the better. I'm assuming a generic key does not get created when a channel gets subscribed to, so I cannot do something as trivial as:
redis-cli keys user* to find all online users.
The other strategy I've thought of is just maintaining my own Redis Set whenever a user published or removes themselves from a channel (which the client automatically handles when they hop online and close the app). That would be an additional layer of complexity that I need to manage and I'm hoping there is a more trivial approach with the data that's already available.
As of Redis 2.8 you can do:
PUBSUB CHANNELS [pattern]
The PUBSUB CHANNELS command has O(N) complexity, where N is the number of active channels.
So in your case:
redis-cli PUBSUB CHANNELS user*
would give you want you want.
There is currently no command for showing what channels "exist" by way of being subscribed to, but there is and "approved" issue and a pull request that implements this.
https://github.com/antirez/redis/issues/221
https://github.com/antirez/redis/pull/412
Due to the nature of this call, it is not something that can scale, and is thus a "DEBUG" command.
There are a few other ways to solve your problem, however.
If you have reason to believe that a channel may be subscribed to, you can send it a message and look at the result. The result is the number of subscribers that got the message. If you got 0, you know that they're not there.
Assuming that your user_ids are incremental, you might be interested in using SETBIT to set a 1 or 0 to a user's offset bit to track presence. You can then do cool things like the new BITCOUNT to see how many users are online, and GETBIT to determine if a specific user is online.
The way I have solved your problem more specifically in the past is by signaling a subscription manager that I have subscribed to a channel. The manager then "pings" the channel by sending a blank message to confirm that there is a subscriber, and occasionally pings the channel thereafter to determine if the user is still online. Not ideal, but better than using DEBUG CHANNELS in production.
From version 2.8.0 redis has a pubsub command that would help in this case:
http://redis.io/commands/pubsub
Remark: currently the state of 2.8.0 is not stable yet (RC2)
I am unaware of any specific way to query what channels are being subscribed to, and you are correct that there isn't any key created when this happens. Also, I wouldn't use the KEYS command in production anyway, as it's really a debugging command.
You have the right idea about using a set to add the user when they're online, and then query this with SISMEMBER <set> <user_id> to determine if the messages should be sent to them or added to a Redis list for processing once they do come online.
You will need to figure out when a user logs off so you can remove them from the list of online users, but I don't know enough about your system to know exactly how you would go about that.
If the connected clients have the ability to send a message back to inform the server that the message(s) were consumed, you could use this to keep track of which messages should be stored for later retrieval.
Cheers,
Mike
* PUBSUB NUMSUB [channel-1 ... channel-N]
Returns the number of subscribers (not counting clients subscribed to patterns) for the specified channels.
https://redis.io/commands/pubsub