Is it possible to thread pool IMAP connections? - ssl

From what I understand IMAP requires a connection per each user. I'm writing an IMAP client (currently just gmail) that supports many (100s, 1000s maybe 10000s+) users at a time. Obviously cutting down the number of open connections would be great. I'm wondering if it's possible to use thread pooling on my side to connect to gmail via IMAP or if that simply isn't supported by the IMAP protocol.

IMAP typically uses SSL over TCP/IP. And a TCP/IP connection will need to be maintained per IMAP client connection, meaning that there will be many simultaneous open connections.
These multiple simultaneous connections can easily be maintained in a non-threaded (single thread) implementation without affecting the state of the TCP connections. You'll have to have some sort of a flow concept per IMAP TCP/IP connection, and store all of the flows in a container (a c++ STL map for instance) using the TCP/IP five-tuple (or socketFd) as a key. For each data packet received, lookup the flow and handle the packet accordingly. There is nothing about this approach that will affect the TCP nor IMAP connections.
Considering that this will work in a single-thread environment, adding a thread pool will only increase the throughput of the application, since you can handle data packets for several flows simultaneously (assuming its a multi-core CPU) You will just need to make sure that 2 threads dont handle data packets for the same flow at the same time, which could cause the packets to be handled out of order. An approach could be to have a group of flows per thread, maybe using IP pools or something similar.

Related

RabbitMQ as Message Broker used by Spring Websocket dies under load

I develop an application where we need to handle 160k concurrent users which are connected to the backend via a websocket connection.
We decided to use the spring websocket implementation and RabbitMQ as the message broker.
In our application every user needs to subscribe to its user queue /exchange/amq.direct/update as well as to another queue where also other users can potential subscribe to /topic/someUniqueName.
In our first performance test we did the naive approach where every user subscribes to two new queues.
When running the test RabbitMQ dies silently when around 800 users are connected at the same time, so around 1600 queues are active (See the graph of all RabbitMQ objects here).
I read though that you should be careful opening many connections to RabbitMQ.
Now I wonder if the approach that is anticipated by Spring Websocket with opening one queue per user is a conceptional problem for systems with high load or if there is another error in my system.
Limiting factors for RabbitMQ are usually:
memory (can be checked in dashboard) that needs to grow with number of messages and number of queues (if you don't use lazy queues that go directly to disk).
maximum number of file descriptors (at least 1 per connection) that often defaults to too low values on many distributions (ref: https://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2012-April/019615.html)
CPU for routing the messages
I did find the issue. I actually misconfigured the RabbitMQ service and just gave it a 1024 file descriptor limit. Increasing it solved the issue.

Efficient local transport for ActiveMQ broker

I have multiple applications (the producers) that produce messages to be processed by another application (the consumer). The messages will be sent through an ActiveMQ broker running on the same server. I don't have access to the applications' code, therefore the messages will be produced by executing a script (I currently don't know which language to use). The consumers will be Java application that will process the received messages.
I'm looking for an efficient transport that fits my use case. The VM transport cannot be used here. Also, I would like to avoid opening a TCP connection with the broker every time the producer script is executed (i.e. I would like to avoid using the TCP transport). I thought that UDP may be a good fit unless you know another transport which is more appropriate.
Thanks,
Mickael
There are pros and cons of both the TCP and UDP protocol
1)If ordering of messages and reliable-delivery of messages doesn't matter to you then UDP might be a nice choice,moreover in UDP it can also happen that duplicate messages are delievered to broker.
2)Using TCP offers reliable-delivery of messages along with ordering, but if you want to eliminate the stream Transport delay of TCP then you might think against it.
There are couple of others as well which you can retrospect based on your requirements
NIO protocol(USed in case of high traffic requirements)
HTTP protocol(In case you want to bypass firewalls)
Hope this helps!
Good luck!

IPC between server and many clients on Mac OS X

I have following scenario:
Server should be Daemon.
Other Apps should be clients.
Many clients should communicate with server to get their task done by server at a time.
These tasks are such as copyfile, deletefile etc.
My solution:
Server has 5 worker threads each containing named pipe. Each pipe's availability status is kept in Shared memory structure. When client wants to communicate with server, it checks which pipe is available from shared memory then opens that pipe & sends its message on that pipe, respective worker thread of server servers this client request. That worker thread sends request status (Success/failure) on that pipe so that client will become aware of last operation status.
As far as I know, pipes on Mac os x are unidirectional & they lack capability of creating unlimited instances like Windows.
What mechanism could be best suited for such kind of communication?
Thanks,
Vaibhav.
As far as I know, pipes on Mac os x are unidirectional & they lack capability of creating unlimited instances like Windows.
Pipes are one directional, but Unix sockets are not. This is probably what you are after if you want to directly port your code to OS X.
However, there are probably better ways to do what you want to do, including stuff like Distributed Objects which I admit I have never used. Even if you stick with a socket interface, I think one socket would be easier with a thread monitoring the socket and handing off work to worker threads as it arrives, using listen and accept. Better still, have an NSOperationQueue or a dispatch queue to put the work on, then the OS will handle the task of optimising the thread count.

Data broadcasting between instances of distributed server

I'm trying to get some feedback on the recommendations for a service 'roster' in my specific application. I have a server app that maintains persistant socket connections with clients. I want to further develop the server to support distributed instances. Server "A" would need to be able to broadcast data to the other online server instances. Same goes for all other active instances.
Options I am trying to research:
Redis / Zookeeper / Doozer - Each server instance would register itself to the configuration server, and all connected servers would receive configuration updates as it changes. What then?
Maintain end-to-end connections with each server instance and iterate over the list with each outgoing data?
Some custom UDP multicast, but I would need to roll my own added reliability on top of it.
Custom message broker - A service that runs and maintains a registry as each server connects and informs it. Maintains a connection with each server to accept data and re-broadcast it to the other servers.
Some reliable UDP multicast transport where each server instance just broadcasts directly and no roster is maintained.
Here are my concerns:
I would love to avoid relying on external apps, like zookeeper or doozer but I would use them obviously if its the best solution
With a custom message broker, I wouldnt want it to become a bottleneck is throughput. Which would mean I might have to also be able to run multiple message brokers and use a load balancer when scaling?
multicast doesnt require any external processes if I manage to roll my own, but otherwise I would need to maybe use ZMQ, which again puts me in the situation of depends.
I realize that I am also talking about message delivery, but it goes hand in hand with the solution I go with.
By the way, my server is written in Go. Any ideas on a best recommended way to maintain scalability?
* EDIT of goal *
What I am really asking is what is the best way to implement broadcasting data between instances of a distributed server given the following:
Each server instance maintains persistent TCP socket connections with its remote clients and passes messages between them.
Messages need to be able to be broadcasted to the other running instances so they can be delivered to relavant client connections.
Low latency is important because the messaging can be high speed.
Sequence and reliability is important.
* Updated Question Summary *
If you have multiple servers / multiple end points that need to pub/sub between each other, what is a recommended mode of communication between them? One or more message brokers to re-pub messages to a roster of the discovered servers? Reliable multicast directly from each server?
How do you connect multiple end points in a distributed system while keeping latency low, speed high, and delivery reliable?
Assuming all of your client-facing endpoints are on the same LAN (which they can be for the first reasonable step in scaling), reliable UDP multicast would allow you to send published messages directly from the publishing endpoint to any of the endpoints who have clients subscribed to the channel. This also satisfies the low-latency requirement much better than proxying data through a persistent storage layer.
Multicast groups
A central database (say, Redis) could track a map of multicast groups (IP:PORT) <--> channels.
When an endpoint receives a new client with a new channel to subscribe, it can ask the database for the channel's multicast address and join the multicast group.
Reliable UDP multicast
When an endpoint receives a published message for a channel, it sends the message to that channel's multicast socket.
Message packets will contain ordered identifiers per server per multicast group. If an endpoint receives a message without receiving the previous message from a server, it will send a "not acknowledged" message for any messages it missed back to the publishing server.
The publishing server tracks a list of recent messages, and resends NAK'd messages.
To handle the edge case of a server sending only one message and having it fail to reach a server, server can send a packet count to the multicast group over the lifetime of their NAK queue: "I've sent 24 messages", giving other servers a chance to NAK previous messages.
You might want to just implement PGM.
Persistent storage
If you do end up storing data long-term, storage services can join the multicast groups just like endpoints... but store the messages in a database instead of sending them to clients.

What are common UDP usecases?

Can anyone tell be where to use the UDP protocol except live streaming of music/video? What are default usecases for UDP?
UDP is also good for broadcast, such as service discovery - finding that newly plugged in printer.
Also of note is that broadcast is anonymous, you don't need to specify target hosts, as such it can form the foundation of a convenient plug-and-play or high-availability network.
UDP is stateless and is good for applications that have large numbers of clients connecting to a server such as time servers or DNS. The fact that no connection has to established and maintained reduces the memory required by the server. There is no handshaking involved and so this reduces the traffic on the network. On the downside, if the information transferred requires multiple packets there is no transmission control to ensure that all packets arrive and in the correct order - but in games packets lost are probably better than late or disordered.
Anything else where you need performance but can survive if a packet gets lost along the way. Multiplayer games come to mind, for example.
A very common use case is DNS, since the overhead of creating a TCP connection would by far outweight the actual payload.
Additional use cases are NTP (network time service) and most video games.
I use UDP to add chat capabilities to our applications. No need to create a server. It is also useful to dispatch events to all users of our applications.