Active connections on web farm - asp.net-core

I'm trying to build a simple chat with websockets. I'm also displaying the current active users in the chat, and here is where the problems start: we use a web farm.
A user can connect through a loadbalancer with a server. When a new connection hits a server, it increases a counter in a SQL database and notifies the other servers in the farm through rabbit MQ.
All other servers fetch the new data and send that number back to their connected users.
If an user disconnects, the same will happen: The server decreases the counter in the SQL database and through rabbit MQ all other servers will know about this.
But, what will happen when a server dies? for example, If 10 users will be connected with this server. When that server goes down, all the users are disconnected, but that is not updated in the database anymore.
What's the best solution to get the total amount of active users in a web farm? And notifying the users when this amount has changed?
Thanks in advance!
Oh btw, we're using signalr

I think the typical way to deal with nodes asynchronously disconnecting from a mesh is to implement a heartbeat/keep-alive mechanism. In this case the heartbeat message would be between servers and there must also be an accessible record of which users are connected to which server. When a server does not produce a heartbeat for a period of time, then all other servers can update their records and mark all the users associated with the server as disconnected.
Looks like you may have a few options on how to keep track of users (SQL database or every server listens a Rabbit MQ message). As far as the heartbeat, you can implement it yourself or try to see if the laodbalancer's detection method can be utilized.

Related

How does dropbox server keep connection alive with all its client app?

Dropbox has more than 300M user.Dropbox desktop application need to keep connection alive with dropbox server for every updates.
But how does dropbox server keep connection alive with all its desktop user?
The dropbox client keeps a TCP connection constantly open to listen for server-side notifications. When it receives a notification, the client initiates an HTTPS conversation to see what changed and download it. When something changes on the client side, it also initiates an HTTPS conversation to update the files on the server.
Source: http://www-net.cs.umass.edu/imc2012/papers/p481.pdf
The Dropbox client keeps continuously opened a TCP
connection to a notification server (notifyX.dropbox.com),
used for receiving information about changes performed else-
where. In contrast to other traffic, notification connections
are not encrypted. Delayed HTTP responses are used to implement a push mechanism: a notification request is sent by the local client asking for eventual changes; the server response is received periodically about 60 seconds later in case of no change; after receiving it, the client immediately
sends a new request. Changes on the central storage are instead advertised as soon as they are performed.
While the decrypted headers give no indication of what servers Dropbox uses to keep so many open TCP connections, people report being able to keep over 600k (https://stackoverflow.com/a/9676852/15472) or even over 1M (http://blog.whatsapp.com/196/1-million-is-so-2011). With enough load-balancing, 300M users, of which only a fraction of which are connected simultaneously and actively share data within each other, certainly seems within reach.
I doubt that all 300M users are connected at the same time... And by the amount of storage they provide, they will have enough servers to handle the needed amount of connections, maybe 1% of their user count at a time.
If you like to investigate yourself, you could use tools like TCPView (part of Sysinternals Suite) to check which connections are opened by the application, or Wireshark to check the transferred data.
I assume that you mean 'update' of storage content; that could also happen on fixed intervals by opening a connection, getting the files list and closing the connection afterwards. In this case the connection would be used for a few seconds in an interval of e.g. 5 minutes. This would again reduce the number of needed simultaneous connections by factor ~100.

BizTalk connectivity issue to SQL during VM snapshot

We have one VM for BizTalk and a separate VM for the SQL backend. We are using Veeam for backups which basically kicks off a snapshot of the VM. When this snapshot is being finalized on the SQL VM, BizTalk services on the application server fail. Usually they restart automatically but sometimes this requires manual intervention to start the services. The error below is logged on the BizTalk server.
Is there any timeout setting or config changes that will allow BizTalk services to stay up during the snapshot process?
An error occurred that requires the BizTalk service to terminate. The most common causes are the following:
1) An unexpected out of memory error.
OR
2) An inability to connect or a loss of connectivity to one of the BizTalk databases.
The service will shutdown and auto-restart in 1 minute. If the problematic database remains unavailable, this cycle will repeat.
Error message: [DBNETLIB][ConnectionRead (recv()).]General network error. Check your network documentation.
Error source:
BizTalk host name: BizTalkServerApplication
Windows service name: BTSSvc$BizTalkServerApplication
We experienced the same situation and error with both BizTalk 2009 and BizTalk 2013, each set up with two App servers and one SQL DB server.
When our VMware does the final step of the Snapshot backup on the Application servers, it freezes the application server for about 10 seconds, preventing it from receiving packets. On SQL Server 2008 and 2012, it by default will send out keep-alive packets to the clients every 30 seconds (30,000 ms). If the SQL server fails to receive a response back from the App server, it will send out 5 retries (default setting) of the keep-alive request 1 second (1,000 ms) apart. If SQL still does not receive the response back, it will terminate the connection, which will cause the BizTalk hosts on the App server to reset, and in our case, when our German-made ERP system sends its EDI documents over to BizTalk during that reset period, the transmission will fail.
We trapped the issue by running NetMon on the DB and App servers, waiting for the next error message. Upon inspection, we see the five SQL keep-alive packets being sent to the App servers 1 second apart, and at the same time there were NO packets at all received on the Application server. At first guess, one might think they were "just dropped network packets", which is rarely the case. We then made the correlation to the timing of the VM Snapshots, and now confirm each time the snapshot finishes each day, the App servers freeze.
As a Short-to-mid-term workaround, we raised the number of retries SQL attempts before declaring a connection dead, (5 by default), by adding the registry value TcpMaxDataRetransmissions and setting it to 30 (thus 30 seconds before SQL declares the client unresponsive). This has masked the problem for now for us, and use at your own discretion.
We are also looking at an Agent-based version of the VM Snapshot, which may alleviate the condition of freezing the server.
Is there any timeout setting or config changes that will allow BizTalk services to stay up during the snapshot process?
Not that I am aware of, however you might want to Google config options in the btsntsvc.exe.config file which is located in your BizTalk installation directory.
All messages that pass through BizTalk are written to the BizTalkMsgBoxDb and its other databases are involved if you are running tracking, BAM etc. The only service that can cache 'stuff' and handle a database outage is the Enterprise Single Sign-On (ESSO) Service. BizTalk therefore needs a persistent connection to the database server to remain 'up', hence why your Host Instance (BizTalkServerApplication) is stopping - it simply wouldn't be able to process messages if the database wasn't there.
I would add that your approach to back-ups probably isn't supported by Microsoft and I would further suggest that you seriously consider whether an approach that takes your database server offline during the backup is viable?
BizTalk has a pretty robust backup solution for its various databases built into the product, and I would recommend that you take a look at using this supported method.
If you do need to take snapshots of the database system - say once a night - you might want to consider stopping the BizTalk Host Instances, performing the snapshot, and then re-starting the Host Instances through some scripted task.
You might also want to consider checking whether there are any hotfixes for your version of BizTalk Server included in a Cumulative Update that might help address your problem.

LDAP Connections

I have a very basic question about the LDAP Protocol:
Can a client be connected for an undefined period of time or each authentication requires to open and close a tcp connection?
Professional-quality LDAP servers can be configured to terminate clients after a period of time, a maximum number of operations, or other conditions; or alternatively, leave the the client connected forever. Ask your LDAP server administrator whether client connections are being terminated for any of the conditions listed, or perhaps others.
In addition to what Terry says, professional quality LDAP client APIs use a connection pool to hide all these gory details from you; to keep connections open as long as possible; and to recover from situations where the server imposes a connection termination rule.
LDAP servers may implement multiple limits on the server side , The LDAP client APIs also provide options to set limits at the client side. Some of the server side limits are [ In case of Oracle DSEE]
Size limit - Number of searh result entries returned
Time limit - Time taken to process the request
Idle time Limit - How much time the connection can stay idle ? [keepalive at load balancers can keep the connection alive] . server access log marks connections closed because of idle time .
Lookthrough limit - Number of candidate entries to look through for a given ldap search
Client APIs may set it's own time and size limit

Data broadcasting between instances of distributed server

I'm trying to get some feedback on the recommendations for a service 'roster' in my specific application. I have a server app that maintains persistant socket connections with clients. I want to further develop the server to support distributed instances. Server "A" would need to be able to broadcast data to the other online server instances. Same goes for all other active instances.
Options I am trying to research:
Redis / Zookeeper / Doozer - Each server instance would register itself to the configuration server, and all connected servers would receive configuration updates as it changes. What then?
Maintain end-to-end connections with each server instance and iterate over the list with each outgoing data?
Some custom UDP multicast, but I would need to roll my own added reliability on top of it.
Custom message broker - A service that runs and maintains a registry as each server connects and informs it. Maintains a connection with each server to accept data and re-broadcast it to the other servers.
Some reliable UDP multicast transport where each server instance just broadcasts directly and no roster is maintained.
Here are my concerns:
I would love to avoid relying on external apps, like zookeeper or doozer but I would use them obviously if its the best solution
With a custom message broker, I wouldnt want it to become a bottleneck is throughput. Which would mean I might have to also be able to run multiple message brokers and use a load balancer when scaling?
multicast doesnt require any external processes if I manage to roll my own, but otherwise I would need to maybe use ZMQ, which again puts me in the situation of depends.
I realize that I am also talking about message delivery, but it goes hand in hand with the solution I go with.
By the way, my server is written in Go. Any ideas on a best recommended way to maintain scalability?
* EDIT of goal *
What I am really asking is what is the best way to implement broadcasting data between instances of a distributed server given the following:
Each server instance maintains persistent TCP socket connections with its remote clients and passes messages between them.
Messages need to be able to be broadcasted to the other running instances so they can be delivered to relavant client connections.
Low latency is important because the messaging can be high speed.
Sequence and reliability is important.
* Updated Question Summary *
If you have multiple servers / multiple end points that need to pub/sub between each other, what is a recommended mode of communication between them? One or more message brokers to re-pub messages to a roster of the discovered servers? Reliable multicast directly from each server?
How do you connect multiple end points in a distributed system while keeping latency low, speed high, and delivery reliable?
Assuming all of your client-facing endpoints are on the same LAN (which they can be for the first reasonable step in scaling), reliable UDP multicast would allow you to send published messages directly from the publishing endpoint to any of the endpoints who have clients subscribed to the channel. This also satisfies the low-latency requirement much better than proxying data through a persistent storage layer.
Multicast groups
A central database (say, Redis) could track a map of multicast groups (IP:PORT) <--> channels.
When an endpoint receives a new client with a new channel to subscribe, it can ask the database for the channel's multicast address and join the multicast group.
Reliable UDP multicast
When an endpoint receives a published message for a channel, it sends the message to that channel's multicast socket.
Message packets will contain ordered identifiers per server per multicast group. If an endpoint receives a message without receiving the previous message from a server, it will send a "not acknowledged" message for any messages it missed back to the publishing server.
The publishing server tracks a list of recent messages, and resends NAK'd messages.
To handle the edge case of a server sending only one message and having it fail to reach a server, server can send a packet count to the multicast group over the lifetime of their NAK queue: "I've sent 24 messages", giving other servers a chance to NAK previous messages.
You might want to just implement PGM.
Persistent storage
If you do end up storing data long-term, storage services can join the multicast groups just like endpoints... but store the messages in a database instead of sending them to clients.

Create multiple connections in a single browser window with wcf in silverlight

I have develop a silverlight 3 chat application in which one user chat with multiple users on a same time.
In my application a chat window is a
silverlight control and a user can
open more than 10 chat windows on same
time in a single browse window.means
every chat window make connection with
wcf.
I have allready increase connection limits of wcf using throttlingservice behavior. It works for multile clinets means multiple browses open on same time on different machines and its more than 10. its ok. but if when one user chat with more than 10 users on same time then on 11th connection its break.
please help me and provide me solution for this problem.
Thanks
I think there is something wrong with your client implementation. Do your clients keep the connection to your server open for too long? Ideally you should only have very compact and short request/reply messages between the client and server such that each connection is only short-lived.
A user cannot send messages from each client simultaneously, I suspect. So you should hardly ever have to open more than one connection between client and server simultaneously.
Do you get the exception if all the other channels are closed? There may be a limit to the number of active connections. You may have to dole out connections between windows when there are more than ten open windows to ensure that you don't attempt to open that 11th connection.