I wanted to understand the pro's/cons of using a client node within a cluster vs a external thin client. Ofcourse the thin client will be less chatty Vs a client node and hence less n/w interactions. Changes in the cluster topology(nodes adding/removing) would not affect the client, while it directly affects the client node.
All these make me wonder will a thin client always be a better option or are then other cases where having a client node makes much more sense.
If Apache/Gridgain has any documentation/links around this. That would do too.
TIA
I think there won't be any thick client in future major releases; it will be superseded by a thin one instead because of a single protocol and lightweight design.
At the moment, a thick client still has some features advantage:
faster and better discovery and communication (topology changes)
peer class loading
near caching
advanced compute capabilities
events listening
full data structures support/distributed locking
etc
The feature parity list is constantly shrinking, but it's also worth mentioning that some features might be available for a particular platform only. For example, in .NET thin client, but not in Java one.
You have mentioned the cons already - being a cluster-wide citizen implies the same obligation a server node has. I.e. a good network channel and participation in all global events.
That means in some cases a thick client might not be deployed and working as expected. Usually it's about NAT, private networks, firewalls, and so on.
In general, I'd say if your task could be implemented by a thin client, use it. If a required feature/API is not yet available, consider using a thick one. For example, if you need something like a health-check for your application running every minute, you definitely would like to have a thin client for that task and not to trigger PME.
Thick clients are aware of all nodes, data distribution, and are more efficient in most cases, use them if your deployment allows for it. Plus, thick clients support all of the GridGain APIs.
Thin clients are lightweight(similar to a jdbc driver), connect to the cluster via binary protocol with a well-defined message format, support a limited set of APIs, and allow for support of multiple languages: Java, .NET, C++, Python, Node.JS, and PHP are supported out of the box.
see docs on thin/thick clients differences
Also take a look at capabilities of thin clients.
This section explains how to choose a client.
For example, a thick client serves as a reducer for queries, thereby you avoid an extra hop(from server to thin client), and lessening the cluster load when executing a query on a partitioned cache.
A Thick client could also directly participate in compute jobs, usually it is used as a reducer, whereas a thin client just submits a job to the cluster.
A thick client could also receive event notifications.
Thick clients could reconnect more reliably(because they know the current
cluster state) if the cluster topology changes.
Related
I've got a working Silverlight/WCF application that I need to start thinking about scaling. An obvious target for scaling, of course, is Azure.
The key architectural feature of the application is that 2-10 Silverlight clients will join a given "room" (using a duplex Net.TCP connection), and any of those clients can then send a message (for instance, a chat message), which then needs to be pushed in real-time to every other client connected to the same room, using the underlying duplex WCF connection.
Right now, the way the WCF service works is basically to keep in-memory a list of sessions and the rooms that they're associated with, so that when a message from one session comes in, it can automatically send the message to every other session in the room.
This works fine for a single WCF server instance, but it gets complicated if you need to scale it so that multiple WCF instances are in play. If you use network-layer load balancing, of course, you would typically find that only some of the members of your room are on the same server you're on, which means that when it comes time to push out messages to all those members, only some of them would actually get notified.
Apart from Azure, I had been thinking that I would handle it via some sort of application-layer load balancing. For instance, the web server that each client downloads the Silverlight application from might do a primitive round-robin sort of load-balancing, i.e., "OK, everyone in room x, you use WCF instance 1. Everyone in room y, you use WCF instance 2." That sort of thing.
So I have two questions:
(1) Is there any other, better way to architect this, so as to be able to use network-layer load balancing rather than needing to make the application aware of the underlying infrastructure?
(2) If I have to do the application-layer load balancing, what's the best way to handle this in Azure? Do I have to use the IAAS (full VM's), or is there a way to do this using PAAS (worker roles)? My understanding is that it's not possible to independently address worker roles, which would make a roles-based approach difficult, if not impossible.
SignalR powered by the Azure Service bus, may work for you.
http://vasters.com/clemensv/2012/02/13/SignalR+Powered+By+Service+Bus.aspx
I've been trying to read up on the DDS standard, and OpenSplice in particular and I'm left wondering about the architecture.
Does DDS require that a broker be running, or any particular daemon to manage message exchange and coordination between different parties?
If I just launch a single process publishing data for a topic, and launch another process subscribing for the same topic, is this sufficient? Is there any reason one might need another process running?
In the alternative, does it use UDP multicasting to have some sort of automated discovery between publishers and subscribers?
In general, I'm trying to contrast this to traditional queue architectures such as MQ Series or EMS.
I'd really appreciate it if anybody could help shed some light on this.
Thanks,
Faheem
DDS doesn't have a central broker, it uses a multicast based discovery protocol. OpenSplice has a model with a service for each node, but that is an implementation detail, if you check for example RTI DDS, they don't have that.
DDS specification is designed so that implementations are not required to have any central daemons. But of course, it's a choice of implementation.
Implementations like RTI DDS, MilSOFT DDS and CoreDX DDS have decentralized architectures, which are peer-to-peer and does not need any daemons. (Discovery is done with multicast in LAN networks). This design has many advantages, like fault tolerance, low latency and good scalability. And also it makes really easy to use the middleware, since there's no need to administer daemons. You just run the publishers and subscribers and the rest is automatically handled by DDS.
OpenSplice DDS used to require daemon services running on each node, but they have added a new feature in v6 so that you don't need daemons anymore. (They still support the daemon option).
OpenDDS is also peer-to-peer, but it needs a central daemon running for discovery as far as I know.
Think its indeed good to differentiate between a 'centralized broker' architecture (where that broker could be/become a single-point of failure) and a service/daemon on each machine that manages the traffic-flows based on DDS-QoS's such as importance (DDS:transport-priority) and urgency (DDS: latency-budget).
Its interesting to notice that most people think its absolutely necessary to have a (real-time) process-scheduler on a machine that manages the CPU as a critical/shared resource (based on timeslicing, priority-classes etc.) yet that when it comes to DDS, which is all about distributing information (rather than processing of application-code), people find it often 'strange' that a 'network-scheduler' would come in 'handy' (the least) that manages the network(-interface) as a shared-resource and schedules traffic (based on QoS-policy driven 'packing' and utilization of multiple traffic-shaped priority-lanes).
And this is exactly what OpenSplice does when utilizing its (optional) federated-architecture mode where multiple applications that run on a single-machine can share data using a shared-memory segment and where there's a networking-service (daemon) for each physical network-interface that schedules the in- and out-bound traffic based on its actual QoS policies w.r.t. urgency and importance. The fact that such a service has 'access' to all nodal information also facilitates combining different samples from different topics from different applications into (potentially large) UDP-frames, maybe even exploiting some of the available latency-budget for this 'packing' and thus allowing to properly balance between efficiency (throughput) and determinism (latency/jitter). End-to-End determinism is furthermore facilitated by scheduling the traffic over pre-configured traffic-shaped 'priority-lanes' with 'private' Rx/Tx threads and DIFSERV settings.
So having a network-scheduling-daemon per node certainly has some advantages (also as it decouples the network from faulty-applications that could be either 'over-productive' i.e. blowing up the system or 'under-reactive' causing system-wide retransmissions .. an aspect thats often forgotten when arguing over the fact that a 'network-scheduling-daemon' could be viewed as a 'single-point-of-failure' where as the 'other view' could be that without any arbitration, any 'standalone' application that directly talks to the wire could be viewed as a potential system-thread when it starts misbehaving as described above for ANY reason.
Anyhow .. its always a controversial discussion, thats why OpenSplice DDS (as of v6) supports both deployment modes: federated and non-federated (also called 'standalone' or 'single process').
Hope this is somewhat helpful.
We've got a Windows service that is connected to various client applications via a duplex WCF channel. The client and server applications are installed on different machines, in different locations, potentially at widely different times, and by different people. In addition, the client can be pointed at a different machine running the same Windows service at startup.
Going forward, we know that the interface between the client and the server applications will likely evolve. The application in the field will be administered by local IT personnel, and we have no real control over what version of either of these applications will be installed when/where or which will be connecting to the other. Since these are installed at various physical locations and by different people, there's a high likely that either the client or server application will be out of date compared to the other.
Since we can't control what versions of the applications in the field are trying to connect to each other, I'd like to be able to verify that the contracts between the client application and the server application are compatible.
Some things I'm looking for (may not be able to realistically get them all):
I don't think I care if the server's interface is newer or older, as long as the server's interface is a super-set of the client's
I want to use something other than an "interface version number". Any developer-kept version number will eventually be forgotten about or missed.
I'd like to use a computed interface comparison if that's possible
How can I do this? Any ideas on how to go about this would be greatly appreciated.
Seems like this is a case of designing your service for versioning. WCF has very good versioning capabilities and extension points. Here are a couple of good MSDN articles on versioning the service contract and more specifically the data contracts. For backward and "forward" compatible versioning look at this article on using the IExtensibleDataObject interface.
If the server's endpoint has metadata publishing enabled, you can programmatically inspect an endpoint's interface by using the MetadataResolver class. This class lets you retrieve the metadata from the server endpoint, and in your case, you would be interested in the ContractDescription which contains the list of all operations. You could then compare the list of operations to your client proxy's endpoint operations.
Of course now, comparing the lists of operations would need to be implemented, you could simply compare the operations names and fail if one of the client's operations is not found within the server's operations. This would not necessarily cover all incompatiblities, ex. request/response schema changes.
I have not tried implementing any of this by the way, so it's more of a theoretical view of your problem. If you don't want to fiddle with the framework, you could implement a custom operation that would return the list of operation names. This would be of minimal effort but is less standards-compliant.
On my team at work, we use the IBM MQ technology a lot for cross-application communication. I've seen lately on Hacker News and other places about other MQ technologies like RabbitMQ. I have a basic understanding of what it is (a commonly checked area to put and get messages), but what I want to know what exactly is it good at? How will I know where I want to use it and when? Why not just stick with more rudimentary forms of interprocess messaging?
All the explanations so far are accurate and to the point - but might be missing something: one of the main benefits of message queueing: resilience.
Imagine this: you need to communicate with two or three other systems. A common approach these days will be web services which is fine if you need an answers right away.
However: web services can be down and not available - what do you do then? Putting your message into a message queue (which has a component on your machine/server, too) typically will work in this scenario - your message just doesn't get delivered and thus processed right now - but it will later on, when the other side of the service comes back online.
So in many cases, using message queues to connect disparate systems is a more reliable, more robust way of sending messages back and forth. It doesn't work well for everything (if you want to know the current stock price for MSFT, putting that request into a queue might not be the best of ideas) - but in lots of cases, like putting an order into your supplier's message queue, it works really well and can help ease some of the reliability issues with other technologies.
MQ stands for messaging queue.
It's an abstraction layer that allows multiple processes (likely on different machines) to communicate via various models (e.g., point-to-point, publish subscribe, etc.). Depending on the implementation, it can be configured for things like guaranteed reliability, error reporting, security, discovery, performance, etc.
You can do all this manually with sockets, but it's very difficult.
For example: Suppose you want to processes to communicate, but one of them can die in the middle and later get reconnected. How would you ensure that interim messages were not lost? MQ solutions can do that for you.
Message queueuing systems are supposed to give you several bonuses. Among most important ones are monitoring and transactional behavior.
Transactional design is important if you want to be immune to failures, such as power failure. Imagine that you want to notify a bank system of ATM money withdrawal, and it has to be done exactly once per request, no matter what servers failed temporarily in the middle. MQ systems would allow you to coordinate transactions across multiple database, MQ and other systems.
Needless to say, such systems are very slow compared to named pipes, TCP or other non-transactional tools. If high performance is required, you would not allow your messages to be written thru disk. Instead, it will complicate your design - to achieve exotic reliable AND fast communication, which pushes the designer into really non-trivial tricks.
MQ systems normally allow users to watch the queue contents, write plugins, clear queus, etc.
MQ simply stands for Message Queue.
You would use one when you need to reliably send a inter-process/cross-platform/cross-application message that isn't time dependent.
The Message Queue receives the message, places it in the proper queue, and waits for the application to retrieve the message when ready.
reference: web services can be down and not available - what do you do then?
As an extension to that; what if your local network and your local pc is down as well?? While you wait for the system to recover the dependent deployed systems elsewhere waiting for that data needs to see an alternative data stream.
Otherwise, that might not be good enough 'real time' response for today's and very soon in the future Internet of Things (IOT) requirements.
if you want true parallel, non volatile storage of various FIFO streams(at least at some point along the signal chain) use an FPGA and FRAM memory. FRAM runs at clock speed and FPGA devices can be reprogrammed on the fly adding and taking away however many independent parallel data streams are needed(within established constraints of course).
We've an existing system which connects to the the back end via http (apache/ssl) and polls the server for new messages, needless to say we have scalability issues.
I'm researching on removing this polling and have come across BOSH/XMPP but I'm not sure how we should take the BOSH technique (using long lived http connection).
I've seen there are few libraries available but the entire thing seems bloaty since we do not need buddy lists etc and simply want to notify the clients of available messages.
The client is written in C/C++ and works across most OS so that is an important factor. The server is in Java.
does bosh result in huge number of httpd processes? since it has to keep all the clients connected, what would be the limit on that. we are also planning to move to 64 bit JVM/apache what would be the max limit of clients in that case.
any hints?
I would note that BOSH is separate from XMPP, so there's no "buddy lists" involved. XMPP-over-BOSH is what you're thinking of there.
Take a look at collecta.com and associated blog posts (probably by Jack Moffitt) about how they use BOSH (and also XMPP) to deliver real-time information to large numbers of users.
As for the scaling issues with Apache, I don't know — presumably each connection is using few resources, so you can increase the number of connections per Apache process. But you could also check out some of the connection manager technologies (like punjab) mentioned on the BOSH page above.