NServiceBus analytics

NServiceBus analytics - nservicebus

We have gone with our first integration deployment into production with NServiceBus. ServiceInsight and ServicePulse are very handy, especially ServiceInsight - it is very helpful to understand things during go-live activities, even though it is sluggish.
I would like to see if there are any easy ways to pull information from ServiceControl into Excel to come up with basic analytics such as volume of messages/events we process, messages/event that takes more time, failed messages, busy endpoints, peak period etc.
Has anyone done this or is there something available already or any recommended ways to do this?

Take a look here: http://code972.com/blog/2015/02/83-real-time-analytics-for-nservicebus-powered-systems-using-elasticsearch
I used Elasticsearch + Kibana to provide real-time analytics for the NServiceBus platform.
You can also export the data directly from ServiceControl's RavenDB instance into CSV and then Excel - but that's indeed a lot less pretty.

Related

Organizing services dataflow / eip

Say I have like 1000 VMs with different services running on them with different technologies used like python, NET, java and different middleware like rabbitmq, redis etc.
How can I dynamically handle the interactions between the services and provide scalability?
For Example, say I have Service A which is pushing Data to a rabbitmq then the data is processed by service B while fetching additional data from Service C. You see at the end I have a decentralized system which is pulling data somewhere and pushing it somewhere else... a total mess! Scale it up to 2000 microservices omg XD.
The moment I change one thing a lot of other systems are affected.
Do you know something maybe like an ESB where I can couple two services together with a message transform adapter in the middle of it and I can change dependenciesat runtime? Like the stream doesn't end in service F anymore and does end in G for example?
I think microservices are a good idea because they can be stateless, can scale, can easily be deployed as a container. But I don't know a good tool/program for managing the data flow. The rabbitmq doesn't support enough enterprise integration patterns. Do you have any advice?

How can I dynamically handle the interactions -
See if using an existing EIP pattern solves your problem to implement the logistics
Depending on how your design shapes up, you may need to use Distributed Lock Management
Or maybe your application is simple enough to use a Consul K/V store as a semaphore & a simple mosquitto topic based bus.
Provide scalability
What is the solution you are trying to scale? AMQP, Consul, "microservices" in themselves are very scalable & distributed
However, to scale your thought process & devops, you need to find a way to see things as patterns that help you split the problem & tackle the complexity
Do you know something maybe like an ESB where I can couple two services together with a message transform adapter in the middle of it and I can change dependenciesat runtime?
Read up on EIP. ESBs are just one of the many ways you can solve your problem. RTFM, & get some perspective.
But I don't know a good tool/program for managing the data flow.
Ask yourself if your problem is related to distributed workflow management, or if a data pipeline is what you are really looking for
Look at Spark, Storm, Luigi, Airflow - they all have a different purpose - but you will know what to do with them if you manage to read up on everything else in this post ;)

Gathering distributed data into central database

I was assigned to update existing system of gathering data coming from points of sale and inserting it into central database. The one that is working now is based on FTP/SFTP transmission, where the information is sent once a day, usually at night. Unfortunately, because of unstable connection links (low quality 2G/3G modems), some of the files appear to be broken. With just a few shops connected that way everything was working smooth, but along with increasing number of shops, errors became more often. What is worse, the time needed to insert data into central database is taking up to 12 - 14h (including waiting for the data to be downloaded from all of the shops) and that cannot happen during the working day as it would block the process of creating sale reports and other activities with the database - so we are really tight with processing time here.
The idea my manager suggested is to send the data continuously, during the day. Data packages would be significantly smaller, so their transmission and insertion would be much faster, central server would contain actual (almost real time) data and night could be used for long running database activities like creating backups, rebuilding indexes etc.
After going through many websites, I found that:
using ASMX web service is now obsolete and WCF should be used instead
WCF with MSMQ or System Messaging could be used to safely transmit data, where I don't have to care that much about acknowledging delivery of data, consistency, nodes going offline etc.
according to http://blogs.msdn.com/b/motleyqueue/archive/2007/09/22/system-messaging-versus-wcf-queuing.aspx WCF queuing is better
there are also other technologies for implementing message queue, like RabbitMQ, ZeroMQ etc.
And that is where I become confused. With so many options, do you have any pros and cons of these technologies?
We were using .NET with Windows Forms and SQL Server, but if it would be necessary, we could change to something more suitable. I am also a bit afraid of server efficiency. After some calculations, server would be receiving about 15 packages of data per second (peak). Is it much? I know there are many websites without serious server infrastructure, that handle hundreds of visitors online and still run smooth, but the website mainly uploads data to the client, and here we would download it from the client.
I also found somewhat similar SO question: Middleware to build data-gathering and monitoring for a distributed system
where DDS was mentioned. What do you think about introducing some middleware servers that would cope with low quality links to points of sale, so the main server would not be clogged with 1KB/s transmission?
I'd be grateful with all your help. Thank you in advance!

Rabbitmq can easily cope with thousands of 1kb messages per second.
As your use case is not about processing real time data, I'd say you should combine few messages and send them as a batch. That would be good enough in order to spread load over the day.
As the motivation here is not to process the data in real time, then any transport layer would do the job. Even ftp/sftp. As rabbitmq will work fine here, it's not the typical use case for it.
As you mentioned that one of your concerns is slow/unreliable network, I'd suggest to compress the files before sending them, and on the receiving end, immediately verify their integrity. Rsync or similar will probably do great job in doing that.

From what I understand, you have basically two problems:
Potential for loss/corruption of call data
Database write performance
The potential for loss/corruption of call data is being caused by a lack of reliability in the transmission of data from client to service.
And it's not clear what is causing the database contention/performance issues, beyond a vague reference to high volumes, so this answer will be more geared towards solving the first problem.
You have correctly identified the need for reliable asynchronous communication transport as a way to address the reliability issues in your current setup.
Looking at MSMQ to deliver this is a valid first step. MSMQ provides reliable communication via a store and forward messaging semantic which comes out of the box and requires very little in the way of configuration.
Unfortunately, while suitable for your needs, MSMQ relies on 2 things:
A reliable network protocol, and
A client service running on both sending and receiving machine.
From your description above, I don't believe 1 exists (the internet is not a reliable network), and you might well struggle with 2 - MSMQ only ships with Windows Server or business/enterprise versions of Windows on the desktop.(*see below...)
As a possible solution to the network reliability problem, you could use a WCF or a RESTful endpoint (using Nancy or WebApi) to expose a service operation(s) exposed over HTTP, which would accept the incoming calls from the client machines. These technologies are quite different, so you'll need to make sure you're making the correct choice early on.
WCF supports WS-ReliableMessaging from the SOAP 1.2 specification out of the box, which allows for reliable web service calls over http, however it's very config-heavy and not generally a nice framework to work with.
REST much simpler than WCF in .Net, is very lightweight and easy to use. However, for reliable delivery you would have to expose some kind of GET operation (in addition to a POST to allow the client to send data) to be called (within a reasonable time-frame) to verify the data was committed. The client would have to implement some kind of retry semantic if the result of the GET "acknowledgement" was negative.
Despite requiring two operations rather than one for the WCF route, I would favour the REST approach. I've done plenty of both and find REST services way nicer to work with.
(*) That's not to say that MSMQ wouldn't work in your ultimate solution, just that it would not be used to address the transmission reliability issue. However it could still be used to address another of your problems, that of database write contention. If you were to queue incoming requests once they came into the server, then these could be processed by an "offline" process, which could then perform the required database operations in a reliable manner. This could be done by using MSMQ transactional queues.
In response to comments:
99% messages are passed from shop to main server, but if some change
is needed (price correction, discounts etc.), that data has to be sent
to shop.
This kind of changes things. Had I understood from the beginning that you had a bidirectional requirement, and seeing as how you have managed to establish msmq communication, I would have nudged you towards NServiceBus, which is a really, really cool wrapper around MSMQ. The reason I would have done this is that you appear to have both a one way, and a publish-subscribe requirement, which is supported really nicely by NServiceBus.

On NServiceBus Profiles

I've been trying to find out ways to improve our nservicebus code performance. I searched and stumbled on these profiles that you can set upon running/installing the nservicebus host.
Currently we're running the nservicebus host as-is, and I read that by default we are using the "Lite" version of the available profiles. I've also learnt from this link:
http://docs.particular.net/nservicebus/hosting/nservicebus-host/profiles
that there are Integrated and Production profiles. The documentation does not say much - has anyone tried the Production profiles and noticed an improvement in nservicebus performance? Specifically affecting the speed in consuming messages from the queues?

One major difference between the NSB profiles is how they handle storage of subscriptions.
The lite, integration and production profiles allow NSB to configure how reliable it is. For example, the lite profile uses in-memory subscription storage for all pub/sub registrations. This is a concern because in order to register a subscriber in the lite profile, the publisher has to already be running (so the publisher can store the subscriber list in memory). What this means is that if the publisher crashes for any reason (or is taken offline), all the subscription information is lost (until each subscriber is restarted).
So, the lite profile is good if you are running on a developer machine and want to quickly test how your services interact. However, it is just not suitable to other environments.
The integration profile stores subscription information on a local queue. This can be good for simple environments (like QA etc.). However, in a highly distributed environment holding the subscription information in a database is best, hence the production profile.
So, to answer your question, I don't think that by changing profiles you will see a performance gain. If anything, changing from the lite profile to one of the other profiles is likely to decrease performance (because you incur the cost of accessing queue or database storage).

Unless you tuned the logging yourself, we've seen large improvements based on reduced logging. The performance from reading off the queues is same all around. Since the queues are local, you won't gain much from the transport. I would take a look at tuning your handlers and the underlying infrastructure. You may want to check out tuning MSMQ and look at the disk you are using etc. Another spot would be to look at how distributed transactions are working assuming you are using a remote database that requires them.
Another option to increase processing time is to increase the number of threads consuming the queue. This will require a license. If a license is not an option you can have multiple instances of a single threaded endpoint running. This requires you shard your work based on message type or something else.
Continuing up the scale you can then get into using the Distributor to load balance work. Again this will require a license, but you'll be able to add more nodes as necessary. All of the opportunities above also apply to this topology.

How can I use NServiceBus with a database instead of MSMQ

Is it possible to use NServiceBus with a database as the queue storage instead of MSMQ? If so, how can I get started and what are the pros and cons of using a database instead of MSMQ?

If you want to use something other than MSMQ you'll have to plug in your own ITransport. I would take a look at the NSB Contrib project on GitHub, there is an implementation of of ITransport for the SQL Server Broker(messaging).
The cons I see for using a database includes cost and maintenance overhead. MSMQ comes with the OS for free and most admins have the skills to maintain it. Once you get in a DB, you have to pay for it and find someone to maintain it. This starts out ok, but once you get into multiple environments and things like clustering, licensing gets out of control.

Using an ESB system to replicate data among databases

I work in a small supermarket chain (4 stores). Each store has its own local database which contains information of each product, prices, and transactions that have ocurred on the store. In addition, each store needs to replicate this information back and forth to a central location.
Right now we are using something called SQLRemote, which is a feature of Sybase's SQL Anywhere database. It works, but sometimes fails and is difficult to manage. To its' credit, SQLRemote actually wasn't designed for this type of scenarios, so it could be said that we are using it incorrectly.
I was thinking that an ESB system such as Mule (or ChainBuilder which seems easier to set up) might be a good alternative to SQL remote. I understand that these systems can detect when changes occur in the database (i.e. when records are added, modified or deleted), and can be set up to deliver a message in a transaction.
Would this be a viable solution to my scenario?
Best regards,
Edgard

Yeah I am sure Mule should be able to do this.
However I work for a company which provides Fuse ESB which is using Apache projects such as Apache ServiceMix, Apache ActiveMQ, Apache Camel and Apache CXF.
We have a user story about a very big retailler in US which uses Fuse ESB to integrate their stores and warehouses and whatnot
http://fusesource.com/collateral/17
Fuse ESB
http://fusesource.com/products/enterprise-servicemix/

Yes, Mule can support this scenario thought it might be overkill. There are targeted database replication solutions out there. The advantage of Mule would be it's ability to handle failure and other scenarios where you need the workflow to be adapted based on what is happening. This allows you to build a very robust solution.
Mule flows could be a very good choice to address this problem. It's a new feature of Mule 3 designed for orchestrating integrations like this.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas