IWantToRunWhenBusStartsAndStops not for production? - nservicebus

New to NServiceBus (4.7.5) and just implemented an NSB host.exe hosted service (implementing IWantToRunWhenBusStartsAndStops) that detects changes to database tables and notifies subscribing web apps by publishing events, e.g. "CustomerDataWasUpdatedEvent". In the future we will perform the actual update through messagehandlers receiving commands obviously, but at the moment this publishing service just polls the database etc.
It all works well, however, approaching production, I noticed that David Boike, in his latest edition of "Learning NServiceBus", states that classes implementing
IWantToRunWhenBusStartsAndStops are really mostly for development and rarely used in production. I set up my database change detection in the Start method and it works nicely, does anyone know why this is discouraged?
Here is the comment in the actual book:
https://books.google.se/books?id=rvpzBgAAQBAJ&pg=PA110&lpg=PA110&dq=nservicebus+iwanttorunwhenbusstartsandstops+in+production+david+boike&source=bl&ots=U6sNII0nm3&sig=qIXffOVFhcy-_3qDnSExRpwRlD4&hl=sv&sa=X&ei=lHWRVc2_BKrWywPB65fIBw&ved=0CBsQ6AEwAA#v=onepage&q=nservicebus%20iwanttorunwhenbusstartsandstops%20in%20production%20david%20boike&f=false

The actual quote is:
...it isn't common to have widespread use of in a production system.
Uncommon is not the same thing as discouraged.
That said I do think there is intent here by the author to highlight the fact that further up the page they assert that this is not a good place to be doing lots of coding, as an unhandled exception can cause the whole process to fail.
The author actually does go on to mention a possible use case for when you may want to load a resource(s) to do work within the handler.
Ok, maybe it's just this scenario we have that is a bit uncommon
Agreed - there is nothing fundamentally wrong with your approach. I recently did the same thing as you for wiring up SqlDependency to listen for database events and then publish a message as a result. In these scenarios there is literally nothing else you can do other than to use IWantToRunAtStatup.
Also, David himself often trawls the nservicebus tag, maybe he'll provide a more definitive answer than mine.

I'll copy the answer I gave in the Particular Software Google Group...
I'll quote myself directly here:
An implementation of IWantToRunWhenBusStartsAndStops is a great place to create a quick interface in order to test messages during debugging by allowing you to send messages based on the console input. Apart from this, it isn't common to have widespread use of them in a production system. One possible production use case will be to provision a resource needed by the endpoint at startup and then tear it down when the endpoint stops.
I think if I could add a little bit of emphasis it would be to "widespread use". I'm not trying to say you won't/can't have an IWantToRunWhenBusStartsAndStops in production code or that avoiding them is a best practice. I am trying to say that having a ton of them is probably a code smell.
Above that paragraph in the book, I warn about IWantToRunWhenBusStartsAndStops not having any ambient transactions or try/catch stuff going on. THAT is really the key part. If you end up throwing an exception in an IWantToRunWhenBusStartsAndStops, tyou can run into big problems. If you use something like a .NET Timer and then throw an exception, you can crash your process!
Let me tell you how I screwed up on this in my first-ever NServiceBus system. The system (still in use today, from what I hear) is responsible for ingesting more than 3000 RSS feeds (probably a lot more than that now) into a CMS. So processing each feed, breaking it up into items, resizing images, encoding attached video for mobile ... all those things were handled in NServiceBus message handlers, which was scaled out to multiple servers, and that was all fantastic.
The problem was the scheduler. I implemented that as an IWantToRunWhenBusStartsAndStops (well, actually IWantToRunAtStartup at that time) and it quickly turned into a mess. I kept the whole table worth of feed information in memory so that I could calculate when to fire off the next ProcessFeed command. I was using the .NET Timer class, and IIRC, I eventually had to use threading primitives like ManualResetEvent in order to coordinate the activity. And because I was using .NET Timer, if the scheduler threw an exception, that endpoint failed and had to restart. Lots of weird edge cases and it was always a quagmire of bugs. Plus, this was now a singleton "commander app" so while the feed/item processors could be scaled out, the scheduler could not.
As I got more experienced with NServiceBus, I realized that each feed should have been a saga, starting from a FeedCreated event, controlled through PauseProcessing and ResumeProcessing commands, using timeouts to control the next processing time, and finally (perhaps) ended via a FeedRemoved event. This would have been MUCH more straightforward and everything would have executed inside transactionally-controlled message handlers.
That experience led me to be a little bit distrustful/skeptical of IWantToRunWhenBusStartsAndStops. Not saying it's bad, just something to be aware of. Always be prepared to consider if what you're trying to do couldn't be better accomplished in another way.

Related

how should slow diagnostics be handled in VS Code API direct implementation

The diagnostics in my VS Code extension, implemented directly (no language server), are blocking other things (e.g. completions) while they run. Furthermore, it seems, if they are triggered again before the prior analysis completes the delays are multiplied several-fold. As a result the experience can be very unpleasant on complex workspaces.
Is there a small example of an extension that does not use LSP, and that handles this well, the API documentation is very light on this point, and I don't find much info by searching. Or, can someone give a summary of a broad strategy for dealing with it. Again, the issue is diagnostics are blocking other services for too long.
Further notes:
In early attempts I tried simply invoking the diagnostics on every document change event, results very nice for small files, but disastrous for large ones.
Somewhat better results by using a timer to prevent diagnostics from running in rapid succession, i.e., do not run if another run was recently started. A run is forced on "document will save".
Briefly toyed with putting diagnostics in an asynchronous function, seemingly no improvement, but maybe did not spend enough time working at it. Should this be necessary? I kind of thought the API would take care of concurrent operations behind the scenes.

Best ESB/Message Queue for appharbor

I'm currently trying to find the best message queue solution for an appharbor application. Most of the ones of looked at assume you have a windows environment with MSMQ and DTC installed, which I don't believe the appharbor environment provides.
I would like something that works well with ravendb, as that is the database we are using. Something who's only dependence is on raven would be ideal, especially if it integrates with our existing unit of work. Ie, when save changes is called in our controller action the messages are saved in the same transaction.
It would also need a host that works in a console application for background processing.
Ideally I would like something that "just works" in a development environment also. With raven, for example, we use the embedded mode while developing and I would like something that doesn't require installation.
I've looked at nServicebus, which seems to fail these conditions because it needs a transport (msmq, sql, etc) and much of the documentation is out of date.
I also looked at rhino service bus but there is a distinct lack of documentation and community. I'm also not sure if it can depend entirely on ravendb.
The others I looked at all seemed quite heavyweight and required installation and configuration to run in a development environment.
Edit: the other option, is to implement our own.
First of all, congratulations on being the 1000th NServiceBus question on StackOverflow!
Second, if you were to use SQL for persisting your business data, then you could run NServiceBus on top of that same SQL where all the messages go through tables (instead of queues) and then you wouldn't need the DTC.
Third, if you did want to go with RavenDB as your transport for NServiceBus, you would have to implement the ISendMessages and IReceiveMessages interfaces on top of it, but I believe that somebody in the community has already started working on that, so possibly you could join forces with them.
Finally, I wouldn't recommend writing your own ESB these days - not when there are so many good choices already out there. You mentioned the issues of community and documentation - those tend to be handled the worst when writing your own infrastructure.

How to simulate production concurrency issue using tests

I've have an application which receives an object from queue, transforms it and publishes it on a topic. It's a message driven bean (spring) message listener container with a fair number of inner beans.
Some strange activity recently happened on the prod boxes. We want to check if this is a concurrency issue. Which is great, but not something I've done before.
My approach is to pump a load of messages at application. Write a piece of software to listen to it's publishing topic. Consume these and process them via something similar to Junit tests which compares the objects attributes with the expected result.
I've added the above for a bit of scope on the problem but basically is there any applications on the market that I could plug into either my code or my IDE which would enable me to do that. I think this is somewhat over the capabilities for JUNIT
I would recommend JavaPathfinder (JPF) for this type of exercise.
JPF is essentially a JVM that simulates execution of your code in every possible way - e.g. simulating all the possible interleavings of the bytecode instructions. JPF works as a model checker that explores the set of possible states that an application can enter and evaluates them based on predefined rules -e.g. deadlock free, non-null, etc.
Model checkers usually fail when applications become too complex but JPF is able to execute the app without traversing the whole state space and is usually able to reach problem states pretty quickly. Try to have a look at that one.

Testing fault tolerant code

I’m currently working on a server application were we have agreed to try and maintain a certain level of service. The level of service we want to guaranty is: if a request is accepted by the server and the server sends on an acknowledgement to the client we want to guaranty that the request will happen, even if the server crashes. As requests can be long running and the acknowledgement time needs be short we implement this by persisting the request, then sending an acknowledgement to the client, then carrying out the various actions to fulfill the request. As actions are carried out they too are persisted, so the server knows the state of a request on start up, and there’s also various reconciliation mechanisms with external systems to check the accuracy of our logs.
This all seems to work fairly well, but we have difficult saying this with any conviction as we find it very difficult to test our fault tolerant code. So far we’ve come up with two strategies but neither is entirely satisfactory:
Have an external process watch the server code and then try and kill it off at what the external process thinks is an appropriate point in the test
Add code the application that will cause it to crash a certain know critical points
My problem with the first strategy is the external process cannot know the exact state of the application, so we cannot be sure we’re hitting the most problematic points in the code. My problem with the second strategy, although it gives more control over were the fault takes, is I do not like have code to inject faults within my application, even with optional compilation etc. I fear it would be too easy to over look a fault injection point and have it slip into a production environment.
I think there are three ways to deal with this, if available I could suggest a comprehensive set of integration tests for these various pieces of code, using dependency injection or factory objects to produce broken actions during these integrations.
Secondly, running the application with random kill -9's, and disabling of network interfaces may be a good way to test these things.
I would also suggest testing file system failure. How you would do that depends on your OS, on Solaris or FreeBSD I would create a zfs file system in a file, and then rm the file while the application is running.
If you are using database code, then I would suggest testing failure of the database as well.
Another alternative to dependency injection, and probably the solution I would use, are interceptors, you can enable crash test interceptors in your code, these would know the state of the application and introduce the above listed failures at the correct time, or any others you may want to create. It would not require changes to your existing code, just some additional code to wrap it.
A possible answer to the first point is to multiply experiments with your external process so that probability to impact problematic parts of code is increased. Then you can analyze core dump file to determine where the code has actually crashed.
Another way is to increase observability and/or commandability by stubbing library or kernel calls, i.e., without modifying your application code.
You can find some resources on Fault Injection page of Wikipedia, in particular in Software Implemented Fault Injection section.
Your concern about fault injection is not a fundamental concern. You merely need a foolproof way to prevent such code ending up in deployment. One way to do so is by designing your fault injector as a debugger. I.e. the faults are injected by a process external to your process. This already provides a level of isolation. Furthermore, most OS'es provide some kind of access control which prevents debugging unless specifially enabled. In the most primitive form, it's by limiting it to root, on other operating systems it requires a specific "debug privilege". Naturally, on production nobody will have that, and thus your fault injector cannot even run on production.
Practially, the fault injector can set breakpoints at specific addresses, i.e. function or even line of code. You can then react to that, e.g. by terminating the process after a certain breakpoint is hit three times.
I was just about to write the same as Justin :)
The component I would suggest to replace during testing could be the logging component (if you have one, if not, I'd strongly suggest to implement one...). It's relatively easy to replace it with code that generates error and the logger usually gets enough information to know the current application state.
Also it seems to be feasible to make sure that the testing code doesn't go into production. I would discourage conditional compilation though but rather go with some configuration file to select the logging component.
Using "random" kills might help to detect errors but is not well suited for systematic testing because of its non-determinism. Therefore I wouldn't use it for automatic tests.

Why use AMQP/ZeroMQ/RabbitMQ

as opposed to writing your own library.
We're working on a project here that will be a self-dividing server pool, if one section grows too heavy, the manager would divide it and put it on another machine as a separate process. It would also alert all connected clients this affects to connect to the new server.
I am curious about using ZeroMQ for inter-server and inter-process communication. My partner would prefer to roll his own. I'm looking to the community to answer this question.
I'm a fairly novice programmer myself and just learned about messaging queues. As i've googled and read, it seems everyone is using messaging queues for all sorts of things, but why? What makes them better than writing your own library? Why are they so common and why are there so many?
what makes them better than writing your own library?
When rolling out the first version of your app, probably nothing: your needs are well defined and you will develop a messaging system that will fit your needs: small feature list, small source code etc.
Those tools are very useful after the first release, when you actually have to extend your application and add more features to it.
Let me give you a few use cases:
your app will have to talk to a big endian machine (sparc/powerpc) from a little endian machine (x86, intel/amd). Your messaging system had some endian ordering assumption: go and fix it
you designed your app so it is not a binary protocol/messaging system and now it is very slow because you spend most of your time parsing it (the number of messages increased and parsing became a bottleneck): adapt it so it can transport binary/fixed encoding
at the beginning you had 3 machine inside a lan, no noticeable delays everything gets to every machine. your client/boss/pointy-haired-devil-boss shows up and tell you that you will install the app on WAN you do not manage - and then you start having connection failures, bad latency etc. you need to store message and retry sending them later on: go back to the code and plug this stuff in (and enjoy)
messages sent need to have replies, but not all of them: you send some parameters in and expect a spreadsheet as a result instead of just sending and acknowledges, go back to code and plug this stuff in (and enjoy.)
some messages are critical and there reception/sending needs proper backup/persistence/. Why you ask ? auditing purposes
And many other use cases that I forgot ...
You can implement it yourself, but do not spend much time doing so: you will probably replace it later on anyway.
That's very much like asking: why use a database when you can write your own?
The answer is that using a tool that has been around for a while and is well understood in lots of different use cases, pays off more and more over time and as your requirements evolve. This is especially true if more than one developer is involved in a project. Do you want to become support staff for a queueing system if you change to a new project? Using a tool prevents that from happening. It becomes someone else's problem.
Case in point: persistence. Writing a tool to store one message on disk is easy. Writing a persistor that scales and performs well and stably, in many different use cases, and is manageable, and cheap to support, is hard. If you want to see someone complaining about how hard it is then look at this: http://www.lshift.net/blog/2009/12/07/rabbitmq-at-the-skills-matter-functional-programming-exchange
Anyway, I hope this helps. By all means write your own tool. Many many people have done so. Whatever solves your problem, is good.
I'm considering using ZeroMQ myself - hence I stumbled across this question.
Let's assume for the moment that you have the ability to implement a message queuing system that meets all of your requirements. Why would you adopt ZeroMQ (or other third party library) over the roll-your-own approach? Simple - cost.
Let's assume for a moment that ZeroMQ already meets all of your requirements. All that needs to be done is integrating it into your build, read some doco and then start using it. That's got to be far less effort than rolling your own. Plus, the maintenance burden has been shifted to another company. Since ZeroMQ is free, it's like you've just grown your development team to include (part of) the ZeroMQ team.
If you ran a Software Development business, then I think that you would balance the cost/risk of using third party libraries against rolling your own, and in this case, using ZeroMQ would win hands down.
Perhaps you (or rather, your partner) suffer, as so many developers do, from the "Not Invented Here" syndrome? If so, adjust your attitude and reassess the use of ZeroMQ. Personally, I much prefer the benefits of Proudly Found Elsewhere attitude. I'm hoping I can proud of finding ZeroMQ... time will tell.
EDIT: I came across this video from the ZeroMQ developers that talks about why you should use ZeroMQ.
what makes them better than writing your own library?
Message queuing systems are transactional, which is conceptually easy to use as a client, but hard to get right as an implementor, especially considering persistent queues. You might think you can get away with writing a quick messaging library, but without transactions and persistence, you'd not have the full benefits of a messaging system.
Persistence in this context means that the messaging middleware keeps unhandled messages in permanent storage (on disk) in case the server goes down; after a restart, the messages can be handled and no retransmit is necessary (the sender does not even know there was a problem). Transactional means that you can read messages from different queues and write messages to different queues in a transactional manner, meaning that either all reads and writes succeed or (if one or more fail) none succeeds. This is not really much different from the transactionality known from interfacing with databases and has the same benefits (it simplifies error handling; without transactions, you would have to assure that each individual read/write succeeds, and if one or more fail, you have to roll back those changes that did succeed).
Before writing your own library, read the 0MQ Guide here: http://zguide.zeromq.org/page:all
Chances are that you will either decide to install RabbitMQ, or else you will make your library on top of ZeroMQ since they have already done all the hard parts.
If you have a little time give it a try and roll out your own implemntation! The learnings of this excercise will convince you about the wisdom of using an already tested library.