Monitor and handle MSGW messages on a job on an IBM i-series (AS/400) from Java - jobs

Does anyone know how one can automatically reply to messages with status MSGW that block a job on an IBM i-series (AS/400)?
I'm using the jt400/jtopen library to access a program on an AS/400 from Java. I'm using the com.ibm.as400.access.ProgramCall class, which works fine, unless the program fails for some reason. As with almost any program, failures will happen sometimes, but unfortunately, in this case, it does not result in a status message or an exception. Instead, the calling thread just hangs. What's worse, any call to the AS/400 to get information on the Job (another class in jt400 that mostly does what you would expect) backing the queue will hang as well.
I could of course monitor the thread in which the call runs and simply kill it after waiting for a while, but that's a last resort. Getting an error message back from the system would be nice.

You could try execute this command before invoke your pcml with com.ibm.as400.access.CommandCall.run() method:
CHGJOB INQMSGRPY(*DFT)
It sets 'C' as default answer for all messages.
but you should ensure you have log of the messages in order to know the problem which generates this message
Regards,

I don't believe Java can directly trap errors that occur on the other side of that API. What I've done is to 'harden' the RPG (IBM i side) program so that it monitors for errors rather than let the default error handler get them. When an error occurs, the RPG program gracefully terminates and passes back an error code or even the entire message back to the Java application.

I've found that you can use the timeout mechanism of ExecutorService to interrupt a ProgramCall in MSGW.
You must discard the AS400 object afterwards, and the server job is still in MSGW, but at least you can continue on the Javaside.
(You need to use a separate AS400 object if you want to investigate on the hanging job.)

Related

AXON framework synchronous response

I am new to AXON framework and are using it for our development. We have a requirement where command (command side) is created for the persisting data, for the same event is triggered which is consumed at query side. Now we need to have a response back to command side from query side which says if the record is persisted into database successfully (custom successful message) or if failed then the reason of the failure (custom exception message as response). Kindly help if there is any way to achieve such scenario.
Here command side and query side are 2 different micro-services and we are using Rabbit Mq for event driven technique.
Thanks in advance
I think that what you are asking is if there is a way for the command and event to be processed in a single transaction?
If you use a subscribing event processor, running in the same JVM, the event is processed synchronously and the whole transaction is rolled back in case of an exception in an event handler. This is not the case here, because you have loosely coupled separate services, which is good.
It's best practice for the aggregate with the command handler to have all the information available to decide whether or not the command can successfully be processed, and when an event is applied, this is a signal that it has happened, and the other services (the query side in this case) have to be informed. It's not good practice for a query module to overrule this ("you say it happened, I say it didn't"). If there is an error in the query side, you fix it, and replay the event.
If it really is an error in the event handler that the whole system must know about, that is really a separate event. You can apply such an event directly on the event bus and notify the whole system. Something like this:
#Autowired
private EventBus eventBus;
(...)
CatastrophicFailureEvent failureEvent = new CatastrophicFailureEvent("OH NO!");
eventBus.publish(GenericEventMessage.asEventMessage(failureEvent));
I think you might need to reconsider your architecture. Keep in mind that events should encapsulate the irreversible state changes of your system. These state changes should not be questioned after they have happened. Your query side should only need to care about projecting these valid state changes that your command side has decided on.
If you need to check whether a user already existed, you need to do this on the command side in your aggregate. The aggregate can keep a list of all the existing usernames and throw an exception if an invalid command is given. The command response (tip: using the sendAndWait() method on the CommandGateway returns a response) can then be used as the system to inform your user about the success/failure of its action.
The following flow might solve your problem, but keep in mind that the user will get a callback on the success of the action even though the query side might not have processed its result yet. This part is eventually consistent.
Command Side:
Request from frontend handled by a Controller class and creates an corresponding command
The above command is invoked and handled by a command handler which creates the corresponding event or throws an exception if the user already exists.
The invoker of the command is informed about the success of the command or the exception is handled and the error shown to the user.
The above event is published through rabbit mq event bus if the command was successful.
Query side:
The event that is published in the step 4 is consumed by the event handler in query side. No checks or validations should be necessary, since they were already handled on the command side.
#Mzzl
Series of activities
Command Side:
1. Request from frontend handled by a Controller class and creates an corresponding command
2. The above command is invoked and handled by a command handler which in return create corresponding event
3. The above event is then published through rabbit mq event bus.
Query Side:
4. The event that is published in the step 3 is consumed by the event handler in query side.
5. The event handler has the logic to perform db transaction (lets assume add a user). Once a user is added then a success message or failure message (lets assume user already available in the DB so could not create duplicate entry) should flow from query side to command side and eventually back to UI as a repsonse.
I'm not sure I've fully understand your issue (especially the microservice part :)),
but if your problem is related to having the query side up to date after the command execution, then you can have a look at this project.
In this example, you can see that he uses a SubscriptionQueryResult in conjunction with a QueryUpdateEmitter (see here)
Basically you will subscribe to query side changes before the command is issued, and you will block after the command execution until the query side send a notification when it is up to date.
This way you can avoid the eventual consistency.

How to catch application crash in DLL

I am coding DLL and want to send some IOCTL to Kernel eventually to attached hardware to clean-up in case of client application crashes.
The crash can be client application programmer's mistake e.g. invalid access, divide by 0 etc. In this situation, my attached hardware has to take some clean up action.
How DLL gets notified of attached client application's crash?
I do not think you can deal with all the crash in the dll, because the whole process is ended. The common way to do that, is to get the exit code of the process, you can write a service to get the exit code of the process, and do the clean job depends on exit code.
But you can still deal with some exceptions in your dll. You can use API
"AddVectoredExceptionHandler". Please read Effective Exception Handling in Visual C++ and Exception Handling - Inform your users! .

How do I get a list of worker threads of nservicebus

How do I get a list of worker threads of nservicebus. I need to register workerThread ids in to db and then bind some type of messages to the exact workerthread. Real idea is handling poison messages. Want to block all the threads not to handle poison messages except specified ones. There will be a seperate service that will manage threads through database.
I would not try to do that. It is almost sure to run into problems.
Of course, in order to get some sort of "identity" for each thread, you could place something like this in your message handler:
[ThreadStatic]
private static readonly Guid ThreadId = Guid.NewGuid();
But again, I wouldn't do that! The guids would change every time the endpoint was restarted, for one.
You could also query the list of threads direct from .NET and try to determine which ones were the message handling threads, but that sounds so scary I don't even want to go into it.
The real issue: Poison Message Handling
As your comment states, the real problem is that a poison message is REALLY poison. Not only is it failing, but it's taking so long to do so that it's really screwing up all the other threads!
Since you are able to identify these messages based on certain properties of the message, I would detect and throw an exception before the operation that times out. All the time.
If you want to be able to test periodically to see if the issue has been fixed, you have a few options:
Test via other means, and return the messages to the source queue when it has been fixed.
Add an appSetting so that the quick-throw behavior is skipped when the config setting is enabled. Then periodically you can edit the config, restart the endpoint, see if it's fixed, and then switch it back if it isn't.
Create another message handler that maintains a thread-locked increment value of zero. Send it a control message to say "Hey, try one now." Then your quick-throw behavior can decrement that value and allow one through to see what happens. This is also dangerous of course. Make sure your locking is tight since you are now sharing this state between different message processing threads.

Does NserviceBus finishes message handling if windows service is stopped?

I have nservicebus program installed as windows service.
When I stop windows service, does NserviceBus waits until current message handler is completed?
If the answer is no, then I may do a half of the work, and appear in incorrect state(if my program uses database).
I have done an experiment - created a program that does Thread.Sleep(100 * 1000) when it receives message.
When I stop windows service while message processing, it stops quickly and do not wait 100 seconds. So I assume that the answer is no.
If the answer is yes, please point me to the place in source code, where such stuff occurs.
NSB does not wait so the current message will rollback to the queue. All handlers are run in a transaction so there should be no risk to get in an inconsistent state. Assuming that your database is enlisted in the dtc transaction. If not you'll be inconsistent anyway so all bets are off.

Cocoa Distributed Objects, Long Polling, launchd and "Not Responding" in Activity Monitor

Scenario:
I have a Distributed-objects-based IPC between a mac application and a launchd daemon (written with Foundation classes). Since I had issues before regarding asynchronous messaging (e.g. I have a registerClient: on the server's root object and whenever there's an event the server's root object notifies / calls a method in the client's proxy object), I did long-polling which meant that the client "harvests" lists of events / notifications from the daemon. This "harvest" is done through a server object method call, which then returns an NSArray instance.
It works pretty well, until for a few seconds, the server object's process (launched thru launchd) starts being labeled red with the "(Not responding)" tag beside it (inside Activity Monitor). Like I said, functionally, it works well, but we just want to get rid of this "Not responding" label.
How can I prevent this "Not responding" tag?
FYI, I already did launchd-based processes before and this is the first time I did long-polling. Also, I tried NSSocketPortNameServer-based connections and also NSSocketPort-based ones. They didn't have this problem. Locking wasn't also an issue 'coz the locks used were only NSCondition's and we logged and debugged the program and it seems like the only locking "issue" is on the harvesting part, which actually, functionally works. Also, client-process is written in PyObjC while server process was written using ObjC.
Thanks in advance.
Sample the process to find out what it's doing or waiting on.
Peter's correct in the approach, though you may be able to figure it out through simple inspection. "Not responding" means that you're not processing events on your event queue for at least 5 seconds (used to be 2 seconds, but they upped it in 10.4). For a UI process, this would create a spinning wait cursor, but for a non-UI process, you're not seeing the effects as easily.
If this is a runloop-based program, it means you're probably doing something with a blocking (synchronous) operation that should be done with the run loop and a callback (async). Alternately, you need a second thread to process your blocking operations so your mainthread can continue to respond to events.
My problem was actually the call for getting a process's PID using the signature FNDR... that part caused the "Not responding" error and it never was the locks or the long-polling part. Sorry about this guys. But thank God I already found the answer.