Raising/Clearing SNMP alarm best practices - snmp4j

I am currently having an application where SNMP ALARMS are raised when my program is not able to reach an external API. I clear the alarm when ever i am successfully getting a response back from the API .
Below is the code for the same .
// Call Webservice to check the external API is up or not
logger.debug("Sending trap data Clear Alarm {}" , trapData);
AlarmTrap.INTERFACE_SMSC_STATUS.clear(trapData);
}
catch(CustomException e)
{
AlarmTrap.INTERFACE_SMSC_STATUS.raise(trapData);
logger.error("Error " + e);
throw e;
}
As you can see for every successful response i am clearing the alarm . Though there is no impact on the current execution as SNMP server discard same kind of alarms . I want to know if it is good practice or not . And whether SNMP protocol itself handles duplicate alarms and are not sent across to network .

If you do not want to send duplicate alarms for consecutive success API responses, you can create a AtomicBoolean class variable - isErrorAlert that only SNMP clear TRAP shall be invoked if the isErrorAlert = true.
AtomicBoolean isErrorAlert = new AtomicBoolean();
try{
//API Success case
if(isErrorAlert.compareAndSet(true, false)){
//send clear trap only if the error case is occured
}
} catch(Exception e) {
//Fail case
isErrorAlert.set(true);
}
References:
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/AtomicBoolean.html

Related

Apache HTTPClient5 - How to Prevent Connection/Stream Refused

Problem Statement
Context
I'm a Software Engineer in Test running order permutations of Restaurant Menu Items to confirm that they succeed order placement w/ the POS
In short, this POSTs a JSON payload to an endpoint which then validates the order w/ a POS to define success/fail/other
Where POS, and therefore Transactions per Second (TPS), may vary, but each Back End uses the same core handling
This can be as high as ~22,000 permutations per item, in easily manageable JSON size, that need to be handled as quickly as possible
The Network can vary wildly depending upon the Restaurant, and/or Region, one is testing
E.g. where some have a much higher latency than others
Therefore, the HTTPClient should be able to intelligently negotiate the same content & endpoint regardless of this
Direct Problem
I'm using Apache's HTTP Client 5 w/ PoolingAsyncClientConnectionManager to execute both the GET for the Menu contents, and the POST to check if the order succeeds
This works out of the box, but sometimes loses connections w/ Stream Refused, specifically:
org.apache.hc.core5.http2.H2StreamResetException: Stream refused
No individual tuning seems to work across all network contexts w/ variable latency, that I can find
Following the stacktrace seems to indicate it is that the stream had closed already, therefore needs a way to keep it open or not execute an already-closed connection
if (connState == ConnectionHandshake.GRACEFUL_SHUTDOWN) {
throw new H2StreamResetException(H2Error.PROTOCOL_ERROR, "Stream refused");
}
Some Attempts to Fix Problem
Tried to use Search Engines to find answers but there are few hits for HTTPClient5
Tried to use official documentation but this is sparse
Changing max connections per route to a reduced number, shifting inactivity validations, or connection time to live
Where the inactivity checks may fix the POST, but stall the GET for some transactions
And that tuning for one region/restaurant may work for 1 then break for another, w/ only the Network as variable
PoolingAsyncClientConnectionManagerBuilder builder = PoolingAsyncClientConnectionManagerBuilder
.create()
.setTlsStrategy(getTlsStrategy())
.setMaxConnPerRoute(12)
.setMaxConnTotal(12)
.setValidateAfterInactivity(TimeValue.ofMilliseconds(1000))
.setConnectionTimeToLive(TimeValue.ofMinutes(2))
.build();
Shifting to a custom RequestConfig w/ different timeouts
private HttpClientContext getHttpClientContext() {
RequestConfig requestConfig = RequestConfig.custom()
.setConnectTimeout(Timeout.of(10, TimeUnit.SECONDS))
.setResponseTimeout(Timeout.of(10, TimeUnit.SECONDS))
.build();
HttpClientContext httpContext = HttpClientContext.create();
httpContext.setRequestConfig(requestConfig);
return httpContext;
}
Initial Code Segments for Analysis
(In addition to the above segments w/ change attempts)
Wrapper handling to init and get response
public SimpleHttpResponse getFullResponse(String url, PoolingAsyncClientConnectionManager manager, SimpleHttpRequest req) {
try (CloseableHttpAsyncClient httpclient = getHTTPClientInstance(manager)) {
httpclient.start();
CountDownLatch latch = new CountDownLatch(1);
long startTime = System.currentTimeMillis();
Future<SimpleHttpResponse> future = getHTTPResponse(url, httpclient, latch, startTime, req);
latch.await();
return future.get();
} catch (IOException | InterruptedException | ExecutionException e) {
e.printStackTrace();
return new SimpleHttpResponse(999, CommonUtils.getExceptionAsMap(e).toString());
}
}
With actual handler and probing code
private Future<SimpleHttpResponse> getHTTPResponse(String url, CloseableHttpAsyncClient httpclient, CountDownLatch latch, long startTime, SimpleHttpRequest req) {
return httpclient.execute(req, getHttpContext(), new FutureCallback<SimpleHttpResponse>() {
#Override
public void completed(SimpleHttpResponse response) {
latch.countDown();
logger.info("[{}][{}ms] - {}", response.getCode(), getTotalTime(startTime), url);
}
#Override
public void failed(Exception e) {
latch.countDown();
logger.error("[{}ms] - {} - {}", getTotalTime(startTime), url, e);
}
#Override
public void cancelled() {
latch.countDown();
logger.error("[{}ms] - request cancelled for {}", getTotalTime(startTime), url);
}
});
}
Direct Question
Is there a way to configure the client such that it can handle for these variances on its own without explicitly modifying the configuration for each endpoint context?
Fixed w/ Combination of the below to Assure Connection Live/Ready
(Or at least is stable)
Forcing HTTP 1
HttpAsyncClients.custom()
.setConnectionManager(manager)
.setRetryStrategy(getRetryStrategy())
.setVersionPolicy(HttpVersionPolicy.FORCE_HTTP_1)
.setConnectionManagerShared(true);
Setting Effective Headers for POST
Specifically the close header
req.setHeader("Connection", "close, TE");
Note: Inactivity check helps, but still sometimes gets refusals w/o this
Setting Inactivity Checks by Type
Set POSTs to validate immediately after inactivity
Note: Using 1000 for both caused a high drop rate for some systems
PoolingAsyncClientConnectionManagerBuilder
.create()
.setValidateAfterInactivity(TimeValue.ofMilliseconds(0))
Set GET to validate after 1s
PoolingAsyncClientConnectionManagerBuilder
.create()
.setValidateAfterInactivity(TimeValue.ofMilliseconds(1000))
Given the Error Context
Tracing the connection problem in stacktrace to AbstractH2StreamMultiplexer
Shows ConnectionHandshake.GRACEFUL_SHUTDOWN as triggering the stream refusal
if (connState == ConnectionHandshake.GRACEFUL_SHUTDOWN) {
throw new H2StreamResetException(H2Error.PROTOCOL_ERROR, "Stream refused");
}
Which corresponds to
connState = streamMap.isEmpty() ? ConnectionHandshake.SHUTDOWN : ConnectionHandshake.GRACEFUL_SHUTDOWN;
Reasoning
If I'm understanding correctly:
The connections were being un/intentionally closed
However, they were not being confirmed ready before executing again
Which caused it to fail because the stream was not viable
Therefore the fix works because (it seems)
Given Forcing HTTP1 allows for a single context to manage
Where HttpVersionPolicy NEGOTIATE/FORCE_HTTP_2 had greater or equivalent failures across the spectrum of regions/menus
And it assures that all connections are valid before use
And POSTs are always closed due to the close header, which is unavailable to HTTP2
Therefore
GET is checked for validity w/ reasonable periodicity
POST is checked every time, and since it is forcibly closed, it is re-acquired before execution
Which leaves no room for unexpected closures
And otherwise the potential that it was incorrectly switching to HTTP2
Will accept this until a better answer comes along, as this is stable but sub-optimal.

RabbitMQ - Non Blocking Consumer with Manual Acknowledgement

I'm just starting to learn RabbitMQ so forgive me if my question is very basic.
My problem is actually the same with the one posted here:
RabbitMQ - Does one consumer block the other consumers of the same queue?
However, upon investigation, i found out that manual acknowledgement prevents other consumers from getting a message from the queue - blocking state. I would like to know how can I prevent it. Below is my code snippet.
...
var message = receiver.ReadMessage();
Console.WriteLine("Received: {0}", message);
// simulate processing
System.Threading.Thread.Sleep(8000);
receiver.Acknowledge();
public string ReadMessage()
{
bool autoAck = false;
Consumer = new QueueingBasicConsumer(Model);
Model.BasicConsume(QueueName, autoAck, Consumer);
_ea = (BasicDeliverEventArgs)Consumer.Queue.Dequeue();
return Encoding.ASCII.GetString(_ea.Body);
}
public void Acknowledge()
{
Model.BasicAck(_ea.DeliveryTag, false);
}
I modify how I get messages from the queue and it seems blocking issue was fixed. Below is my code.
public string ReadOneAtTime()
{
Consumer = new QueueingBasicConsumer(Model);
var result = Model.BasicGet(QueueName, false);
if (result == null) return null;
DeliveryTag = result.DeliveryTag;
return Encoding.ASCII.GetString(result.Body);
}
public void Reject()
{
Model.BasicReject(DeliveryTag, true);
}
public void Acknowledge()
{
Model.BasicAck(DeliveryTag, false);
}
Going back to my original question, I added the QOS and noticed that other consumers can now get messages. However some are left unacknowledged and my program seems to hangup. Code changes are below:
public string ReadMessage()
{
Model.BasicQos(0, 1, false); // control prefetch
bool autoAck = false;
Consumer = new QueueingBasicConsumer(Model);
Model.BasicConsume(QueueName, autoAck, Consumer);
_ea = Consumer.Queue.Dequeue();
return Encoding.ASCII.GetString(_ea.Body);
}
public void AckConsume()
{
Model.BasicAck(_ea.DeliveryTag, false);
}
In Program.cs
private static void Consume(Receiver receiver)
{
int counter = 0;
while (true)
{
var message = receiver.ReadMessage();
if (message == null)
{
Console.WriteLine("NO message received.");
break;
}
else
{
counter++;
Console.WriteLine("Received: {0}", message);
receiver.AckConsume();
}
}
Console.WriteLine("Total message received {0}", counter);
}
I appreciate any comments and suggestions. Thanks!
Well, the rabbit provides infrastructure where one consumer can't lock/block other message consumer working with the same queue.
The behavior you faced with can be a result of couple of following issues:
The fact that you are not using auto ack mode on the channel leads you to situation where one consumer took the message and still didn't send approval (basic ack), meaning that the computation is still in progress and there is a chance that the consumer will fail to process this message and it should be kept in rabbit queue to prevent message loss (the total amount of messages will not change in management consule). During this period (from getting message to client code and till sending explicit acknowledge) the message is marked as being used by specific client and is not available to other consumers. However this doesn't prevent other consumers from taking other messages from the queue, if there are more mossages to take.
IMPORTANT: to prevent message loss with manual acknowledge make sure
to close the channel or sending nack in case of processing fault, to
prevent situation where your application took the message from queue,
failed to process it, removed from queue, and lost the message.
Another reason why other consumers can't work with the same queue is QOS - parameter of the channel where you declare how many messages should be pushed to client cache to improve dequeue operation latency (working with local cache). Your code example lackst this part of code, so I am just guessing. In case like this the QOS can be so big that there are all messages on server marked as belonging to one client and no other client can take any of those, exactly like with manual ack I've already described.
Hope this helps.

How to make an Attended call transfer with UCMA

I'm struggling with making a call transfer in a UMCA IVR app I've built. This is not using Lync.
Essentially, I have an established call from an outside user and as part of the IVR application, they select an option to be transferred. This transfer is to a configured outside number (ie: Our Live Operator). What I want to do is transfer the original caller to the outside number, and if a valid transfer is established, I want to terminate the original call. If the transfer isn't established, I want to send control back to the IVR application to handle this gracefully.
My problem is my EndTransferCall doesn't get hit when the transfer is established. I would have expected it to hit, set my AutoResetEvent and return a True, and then in my application I can disconnect the original call. Can somebody tell me what I'm missing here?
_call is an established AudioVideoCall. My application calls the Transfer method
private AutoResetEvent _waitForTransferComplete = new AutoResetEvent(false);
public override bool Transfer(string number, int retries = 3)
{
var success = false;
var attempt = 0;
CallTransferOptions transferOptions = new CallTransferOptions(CallTransferType.Attended);
while ((attempt < retries) && (success == false))
{
try
{
attempt++;
_call.BeginTransfer(number, transferOptions, EndTransferCall, null);
// Wait for the transfer to complete
_waitForTransferComplete.WaitOne();
success = true;
}
catch (Exception)
{
//TODO: Log that the transfer failed
//TODO: Find out what exceptions get thrown and catch the specific ones
}
}
return success;
}
private void EndTransferCall(IAsyncResult ar)
{
try
{
_call.EndTransfer(ar);
}
catch (OperationFailureException opFailEx)
{
Console.WriteLine(opFailEx.ToString());
}
catch (RealTimeException realTimeEx)
{
Console.WriteLine(realTimeEx.ToString());
}
finally
{
_waitForTransferComplete.Set();
}
}
Is the behavior the same if you don't use the _waitForTransferComplete object? You shouldn't need it - it should be fine that the method ends, the event will still be raised. If you're forcing synchronous behavoir in order to fit in with the rest of the application though, try it like this:
_call.EndTransfer(
_call.BeginTransfer (number,transferOptions,null,null)
);
I'm just wondering if the waiting like that causes a problem if running on a single thread or something...

JMS Expiration after it has been receive not working

The title might be confusing but this is what I want to accomplish. I want to send a jms message from 1 ejb to another, the 2nd ejb has a message listener and this is now working properly. But I wanted the 1st ejb to create a temporary destination queue where the 2nd ejb would respond - this is also working properly.
My problem is in the 2nd ejb, it's calling a 3rd party web service that on some occasion would respond after a long time, and the temporary queue should expire on that time. But the problem is it doesn't according to java.net: http://java.net/projects/mq/lists/users/archive/2011-07/message/22
The message hasn't been delivered to a client and it expires -- in this case, the message is deleted when TTL is up.
The message is delivered to the JMS client (it's in-flight). Once this happens, since control is handed to the jms client, the broker cannot expire the message.
Finally, the jms client will check TTL just before it gives the message to the user application. If it's expired, we will not give it to the application and it will send a control message back to the broker indicating that the message was expired and not delivered.
So, it was received but no reply yet. Then on the time where it would write to the temporary queue it should already be expired but for some reason I was still able to write to the queue and I have the ff in my imq log:
1 messages not expired from destination jmsXXXQueue [Queue] because they have been delivered to client at time of the last expiration reaping
Is there another implementation where I can detect if the temporary queue is already expired? So that I can perform another set of action? Because my problem right now is ejb2 respond late and there is no more jms reader from ejb1 because it's already gone.
It works now, my solution was to wrap the 1st Stateless bean (the one where the first jms message originates) inside a bean managed transaction. See code below:
#Stateless
#TransactionManagement(TransactionManagementType.BEAN)
#LocalBean
public class MyBean {
public void startProcess() {
Destination replyQueue = send(jmsUtil, actionDTO);
responseDTO = readReply(jmsUtil, replyQueue, actionDTO);
jmsUtil.dispose();
}
public Destination send(JmsSessionUtil jmsUtil, SalesOrderActionDTO soDTO) {
try {
utx.begin();
jmsUtil.send(soDTO, null, 0L, 1,
Long.parseLong(configBean.getProperty("jms.payrequest.timetolive")), true);
utx.commit();
return jmsUtil.getReplyQueue();
} catch (Exception e) {
try {
utx.rollback();
} catch (Exception e1) {
}
}
return null;
}
public ResponseDTO readReply(JmsSessionUtil jmsUtil, Destination replyQueue,
SalesOrderActionDTO actionDTO) {
ResponseDTO responseDTO = null;
try {
utx.begin();
responseDTO = (ResponseDTO) jmsUtil.read(replyQueue);
if (responseDTO != null) {
//do some action
} else { // timeout
((TemporaryQueue) replyQueue).delete();
jmsUtil.dispose();
}
utx.commit();
return responseDTO;
} catch (Exception e) {
try {
utx.rollback();
} catch (Exception e1) {
}
}
return responseDTO;
}
}

How can I send a notification message from server to all clients in WCF (broadcast you can say)?

I want to send notification message every second from net tcp WCF service to all clients,
Broadcast you can say?
After the helpful answers
I wrote the following method that will send notifications (heartbeat) to all connected users
foreach (IHeartBeatCallback callback in subscribers)
{
ThreadPool.QueueUserWorkItem(delegate(object state)
{
ICommunicationObject communicationCallback = (ICommunicationObject)callback;
if (communicationCallback.State == CommunicationState.Opened)
{
try
{
callback.OnSendHeartBeat(_heartbeatInfo.message, _heartbeatInfo.marketstart,_heartbeatInfo.marketend, _heartbeatInfo.isrunning, DateTime.Now);
}
catch (CommunicationObjectAbortedException)
{
Logger.Log(LogType.Info, "BroadCast", "User aborted");
communicationCallback.Abort();
}
catch (TimeoutException)
{
Logger.Log(LogType.Info, "BroadCast", "User timeout");
communicationCallback.Abort();
}
catch (Exception ex)
{
Logger.Log(LogType.Error, "BroadCast", "Exception " + ex.Message + "\n" + ex.StackTrace);
communicationCallback.Abort();
}
}
else
{
DeletionList.Add(callback);
}
}
);
}
I am worried about calling the callback method as the client may close his application, but I handled it using the try catch, decrease the timeout, and send the broad cast in parallel, so is that sufficient?
You'll need to setup a callback service; I wrote a simple beginners guide a while back
In order to do that, you need to create and mantain a list of all connected clients (the general practice to fo this is creating LogIn and LogOut methods to create and manage a list of object representing your clients incuding their CallbackContext).
Then, with a System.Time.Timers, you can loop through the connected client list and send the notification.
Tip. this method could also act as a Keep-Alive or Hear-Beat method (if this isn't it's purpose by design) by adding the possiblity to remove clients from your list if the service cannot send the callback to them.