What to use for logging errors in a Quartz scheduled Job? - error-handling

I have an asp.net mvc 3 application. In this application I have a reminder system that uses quartz to grab messages from the database and send them out.
I am wondering what is the best way to log if something happens(say the database times out - I want to know about this).
I use in my mvc application ELMAH for my logging and it works great. However since quartz.net is it's own thread with out the httpcontext I can't use ELMAH(or at least I don't think I can). I tried to make an httpcontext by first querying my home page then through the scheduler code and using that as the context but that did not work.
System.ArgumentNullException was unhandled by user code
Message=Value cannot be null.
Parameter name: application
Source=Elmah
ParamName=application
StackTrace:
at Elmah.ErrorSignal.Get(HttpApplication application)
at Elmah.ErrorSignal.FromContext(HttpContext context)
at Job.Execute(JobExecutionContext context) in Job.cs:line 19
at Quartz.Core.JobRunShell.Run()
InnerException:
ErrorSignal.FromCurrentContext().Raise(new System.Exception());
So I am looking for either how to get ELMAH working in this scenario or something comparable to it(something that sends emails, stack trace and everything nice ELMAH has).

You can try NLog. It's very simple to implement and effective.
You can send email, trace pretty much everything.
I normally tend to keep everything in a separate config file (NLog.Config)
<?xml version="1.0" encoding="utf-8" ?>
<nlog xmlns="http://www.nlog-project.org/schemas/NLog.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
<targets>
<target name="DebugHandler" type="File" filename="${basedir}/_Logs/${date:format=yyyyMMdd}_${level}.Log"
layout="${longdate} ${logger} ${aspnet-session:variable=UserName} ${threadid} ${environment} ${identity} ${aspnet-request} ${message} ${exception}" />
<target name="ErrorHandler" type="File" filename="${basedir}/_Logs/${date:format=yyyyMMdd}_${level}.Log"
layout="${longdate} ${logger} ${aspnet-session:variable=UserName} ${threadid} ${environment} ${aspnet-request} ${message} ${exception}" />
<target name="FatalHandler" type="File" filename="${basedir}/_Logs/${date:format=yyyyMMdd}_${level}.Log"
layout="${longdate} ${logger} ${aspnet-session:variable=UserName} ${threadid} ${environment} ${aspnet-request} ${message} ${exception}" />
<target name="GenericHandler" type="File" filename="${basedir}/_Logs/${date:format=yyyyMMdd}_${level}.Log"
layout="${longdate} ${logger} ${aspnet-session:variable=UserName} ${threadid} ${environment} ${aspnet-request} ${message} ${exception}" />
</targets>
<rules>
<logger name="*" level="Debug" appendTo="DebugHandler" />
<logger name="*" level="Error" appendTo="ErrorHandler" />
<logger name="*" level="Fatal" appendTo="FatalHandler" />
<logger name="*" levels="Info,Warn" appendTo="GenericHandler" />
</rules>
</nlog>
As you can see you can activate or deactivate the different levels.
I've used it on in my quartz.net jobs as well as a debugging/tracing system and it does its job.
The only limit is you do not have an ELMAH interface, which can be a huge limit sometimes.

Related

NLog initial send and multiple bulk emailing

Background
I have a web application written in asp.net core v2.2. I use NLog as my 3rd party logger which emails me each time there's an application error which works well.
<targets>
<!--eg email on error -->
<target xsi:type="Mail"
name="emailnotify"
header="An Error has been reported By foo.com (${machinename}) ${newline} "
layout="${longdate} ${level:uppercase=true:padding=5} - ${logger:shortName=true} - ${message} ${exception:format=tostring} ${newline}"
html="true"
addNewLines="true"
replaceNewlineWithBrTagInHtml="true"
subject="Error on foo.com (${machinename})"
to="foo#foo.com"
from="foo#foo.com"
secureSocketOption="StartTls"
smtpAuthentication="basic"
smtpServer="${environment:EmailConfigNlogSMTPServer}"
smtpPort="25"
smtpusername="${environment:EmailConfigNlogSMTPUsername}"
smtppassword="${environment:EmailConfigNlogSMTPPassword}" />
<!-- set up a blackhole log catcher -->
<target xsi:type="Null" name="blackhole" />
</targets>
<rules>
<!-- Skip Microsoft logs and so log only own logs-->
<logger name="Microsoft.*" level="Info" writeTo="blackhole" final="true" />
<!-- Send errors via emailnotify target -->
<logger name="*" minlevel="Error" writeTo="emailnotify" final="true" />
</rules>
However, there are times when a large amount of emails, in short succession could come in, making me feel like I'm being "spammed", before I've had chance to resolve the issue.
An example could be on rare occasions such as my application's 3rd party search engine being down. At this point every user doing a search on the website would generate an error resulting in a potentially large number of emails.
Another example is when speculative 404s are occurring either by companies gathering website technology statistics or just plain dodgy requests. i.e. /bitcoin/xxx or /shop/xxx or git/.head/xxx.
Historically, I've used the "filters" feature around some of the 404s I was receiving and have set some up. See example below. However, in bulk format, I want to be notified about all 404s or 500s. I don't feel it's a good idea trying to filter things out that don't seem relevant as you can start missing out on key information such as hacking attempts, SQL injection attacks or application sniffing:
<filters>
<when condition="contains('${message}','.php')" action="Ignore" />
<when condition="contains('${message}','wp-includes')" action="Ignore" />
<when condition="contains('${message}','wordpress')" action="Ignore" />
<when condition="contains('${message}','downloader')" action="Ignore" />
</filters>
</logger>
Requirement
I want to introduce the following logic into my application logging:
Email notify when the first 500 error of type occurs (i.e error performing search on website, error accessing database) so I know straight away the first time the application has the initial error
Bulk notify consecutive 500 errors after the initial error so I know when else this error occurred but it won't continue to "spam" me.
Bulk notify all 404, 405 errors
These bulk emails would notify me after 50 errors or once a day whichever comes first.
Ideally, I would like the format of the bulk email to be:
Bulk email
Subject: 500 errors (13) | 404 errors (2)
Body in a table format:
500 Errors (13)
-----------------------------------------------------
| When | Message |
-----------------------------------------------------
| 2019-11-13 11:10:29 | Website search engine down |
-----------------------------------------------------
| 2019-11-13 11:10:30 | Website search engine down |
-----------------------------------------------------
| 2019-11-13 11:10:31 | Website search engine down |
-----------------------------------------------------
| 2019-11-13 11:10:32 | Website search engine down |
-----------------------------------------------------
| 2019-11-13 11:10:33 | Website search engine down |
-----------------------------------------------------
| 2019-11-13 11:10:34 | Website search engine down |
-----------------------------------------------------
| 2019-11-13 11:10:35 | Website search engine down |
-----------------------------------------------------
| 2019-11-13 11:11:01 | Login failed for user foo |
-----------------------------------------------------
| 2019-11-13 11:11:01 | Login failed for user foo |
-----------------------------------------------------
| 2019-11-13 11:11:01 | Login failed for user foo |
-----------------------------------------------------
| 2019-11-13 11:11:02 | Login failed for user foo |
-----------------------------------------------------
| 2019-11-13 11:11:02 | Login failed for user foo |
-----------------------------------------------------
| 2019-11-13 11:11:02 | Login failed for user foo |
-----------------------------------------------------
404 Errors (2)
-----------------------------------------------------
| When | Message |
-----------------------------------------------------
| 2019-11-13 11:10:45 | /bitcoin not found |
-----------------------------------------------------
| 2019-11-13 11:10:57 | /shop not found |
-----------------------------------------------------
Alternatively, I'd be happy to have separate bulk emails for the 500s, 404s or 405s.
In the example bulk email above, the application would've informed me straight away with 2 separate emails that there was an issue
With the website search engine being down
The user not being able to access the SQL database
And then after hitting 50 errors or once a day, send out the bulk email.
Question
Is my requirement possible using just NLog configuration?
I can see how I can carry out bulk notification using NLog's buffer wrapper configuration, although I don't think it's possible to output in the table format I'd ideally like.
<target xsi:type="BufferingWrapper"
name="bulkemailnotify"
bufferSize="50"
flushTimeout="86400000‬"
slidingTimeout="False"
overflowAction="Flush">
<target xsi:type="emailnotify" />
</target>
If not, would it be looking at one of the following?
Extending NLog
Removing NLog and adding a custom logger
Ideally, I would continue to use NLog and would only look to add a custom logger as a last resort.
Lets say that ${aspnet-response-statuscode} is the one returning the HTTP Status Code:
Then we can have following two targets:
Instant mail-target with first error (instant trigger with first event, and waits 5 min before triggering again)
Bulk mail-target with all errors (triggers after 5 min with all events)
Then you could probably do this:
<targets async="true">
<target xsi:type="SplitGroup" name="special-mail">
<target xsi:type="FilteringWrapper" name="filter-mail">
<filter type="whenRepeated" layout="${aspnet-response-statuscode}" timeoutSeconds="300" action="Ignore" />
<target xsi:type="Mail" name="instant-mail">
<header>Instant Error Report</header>
</target>
</target>
<target xsi:type="BufferingWrapper" name="buffer-mail" flushTimeout="300000">
<target xsi:type="Mail" name="bulk-mail">
<header>Bulk Error Report</header>
</target>
</target>
</target>
</targets>
<rules>
<logger name="Microsoft.*" maxlevel="Info" final="true" />
<logger name="*" minlevel="Error" writeTo="special-mail" final="true" />
</rules>
Not that skilled with the mail-target-layouts (header + body + footer) and making a good looking html-email. But you can probably steal something from the internet.
I've marked #Rolf Kristensen's answer as the answer because he gave me a large chunk of the functionality I asked for.
However, I thought I'd share what my final config looked like in case this can help anyone else.
Ideally, I wanted to use the ${exception:format=HResult} to check for a repeating log in the "whenRepeated" filter but there's a bug which has been fixed in a future release of nlog.
Edit: Updated the WhenRepeated attribute of the <filter> to include ${exception:format=TargetSite} ${exception:format=Type} as per Rolf Kristensen's recommendation.
<?xml version="1.0" encoding="utf-8" ?>
<nlog xmlns="http://www.nlog-project.org/schemas/NLog.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
autoReload="true"
internalLogLevel="Warn"
internalLogFile="c:\temp\internal.txt">
<!-- We're using the Mail xsi:type so reference mailkit -->
<extensions>
<add assembly="NLog.MailKit"/>
</extensions>
<!-- Set up the variables that we want to use across the various email targets -->
<variable name="exceptiontostring" value="${exception:format=tostring}" />
<variable name="abbreviatedenvironment" value="${left:inner=${environment:ASPNETCORE_ENVIRONMENT}:length=1}" />
<variable name="websiteshortname" value="${configsetting:cached=True:name=WebsiteConfig.ShortName}" />
<!-- If you change the below value, ensure you update the nummillisecondstobufferkeyerrors variable too -->
<variable name="numsecondstobufferkeyerrors" value="10" />
<!-- This should always be set to the value of numsecondstobufferkeyerrors variable *1000 to ensure initial logs aren't missed being sent -->
<variable name="nummillisecondstobufferkeyerrors" value="10000" />
<!-- Set the timings around how often the email the non key error email gets sent. If you want an email once a day, set this to 86400000 -->
<variable name="nummillisecondstobuffernonkeyerrors" value="5000" />
<variable name="borderstyle" value="border: 1px solid #000000;" />
<!-- Define various log targets -->
<!-- Targets that write info and error logs to a file -->
<targets async="true">
<!-- Write logs to file -->
<target xsi:type="File" name="error" fileName="D:\Dev\dotnet\Core\Logs\Logging2.2\error-log-${shortdate}.log"
layout="${longdate}|${logger}|${uppercase:${level}}|${message} ${exception}" />
<target xsi:type="File" name="info" fileName="D:\Dev\dotnet\Core\Logs\Logging2.2\info-log-${shortdate}.log"
layout="${longdate}|${logger}|${uppercase:${level}}|${message}" />
</targets>
<!-- Targets that send emails -->
<targets>
<!-- Ensure we try for up to a minute to resend emails if there's a temporary issue with the email service so we still get to be notified of any logs -->
<default-wrapper xsi:type="AsyncWrapper">
<wrapper-target xsi:type="RetryingWrapper" retryDelayMilliseconds="6000" retryCount="10" />
</default-wrapper>
<!-- Set up the default parameters for each email target so we don't have to repeat them each time -->
<default-target-parameters xsi:type="Mail">
<to>${configsetting:cached=True:name=EmailConfig.SendErrorsTo}</to>
<from>${configsetting:cached=True:name=EmailConfig.FromAddress}</from>
<secureSocketOption>StartTls</secureSocketOption>
<smtpAuthentication>basic</smtpAuthentication>
<smtpServer>${environment:EmailConfigNlogSMTPServer}</smtpServer>
<smtpPort>25</smtpPort>
<smtpusername>${environment:EmailConfigNlogSMTPUsername}</smtpusername>
<smtppassword>${environment:EmailConfigNlogSMTPPassword}</smtppassword>
<html>true</html>
<addNewLines>true</addNewLines>
<replaceNewlineWithBrTagInHtml>true</replaceNewlineWithBrTagInHtml>
</default-target-parameters>
<!-- Set up a SplitGroup target that allows us to split out key errors between
1) Sending an initial email notification for each key distinct error
2) Buffer all key errors and send them later so we don't keep getting the same error multiple times straight away -->
<target xsi:type="SplitGroup" name="key-error-mail">
<target xsi:type="FilteringWrapper" name="filter-mail">
<!-- Ignore any previous logs for the same controller, action and exception
TO DO: Change ${exception:format=TargetSite} ${exception:format=Type} in the layout below to be ${exception:format=HResult} when bug is fixed -->
<filter type="whenRepeated" layout="${exception:format=TargetSite} ${exception:format=Type}" timeoutSeconds="${numsecondstobufferkeyerrors}" action="Ignore" />
<!-- If we get past the ignore filter above then we send the instant mail -->
<target xsi:type="Mail" name="key-error-instant-mail">
<subject>Error on ${websiteshortname} (${abbreviatedenvironment}) ${newline}</subject>
<layout>${longdate} ${level:uppercase=true:padding=5} - ${logger:shortName=true} - ${message} ${exceptiontostring} ${newline}</layout>
</target>
</target>
<!-- Ensure that we buffer all logs to the bulk email -->
<target xsi:type="BufferingWrapper" name="buffer-key-error-mail" flushTimeout="${nummillisecondstobufferkeyerrors}">
<!-- Send out bulk email of key logs in a table format -->
<target xsi:type="Mail" name="key-error-bulk-mail">
<subject>Bulk Key Errors on ${websiteshortname} (${abbreviatedenvironment})</subject>
<layout>
<strong>${aspnet-response-statuscode} Error ${when:when='${aspnet-request-url}'!='':inner=on ${aspnet-request-url} (${aspnet-mvc-controller} > ${aspnet-mvc-action})}</strong>
<table width="100%" border="0" cellpadding="2" cellspacing="2" style="border-collapse:collapse; ${borderstyle}">
<thead>
<tr style="background-color:#cccccc;">
<th style="${borderstyle}">Date</th>
<th style="${borderstyle}">IP Address</th>
<th style="${borderstyle}">User Agent</th>
<th style="${borderstyle}">Method</th>
</tr>
</thead>
<tbody>
<tr>
<td style="${borderstyle}">${longdate}</td>
<td style="${borderstyle}">${aspnet-request-ip}</td>
<td style="${borderstyle}">${aspnet-request-useragent}</td>
<td style="${borderstyle}">${aspnet-request-method}</td>
</tr>
<tr><td colspan="4" style="${borderstyle}">${message}</td></tr>
<tr><td colspan="4" style="${borderstyle}">${exceptiontostring}</td></tr>
</tbody></table>
${newline}
</layout>
</target>
</target>
</target>
<!-- Set up a non key error buffer wrapper that we're not interested in receiving an immediate notification for
Send a periodic email with a list of the logs encountered since the last email was sent
Send the email out in table format -->
<target xsi:type="BufferingWrapper" name="buffer-mail-non-key-error" flushTimeout="${nummillisecondstobuffernonkeyerrors}">
<target xsi:type="Mail" name="bulk-mail-non-key-error">
<subject>Bulk Errors on ${websiteshortname} (${abbreviatedenvironment})</subject>
<layout>
<strong>${aspnet-response-statuscode} Error ${when:when='${aspnet-request-url}'!='':inner=on ${aspnet-request-url} (${aspnet-mvc-controller} > ${aspnet-mvc-action})}</strong>
<table width="100%" border="0" cellpadding="2" cellspacing="2" style="border-collapse:collapse; ${borderstyle}">
<thead>
<tr style="background-color:#cccccc;">
<th style="${borderstyle}">Date</th>
<th style="${borderstyle}">IP Address</th>
<th style="${borderstyle}">User Agent</th>
<th style="${borderstyle}">Method</th>
</tr>
</thead>
<tbody>
<tr>
<td style="${borderstyle}">${longdate}</td>
<td style="${borderstyle}">${aspnet-request-ip}</td>
<td style="${borderstyle}">${aspnet-request-useragent}</td>
<td style="${borderstyle}">${aspnet-request-method}</td>
</tr>
<tr><td colspan="4" style="${borderstyle}">${message}</td></tr>
</tbody></table>
${newline}
</layout>
</target>
</target>
<!-- Set up a blackhole log catcher -->
<target xsi:type="Null" name="blackhole" />
</targets>
<rules>
<!-- Skip Microsoft logs and so log only own logs-->
<logger name="Microsoft.*" level="Info" writeTo="blackhole" final="true" />
<!-- Now we've sent Microsoft logs to the blackhole, we only log non Microsoft info to the info log file-->
<logger name="*" level="Info" writeTo="info" final="true" />
<!-- Log errors to the error email buffer and error file -->
<logger name="*" minlevel="Error" writeTo="error" />
<!-- log any key errors to the SplitGroup target -->
<logger name="*" minlevel="Error" writeTo="key-error-mail">
<filters defaultAction="Log">
<when condition="starts-with('${aspnet-response-statuscode}','4')" action="Ignore" />
</filters>
</logger>
<!-- log any non key errors to the bufferwrapper target -->
<logger name="*" minlevel="Error" writeTo="buffer-mail-non-key-error">
<filters defaultAction="Log">
<when condition="not starts-with('${aspnet-response-statuscode}','4')" action="Ignore" />
</filters>
</logger>
</rules>
</nlog>

NLog nlog.config for asp.net core duplicate rules

I am trying to apply NLog to my ASP.NET core application. I am following the guide from NLog website: https://github.com/NLog/NLog.Web/wiki/Getting-started-with-ASP.NET-Core-2
The below is the rules suggested for nlog.config
<rules>
<!--All logs, including from Microsoft-->
<logger name="*" minlevel="Trace" writeTo="allfile" />
<!--Skip non-critical Microsoft logs and so log only own logs-->
<logger name="Microsoft.*" maxLevel="Info" final="true" /> <!-- BlackHole without writeTo -->
<logger name="*" minlevel="Trace" writeTo="ownFile-web" />
</rules>
There are two lines have the same logger name patterns
logger name="*" minlevel="Trace"
but for different targets. One is for allfile target, the other is for ownFile-web target
It does not make sense for me. It seems duplicate to me. Any comments? Thanks!
You need to read the rules from top to bottom. There are 3 rules:
if level is trace or up, then log to allfile. So this file will contain all the log messages (including those from externals)
if the level is at max info (so trace, debug and info), and if logger name starts with Microsoft., then skip log messages (as there is no writeTo=) and stop processing (notice the final=true)
if level is trace or up, then log to ownFile-web. But because (trace, debug and info) Microsoft. log messages are skipped (see previous rule), this file will only contain your own logs + warnings and errors from Microsoft - and not the trace, debug and info from Microsoft.

Why does log4net stop logging after a while on .NET WCF but not a usual website?

I've search many sites and tried several different opinions. But I still could not solve it. Here are the things I did right now:
In Global.asax at Application_Startup I give the file path and startup the Log4Net.
Right after I start log4net, I write a log that says "Application has
stared'"
Currently, there is only 1 worker in IIS for the WCF application The
IIS user has access to WRITE, MODIFY and READ privileges
The problem:
When I invoke a method of service directly (without doing the 2.
step below), No Logs is written
On a browser, I write te WCF url and hit ENTER, Log4Net creates the
folder and the files (files are EMPTY at this point).
If I make requests and invoke the methods (doing the 1st step), now
Log4Net writes the logs.
The actual problem:
After the 3rd step, (lets say we waited without any invokes of the
WCF methods around 10 minutes or more), the invoking DOES NOT CREATE
Log4Net Text logs ANYMORE.
Sometimes, if I repeat the 2nd step, it begins writing the logs
again. But there is no coherent results.
Here is the Config.xml:
<?xml version="1.0" encoding="utf-8"?>
<log4net>
<appender name="ProcessInfo_FileAppender" type="log4net.Appender.RollingFileAppender">
<file type="log4net.Util.PatternString" value="L:\LOGs\ProcessInfo\ProcessInfo_[%processid].txt" />
<lockingModel type="log4net.Appender.FileAppender+MinimalLock" />
<appendToFile value="true" />
<rollingStyle value="Composite" />
<maxSizeRollBackups value="200" />
<maximumFileSize value="30MB" />
<layout type="log4net.Layout.PatternLayout">
<conversionPattern value="%date [%thread] - %message%newline" />
</layout>
</appender>
<logger name="ProcessInfo">
<levelMin value="ERROR" />
<levelMax value="INFO" />
<appender-ref ref="ProcessInfo_FileAppender" />
</logger>
<root></root>
</log4net>
I have other WCF projects which have no problem even with multiple Workers. (I used the exact same IIS and Log4Net xml configuration with them). Also, as I mentioned on the title, I have a WebSite who has exact same logging codes (they both using a common 3rd party dll which I wrote) and has NO PROBLEM of writing Log4Net text logging at all.
Please help.
Thanks.
The problem is not in your logging configuration, you should try to enable log4net internal debugging. This will tell you why the logging stops. I guess there is some code that reconfigures your logging to load configuration from your web.config which is not there.
<configuration>
...
<appSettings>
<add key="log4net.Internal.Debug" value="true"/>
</appSettings>
...
<system.diagnostics>
<trace autoflush="true">
<listeners>
<add
name="textWriterTraceListener"
type="System.Diagnostics.TextWriterTraceListener"
initializeData="C:\tmp\log4net.txt" />
</listeners>
</trace>
</system.diagnostics>
</configuration>
Log4net FAQ

Understanding about FUSE Server and camel

New to FUSE and Camel.
Downloaded CBR project from (https://github.com/jboss-fuse/quickstarts) and was able to run it as standalone camel project.
cbr.xml is as follows. This drops MSG from a work/cbr/input directory into another. Could run this as mvn camel:run
<route id="cbr-route">
<from uri="file:work/cbr/input" />
<log message="Receiving order ${file:name}" />
<choice>
<when>
<xpath>/order:order/order:customer/order:country = 'UK'</xpath>
<log message="Sending order ${file:name} to the UK" />
<to uri="file:work/cbr/output/uk" />
</when>
<when>
<xpath>/order:order/order:customer/order:country = 'US'</xpath>
<log message="Sending order ${file:name} to the US" />
<to uri="file:work/cbr/output/us" />
</when>
<otherwise>
<log message="Sending order ${file:name} to another country" />
<to uri="file:work/cbr/output/others" />
</otherwise>
</choice>
<log message="Done processing ${file:name}" />
</route>
</camelContext>
But ReadMe says start FUSE SEVER
trying to understand why do i need FUSE container at all if i am able to run it as standlone
There is a project requirement that web service calls from Client go through FUSE for making it asynchronous.
assuming i would not need fuse container in this case
Thanks for taking the time and reading
The FUSE is the container just like any other container , say tomcat, where you deploy your code to.
If you have a lot of integration scenarios and then you need to have a dedicated server to hold and run all your deploy-able. At least that is what I would suggest.
But if there are only a few of such scenarios you can use the power of camel , being framework,use it as a supporting jar for your java code and run as a standalone code.
This means you are responsible to number of requests and issues regarding availability or scalability for that piece of integration code.

Infinispan Initial State Transfer Hangs and times out

I'm trying to cluster a pair of servers with a shared Infinispan cache (Replicated Asynchronously). One always starts successfully, and registers itself properly with the JDBC database. When the other starts, it registers properly with the database, and I see a bunch of chatter between them, then, while waiting on a response from the second server, I get
`org.infinispan.commons.CacheException: Initial statue transfer timed out`
I think it's just an issue of configuration, but I'm not sure how to debug my configuration issues. I've spent several days configuring and re-configuring my Infinispan XML, and my JGroups.xml:
Infinispan:
<?xml version="1.0" encoding="UTF-8"?>
<infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:infinispan:config:6.0"
xsi:schemaLocation="urn:infinispan:config:6.0 http://www.infinispan.org/schemas/infinispan-config-6.0.xsd
urn:infinispan:config:remote:6.0 http://www.infinispan.org/schemas/infinispan-cachestore-remote-config-6.0.xsd"
xmlns:remote="urn:infinispan:config:remote:6.0"
>
<!-- *************************** -->
<!-- System-wide global settings -->
<!-- *************************** -->
<global>
<shutdown hookBehavior="DEFAULT"/>
<transport clusterName="DSLObjectCache">
<properties>
<property name="configurationFile" value="jgroups.xml"/>
</properties>
</transport>
<globalJmxStatistics enabled="false" cacheManagerName="Complex.com"/>
</global>
<namedCache name="ObjectCache">
<transaction transactionMode="TRANSACTIONAL" />
<locking
useLockStriping="false"
/>
<invocationBatching enabled="true"/>
<clustering mode="replication">
<async asyncMarshalling="true" useReplQueue="true" replQueueInterval="100" replQueueMaxElements="100"/>
<stateTransfer fetchInMemoryState="true" />
</clustering>
<eviction strategy="LIRS" maxEntries="500000"/>
<expiration lifespan="86400000" wakeUpInterval="1000" />
</namedCache>
<default>
<!-- Configure a synchronous replication cache -->
<locking
useLockStriping="false"
/>
<clustering mode="replication">
<async asyncMarshalling="true" useReplQueue="true" replQueueInterval="100" replQueueMaxElements="100"/>
<stateTransfer fetchInMemoryState="true" />
</clustering>
<eviction strategy="LIRS" maxEntries="500000"/>
<expiration lifespan="86400000" wakeUpInterval="1000" />
<persistence>
<cluster remoteCallTimeout="60000" />
</persistence>
</default>
</infinispan>
Jboss.xml:
<config xmlns="urn:org:jgroups"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-3.0.xsd">
<!-- Default the external_addr to #DEADBEEF so we can see errors coming through
on the backend -->
<TCP
external_addr="${injected.external.address:222.173.190.239}"
receive_on_all_interfaces="true"
bind_addr="0.0.0.0"
bind_port="${injected.bind.port:12345}"
conn_expire_time="0"
reaper_interval="0"
sock_conn_timeout="20000"
tcp_nodelay="true"
/>
<JDBC_PING
datasource_jndi_name="java:jboss/datasources/dsl/control"
/>
<MERGE2 max_interval="30000" min_interval="10000"/>
<FD_SOCK
external_addr="${injected.external.address:222.173.190.239}"
bind_addr="0.0.0.0"
/>
<FD timeout="10000" max_tries="5"/>
<VERIFY_SUSPECT timeout="1500"
bind_addr="0.0.0.0"
/>
<pbcast.NAKACK use_mcast_xmit="false"
retransmit_timeouts="300,600,1200,2400,4800"
discard_delivered_msgs="true"/>
<UNICAST3 ack_batches_immediately="true"
/>
<RSVP ack_on_delivery="true"
throw_exception_on_timeout="true"
timeout="1000"
/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="5000"
view_bundling="true" view_ack_collection_timeout="5000"/>
<FRAG2 frag_size="60000"/>
<pbcast.STATE_SOCK
bind_port="54321"
external_addr="${injected.external.address:222.173.190.239}"
bind_addr="0.0.0.0"
/>
<pbcast.FLUSH timeout="1000"/>
</config>
I've tried, frankly, every configuration option I can think of, and I'm not sure why the replication keeps timing out. All communication between these servers is wide open. Sorry to just dump so much XML, but I'm not even sure how to collect more information.
Continued exploration indicated that Infinispan was pushing logs to the server.log, but - due to my configuration, this was not duplicated on the console. Further inspection revealed that I left a single element in my cache objects unserializable - making it impossible for it to be written to the wire and transferred. The logs are very specific, making this actually a very easy problem to track down once I realized where the logs were being written.
If you come here from the future, my advice is to just tail every single log you can on the working server, and see what comes up.