Infinispan Initial State Transfer Hangs and times out - jboss7.x

I'm trying to cluster a pair of servers with a shared Infinispan cache (Replicated Asynchronously). One always starts successfully, and registers itself properly with the JDBC database. When the other starts, it registers properly with the database, and I see a bunch of chatter between them, then, while waiting on a response from the second server, I get
`org.infinispan.commons.CacheException: Initial statue transfer timed out`
I think it's just an issue of configuration, but I'm not sure how to debug my configuration issues. I've spent several days configuring and re-configuring my Infinispan XML, and my JGroups.xml:
Infinispan:
<?xml version="1.0" encoding="UTF-8"?>
<infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:infinispan:config:6.0"
xsi:schemaLocation="urn:infinispan:config:6.0 http://www.infinispan.org/schemas/infinispan-config-6.0.xsd
urn:infinispan:config:remote:6.0 http://www.infinispan.org/schemas/infinispan-cachestore-remote-config-6.0.xsd"
xmlns:remote="urn:infinispan:config:remote:6.0"
>
<!-- *************************** -->
<!-- System-wide global settings -->
<!-- *************************** -->
<global>
<shutdown hookBehavior="DEFAULT"/>
<transport clusterName="DSLObjectCache">
<properties>
<property name="configurationFile" value="jgroups.xml"/>
</properties>
</transport>
<globalJmxStatistics enabled="false" cacheManagerName="Complex.com"/>
</global>
<namedCache name="ObjectCache">
<transaction transactionMode="TRANSACTIONAL" />
<locking
useLockStriping="false"
/>
<invocationBatching enabled="true"/>
<clustering mode="replication">
<async asyncMarshalling="true" useReplQueue="true" replQueueInterval="100" replQueueMaxElements="100"/>
<stateTransfer fetchInMemoryState="true" />
</clustering>
<eviction strategy="LIRS" maxEntries="500000"/>
<expiration lifespan="86400000" wakeUpInterval="1000" />
</namedCache>
<default>
<!-- Configure a synchronous replication cache -->
<locking
useLockStriping="false"
/>
<clustering mode="replication">
<async asyncMarshalling="true" useReplQueue="true" replQueueInterval="100" replQueueMaxElements="100"/>
<stateTransfer fetchInMemoryState="true" />
</clustering>
<eviction strategy="LIRS" maxEntries="500000"/>
<expiration lifespan="86400000" wakeUpInterval="1000" />
<persistence>
<cluster remoteCallTimeout="60000" />
</persistence>
</default>
</infinispan>
Jboss.xml:
<config xmlns="urn:org:jgroups"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-3.0.xsd">
<!-- Default the external_addr to #DEADBEEF so we can see errors coming through
on the backend -->
<TCP
external_addr="${injected.external.address:222.173.190.239}"
receive_on_all_interfaces="true"
bind_addr="0.0.0.0"
bind_port="${injected.bind.port:12345}"
conn_expire_time="0"
reaper_interval="0"
sock_conn_timeout="20000"
tcp_nodelay="true"
/>
<JDBC_PING
datasource_jndi_name="java:jboss/datasources/dsl/control"
/>
<MERGE2 max_interval="30000" min_interval="10000"/>
<FD_SOCK
external_addr="${injected.external.address:222.173.190.239}"
bind_addr="0.0.0.0"
/>
<FD timeout="10000" max_tries="5"/>
<VERIFY_SUSPECT timeout="1500"
bind_addr="0.0.0.0"
/>
<pbcast.NAKACK use_mcast_xmit="false"
retransmit_timeouts="300,600,1200,2400,4800"
discard_delivered_msgs="true"/>
<UNICAST3 ack_batches_immediately="true"
/>
<RSVP ack_on_delivery="true"
throw_exception_on_timeout="true"
timeout="1000"
/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="5000"
view_bundling="true" view_ack_collection_timeout="5000"/>
<FRAG2 frag_size="60000"/>
<pbcast.STATE_SOCK
bind_port="54321"
external_addr="${injected.external.address:222.173.190.239}"
bind_addr="0.0.0.0"
/>
<pbcast.FLUSH timeout="1000"/>
</config>
I've tried, frankly, every configuration option I can think of, and I'm not sure why the replication keeps timing out. All communication between these servers is wide open. Sorry to just dump so much XML, but I'm not even sure how to collect more information.

Continued exploration indicated that Infinispan was pushing logs to the server.log, but - due to my configuration, this was not duplicated on the console. Further inspection revealed that I left a single element in my cache objects unserializable - making it impossible for it to be written to the wire and transferred. The logs are very specific, making this actually a very easy problem to track down once I realized where the logs were being written.
If you come here from the future, my advice is to just tail every single log you can on the working server, and see what comes up.

Related

Orbeon Forms PE (Ehcache) replication without multicasting

We want to set up Orbeon Forms PE replication but we cannot use multicasting as proposed in the docs.
We have two nodes - 172.13.238.241 and 172.13.238.242 and the problem seems to be with the EhCache part. I open the form, load balancer (haproxy) directs me to a node, I turn of the second node and then for several minutes all the requests in the brower fail. Eventually the request start to work again but they are very slow.
This is what I have in node 1 (another node has same config with IP-s replaced with each other and it has a different uniqueId value):
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
channelSendOptions="asynchronous"
channelStartOptions="3">
<Manager className="org.apache.catalina.ha.session.DeltaManager"
expireSessionsOnShutdown="false"
notifyListenersOnReplication="true"/>
<Channel className="org.apache.catalina.tribes.group.GroupChannel">
<Interceptor className="org.apache.catalina.tribes.group.interceptors.StaticMembershipInterceptor">
<Member className="org.apache.catalina.tribes.membership.StaticMember"
port="4100"
host="172.13.238.242"
uniqueId="{0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2}" />
</Interceptor>
<Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
address="172.13.238.241"
port="4100"
autoBind="0"
maxThreads="6"
selectorTimeout="5000" />
<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
</Sender>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/>
</Channel>
<Valve className="org.apache.catalina.ha.tcp.ReplicationValve" filter=".*\.gif|.*\.js|.*\.jpeg|.*\.jpg|.*\.png|.*\.htm|.*\.html|.*\.css|.*\.txt"/>
<ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
</Cluster>
I wonder if anyone could spot some mistakes in my ehcache configuration.

Making Icecast SSL

So I have just finished setting up Icecast on a Centos 7 VPS and everything is working perfectly fine, but i was needing my stream to be SSL...
However i'm not entirely sure how to do it, I looked at Icecast's website and found this page which says it can be done.
http://icecast.org/docs/icecast-2.4.1/config-file.html
However I ain't entirely sure where i'm putting the part as I pretty much just followed a tutorial online how to set it up so i'm not very familiar how it works, i do have a SSL certificate all set up and what not working with my site I just wanted the stream to be SSL too.
Any help would be great thanks!
<!-- LIMITS -->
<limits>
<clients>100</clients>
<sources>10</sources>
<threadpool>5</threadpool>
<queue-size>524288</queue-size>
<client-timeout>30</client-timeout>
<header-timeout>15</header-timeout>
<source-timeout>10</source-timeout>
<burst-on-connect>1</burst-on-connect>
<burst-size>65535</burst-size>
</limits>
<!-- GENRIC -->
<authentication>
<source-password>password</source-password>
<admin-user>admin</admin-user>
<admin-password>password</admin-password>
</authentication>
<hostname>MyHost/IP</hostname>
<listen-socket>
<port>8000</port>
</listen-socket>
<fileserve>1</fileserve>
<!-- PATHES -->
<paths>
<basedir>/opt/icecast/latest/share/icecast</basedir>
<webroot>/opt/icecast/latest/share/icecast/web</webroot>
<adminroot>/opt/icecast/latest/share/icecast/admin</adminroot>
<logdir>/var/log/icecast</logdir>
<pidfile>/var/run/icecast/icecast.pid</pidfile>
<alias source="/" dest="/status.xsl"/>
</paths>
<!-- LOG -->
<logging>
<accesslog>access.log</accesslog>
<errorlog>error.log</errorlog>
<playlistlog>playlist.log</playlistlog>
<loglevel>1</loglevel>
<logsize>10000</logsize>
<logarchive>1</logarchive>
</logging>
<!-- SECURITY -->
<security>
<chroot>0</chroot>
<changeowner>
<user>icecast</user>
<group>icecast</group>
</changeowner>
</security>
You have nothing referring to SSL.
Try replacing this
<!-- GENRIC -->
<authentication>
<source-password>password</source-password>
<admin-user>admin</admin-user>
<admin-password>password</admin-password>
</authentication>
<hostname>MyHost/IP</hostname>
<listen-socket>
<port>8000</port>
</listen-socket>
<fileserve>1</fileserve>
With this
<!-- GENRIC -->
<authentication>
<source-password>password</source-password>
<admin-user>admin</admin-user>
<admin-password>password</admin-password>
</authentication>
<listen-socket>
<port>8000</port>
<bind-address>127.0.0.1</bind-address>
</listen-socket>
<listen-socket>
<port>8443</port>
<ssl>1</ssl>
</listen-socket>
<fileserve>1</fileserve>

Why does log4net stop logging after a while on .NET WCF but not a usual website?

I've search many sites and tried several different opinions. But I still could not solve it. Here are the things I did right now:
In Global.asax at Application_Startup I give the file path and startup the Log4Net.
Right after I start log4net, I write a log that says "Application has
stared'"
Currently, there is only 1 worker in IIS for the WCF application The
IIS user has access to WRITE, MODIFY and READ privileges
The problem:
When I invoke a method of service directly (without doing the 2.
step below), No Logs is written
On a browser, I write te WCF url and hit ENTER, Log4Net creates the
folder and the files (files are EMPTY at this point).
If I make requests and invoke the methods (doing the 1st step), now
Log4Net writes the logs.
The actual problem:
After the 3rd step, (lets say we waited without any invokes of the
WCF methods around 10 minutes or more), the invoking DOES NOT CREATE
Log4Net Text logs ANYMORE.
Sometimes, if I repeat the 2nd step, it begins writing the logs
again. But there is no coherent results.
Here is the Config.xml:
<?xml version="1.0" encoding="utf-8"?>
<log4net>
<appender name="ProcessInfo_FileAppender" type="log4net.Appender.RollingFileAppender">
<file type="log4net.Util.PatternString" value="L:\LOGs\ProcessInfo\ProcessInfo_[%processid].txt" />
<lockingModel type="log4net.Appender.FileAppender+MinimalLock" />
<appendToFile value="true" />
<rollingStyle value="Composite" />
<maxSizeRollBackups value="200" />
<maximumFileSize value="30MB" />
<layout type="log4net.Layout.PatternLayout">
<conversionPattern value="%date [%thread] - %message%newline" />
</layout>
</appender>
<logger name="ProcessInfo">
<levelMin value="ERROR" />
<levelMax value="INFO" />
<appender-ref ref="ProcessInfo_FileAppender" />
</logger>
<root></root>
</log4net>
I have other WCF projects which have no problem even with multiple Workers. (I used the exact same IIS and Log4Net xml configuration with them). Also, as I mentioned on the title, I have a WebSite who has exact same logging codes (they both using a common 3rd party dll which I wrote) and has NO PROBLEM of writing Log4Net text logging at all.
Please help.
Thanks.
The problem is not in your logging configuration, you should try to enable log4net internal debugging. This will tell you why the logging stops. I guess there is some code that reconfigures your logging to load configuration from your web.config which is not there.
<configuration>
...
<appSettings>
<add key="log4net.Internal.Debug" value="true"/>
</appSettings>
...
<system.diagnostics>
<trace autoflush="true">
<listeners>
<add
name="textWriterTraceListener"
type="System.Diagnostics.TextWriterTraceListener"
initializeData="C:\tmp\log4net.txt" />
</listeners>
</trace>
</system.diagnostics>
</configuration>
Log4net FAQ

Infinispan distributed cluster with shared index

Does anybody have a working example of how to configure a cluster of nodes to share an index using the infinispan directory provider? All the documentation on Infinispan (the documentation is seriously lacking btw) implies that it should be as easy as having some properties set but no matter how I try I cannot get it to work. The nodes in the cluster find eachother fine and I can do get operations on one node and get object that were put on another. But as soon as I do queries (use the index) it just starts to fail.
My infinispan config:
<global>
<transport clusterName="SomeCluster">
<properties>
<property name="configurationFile" value="jgroups-udp.xml" />
</properties>
</transport>
</global>
<namedCache name="access">
<clustering mode="distribution" />
<indexing enabled="true" indexLocalOnly="true">
<properties>
<property name="default.directory_provider" value="infinispan"/>
<property name="default.worker.backend" value="jgroups"/>
</properties>
</indexing>
</namedCache>
I have not found one example/tutorial which covers a distributed cache with a shared index, and I consider my google-fu to be great. I have asked on the infinispan community forum but havent gotten any replies there.
The errors I get are all related to the fact that only one node can be able to write to the index (the master node) but the config above, which according some the documentation on Hibernet Search should make one node a master node, does nothing as far as I can se.
Edit:Im using Infinispan 6.0.2.Final
Rather than JGroups backend I'd use InfinispanIndexManager - this manager already provides its own backend.
<indexing enabled="true" indexLocalOnly="true">
<properties>
<property name="default.indexmanager" value="org.infinispan.query.indexmanager.InfinispanIndexManager" />
<property name="default.exclusive_index_use" value="false" />
<property name="default.metadata_cachename" value="lucene_metadata_repl" />
<property name="default.data_cachename" value="lucene_data_dist" />
<property name="default.locking_cachename" value="lucene_locking_repl" />
<property name="lucene_version" value="LUCENE_36" />
</properties>
</indexing>
Now, configure all the caches to be clustered (distributed or replicated). Without specifying the cache configuration this way, the three caches are created using the default cache configuration - which is by default non-clustered.
I am not sure about the exclusive_index_use, though, maybe it's not necessary.
I agree that Infinispan documentation could be much better, usually I have to fallback to investigating source code. For examples of indexing configuration, you can look into the infinispan-query module/src/test/resources.

What to use for logging errors in a Quartz scheduled Job?

I have an asp.net mvc 3 application. In this application I have a reminder system that uses quartz to grab messages from the database and send them out.
I am wondering what is the best way to log if something happens(say the database times out - I want to know about this).
I use in my mvc application ELMAH for my logging and it works great. However since quartz.net is it's own thread with out the httpcontext I can't use ELMAH(or at least I don't think I can). I tried to make an httpcontext by first querying my home page then through the scheduler code and using that as the context but that did not work.
System.ArgumentNullException was unhandled by user code
Message=Value cannot be null.
Parameter name: application
Source=Elmah
ParamName=application
StackTrace:
at Elmah.ErrorSignal.Get(HttpApplication application)
at Elmah.ErrorSignal.FromContext(HttpContext context)
at Job.Execute(JobExecutionContext context) in Job.cs:line 19
at Quartz.Core.JobRunShell.Run()
InnerException:
ErrorSignal.FromCurrentContext().Raise(new System.Exception());
So I am looking for either how to get ELMAH working in this scenario or something comparable to it(something that sends emails, stack trace and everything nice ELMAH has).
You can try NLog. It's very simple to implement and effective.
You can send email, trace pretty much everything.
I normally tend to keep everything in a separate config file (NLog.Config)
<?xml version="1.0" encoding="utf-8" ?>
<nlog xmlns="http://www.nlog-project.org/schemas/NLog.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
<targets>
<target name="DebugHandler" type="File" filename="${basedir}/_Logs/${date:format=yyyyMMdd}_${level}.Log"
layout="${longdate} ${logger} ${aspnet-session:variable=UserName} ${threadid} ${environment} ${identity} ${aspnet-request} ${message} ${exception}" />
<target name="ErrorHandler" type="File" filename="${basedir}/_Logs/${date:format=yyyyMMdd}_${level}.Log"
layout="${longdate} ${logger} ${aspnet-session:variable=UserName} ${threadid} ${environment} ${aspnet-request} ${message} ${exception}" />
<target name="FatalHandler" type="File" filename="${basedir}/_Logs/${date:format=yyyyMMdd}_${level}.Log"
layout="${longdate} ${logger} ${aspnet-session:variable=UserName} ${threadid} ${environment} ${aspnet-request} ${message} ${exception}" />
<target name="GenericHandler" type="File" filename="${basedir}/_Logs/${date:format=yyyyMMdd}_${level}.Log"
layout="${longdate} ${logger} ${aspnet-session:variable=UserName} ${threadid} ${environment} ${aspnet-request} ${message} ${exception}" />
</targets>
<rules>
<logger name="*" level="Debug" appendTo="DebugHandler" />
<logger name="*" level="Error" appendTo="ErrorHandler" />
<logger name="*" level="Fatal" appendTo="FatalHandler" />
<logger name="*" levels="Info,Warn" appendTo="GenericHandler" />
</rules>
</nlog>
As you can see you can activate or deactivate the different levels.
I've used it on in my quartz.net jobs as well as a debugging/tracing system and it does its job.
The only limit is you do not have an ELMAH interface, which can be a huge limit sometimes.