I am trying to set up Manual Replication for Ehcache for to two different servers. Below is my configuraion:
<cache name="codeTaskCache" maxElementsInMemory="1000" eternal="false"
timeToIdleSeconds="0" timeToLiveSeconds="0" overflowToDisk="false" />
<cacheManagerPeerProviderFactory class=
"net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"
properties="peerDiscovery=manual,
rmiUrls=//server1:40001/codeTaskCache | //server2:40001/codeTaskCache"
propertySeparator="," />`
The same configuration is present on both tomcat server which are running on two different unix boxes.
Replication works fine from server1 -> server2 but NOT from server2 -> server1 which is pretty strange.
In document they have a statement 'The rmiUrls is a list of the cache peers of the server being configured. Do not include the server being configured in the list.'
But if it works one way then why not the other way? Could someone please throw some light on this? Thanks in advance.
The RMI urls should not point to itself, just the other ehcache instances - at least that's what worked for me.
Also make sure the hostnames are resolvable from both servers, and in the last desperate attempt you can even try to have "hostname=name_of_server_dot_domain_dot_com" in the listening statement of the listener part XML.
Related
Just trying out splunk, have had an issue with integrating a search head cluster with an indexer cluster.
I have 3 machines in a search head cluster and 3 machines in an indexer cluster. These are all on CentOS7, no firewall installed, all machines are able to ping / view each others splunk instaces (ip:8000 / ip:8089).
When following https://docs.splunk.com/Documentation/Splunk/6.6.2/DistSearch/SHCandindexercluster specifically
splunk edit cluster-config -mode searchhead -master_uri 10.152.31.202:8089 -secret newsecret123
I get an error of
Could not contact master. Check that the master is up, the master_uri=10.152.31.202:8089 and secret are specified correctly
I have removed the https:// part from the IP's above as I couldn't post with them included.
I have set the pass4SymmKey to be the same on all servers.
thanks
Please check shclustering pass4symmkey in both search head cluster and in the master.
i suspect pass4symmkey issue.
You should check splunkd.log to see what the error is. I would recommend not setting up the Pass4symKey and verifying it works first, if not then you found your issue.
Also, you did not mention having an extra server acting as the cluster master. This should be an independent server from your indexers. You have one right?
Recently we decided to add a cache layer to our mule APIs and Redis came to the scope.
We are at Mule3.8.0 and Redis connector 4.0.0. and we met following issues while configuring:
How to separate our keys by Redis DB ? This is not mentioned in document and there is only a 'Default Partition Name' in the configuration seems close but whatever value we put there, seems no effect - it will always be db0 containing all the keys, hence we can't really have "dev", "qa" and "test" key sets in the same redis cluster
The Redis connector document has example as below
<redis:sorted-set-select-range-by-index config- ref="Redis_configuration" key="my_key" start="0" end="-1" />
however when we tried the samething it complains the 'end' value should be >= 0 hence not usable
How to configure a connection pool properly with Redis connector configuration? Not mentioned in document again. The only attribute is the 'Pool Config Reference' and I tried to put a spring bean ref to my own JedisPoolConfig there, seems no effect, and number of the connections remains the same no matter what value I put in that bean.
Thanks in advance If someone could help with these issues above
James
How to separate our keys by Redis DB ?
You can use Redis in cluster mode with sharing data (http://redis.io/topics/cluster-tutorial)
I don't think you need special configuration in Mule.
I think you mix Partition term in Mule and Partition term in Redis.
Regards,
So today we run into a disturbing solr issue.
After a restart of the whole cluster one of the shard stop being able to index/store documents.
We had no hint about the issue until we started indexing (querying the server looks fine).
The error is:
2014-05-19 18:36:20,707 ERROR o.a.s.u.p.DistributedUpdateProcessor [qtp406017988-19] ClusterState says we are the leader, but locally we don't think so
2014-05-19 18:36:20,709 ERROR o.a.s.c.SolrException [qtp406017988-19] org.apache.solr.common.SolrException: ClusterState says we are the leader (http://x.x.x.x:7070/solr/shard3_replica1), but locally we don't think so. Request came from null
at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:503)
at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:267)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:550)
at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:126)
at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:101)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:65)
at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
We run Solr 4.7 in Cluster mode (5 shards) on jetty.
Each shard run on a different host with one zookeeper server.
I checked the zookeeper log and I cannot see anything there.
The only difference is that in the /overseer_election/election folder I see this specific server repeated 3 times, while the other server are only mentioned twice.
45654861x41276x432-x.x.x.x:7070_solr-n_00000003xx
74030267x31685x368-x.x.x.x:7070_solr-n_00000003xx
74030267x31685x369-x.x.x.x:7070_solr-n_00000003xx
Not even sure if this is relevant. (Can it be?)
Any clue what other check can we do?
We've experienced this error under 2 conditions.
Condition 1
On a single zookeeper host there was an orphaned Zookeeper ephemeral node in
/overseer_elect/election. The session this ephemeral node was associated with no longer existed.
The orphaned ephemeral node cannot be deleted.
Caused by: https://issues.apache.org/jira/browse/ZOOKEEPER-2355
This condition will also be accompanied by a /overseer/queue directory that is clogged-up with queue items that are forever waiting to be processed.
To resolve the issue you must restart the Zookeeper node in question with the orphaned ephemeral node.
If after the restart you see Still seeing conflicting information about the leader of shard shard1 for collection <name> after 30 seconds
You will need to restart the Solr hosts as well to resolve the problem.
Condition 2
Cause: a mis-configured systemd service unit.
Make sure you have Type=forking and have PIDFile configured correctly if you are using systemd.
systemd was not tracking the PID correctly, it thought the service was dead, but it wasn't, and at some point 2 services were started. Because the 2nd service will not be able to start (as they both can't listen on the same port) it seems to just sit there in a failed state hanging, or fails to start the process but just messes up the other solr processes somehow by possibly overwriting temporary clusterstate files locally.
Solr logs reported the same error the OP posted.
Interestingly enough, another symptom was that zookeeper listed no leader for our collection in /collections/<name>/leaders/shard1/leader normally this zk node contains contents such as:
{"core":"collection-name_shard1_replica1",
"core_node_name":"core_node7",
"base_url":"http://10.10.10.21:8983/solr",
"node_name":"10.10.10.21:8983_solr"}
But the node is completely missing on the cluster with duplicate solr instances attempting to start.
This error also appeared in the Solr Logs:
HttpSolrCall null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /roles.json
To correct the issue, killall instances of solr (or java if you know it's safe), and restart the solr service.
We figured out!
The issue was that jetty didn't really stop so we had 2 running processes, for whatever reason this was fine for reading but not for writing.
Killing the older java process solved the issue.
There are scripts that build the admin server, then create clusters, managed servers, machines etc and when this domain is built, it is seen that an additional phantom server osb_server1 with port 8011, is getting built that isn't attached to any cluster or any machine.
This is built when the wlsb.jar was being referenced during one of the scripts.
Once after the admin server is up and running and we have other managed servers as well, Was trying to remove osb_server1 and this error creeps up
weblogic.management.configuration.AppDeploymentMBeanImpl.isCacheInAppDirectorySet()
Errors must be corrected before processding
There are like 120 default deployments on OSB that are targeted to osb_server1, was trying to retarget them to another server, but that is also throwing an error ...
Any ideas ???
That's due to the weird behaviour/bug of the standard osb template. There is a discussion here. http://theheat.dk/blog/?p=1255.
I didnt follow the steps given by Oracle(as in the URL). What I did was,
I keep the default osb_server1, and make it part of the cluster during the domain creation(ie, it's the first server). Once the domain is created, I re-set the osb_server1 to the desired value. That way the singleton services will still be deployed to the 1st server and others to cluster. Using WLST:
readDomain(domain_name)
cd('/Servers/osb_server1')
set('ListenPort', osb1_listen_port)
set('Name', osb1_name)
cd('/Servers/' + osb1_name + '/ServerDiagnosticConfig/osb_server1')
set('Name', osb1_name)
updateDomain()
closeDomain()
I have configured cluster with two instances on glassfish 3.1.1 and iPlanet Web Server as a load-balancer (on the same machine). For test application provided with glassfish everything works ok (and this application has session replication enabled).
But when I try to make my own application working following situation takes place: it responds when I send requests on ports of a particular instances (that is 28080 and 28081), but when I try to send request through load balancer (port 81) I get error 404. My application has not session replication enabled yet, but it can just make a connection and create two other sessions for each instance. I would like to get similar effect with load balancer.
So I would like to determine:
Is session replication strongly required to load balancer works fine?
Does anyone know any other reasons of this error?
Message from iPlanet log:
[23/Aug/2012:05:44:16] failure ( 4120) myHost: for host 127.0.0.1 trying to GET /myApp/login.jsp, service-j2ee reports: PWC6117: File "c:/webserver7/https-myHost/docs/myApp/login.jsp" not found
Additional conclusions:
(81 - http-listener port on iPlanet)
When I send GET http://localhost:81/testApp then loadbalancer passes it to glassfish and returns correct site. But when I try the same with my test application, GET http://localhost:81/myApp then iPlanet looks for this site in its own resources (docs directory as in log above)
fragment of myHost-obj.conf:
<Object name="default">
AuthTrans fn="match-browser" browser="*MSIE*" ssl-unclean-shutdown="true"
NameTrans fn="name-trans-passthrough" name="lbplugin" config-file="C:/WebServer7/https-myHost/config/loadbalancer.xml"
NameTrans fn="assign-name" name="perf" from="/.perf"
NameTrans fn="ntrans-j2ee" name="j2ee"
NameTrans fn="pfx2dir" from="/mc-icons" dir="C:/WebServer7/lib/icons" name="es-internal"
PathCheck fn="uri-clean"
PathCheck fn="check-acl" acl="default"
PathCheck fn="find-pathinfo"
PathCheck fn="find-index-j2ee"
PathCheck fn="find-index" index-names="index.html,home.html,index.jsp"
ObjectType fn="type-j2ee"
ObjectType fn="type-by-extension"
ObjectType fn="force-type" type="text/plain"
Service method="(GET|HEAD)" type="magnus-internal/directory" fn="index-common"
Service method="(GET|HEAD|POST)" type="*~magnus-internal/*" fn="send-file"
Service method="TRACE" fn="service-trace"
Error fn="error-j2ee"
AddLog fn="flex-log"
</Object>
First, if you are running the Load Balancer plugin, then you may have a support contract (a GlassFish license is required before you put the plugin into production). If so, calling support is a good option.
To answer your first question, session replication is not required for the Load Balancer to work.
As a shameless plug, I have a 5-part youtube series on setting this up. You can skip the videos on downloading and installing and go straight to setup/configuration/testing. Based on what you describe, I suspect the issue isn't the plugin itself, but the loadbalancer.xml configuration. Look at loadbalancer.xml and see if myApp is configured.
Hope this helps.