Set up distributed index using Hibernate Search and Lucene - lucene

Our application is using Hibernate Search for indexing some of its data. The application is running on two JBoss EAP 6.2 application servers for load distribution and failover. We need changes made on one machine to be immediately visible on the other. The index is a central part of the application and needs to be consistent with the database data. Completely rebuilding it takes a long time so it is important that it remains intact even in the case of a server crash. Also, the index is expected to grow too large to keep all of it in memory.
Our current solution is to use the standard filesystem directory with a shared filesystem (NFS) and the JGroups backend to ensure that only one server writes to a given index at any time. This works more or less, but sometimes we have problems with index updates taking very long (up to 20 seconds) or failing completely. Due to some other reasons we need to migrate away from the currently used file system, so we are evaluating alternatives for the current setup.
One thing we tried is the Infinispan directory with a file cache store for persistence, but we had some problems there regarding OutOfMemoryErrors (see also my post in the Infinispan forums https://developer.jboss.org/thread/253732). Also, performance was still not acceptable in our first tests (about 3 seconds for an index update with two clustered servers set up on my developer machine), though that may be due to configuration issues.
I think this is not such an uncommon requirement, but I couldn't find much information on best practices to implement it.
Who has experiences with similar setups? Does the Infinispan directory work for you? Can anybody suggest a working configuration or how to proceed to arrive at one? What alternatives have you tried and which work?

You need to be careful about which versions are being used. The Infinispan version which is bundled within JBoss EAP is not intended (i.e. tested as extensively as for other purposes) for storing the Lucene index.
When JBoss EAP 6.2 was released, the bundled Infinispan was considered good to go for the internal needs of the application server, but as you might have discovered, the feature of index storage was having at least some performance issues.
In recent developments of Infinispan we applied many improvements to the index storage feature, fixing some bugs and getting very significant performance improvements out of it. I would hope you could be willing to try Infinispan 7.2.0.Beta1 ?
All of these improvements are also being backported to JBoss Data Grid, version 6.5 will make them available as a supported product. Note this feature of storing an Hibernate Search index wasn't supported before - it is going to be a new feature of JDG 6.5.
Modules from JDG 6.5 will be compatible with JBoss EAP, you'll just have to make sure you'll use the Infinispan build provided by JDG and not the one meant for internal usage of EAP.
Performance improvements are still being worked on. It's much better already - especially compared to that older version - but we won't stop working on that yet so if you could try latest bleeding edge versions of Infinispan 7.2.x (another release is scheduled for tomorrow), I'd highly appreciate your feedback to keep pushing it.

Related

What is the best strategy to select which parameters to use for Geode server and locator startup script

Our company uses Geode services for some of our applications, we are making use of Geode Member group configurations as well for maintaining different regions.
We have been undergoing an effort of migrating our applications from Geode version 1.6 to the latest version 1.12.
We have seen dramatic performance decrease after the upgrade, if we use the older parameters for the server and locator startup scripts, and things work fine when we remove those parameters.
We are now planning to take the route of understanding the parameters (earlier used) and available, to determine the most optimal configurations for the server and locator to get the best out of the new Geode version.
I was wondering if someone has any best practices or recommendations to follow for this task.
Below are the configurations for the Geode locator and server startup scripts for old and new versions.
Locator startup command
---Old configurations ( works great with Geode 1.6 version but not with any version after Geode 1.8)
gfsh start locator --locators=$locators_str --name=${EC2_HOSTNAME}.aws.compnaynamedigital.net --initial-heap=2G --max-heap=2G --dir=/opt/compnayname/geode/locator --J=-Dlog4j.configurationFile=/opt/compnayname/geode/log4j2-locator.xml --J=-DCLUSTER=${ECS_CLUSTER} --J='-javaagent:/opt/compnayname/geode/jmxtrans-agent-1.2.6.jar=/opt/compnayname/geode/jmxtrans-agent-locator.xml' --J=-Dgemfire.distributed-system-id=${DISTRIBUTED_SYSTEM_ID} --J=-Dgemfire.member-timeout=30000 --J=-Dgemfire.max-num-reconnect-tries=0 --J=-Dgemfire.jmx-manager=true --J=-Dgemfire.jmx-manager-start=true --J=-Dgemfire.jmx-manager-port=1099 --J=-Dgemfire.http-service-port=0 --J=-Dgemfire.log-level=info --J=-Dgemfire.log-file-size-limit=10 --J=-Dgemfire.log-disk-space-limit=10 --J=-Dgemfire.disable-auto-reconnect=true
---New configuration (works great with all versions)
gfsh start locator --locators=$locators_str --name=${EC2_HOSTNAME}.aws.compnaynamedigital.net --J=-Xmx2048m --dir=/opt/compnayname/geode/locator --J=-Dlog4j.configurationFile=/opt/compnayname/geode/log4j2-locator.xml --J='-javaagent:/opt/compnayname/geode/jmxtrans-agent-1.2.6.jar=/opt/compnayname/geode/jmxtrans-agent-locator.xml'
Server Startup command
---Old configurations ( works great with Geode 1.6 version but not with any version after Geode 1.8)
gfsh start server --locators=$locators_str --name=${EC2_HOSTNAME}.aws.compnaynamedigital.net --initial-heap=${GEODE_INIT_HEAP} --max-heap=${GEODE_MAX_HEAP} --group=${SERVER_GROUP} --dir=/opt/compnayname/geode/server --classpath=/opt/compnayname/geode/services-geode.jar --J=-Dlog4j.configurationFile=/opt/compnayname/geode/log4j2-server.xml --J=-DCLUSTER=${ECS_CLUSTER} --J='-javaagent:/opt/compnayname/geode/jmxtrans-agent-1.2.6.jar=/opt/compnayname/geode/jmxtrans-agent-server.xml' --J=-Dgemfire.distributed-system-id=${DISTRIBUTED_SYSTEM_ID} --J=-Dgemfire.member-timeout=30000 --J=-Dgemfire.max-num-reconnect-tries=0 --J=-Dgemfire.socket-buffer-size=16777215 --J=-Dgemfire.off-heap-memory-size=${GEODE_OFF_HEAP} --J=-XX:+UseParNewGC --J=-XX:+UseConcMarkSweepGC --J=-XX:CMSInitiatingOccupancyFraction=60 --eviction-heap-percentage=70 --critical-heap-percentage=90 --J=-Dgemfire.http-service-port=0 --J=-Dgemfire.log-level=info --J=-Dgemfire.log-file-size-limit=10 --J=-Dgemfire.log-disk-space-limit=10 --J=-Dgemfire.disable-auto-reconnect=true ${ADDTL_GEODE_SERVER_OPTS}
---New configuration (works great with all versions)
gfsh start server --locators=$locators_str --name=${EC2_HOSTNAME}.aws.compnaynamedigital.net --J=-Xmx${GEODE_MAX_HEAP} --group=${SERVER_GROUP} --dir=/opt/compnayname/geode/server --classpath=/opt/compnayname/geode/services-geode.jar --J=-Dlog4j.configurationFile=/opt/compnayname/geode/log4j2-server.xml --J='-javaagent:/opt/compnayname/geode/jmxtrans-agent-1.2.6.jar=/opt/compnayname/geode/jmxtrans-agent-server.xml'
Test Environment Details
We are using the exact same environment (read AWS) for testing the old and new configurations and performing the same test to measure the response time. We are using 3 Geode locators and 3 Geode servers for the different member groups.
The only difference is the Geode version
We are actually doing a count operation (we have written a count function to execute on Geode regions to count the records existing in the downloaded data which is actually data sketches (https://datasketches.apache.org/)). This count operation on the same data in the same testing environment is giving a drastically slow response with the old configuration using any Geode version beyond 1.8
Another surprising thing is that if I use the old configurations in my local laptop (my laptop serves as locator and server both) with any Geode version greater than 1.8 (including the latest version of Geode), then I am not seeing this issue. Somehow these extra configurations are causing slowness in the AWS environment in the distributed infrastructure.
Please let me know if more information is required and I will be glad to provide more details.
Any information on this will be appreciated.
The main difference I see is the inclusion of --J='-javaagent:/opt/xyz/geode/jmxtrans-agent-1.2.6.jar=/opt/xyz/geode/jmxtrans-agent-server.xml' in the startup parameters. This seems to be a third party Java Agent to expose several JVM metrics through JMX.
Do you know if the agent itself is modifying the byte code?, I've seen negative effects of that approach for Geode applications in the past (not performance related, though). Have you tried upgrading the agent to the latest available version (1.2.10)?. As a side note, Geode already exposes a lot of metrics and information through JMX out of the box, is there any reason why you're relying on yet another external tool for this?.
We have seen dramatic performance decrease after the upgrade
How are you measuring performance?, where do you see the degradation?, are you executing exactly the same workload on exactly the same machines, where the only difference is the Geode version?. There are several actors in play here.
That said, diagnosing and troubleshooting performance degradations can be a long and though process, so my suggestion would be to open a Geode JIRA Ticket with all the relevant information and artefacts.
Cheers.

Liferay Cloud IDE, Multiple developpers working on same liferay server

We want to start working with liferay. But the server is too heavy and the developpers computer don't have enought RAM. We want to centralize the server instance.
In other words, we want to build a development server where all developpers can connect and directly develop in their web browser, compile, view the result and push the code to git repository.
I found some good cloud IDE like eclipse CHE and a good maven archetype for liferay projet. So i can build the projet with maven. But now i want to know if it is possible to configure Liferay like every developpers can work without troubling another. And if possible, How ?
The developpers can share the same database and can use different port. Maybe, the server can generate tempory URL like some online cloud editor.
I found this post Liferay With Multiple Server Instances, but i don't think is the best way because he create one server per project. I think is too heavy.
If necessary, We have kubernetes in our IS.
Liferay's tomcat bundle, by default, is configured to take a maximum of 2.5G for the process, but it can run with far less - the default only recently was bumped up, because many people never change the default and then wonder why production systems run out of memory. For 1 concurrent user (the sole developer) on a machine, I guess that the previous default of 1G heap space is enough. Are you saying that that's too much for your developers' machines?
Having many developers on a shared server poses one problem: Yes, you may deploy different code from different machines, but: How about setting a breakpoint? Can you connect with multiple debuggers? If something fails, how do you know whos recent deployment caused the failure?
Sharing a server is an integration technique, not a development technique. If your developers don't have enough memory available for running their own Liferay server next to their IDE, it's a lot cheaper to upgrade their machines than to slow them down when everybody is accessing the same server and they can't properly debug. You pay the memory once, but your waiting developers by the hour.
Is it possible to share one server? Sure it is.
Is it possible to share one server without troubling each other? I doubt.
When you say: You think it's too heavy: What are you basing that assumption on? What does the actual developer machine look like and what keeps you from investing in the extra memory?
It's trivial to share some infrastructure - i.e. have all of them connect to the same database server (and give everyone their own schema). But just the extra effort and setup might require you to pay the developers by the hour as much as you'd otherwise pay for a couple of memory chips.
And yet another option is: Run Liferay on a remote server, but keep 1 instance per developer. This way you don't need the local memory, but can have the memory in the cloud. Calculate if you pay more for remote cloud machines than for local memory - that decision is up to you.

Websphere Migration from was7 to was9

Planning to Migrate the Websphere from 7.0 to 9 and 8.5 to 9.
Can anyone help me getting the detailed Process
Migration here is "In place". (Migration will be done on the same servers, where the old Installation are in)
if at all any migration tools need to be used, please provide the clear info on them.
any documental references, or any video references for the questioner is appreciated.
OS used : RHEL
CUrrent version: WAS 7x and 8.5
Migrating to : WAS 9.0
It sounds like you're in the very beginning stages of doing this migration. Therefore, I highly recommend you take some time to plan this out, especially to figure out the exact steps you'll be taking and how you'll handle something going wrong. For WebSphere, there is a collection of documents from IBM that discuss planning and executing the upgrade. Look around there for documentation on the tools and step by step guides for different kinds of topologies. The step by step guide for an in place migration of a cell is here.
You should make sure to take good backups before you start the process so you can restore back to before the migration if you need to.
In addition to doing the upgrade, an important part is to also make sure your applications are going to work on the new version if you haven't already. IBM provides this tool to scan applications and identify potential issues that developers will have to fix. There is documentation for the tool at that link as well.
If you are in the planning phase, I'd strongly suggest you to consider migrating to WebSphere Liberty instead of traditional WAS v9. All these migration tools (toolkit for binaries, Eclipse migration toolkit) support both migration scenarios.
Choosing Liberty might be a bit more work at the beginning, but you will gain more deployment flexibility and speed up future development. Liberty is also much better fitted for any cloud/containers environments, as it is much more lightweight, so in the future, if you would like to move to containers, it would be much easier.
Check this tutorial Migrate traditional WebSphere apps to WebSphere Liberty on IBM Cloud Private by using Kubernetes, although it shows the steps to migrate to Liberty on ICP, beginning is the same - analyzing of the application whether they are fit for Liberty and migrating. If you don't have access to IBM Cloud or ICP, you can use stand alone version of the Transformation Advisor that was released recently - IBM Cloud Transformation Advisor.
Having said all that, some apps include old or proprietary traditional WebSphere APIs and in that case it may be easier and cheaper to temporary migrate them to WAS v9, and modernize in the future.

Effects of maintaining many MobileFirst versions

We have an iOS and Android app that we deployed to our MFP server v7.0.
I know that maintaining a lot of versions is hard and the user who installed the old versions may not experience the new features if we don't let them install the latest version.
I want to know what are the other implications/effects if we maintain a lot of version in server. Like if it'll take a long time to restart the server or the server might experience downtime.
What is the good number of versions to be maintained as well? If there's any.
I want to know what are the other implications/effects if we maintain a lot of version in server. Like if it'll take a long time to restart the server or the server might experience downtime.
Yes, it could take longer for the server to start if you maintain a lot of resources, I would say this is expected. It's natural.
You will want to look at the following doc: https://mobilefirstplatform.ibmcloud.com/learn-more/scalability-and-hardware-sizing-7-1/
What is the good number of versions to be maintained as well? If there's any.
This is really up to your development, marketing, etc... You probably want to always have your customers on the latest, so you could play with 2-3 versions and Remote Disable those you decide to not support anymore.

Apache Tomcat 6.0.35 is taking 100% CPU in prodcution

I have been using apache-tomcat-6.0.35 in production environment. Our application is hosted on Amazon EC2 using Small Instance. The problem we are facing is that the apache tomcat is using 100% CPU. We have verified it by running htop and it shows multiple threads of tomcat running.
Out application has been developed in Grails 2.0.1.
We are puzzled that why it is happening? Can any body suggest any solutions?
Thanks
Probable Cause
Most likely this has been caused by the recent Leap Second and its impact on quite some unaware/unprepared IT systems, including parts of Linux, MySQL, Java and indeed Tomcat - see the Wired article about the ‘Leap Second’ Bug Wreaks Havoc Across Web for the whole story:
[...], saying it experienced the leap bug problem with the
Java-happy Tomcat web servers it uses to serve up its site. “Our web
servers running tomcat came close to zero response (we were able to
handle some requests),” read an e-mail from a site spokesman. “We were
able to connect to servers in order to reset them. Only rebooting the
servers cleared up the issue.” [emphasis mine]
Workaround / Fix
Accordingly, the solution usually boils down to turning it off and on again, i.e. restarting the server in question, though you might be able to avoid this by simply setting the date, as suggested e.g. in the context of:
Linux/Tomcat, see July 1 2012 Linux problems? High CPU/Load? Probably caused by the Leap Second!:
Apparently, simply forcing a reset of the date is enough to fix the
problem:
date -s "`date`"
MySQL, see MySQL and the Leap Second, High CPU and the Fix (also linked from the comments on wwwhizz' answer to MySQL high CPU usage, where you'll find two specific variations how to do this depending on your OS):
The fix is quite simple – simply set the date. Alternatively, you can
restart the machine, which also works. Restarting MySQL (or Java, or
whatever) does NOT fix the problem.
Background / Proposed Solutions
Please note that while the underlying issue is utterly tricky, it is all but unknown in principle, hence there have been prominent posts/users warning about and explaining this and offering suggestions on how to deal with it in principle, in particular:
An humble attempt to work around the leap second by Marco Marongiu
Time, technology and leaping seconds by Christopher Pascoe
We can't say anything for sure with the information provided. For performance issue, I would recommend a profiler, especially JProfiler, to investigate the cause of this problem. By this way you will be able to locate where the problem is.
This program has a trial license, I think that's enough for a quick look.
UPDATE: after carefully read your question, I see that you have many tomcat instance running for a website? It means that the previous tomcat instances failed to stop; they still run and hog up all the resources. This is possible. You must kill all the old tomcat process before trying to start a new one.
You can kill the processes by hand by "kill -9 " if you are on Linux, before trying to start the server again.