Clearing Locks between JUnit tests in Hibernate Search 4.1

Clearing Locks between JUnit tests in Hibernate Search 4.1 - lucene

we recently upgraded to Hibernate Search 4.1 and are getting errors when we run our JUnit tests based on the changes hibernate made with regards to locks. When we run Junit tests with the AbstractTransactionalJUnit4SpringContextTests we often see locks left after each test. In reviewing (How to handle Hibernate-Search index recovery) we tried the native locks, but this did not resolve the issue.
We've tried out the various locking mechanisms (simple, single, and native) using the default directory provider (Filestore) and regularly see messages like:
build 20-Apr-2012 07:07:53 ERROR 2012-04-20 07:07:53,290 154053 (LogErrorHandler.java:83) org.hibernate.search.exception.impl.LogErrorHandler - HSEARCH000058: HSEARCH000117: IOException on the IndexWriter
build 20-Apr-2012 07:07:53 org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock#target/indexes/Resource/write.lock
build 20-Apr-2012 07:07:53 at org.apache.lucene.store.Lock.obtain(Lock.java:84)
build 20-Apr-2012 07:07:53 at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1108)
or
build 19-Apr-2012 19:31:09 ERROR 2012-04-19 19:31:09,395 153552 (LuceneBackendTaskStreamer.java:61) org.hibernate.search.backend.impl.lucene.LuceneBackendTaskStreamer - HSEARCH000072: Couldn't open the IndexWriter because of previous error: operation skipped, index ouf of sync!
Some of these messages seem to show the lock issue cascading from one test to another, hence the need for the reset, and some may be valid because the tests are testing 'invalid' behaviors and how our application reacts to them, but often because of cases like this where the ID is null
build 19-Apr-2012 19:31:11 Primary Failure:
build 19-Apr-2012 19:31:11 Entity org.tdar.core.bean.resource.CodingSheet Id null Work Type org.hibernate.search.backend.PurgeAllLuceneWork
But, regardless, we need to make sure that one test does not effect another.
In reading some of the discussions (email discussion on directory providers) it was suggested that the RAM based directory provider might be a better option, but we'd prefer to use the same provider as we use in production wherever possible.
How should we be resetting HibernateSearch between tests to clean up lock files and reset potential issues where the index is out-of-sync or corrupted? At the beginning of the test suite, we wipe the index directory, is it recommended to wipe it after every test?
thanks

If you have stale locks in the directory, it means that Hibernate Search wasn't shut down properly as it certainly will close the locks.
If you start a new Hibernate SessionFactory in each test, you should make sure it's closed as well after the test was run:
org.hibernate.SessionFactory.close()
(This is often missing in many examples as there are no noticeable problems when forgetting to close a Hibernate SessionFactory, but has never been optional and might leak connections or threads).
The thread from the Hibernate mailing list you linked to ended up changing the locks to use native handles in Hibernate Search 4.1, so that locks are cleaned up automatically in case the JVM crashes or is killed. But in your case I guess you're not killing the VM between tests, so you just need to make sure locks are released properly by shutting down the service.
exclusive_index_use=false hides the problem as the IndexWriter will be closed at the end of each transaction. That makes it slower though, as it's significantly more efficient to reuse the IndexWriter. The reason you have this issue after upgrading to Hibernate Search 4.1 is that this option was changed to true by default. But even then, you should still close it properly.

My understanding is that Spring manages the SessionFactory lifecycle, so it is not necessary to call close() at any time.
I have seen this locking error when there are multiple contexts loaded during a test run. For example, the first context creates the locks on the index file. The second context attempt to access the same indexes and fails due to the existence of the open SessionFactory from the first context.
I have fixed this by using #DirtiesContext which closes context before the next is instantiated

Related

Is it okay to initialize/seed a database data during Startup of an application?

We would like to programmatically ensure that a database table has a certain set of rows (based on a sometimes-changing enum). We are using EF Core 2.2 with code-first migrations and are looking for the right place to seed this data. We had thought that adding a seeding method to our Startup.cs would be a good idea, but Microsoft's documentation says
The seeding code should not be part of the normal app execution as this can cause concurrency issues when multiple instances are running and would also require the app having permission to modify the database schema.
Is the code in Startup.cs considered "part of the normal app execution"?
Our app currently only runs with 1 instance, but there might be multiple in the future. Plus, we have an Azure Functions app and a console app which might also need to ensure that the database table has the correct rows before executing. Despite these concerns, I have seen accepted and upvoted answers on other threads saying that initializing as part of Startup.cs is okay. Will we be shooting ourselves in the foot by doing this?

From the docs:
Depending on the constraints of your deployment the initialization code can be executed in different ways:
Running the initialization app locally.
Deploying the initialization app with the main app, invoking the initialization routine and disabling or removing the initialization app.
My interpretation from this is that you could deploy a console app using publishing profiles that ensured the database seed at launch.

programmatically set show_sql without recreating SessionFactory?

In NHibernate, I have show_sql turned on for running unit tests. Each of my unit tests clears the database and refills it, and this results in lots of sql queries that I don't want NHibernate to output.
Is it possible to control show_sql without destroying the SessionFactory? If possible, I'd like to turn it off when running setup for a test, then turn it on again when the body of the test starts to run.
Is this possible?

The only place you can set this is when building a NHibernate.Cfg.Configuration.
Once you've created a SessionFactory from your Configuration, there's no way to access the configuration settings, which I think is one of the reasons for using a factory pattern: to ensure that instances once successfully built can't be messed up by runtime re- or mis-configuration.
If you really need that feature, get the NH source code and find the place where the show_sql setting is evaluated.

Another option although it may/may not be as good is to use NHProf and just initialise NHProf when testing.
NHProf doesn't log setting/clearing database just queries used.

How to override edit locks

I'm writing a WLST script to deploy some WAR's and an EAR. However, intermittently, the script will time out because it can't seem to get an edit lock (this script is part of a chain of many other scripts). I was wondering, is there a way to override or stop any current locks on the server? This is only a temporary solution, but in the interest of time, it will do for now.
Thanks.

You could try setting a wait period and timeout:
startEdit([waitTimeInMillis], [timeoutInMillis], [exclusive]).
Are other scripts erroring out, leaving the session locked? You could try adding exception handling around those. Also, if you have 'Automatically acquire lock" enabled in the Admin Console and you use the admin console sometimes it can cause problems if you are running scripts at the same time, even though you are not making "lock-requiring" changes.
Also, are you using the same user for the chained scripts?

Within WLST, you can pass a number as a parameter to gain an exclusive lock. This allows the script to grab a different lock than the regular one that's used whenever an administrator locks from the console. It also prevents two instances of the same script from stepping on each other.
However, this creates complex change merge scenarios that are best avoided (by processes).
Oracle's documentation on configuration locks can be found here.
Alternatively, if you want the script to temporarily relieve any existing locks regardless of the pending changes, you may as well disable change management from the console, minimizing the inconvenience caused.
WLST also contains the cancelEdit command that you could run before you startEdit. Hope one of these options pan out!

To take the configuration change lock from another administrator:
If another administrator already has the configuration lock, the following message appears: Another user already owns the lock. You will need to either wait for the lock to be released, or take the lock.
Locate the Change Center in the upper left corner of the
Administration Console.
Click Take Lock & Edit.
Make your configuration changes.
In the Change Center, click Activate Changes. Not all changes take
effect immediately. Some require a restart (see Use the Change
Center).

As long as you're running WLST as an administrative user, you should be able to jump into an existing edit session with the edit() command - I've done a quick test with two admin users, one in the Admin Console, and one using WLST, and it appears to work fine - I can see the changes in the Admin Console session inside the WLST interpreter.
You could put a very simple exception handler around your calls to startEdit that will log the exception's stack trace, but do nothing else. And then rely on the edit call to pop you into the change session.
Relying on that is going to be tricky though if another script has started an edit session and is expecting to be able to commit that change session itself - you'll be getting exceptions and unreliable behaviour across multiple invocations.

TeamCity: Managing deployment dependencies for acceptance tests?

I'm trying to configure a set of build configurations in TeamCity 6 and am trying to model a specific requirement in the cleanest possible manner way enabled by TeamCity.
I have a set of acceptance tests (around 4-8 suites of tests grouped by the functional area of the system they pertain to) that I wish to run in parallel (I'll model them as build configurations so they can be distributed across a set of agents).
From my initial research, it seems that having a AcceptanceTests meta-build config that pulls in the set of individual Acceptance test configs via Snapshot dependencies should do the trick. Then all I have to do is say that my Commit build config should trigger AcceptanceTests and they'll all get pulled in. So, lets say I also have AcceptanceSuiteA, AcceptanceSuiteB and AcceptanceSuiteC
So far, so good (I know I could also turn it around the other way and cause the Commit config to trigger AcceptanceSuiteA, AcceptanceSuiteB and AcceptanceSuiteC - problem there is I need to manually aggregate the results to determine the overall success of the acceptance tests as a whole).
The complicating bit is that while AcceptanceSuiteC just needs some Commit artifacts and can then live on it's own, AcceptanceSuiteA and AcceptanceSuiteB need to:
DeploySite (lets say it takes 2 minutes and I cant afford to spin up a completely isolated one just for this run)
Run tests against the deployed site
The problem is that I need to be able to ensure that:
the website only gets configured once
The website does not get clobbered while the two suites are running
If I set up DeploySite as a build config and have AcceptanceSuiteA and AcceptanceSuiteB pull it in as a snapshot dependency, AFAICT:
a subsequent or parallel run of AcceptanceSuiteB could trigger another DeploySite which would clobber the deployment that AcceptanceSuiteA and/or AcceptanceSuiteB are in the middle of using.
While I can say Limit the number of simultaneously running builds to force only one to happen at a time, I need to have one at a time and not while the dependent pieces are still running.
Is there a way in TeamCity to model such a hierarchy?
EDIT: Ideas:-
A crap solution is that DeploySite could set a 'in use flag' marker and then have the AcceptanceTests config clear that flag [after AcceptanceSuiteA and AcceptanceSuiteB have completed]. The problem then becomes one of having the next DeploySite down the pipeline wait until said gate has been opened again (Doing a blocking wait within the build, doesnt feel right - I want it to be flagged as 'not yet started' rather than looking like it's taking a long time to do something). However this sort of stuff a flag over here and have this bit check it is the sort of mutable state / flakiness smell I'm trying to get away from.
EDIT 2: if I could programmatically alter the agent configuration, I could set Agent Requirements to require InUse=false and then set the flag when a deploy starts and clear it after the tests have run

Seems you go look on the Jetbrains Devnet and YouTrack tracker first and remember to use the magic word clobber in your search.
Then you install groovy-plug and use the StartBuildPrecondition facility
To use the feature, add system.locks.readLock. or system.locks.writeLock. property to the build configuration.
The build with writeLock will only start when there are no builds running with read or write locks of the same name.
The build with readLock will only start when there are no builds running with write lock of the same name.
therein to manage the fact that the dependent configs 'read' and the DeploySite config 'writes' the shared item.
(This is not a full productised solution hence the tracker item remains open)
EDIT: And I still dont know whether the lock should be under Build Parameters|System Properties and what the exact name format should be, is it locks.writeLock.MYLOCKNAME (i.e., show up in config with reference syntax %system.locks.writeLock.MYLOCKNAME%) ?
Other puzzlers are: how does one manage giving builds triggered by build completion of a writeLock task read access - does the lock get dropped until the next one picks up (which would allow another writer in) - or is it necessary to have something queue up the parent and child dependency at the same time ?

Nhibernate Profiler - Shows no information other than "session"?

So I am having problems getting NHibernate intergated in my MVC project. I therefore, installed the NHProfiler and initialized it in the Global.asax.cs file (NhibernateProfiler.Initialize();).
However, all I can see in the NHProf is a Session # and the time it took to come up. But selecting it or any other operations doesn't show me any information about the connection to the database or any information at all in any of the other windows such as:
- Statements, Entities, Session Usage
The Session Factory Statistics only shows Start time, execution time, and thats it.
Any thoughts.

Do you have any custom log4net configuration? Just thinking that might be overwriting NHProf's log4net listener after startup. If you refresh the page (and hence start another session*), does NHProf display another Session start? Also verify that your HibernatingRhinos.Profiler.Appender.dll (or HibernatingRhinos.Profiler.Appender.v4.0.dll if you're using .NET 4) is the same one as the current version of NHProf.
* I'm assuming that you're using Session-per-Request since this is a web app.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Clearing Locks between JUnit tests in Hibernate Search 4.1 - lucene

Related

Is it okay to initialize/seed a database data during Startup of an application?

programmatically set show_sql without recreating SessionFactory?

How to override edit locks

TeamCity: Managing deployment dependencies for acceptance tests?

Nhibernate Profiler - Shows no information other than "session"?

Categories

Resources