Tuning Infinispan for local MVCC HashMap - infinispan

I am new to Infinispan and basically stumbled over it whilst looking for an isolated MVCC HashMap for Java.
I am concerned that Infinispan may be somewhat heavy for what I need or that there may be a more performant way to achieve what I need with Infinispan. I don't need clustering or distribution, I just need the embedded Infinispan inside a single JVM.
At the moment I need a Map implementation which has transactions and Repeatable Read semantics, I currently have the following initialization code:
final ConfigurationBuilder builder = new ConfigurationBuilder();
builder.jmxStatistics().available(false);
builder.invocationBatching().enable();
builder.versioning().scheme(VersioningScheme.SIMPLE);
builder.versioning().enable();
builder.locking().concurrencyLevel(Runtime.getRuntime().availableProcessors() * 2);
builder.locking().writeSkewCheck(true);
builder.transaction().locking().isolationLevel(IsolationLevel.REPEATABLE_READ);
builder.transaction().lockingMode(LockingMode.OPTIMISTIC);
builder.transaction().transactionMode(TransactionMode.TRANSACTIONAL);
final DefaultCacheManager cacheManager = new DefaultCacheManager(builder.build());
final Cache<String, String> cache = cacheManager.getCache();
final TransactionManager transactionManager = cache.getAdvancedCache().getTransactionManager();
I then use the cache from various threads like so:
transactionManager.begin();
cache.put(KEY, VALUE);
...
transactionManager.commit();
Is this the most effective/performant way I could achieve this, or should I consider a different class or are there some tuning options that I am not aware of?

Infinispan development is more focused on the clustered setup; local caches are rather a special case, and therefore their implementation may seem somewhat heavier.
You've missed builder.transaction().notifications(false) - that could slice another percent, and disables the ability to use transaction listeners. Also, you can experiment with budiler.transaction().useSynchronization(true), though with the dummy TM (which is used for invocation batching) it probably doesn't matter too much.
There's a cache mode optimized for local operations - simple cache but that does not support transactions. So, I'd say that this is pretty much all.

Related

What are the key difference in using Redis Cache via ConnectionMultiplexer and AddStackExchangeRedisCache(IDistributedCache) in StartUp.cs?

I want to implement Distributed caching(Redis) in ASP.NET Core project. After a bit or research I found that there are two ways of creating a Redis connection using AddStackExchangeRedisCache in Startup.cs and ConnectionMultiplexer
AddStackExchangeRedisCache - This happens in Startup.cs.
Doubts in above approach:
Does this work in Prod environment?
When and how the connection is initialized?
Is it thread safe way to create the connection?
By using the ConnectionMultiplexer, we can initialize the DB instance. As per few articles, Lazy initialization will take care of the Thread safety as well
Doubts:
From above approaches, which is the better approach?
I tried both approaches in my local machine both are working fine. But I could not find Pros and Cons of above approach.
With ConnectionMultiplexer, you have the full list of commands that you can execute on your Redis server. With DistributedCaching, you can only store/retrieve a byte array or a string, and you can not execute any other commands that Redis provides. So if you just want to use it as a cache store, DistributedCaching provides a good abstraction layer. However, even the simplest increment/decrement command for Redis will not be available, unless you use ConnectionMultiplexer.
The extension method AddStackExchangeRedisCache uses a ConnectionMultiplexer under the hood (see here, and here for the extension method itself).
#2: works in prod either way
#3: connection is established lazily on first use, the ConnectionMultiplexer instance is re-used (registered as DI singleton)
#4: yeah, see above resp. here, a SemaphoreSlim is used to ensure the connection is only created once
pros and cons: since both use the ConnectionMultiplexer, they are pretty similar.
You can pick between the advantages of using the implementation agnostic IDistributedCache vs. direct use of the multiplexer and the StackExchange.Redis API (which has more specific functions than the interface).
Wrappers like IDistributedCache and StackExchangeRedis.Extensions do not include all the functions possible in the original library, In particular I required to delete All the keys in Redis Cache, which was not exposed in these wrappers.

What OOP patterns can be used to implement a process over multiple "step" classes?

In OOP everything is an object with own attributes and methods. However, often you want to run a process that spans over multiple steps that need to be run in sequence. For example, you might need to download an XML file, parse it and run business actions accordingly. This includes at least three steps: downloading, unmarshalling, interpreting the decoded request.
In a really bad design you would do this all in one method. In a slightly better design you would put the single steps into methods or, much better, new classes. Since you want to test and reuse the single classes, they shouldn't know about each other. In my case, a central control class runs them all in sequence, taking the output of one step and passing it to the next. I noticed that such control-and-command classes tend to grow quickly and are rather not flexible or extendible.
My question therefore is: what OOP patterns can be used to implement a business process and when to apply which one?
My research so far:
The mediator pattern seems to be what I'm using right now, but some definitions say it's only managing "peer" classes. I'm not sure it applies to a chain of isolated steps.
You could probably call it a strategy pattern when more than one of the aforementioned mediators is used. I guess this would solve the problem of the mediator not being very flexible.
Using events (probably related to the Chain of Responsibility pattern) I could make the single steps listen for special events and send different events. This way the pipeline is super-flexible, but also hard to follow and to control.
Chain of Responsibility is the best for this case. It is pretty much definition of CoR.
If you are using spring you can consider interesting spring based implementation of this pattern:
https://www.javacodegeeks.com/2012/11/chain-of-responsibility-using-spring-autowired-list.html
Obviously without spring it is very similar.
Is dependency injection not sufficient ? This makes your code reusable and testable (as you requested) and no need to use some complicated design pattern.
public final class SomeBusinessProcess {
private final Server server;
private final Marshaller marshaller;
private final Codec codec;
public SomeBusinessProcess(Server server, Marshaller marshaller, Codec codec) {
this.server = server;
this.marshaller = marshaller;
this.codec = codec;
}
public Foo retrieve(String filename) {
File f = server.download(filename);
byte[] content = marshaller.unmarshal(f);
return codec.decode(content);
}
}
I believe that a Composite Command (a vairation of the Command Pattern) would fit what you describe. The application of those is frequent in Eclipse.

With EclipseLink Is it possible to create orm.xml mappings during runtime for a class?

I am new to EclipseLink. I am trying to generate orm mappings for a class during runtime and do mapping.
Is it possible at all?
I see examples where a class is generated during runtime but that doesn't fit my situation.
thanks
It could be possible, depending on what you are trying to do and when. Persistence units are pretty static creations that should be known upfront - just like java classes themselves. So if you are not using Dynamic entities, why wouldn't you know upfront that the class should be apart of the persistence unit up front?
While it is not a great idea, you could create a static persistence unit and specify that it use a customizer as described here http://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Advanced_JPA_Development/Customizers with which you could add in descriptors or mappings to the persistence unit. The customizer is only run once though, during initialization. So if you wanted to make changes later on, you would need to refresh the persistence unit using the refreshMetadata on the EntityManagerFactory to have it reload the persistence unit. Running EntityManagers will not be affected by the changes.
Using the EMF refreshMetadata, you could also use a MetadataRepository to pick up different or extended ORM.xml files for your entities - so you could incorporate changes made to the xml instead of using a customizer. This is described somewhat here:
http://www.eclipse.org/eclipselink/documentation/2.5/solutions/extensible001.htm#CIAIJHAG

C# Task Parallel Library and NHibernate/Spring.NET

I have been using Spring.NET and NHibernate for some years and I am very satisfied. However, I was always playing around with multi threading, Reactive Extensions and eventually Task Parallel Library which is a great framework. Unfortunately all kind of multithreading approaches fail because of NHiberntate's session which is not thread safe.
I am asking you how can I benefit from parallel programming and still utilising NHibernate.
For instance: I have a CustomerRegistrationService class which method Register performs several tasks:
ICustumer customer = this.CreateCustomerAndAdresses(params);
this.CreateMembership(customer);
this.CreateGeoLookups(customer.Address);
this.SendWelcomeMail(customer);
The last two methods would be ideal candidates to run parallel, CreateGeoLookups calls some web services to determine geo locations of the customer's address and creates some new entities as well as updates the customer itself. SendWelcomMail does what it says.
Because CreateGeoLookups does use NHibernate (although through repository objects so NHibernate is acutally hidden via Interfaces/Dependency Inection) it won't work with Task.Factory.StarNew(...) or other Threading mechanisms.
My question is not to solve this very issue I have described but I would like to hear from you about NHibenrate, Spring.NET and parallel approaches.
Thank you very much
Max
In NH its the ISession that isn't thread-safe but the ISessionFactory is entirely thread-safe, easily supporting what it seems you are after. If you have designed your session-lifecycle-management (and the repositories that depend upon it) such that you assume one single consistent ISession across calls, then, yes, you will have this kind of trouble. But if you have designed your session-handling pattern to only assume a single ISessionFactory but not to make assumptions about ISession, then there is nothing inherently preventing you from interacting with NH in parallel.
Although you don't specifically mention your use case as being for the web, its important to take note that in web-centric use-cases (e.g., what is a pretty common case for Spring.NET users as well as many other NH-managing-frameworks), the often-used 'Session-Per-Request' pattern of ISession management (often referred to in Spring.NET as 'Open Session In View' or just 'OSIV') will NOT work and you will need to switch to a different duration of your ISession lifecycle. This is because (as the name suggests) the session-per-request/OSIV pattern makes the (now incorrect in your case) assumption that there is only a single ISession instance for the duration of each HttpRequest (and presumably you would want to be spawning these parallel NH calls all within the context of a single HttpRequest in the web use case).
Obviously in the non-web case where there's rarely a similar concept to session-per-request you wouldn't be as likely to run into this issue as session-lifecycle management is rarely as fine-grained/short-lived as it in web-based applications.
Hope this helps.
-Steve B.
This a difficult thing you ask for. The DTC has to be taken with care.
The only solution i may know is the use of reliable, transactional messaging (e.g. MSMQ + NServiceBus/MassTransit).
Such a design enables you to do this. It would look like this:
var customerUid=CreateCustomers();
Bus.Publish(new CustomerCreatedEvent() { CustomerUid = customerUid});
Then you could use two event handlers (Reactors) that handle the event and send an EMail or create the lookups.
This won´t allow you sharing the Transaction either but will ensure that the Reactors are run (in a new Transaction) when the creation of the customer suceeded.
Also this has nothing to do with the TPL.
Well thank you for answering. I know that the 'ISession that isn't thread-safe but the ISessionFactory is entirely thread-safe'. My problem in the above code for example is that the whole operation is wrapped in one transaction. So this.CreateCustomerAndAdresses(params) on main thread #1 will use for instance ISession #1 with transaction #1. Calling the other three in parallel will create three more threads and three more sessions and transactions which leads to database timeouts in my case. My assumption is that the transaction #1 is not successfully commited because it waits for the three concurrent tasks to complete. But the three concurrent tasks try to read from the database while a transaction is still active leading to deadlocks/timeouts.
So is there some way to tell the other threads/sessions not to create a new transaction but use the main transaction #1?
I am using the TxScopeTransactionManager from Spring.NET which utilises DTC (System.Transactions). I have googled that maybe System.Transactions.DependentTransaction could work but do not have a clue how to integrate it in my Spring.NET transaction managed scenario.
Thanks

Is Unit of Work more efficient if I don't care for transactions etc?

If I have 10 database calls on my web page, and none of them require any transactions etc.
They are simply getting data from the database (reads), should I still use the unit of work class?
Is it because creating a new session per database call is too expensive?
With NHibernate, session factory creation is very expensive (so you'll want to cache the session factory once it's created, probably on the HttpApplication) but session creation is very cheap. In other words, if it keeps your code cleaner, multiple session creations is not necessarily a bad thing. I think the NH documentation says it best:
An ISessionFactory is an
expensive-to-create, threadsafe object
intended to be shared by all
application threads. An ISession is an
inexpensive, non-threadsafe object
that should be used once, for a single
business process, and then discarded.
So, using the UoW pattern is probably not more efficient due to the extra overhead, but it's a good practice and the overhead is probably not going to hurt you. Premature optimization and all that.
Yes, you should use a transaction. From Ayende's blog:
"NHibernate assume that all access to the database is done under a transaction, and strongly discourage any use of the session without a transaction."
For more details, here's a link to his blog posting:
http://ayende.com/Blog/archive/2008/12/28/nh-prof-alerts-use-of-implicit-transactions-is-discouraged.aspx