Conflicting class versions in Apache Flink - jvm

I have 2 applications. The first is a Play! Framework (v2.5.1) application. This application's job is to read the aggregated data. The second is an Apache Flink (v1.1.2) application. This application's job is to write the aggregated data.
The error
java.lang.NoSuchMethodError: com.typesafe.config.ConfigFactory.defaultApplication(Lcom/typesafe/config/ConfigParseOptions;)Lcom/typesafe/config/Config;
This is caused by Play & Flink using different versions of com.typesafe.config (1.3.0 vs 1.2.1).
I've tried
I've tried using shading, but there are further complications when I get to using Akka. Akka also has conflicting versions, so I shade config & akka, which leads to a configuration error in Akka. If I duplicate the configuration to the correct path, then the ActorSystem fails to initialize because of incorrect class version.
Research
I don't know this area well, but it seems like a number of JVM servers handle this by doing parent-last class loading. Is that possible in flink?
There may be other, simple solutions that I've not tried as well. If there are some of those, let me know, and I'll gladly try them.
Thanks for your help!

Related

Camel readlock strategy in cluster

We are trying to move to cluster with Apache Camel. So far we had it on one node and worked well.
One node:
I have readlock strategy set to 'changed' which keeps track of file changes with camelLock file and only when the file has finished downloading, it will be picked up for processing. But camel readlock strategy 'changed' is discouraged in clustering. According to the camel documentation 'idempotent' is recommended. This is what happens when I am testing with 5GB file.
Two nodes:
I have readlock strategy to 'idempotent' which distributes files to one of the nodes but camel starts processing the file even before the file has finished downloading.
Is there a way to stop camel from processing even before file has downloaded when readlock strategy is idempotent?
Even though both "readLock=changed" and "readLock=idempotent" cause the file-consumer to wait, they really address quite different use-cases: while "readLock=changed" guards against the file being incomplete (i.e. still being written by some producer/sender), "readLock=idempotent" guards against a file being read by two consumer routes. It's a bit confusing that they're addressed by the same option.
First, to address the "changed" scenario: can the sender be changed so that it writes the file in one directory and then, when it is done writing, it copies it into the directory being monitored by your file-consumer? If this is under your control, this is a good way of letting the OS handle things instead of trying to deal with it yourself. (This does not address the issue of the multiple readers.) Otherwise, I suggest you revert back to readLock=changed
Next, on multiple readers, one work around is to only have this route run on only one node of your cluster. Sometimes this might defeat the purpose of clustering, but it is quite possible that you're starting up additional nodes to help with some other routes, and you're fine with this particular route running on just one node. It's a bit of a hack to do something like this, because all nodes are no longer equal, but it is still an option to consider. Simplest would be to start one node with some environment property that flags it as the node that will handle file-reading... or some similar approach.
If you do want the route on multiple nodes, you can start by using the option "idempotent=true" but this is not good enough on its own. The option uses a repository, where it records what files have been read before, and the default repository is in-memory (i.e. each node has its own). So, the default implementation is helpful if the same file is actually being received more than once, and you wish to skip it. However, if you want it to work across nodes, you have to use a different repository.
One central repository could be a database. In that case use can use Camel's JDBC or JPA based repositories. Or, you could use something like Hazelcast. See here for your options: http://camel.apache.org/idempotent-consumer.html
You can use readLock=idempotent-changed.
idempotent-changed is for using an idempotentRepository and changed as the combined read-lock. This allows you to use read locks that supports clustering if the idempotent repository implementation supports that.
You can read more about these idempotent-changed options here: https://camel.apache.org/components/3.13.x/file-component.html
We also used readLock=changed in Docker clustered mode and worked perfectly since we used readLockMinAge for certain interval.

How to setup a project and break it into sub-projects, how to use slick in this setup

This is a brand new project, so I can use the latest version of play.
I am using IntelliJ 13.
So I want to break the models/db/service layer because I will also have a job service (reading messages off a queue for example) that will need this server layer also.
Since slick is outside of play, how do I setup the datasource for this project, keeping in mind I will be connecting to multiple databases.
Do I need to create a custom config file for this?
web-app (play2!)
- service
service (models + dao)
models
dao
jobs (service)
I don't see any examples like this, which I find strange because I think pretty much any project would have to be setup this way in the real world (beyond simple examples).
Can someone show be sample code where things are broken down like this?
This example isn't broken into sub-projects, but it is very split up and would allow you to specify multiple databases.
https://github.com/geigerma/play-cake

Isolate a FlywayException only for a migration error

In a flyway migration a FlyWayException can be throw according to differents cases : during a migration failure, if the given database url can not be found etc.
Everytime it's a FlyWayException with a JdbcSQLException as cause. But in my app I'd like to isolate these cases in order to provide differents behaviors.
Is there any way to do this ?
I can see that a JdbcSQLException contains a SQLState, maybe it can be a solution, but I don't know if it's the best one.
It is never good to parse out SQL exceptions as they may change from version to version of the database. I would submit a feature request to Flyway to have specific exceptions instead of one generic FlyWayException: https://github.com/flyway/flyway/issues.

Migration patch from NServiceBus 2.6 to NServiceBus 3.0

I have an existing NServiceBus 2.6 application that I want to start moving to 3.0. I'm looking for the minimum change upgrade in the first instance. Is this as simple as replace the 2.6 DLLs with the 3.0 Nuget packages or are there other considerations?
For the most part the application migration is quite straight forward, but depending on your configuration and environment, you may need to make the following changes:
The new convention over configuration for endpoints may mean you will need to rename your endpoints to match your queue names (#andreasohlund has a good post about this).
persistence of saga, timeouts, subscriptions etc. now defaults to RavenDb, so if you use SQL Server to persist data, you need to make sure you have to correct profile and endpoint configuration. For SQL Server storage, make sure you add a reference to NServiceBus.NHibernate as it is no longer part of the core.
Error queues are now referenced differently using different configuration ie. use MessageForwardingInCaseOfFaultConfig instead of the regular MsmqTransportConfig error property. You should still be able to use it, but it will look for the MessageForwardingInCaseOfFaultConfig first.
Other than that, I don't think you need to do anything else to get you upgrade working. I modified some of my message definitions to take advantage of the new ICommand and IEvent interfaces as a way communicatinf intent more clearly.
Anyway, I'm sure there will be some cases that are specific to your environment that will require different changes but I hope this helps a bit.

How to install Jena SemanticWeb Framework in Play Framework

I put jena jar files in the lib folder and see the message:
A JPA error occurred (Cannot start a JPA manager without a
properly configured database): No datasource configured
what am I doing wrong?
I found the answer. This was a problem in Play.
There's some reason put in front of the class directive from the module javax
I do not know why it happened, simply remove and earned
This error doesn't asoociate with jena, because if your dont't choose model (dataset) while your execute query, you will get next message - No dataset description for query and com.hp.hpl.jena.query.QueryExecException. But if you choose jena as datasource in play, you may get your message(sorry, but i don't know much about Play).
What operations you do with jena?
I don't know much Jena but it seems that some persistent ontologies might be stored into database. Thus, it would mean Jena needs a database connection?
Is this error an error of Jena and not from Play?
What do you try to do in your code before getting this error?
If Jena requires some configuration and resource creation before using it, you should think of creating a little Jena play plugin to initialize your Jena context...