Jackrabbit Clustering Configuration - lucene

My application uses stand alone version of jackrabbit and we wanted to move to embedded mode so that we can cluster it.
I read the requirements on the jackrabbit clustering site but still confused. Should I be having different home directories for each cluster node. i.e. If I need to configure two nodes, do I need to have ~/node1/repository.xml and ~/node2/repository.xml? Or they can share same ~/node/repository.xml?

As described in the Clustering Overview, "each cluster node needs its own (private) repository directory, including repository.xml file, workspace FileSystem and Search index."

Related

How to redirect the Apache log in Kubernetes

I am having one namespace and one deployment(replica set), My Apache logs should be written outside the pod, how is it possible in Kubernetes.
This is a Community Wiki answer so feel free to edit it and add any additional details you consider important.
You should specify more precisely what you exactly mean by outside the pod, but as David Maze have already suggested in his comment, take a closer look at Logging Architecture section in the official kubernetes documentation.
Depending on what you mean by "outside the Pod", different solution may be the most optimal in your case.
As you can read there:
Kubernetes provides no native storage solution for log data, but you can integrate many existing logging solutions into your Kubernetes
cluster ... Cluster-level logging architectures are described in assumption that a logging backend is present inside or outside of your cluster.
Here are mentioned 3 most popular cluster-level logging architectures:
Use a node-level logging agent that runs on every node.
Include a dedicated sidecar container for logging in an application pod.
Push logs directly to a backend from within an application.
Second solution is widely used. Unlike the third one where the logs pushing needs to be handled by your application container, sidecar approach is application independend, which makes it much more flexible solution.
So that the matter was not so simple, it can be implemented in two different ways:
Streaming sidecar container
Sidecar container with a logging agent

Hortonworks schema registry cluster mode

I'm using Hortonworks schema registry with NIFI and things are working fine. I have installed Hortonworks schema registry on a single node and I'm afraid if that machine goes down what will happen to my NIFI flows. I have seen in Hortonworks schema registry architecture that we can use mysql, PostgreSql and In-Memory storage for storing schema. AFAIK none of them are distributed system. Is there any way to achieve cluster mode for high availability?
Sure, you can do active-active or active-passive replication for MySQL and Postgres, but that is left up to you to implement, as Hortonworks will likely forward you to the respective documentation on each tool, and that is the reason why the documentation for these tools doesn't guide you towards these design decisions in itself, as you should be aware of the drawbacks when having a SPoF
The Schema Registry itself is just a web-app, so you could put it behing your favorite reverse proxy, or within a container orchestrator, such as Docker support in HDP 3.x

Do I need multiple masters on OKD?

So I have a question regarding setting up OKD for our needs - our team has already established that Kubernetes is basically the simplest way for us to manage our stack. We don't have too much workload; probably 3 dedicated servers could work through all of it, but we have a lot of services and tools that are best served by running in docker containers, and we also strongly benefit from running our fairly monolithic core application as a container to make deployment and maintenance simpler.
The question though, is that how many nodes we need; specifically, whether we need HA Master nodes.
From the documentation, it seems that Infrastructure nodes are responsible for routing. Does this mean that even if the master node goes down, the other nodes are still available and routing works, so long as domains point at the infrastructure nodes? Or would a failed master make all the other nodes unreachable?
In our environment router pods are running on infra nodes and we can safely turn off master node without impact for applications.
master node: api, controllers, etcd
infra node: registry, router, metrics, logging etc.
With master turned off you just can't manage cluster, the rest works fine. It is good to have more than one master node for etcd redundancy, but with such small environment I think it makes no sense maintain more.

Kubernetes Custom Volume Plugin with Dynamic Provisioning

I have a proprietary file-system and I would like to use it for providing file storage to my K8S pods. I am currently running K8S v1.5.1, but open upgrade to 1.6 if need be.
I would like to make use of Dynamic Provisioning so that the volumes are created on need basis. I went through the official documentation on kubernetes.io and this is what I have understood so far:
I need to write a Kubernetes Custom volume-plugin for my proprietary
file-system.
I need to create a StorageClass which makes use of a
provisoner that provisions volumes from my proprietary filesystem
I then create a PVC that refers to my StorageClass
I then create my Pods referring to my storage class by name.
What I am not able to make out is:
Is Provisoner referred by Storage Class and K8S Volume Plugin one and the same? If they are different, how?
There is mention of External Provisoner in K8S documentation. Does this mean I can write the K8S Volume Plugin for my filesystem out-of-tree (outside K8S code)?
My filesystem provides REST APIs to create filesystem volumes. Can I invoke them in my provisoner/volume plugin?
If I write an out-of-tree plugin, how do I load it in my K8S cluster so that it can be used to provision volumes using the Storage Class?
Appreciate any help in answering any or all of the above.
Thanks!
Is Provisoner referred by Storage Class and K8S Volume Plugin one and the same? If they are different, how?
It should be same if you want to provision the storage using that plugin.
There is mention of External Provisoner in K8S documentation. Does this mean I can write the K8S Volume Plugin for my filesystem out-of-tree (outside K8S code)?
Yes, thats correct.
My filesystem provides REST APIs to create filesystem volumes. Can I invoke them in my provisoner/volume plugin?
Yes, as long as the client is part of the provisioner code.
If I write an out-of-tree plugin, how do I load it in my K8S cluster so that it can be used to provision volumes using the Storage Class?
It can run as a container or you can invoke it by a binary execution model.

IBM Worklight 6.2. Analytics JNDI properties in WAS ND

About Worklight 6.2 Analytics.
https://www-01.ibm.com/support/knowledgecenter/api/content/SSZH4A_6.2.0/com.ibm.worklight.monitor.doc/monitor/t_setting_up_production_cluster.html
There are several JNDI properties to configure but It is not explained how to configure them in WAS ND and in which scope must be configured (if this has sense)
For example the worklight.properties are configured as application properties during the application installation.
How are configured the analytics JNDI properties on WAS?
And also in which scope should them be configured, this is also struggling me. For example it says that properties like "analytics/shards" or "analytics/replicas_per_shard" must be configured in the first node, but for me these properties should be properties configured at cluster level, not at node level.
Also WAS ND topology is something completely dynamic and flexible, what happens if I remove that "first" node?
Ok, now I understand that when in the Worklight Analytics documentation talk about cluster it is not talking about WAS Cluster but about Elasticsearch cluster.
Taking into account this, configuring a cluster for Analytics does not mean to install the analytics.war in a WAS cluster, it means that you will install analytics.war file in a number of WAS servers (not WAS clusters, not WAS nodes) and with the ElasticSearch properties you will configure the ElasticSearch cluster.
Is this correct?
The specific answer to my question is that the value of the properties are set during the detailed installation of the analytics.war file as it is done with the Application Project WAR file, worklightadmin.war or worklightconsole.war.
It is only needed to set those properties if you are configuring Analytics in more than one server.