I'm looking for some help with an ApacheDS Multi Master solution.
I'm new to the setting up LDAP solutions and so it's quite possible that I'm making some pretty basic errors.
I have two CentOS VM's running - LDAP1 and LDAP2. Each VM having a running ApacheDS solution.
LDAP1 is running ApacheDS on port 10389 and LDAP2 is running ApacheDS on port 10399.
I can connect to both servers using Apache Directory Studio with no problems and can see the default structures listed.
I have then imported the SevenSeas structure into LDAP1, and enabled DEBUG for both LDAP1 and LDAP2 for replication by uncommenting the
lines in /instances/default/conf/log4j.properties
log4j.logger.org.apache.directory.server.PROVIDER_LOG=DEBUG
log4j.logger.org.apache.directory.server.CONSUMER_LOG=DEBUG
I then attempt to create the MultiMaster config following this guide : http://joacim.breiler.com/apacheds/ch08s02.html
1) I enable the replication handler on LDAP1 by importing the following LDIF
dn: ads-serverId=ldapServer,ou=servers,ads-directoryServiceId=default,ou=config
changetype: modify
add: ads-replReqHandler
ads-replReqHandler: org.apache.directory.server.ldap.replication.provider.SyncReplRequestHandler
2) I then enable the replication handler on LDAP2 by importing the same LDIF as above
3) I then restart both LDAP1 and LDAP2 ApacheDS servers.
4) Once the server restarts I check the apacheDS.log and see the following entries ( on Both LDAP1 and LDAP2)
[08:41:28] DEBUG [org.apache.directory.server.PROVIDER_LOG] - initializing the syncrepl provider
[08:41:28] DEBUG [org.apache.directory.server.PROVIDER_LOG] - Starting the replication consumer manager
[08:41:28] DEBUG [org.apache.directory.server.PROVIDER_LOG] - no replica logs found to initialize
[08:41:28] DEBUG [org.apache.directory.server.PROVIDER_LOG] - syncrepl provider initialized successfully
5) I then import the following LDIF on LDAP2 (No Errors are generated)
dn: ads-replConsumerId=1,ou=replConsumers,ads-serverId=ldapServer,ou=servers,ads-directoryServiceId=default,ou=config
objectClass: ads-base
objectClass: ads-replConsumer
objectClass: top
ads-replAliasDerefMode: never
ads-replAttributes: *
ads-replConsumerId: 1
ads-replProvHostName: ldap1
ads-replProvPort: 10389
ads-replRefreshInterval: 60000
ads-replRefreshNPersist: true
ads-replSearchFilter: (objectClass=*)
ads-replSearchScope: sub
ads-replSearchSizeLimit: 0
ads-replSearchTimeOut: 0
ads-replUserDn: uid=admin,ou=system
ads-replUserPassword:: c2VjcmV0
ads-searchBaseDN: o=SevenSeas
6) I then import the following LDIF on LDAP1
dn: ads-replConsumerId=2,ou=replConsumers,ads-serverId=ldapServer,ou=servers,ads-directoryServiceId=default,ou=config
objectClass: ads-base
objectClass: ads-replConsumer
objectClass: top
ads-replAliasDerefMode: never
ads-replAttributes: *
ads-replConsumerId: 2
ads-replProvHostName: ldap2
ads-replProvPort: 10399
ads-replRefreshInterval: 60000
ads-replRefreshNPersist: true
ads-replSearchFilter: (objectClass=*)
ads-replSearchScope: sub
ads-replSearchSizeLimit: 0
ads-replSearchTimeOut: 0
ads-replUserDn: uid=admin,ou=system
ads-replUserPassword:: c2VjcmV0
ads-searchBaseDN: o=SevenSeas
7) I then attempt to restart LDAP1 and LDAP2 apacheDS servers and hit the following error on both servers. (Reported in ApacheDS.log
08:52:42] ERROR [org.apache.directory.server.config.ConfigPartitionReader] - An error occured while reading the configuration DN 'ou=replConsumers,ads-serverId=ldapServer,ou=servers,ads-directoryServiceId=default,ou=config' for the objectClass 'ads-replConsumer':
ERR_04274 Can't find an OID for the name ads-base
[08:52:42] ERROR [org.apache.directory.server.UberjarMain] - Failed to start the service.
org.apache.directory.server.config.ConfigurationException: An error occured while reading the configuration DN 'ou=replConsumers,ads-serverId=ldapServer,ou=servers,ads-directoryServiceId=default,ou=config' for the objectClass 'ads-replConsumer':
ERR_04274 Can't find an OID for the name ads-base
at org.apache.directory.server.config.ConfigPartitionReader.read(ConfigPartitionReader.java:641)
at org.apache.directory.server.config.ConfigPartitionReader.read(ConfigPartitionReader.java:600)
at org.apache.directory.server.config.ConfigPartitionReader.read(ConfigPartitionReader.java:600)
at org.apache.directory.server.config.ConfigPartitionReader.readConfig(ConfigPartitionReader.java:754)
at org.apache.directory.server.config.ConfigPartitionReader.readConfig(ConfigPartitionReader.java:718)
at org.apache.directory.server.config.ConfigPartitionReader.readConfig(ConfigPartitionReader.java:690)
at org.apache.directory.server.ApacheDsService.start(ApacheDsService.java:177)
at org.apache.directory.server.UberjarMain.start(UberjarMain.java:76)
at org.apache.directory.server.UberjarMain.main(UberjarMain.java:54)
8) I then reverted the config to before the import to allow me to restart the servers
9) If I remove the following line from the LDIF used in point 5 / 6, I'm then able to start the servers.
objectClass: ads-base
10) The debug (on both LDAP1 and LDAP2) reflects that the the replication is working
[09:02:31] DEBUG [org.apache.directory.server.PROVIDER_LOG] - initializing the syncrepl provider
[09:02:31] DEBUG [org.apache.directory.server.PROVIDER_LOG] - Starting the replication consumer manager
[09:02:31] DEBUG [org.apache.directory.server.PROVIDER_LOG] - no replica logs found to initialize
[09:02:31] DEBUG [org.apache.directory.server.PROVIDER_LOG] - syncrepl provider initialized successfully
11) I then login to LDAP2 and do not see the partition o=SevenSeas - Which to me says the replication hasn't worked ? Have I missed a stage ?
UPDATE
I've been able to make some more progress on this. Whilst using the Apache Directory Studio tool - i stumbled on the Server configuration tabs (Right click on the LDAP connection). Amongst the tabs is a Replication tab.
This allows you to add consumers. On LDAP1 I added a consumer to point to LDAP2 and on LDAP2 I added a consumer to point to LDAP1, where the BASEDN was ou=system. The consumer ID appears to need to match the ID's given in point 5 /6.
I then restarted both the LDAP servers.
When the servers restarted I could see that they were talking to each other (via the debug). I made an edit to the ou_system partition on LDAP1 and this was replicated to LDAP2. I than made an edit to the ou=system partition on LDAP2 and this was replicated to LDAP1.
My issue is now that I cannot replicate other partitions - no matter what the BASE DN is in the consumer config.
I guess we started working on the same issue at the same time today. The replication to a custom partition seems to work when you set the cache to 1000 for the partition.
Related
I have three nodes with Solr and ZooKeeper with enabled TLS/SSL where the ZK listen only in securePort and Solr - HTTPS.
Now I want to connect Solr to Apache Ranger for audit logs
where I am setting:
ranger.audit.solr.urls = https://HOST1:8983/solr/ranger_audits
and
ranger_admin_solr_zookeepers = HOST1:2281,HOST2:2281,HOST3:2281
The Apache Ranger is not in SSL mode and listen only on HTTP.
For Solr I have successfully create ranger_audits configset and collection with the same name.
ZooKeeper election is also successful where I have 1 leader and 2 followers.
So everything works as expected except the Apache Ranger audit communication.
The version of the Apache Ranger is 2.0.
ZooKeeper version - 3.6.3
Solr version - 8.11.1
With the current settings I get the following exception when open audit tab in Ranger UI:
2022-03-22 06:54:08,189 [http-bio-6080-exec-2] INFO org.apache.ranger.common.RESTErrorUtil (RESTErrorUtil.java:326) - Operation error. response=VXResponse={org.apache.ranger.view.VXResponse#7ef95c52statusCode={1} msgDesc={Error running solr query, please check solr configs. java.util.concurrent.TimeoutException: Could not connect to ZooKeeper HOST1:2281,HOST2:2281,HOST3:2281 within 15000 ms} messageList={[VXMessage={org.apache.ranger.view.VXMessage#3bd495a3name={ERROR_SYSTEM} rbKey={xa.error.system} message={System Error. Please try later.} objectId={null} fieldName={null} }]} }
javax.ws.rs.WebApplicationException
UPDATE:
The solution is to provide jaas.conf and java properties which fixed the problem.
-Dzookeeper.client.secure=true
-Djava.security.auth.login.config=/etc/ranger/admin/conf/jaas.conf
The sample of the jaas.conf is:
Client {
org.apache.zookeeper.server.auth.DigestLoginModule required
username="admin"
password="admin-pass";
};
Please note that this is not complete solution and the connection from Ranger to through HTTPS ZooKeepers is still problematic.
I'm connected to my LDAP connection.
I'm trying to import an LDIF file to it by right clicking and using the wizard.
As I choose the file, check the overwrite option and press OK,
I get the following error:
Error while importing LDIF
javax.naming.NameAlreadyBoundException:
at org.apache.directory.studio.connection.core.io.api.DirectoryApiConnectionWrapper.checkResponse(DirectoryApiConnectionWrapper.java:1359)
And this appears in the Modification Logs tab:
#!RESULT ERROR
#!CONNECTION ldap://192.168.99.100:389
#!DATE 2018-01-24T11:01:17.743
#!ERROR
dn: dc=mycompany,dc=net
changetype: add
dc: mycompany
objectclass: dcObject
objectclass: organization
o: mycompany.net
I tried googling around with the error but can't find anything on this particular matter.
Also, as I choose the LDIF-file, even after uninstalling and reinstalling the program, it warns already upon selecting the LDIF-file that "selected logfile already exists".
The description of javax.naming.NameAlreadyBoundException sounds like this is a weird name for the LDAP result code entryAlreadyExists(68) returned when processing the LDAP Add Operation.
It means just that: An LDAP entry with this DN already exists. You cannot add a second one with the same DN.
New to LDAP. Exported the DIT as an LDIF from Apache Studio. Tried to import the LDIF file. Error occurs:
...
#!ERROR [LDAP: error code 32 - Unable to add entry 'dc=example,dc=com' because its parent entry 'dc=com' does not exist in the server.]
dn: dc=example,dc=com
changetype: add
dc: example
objectClass: domain
objectClass: top
The LDAP server is UnboundID LDAP SDK for Java 3.2.0.
I don't know enough LDAP to fix it.
Should Apache Studio have created dc=com before this entry?
LDIF export does not guarentee ordering. LDIF import assumes ordering.
So to answer your question, yes, you should have created dc=com first.
There is a subtle exception where you can have a dc=example.com as a single node which looks confusing. But periods are legal in a name.
I am trying to cluster ehcache and lucene with Liferay 6.2 EE sp2 bundle on 2 servers with mutlicast enabled. WE have Apache HTTPD servers fronting tomcat servers using reverse proxy. A valid 6.2 license is deployed on both the nodes.
We user the following properties in the portal-ext.properties:
cluster.link.enabled=true
lucene.replicate.write=true
ehcache.cluster.link.replication.enabled=true
# Since we are using SSL on the frontend
web.server.protocol=https
# set this to any server that is visible to both the nodes
cluster.link.autodetect.address=dbserverip:dbport
#ports and ips we know work in our environment for multicast
multicast.group.address["cluster-link-control"]=ip
multicast.group.port["cluster-link-control"]=port1
multicast.group.address["cluster-link-udp"]=ip
multicast.group.port["cluster-link-udp"]=port2
multicast.group.address["cluster-link-mping"]=ip
multicast.group.port["cluster-link-mping"]=port3
multicast.group.address["hibernate"]=ip
multicast.group.port["hibernate"]=port4
multicast.group.address["multi-vm"]=ip
multicast.group.port["multi-vm"]=port5
We are running into issues with the ehcache and lucene clustering not working. The following tests fail :
Moving a portlet on node 1, does not show up on node 2
There are no errors except for a startup error with lucene.
14:19:35,771 ERROR
[CLUSTER_EXECUTOR_CALLBACK_THREAD_POOL-1][LuceneHelperImpl:1186]
Unable to load index for company 10157
com.liferay.portal.kernel.exception.SystemException:
java.net.ConnectException: Connection refused at
com.liferay.portal.search.lucene.LuceneHelperImpl.getLoadIndexesInputStreamFromCluster(LuceneHelperImpl.java:488)
at
com.liferay.portal.search.lucene.LuceneHelperImpl$LoadIndexClusterResponseCallback.callback(LuceneHelperImpl.java:1176)
at
com.liferay.portal.cluster.ClusterExecutorImpl$ClusterResponseCallbackJob.run(ClusterExecutorImpl.java:614)
at
com.liferay.portal.kernel.concurrent.ThreadPoolExecutor$WorkerTask._runTask(ThreadPoolExecutor.java:682)
at
com.liferay.portal.kernel.concurrent.ThreadPoolExecutor$WorkerTask.run(ThreadPoolExecutor.java:593)
at java.lang.Thread.run(Thread.java:745) Caused by:
java.net.ConnectException: Connection refused at
java.net.PlainSocketImpl.socketConnect(Native Method) at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at
java.net.Socket.connect(Socket.java:579) at
sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:625) at
sun.security.ssl.BaseSSLSocketImpl.connect(BaseSSLSocketImpl.java:160)
at sun.net.NetworkClient.doConnect(NetworkClient.java:180) at
sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at
sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at
sun.net.www.protocol.https.HttpsClient.(HttpsClient.java:275)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:371)
We verified that the jgroups multicast works outside of liferay by running the following commands and using a downloaded copy of the jgroups.jar and replacing with the 5 multicast ips and ports.
Testing with JGROUPS
1) McastReceiver -
java -cp ./jgroups.jar org.jgroups.tests.McastReceiverTest -mcast_addr 224.10.10.10 -port 5555
ex. java -cp jgroups-final.jar org.jgroups.tests.McastReceiverTest -mcast_addr 224.10.10.10 -port 5555
2) McastSender -
java -cp ./jgroups.jar org.jgroups.tests.McastSenderTest -mcast_addr 224.10.10.10 -port 5555
ex. java -cp jgroups-final.jar org.jgroups.tests.McastSenderTest -mcast_addr 224.10.10.10 -port 5555
From there, typing things into the McastSender will result in the Receiver printing it out.
Thanks!
After a lot of troubleshooting and help from various folks in my team and at liferay support, we switched to using unicast and it worked a lot better.
Here is what we did:
Extracted jgroups.jar from the tomcat home/webappts/ROOT/WEB_INF/lib, saved locally.
Unzipped the jgroups.jar file and extracted and save the tcp.xml from the jar's WEB_INF folder
As a base line test, changed the section in the tcp.xml and saved
TCPPING timeout="3000"
initial_hosts="${jgroups.tcpping.initial_hosts:servername1[7800],servername2[7800]}"
port_range="1"
num_initial_members="10"
Copy the tcp.xml to the liferay home on both the nodes
Change the portal-ext.properties to remove the mutlicast properties and add the following lines.
cluster.link.channel.properties.control=${liferay.home}/tcp.xml
cluster.link.channel.properties.transport.0=${liferay.home}/tcp.xml
Start node 1
start node 2
check logs
Do the cluster cache test:
Moving a portlet on node 1, shows up on node 2
Under control panel -> License manager both the nodes show up with valid licenses.
searching for user on node 2 after adding in node 1 in control panel -> user and organizations.
All of the above tests worked.
So we shutdown servers and changed the tcp.xml to use jdbc rather than the tcpping so we don't have to specify node names manually.
Step for the jdbc config:
Create the table in the liferay database manually.
CREATE TABLE JGROUPSPING (own_addr varchar(200) not null, cluster_name varchar(200) not null, ping_data blob default null, primary key (own_addr, cluster_name))
change tcp.xml and remove the tcpping section and add the following.
Note: Please replace the leading \ with less than symbol in the following code block. There are issues with the leading less than sign in the SO editor/parser hiding whatever comes after it:
\JDBC_PING datasource_jndi_name="java:comp/env/jdbc/LiferayPool"
initialize_sql="" />
Save and push the file manually to both the nodes.
Start the servers and repeat tests above.
It should work seamlessly.
It was invaluable to have the debug logging on for jgroups mentioned in the following the post:
https://bitsofinfo.wordpress.com/2014/05/21/clustering-liferay-globally-across-data-centers-gslb-with-jgroups-and-relay2/
tomcat home/webapps/ROOT/WEB-INF/classes/META-INF/portal-log4j-ext.xml file I used to triage various issues on bootup related to clustering.
<?xml version="1.0"?>
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/">
<category name="com.liferay.portal.cluster">
<priority value="TRACE" />
</category>
<category name="com.liferay.portal.license">
<priority value="TRACE" />
</category>
We also found that the Lucene cluster replication startup errors were fixed in a fix pack and are getting a patch for it.
https://issues.liferay.com/browse/LPS-51714
https://issues.liferay.com/browse/LPS-51428
We added the following portal instance properties for lucene replication to work better between the 2 nodes:
portal.instance.http.port=port that the app servers listen on ex. 8080
portal.instance.protocol=http
Hope this helps someone.
Update
The lucene index load in a cluster issue was resolved by a Liferay 6.2 EE patch from support for the LPS's mentioned above.
I have 2 apacheds services running on workstation (Windows 7). One of the service runs as a windows service and the other one I am running from its jar file from command line using
java -jar "C:\Program Files (x86)\ApacheDS - Instance2\lib\apacheds-service-2.0.0-M15.jar" "C:\Program Files (x86)\ApacheDS - Instance2\instances\instance2"
The first apacheds installation is at location C:\Program Files (x86)\ApacheDS and it runs on 10389. The second service runs on port 11389.
Using apache directory studio I can connect to both these instances/services running on my workstation and there are no errors on console.
By using the following ldif file I have imported setting for Instance 1 which will be the master
dn: ads-serverId=ldapServer,ou=servers,ads-directoryServiceId=default,ou=config
changetype: modify
add: ads-replReqHandler
ads-replReqHandler: org.apache.directory.server.ldap.replication.provider.SyncReplRequestHandler
Then I have also imported following ldif file to Instance 1/master -
dn: ads-replConsumerId=1,ou=replConsumers,ads-serverId=ldapServer,ou=servers,ads-directoryServiceId=default,ou=config
changetype: add
ads-replRefreshNPersist: TRUE
ads-replAliasDerefMode: never
ads-replProvPort: 10389
ads-replSearchSizeLimit: 0
ads-replProvHostName: localhost
objectClass: ads-replConsumer
objectClass: ads-base
ads-replUserDn: uid=admin, ou=system
ads-replRefreshInterval: 60000
ads-replUserPassword: secret
ads-replConsumerId: 1
ads-replAttributes: *
ads-replSearchTimeOut: 0
ads-replSearchScope: sub
ads-replSearchFilter: (objectClass=*)
ads-searchBaseDN: ou=system
I added a few users under ou=users, ou=system on the master but nothing gets replicated on the slave. There is no evidence on the consoles of either instances that these two instances are trying to talk and hence I think this is not the right configuration or incomplete configuration as there is nothing I could find on apacheds documentation that needs to be added as part of the consumer configuration on Instance2/Slave. Am I missing something ?
Thanks !
There was some bug in apacheds 2.0 - M15 Version regarding replication. That bug has been rectified already and the fix will be in M16. I built 2 separate instances from apacheds svn trunk and built the installers and ran 2 separate instances. Added following settings/config on the Provider/Master (Running on Port 10389) -
dn: ads-serverId=ldapServer,ou=servers,ads-directoryServiceId=default,ou=config
changetype: modify
add: ads-replReqHandler
ads-replReqHandler: org.apache.directory.server.ldap.replication.provider.SyncReplRequestHandler
Added following config to the consumer/slave instance (Running on 11389) -
dn: ads-replConsumerId=1,ou=replConsumers,ads-serverId=ldapServer,ou=servers,ads-directoryServiceId=default,ou=config
changetype: add
ads-replRefreshNPersist: TRUE
ads-replAliasDerefMode: never
ads-replProvPort: 10389
ads-replSearchSizeLimit: 0
ads-replProvHostName: localhost
objectClass: ads-replConsumer
objectClass: ads-base
ads-replUserDn: uid=admin, ou=system
ads-replRefreshInterval: 60000
ads-replUserPassword: secret
ads-replConsumerId: 1
ads-replAttributes: *
ads-replSearchTimeOut: 0
ads-replSearchScope: sub
ads-replSearchFilter: (objectClass=*)
ads-searchBaseDN: ou=system
Restarted both instances and tried adding an entry under Provider and Wow ! it got replicated to the consumer.
Though the configuration I have posted in the question is incorrect, couple of confusing things led to that configuration.
Lack of clear documentation on apacheds website.
Apache ds studio - When you create a connection in apache ds studio to connect to a apacheds service, you can right click on the connection and open its configuration (Which is stored under ou=config. Its basically a GUI for ou=config). When you click on the last tab 'Replication'. The title reads 'All Replication Consumers' with a 'Add' button placed on the right. This is misleading as it gives the user an impression that one needs to add consumer/slave details here and this config should lie on the master/provider side.